Skip to contents

Prompts the user to select chemical elements to keep in a data table of bonds or angles. Filtering is based on matching the base chemical symbol in a specified column (e.g., "CentralAtom").

Usage

filter_atoms_by_symbol(data_table, atom_col = "CentralAtom")

Arguments

data_table

A data.table object containing atomic information, such as the output from calculate_angles or minimum_distance.

atom_col

A character string specifying the name of the column in data_table that contains the atom labels to filter by. Defaults to "CentralAtom".

Value

A data.table filtered to include only the rows where the atom label in atom_col corresponds to one of the user-selected chemical symbols. If the user provides no input, an empty data.table is returned.

Details

The function first identifies all unique base chemical symbols from the atom labels in the specified column (e.g., it extracts 'C' from 'C1', 'Si' from 'Si2_1'). It then presents these symbols to the user and asks for a comma-separated list of the symbols they wish to retain.

The matching logic is designed to be specific to avoid ambiguity between elements. For example, if the user enters 'C', the function will match labels like 'C1', 'C2', 'C_10', or a lone 'C'. However, it will not match labels for different elements that start with C, such as 'Cr1' or 'Ca2'. This is achieved by constructing a regular expression that ensures the character(s) immediately following the selected symbol are not alphabetical letters.

This function is intended for interactive use.

Examples

# 1. Create a sample data.table of bond angles
sample_angles <- data.table::data.table(
  CentralAtom = c("C1", "C2", "Si1", "Cr1", "O1", "O2", "C"),
  Neighbor1 = c("O1", "O2", "O1", "N1", "C1", "C2", "H1"),
  Neighbor2 = c("H1", "H2", "O2", "N2", "H3", "H4", "H2"),
  Angle = c(109.5, 109.5, 120.0, 90.0, 104.5, 104.5, 120)
)

# 2. In an interactive R session, the function would prompt the user.
if (interactive()) {
  filtered_data <- filter_atoms_by_symbol(sample_angles, atom_col = "CentralAtom")
  print(filtered_data)
}