qstack.qml.b2r2

Bond-based reaction representation (B2R2) for chemical reactions.

Provides:

defaults: default parameters for B2R2 computation.

qstack.qml.b2r2.get_b2r2(reactions, variant='l', progress=False, rcut=3.5, gridspace=0.03)[source]

High-level interface for computing bond-based reaction representations (B2R2).

Reference:

P. van Gerwen, A. Fabrizio, M. D. Wodrich, C. Corminboeuf, “Physics-based representations for machine learning properties of chemical reactions”, Mach. Learn.: Sci. Technol. 3, 045005 (2022), doi:10.1088/2632-2153/ac8f1a

Parameters:
  • reactions (List[rxn]) – List of reaction objects with attributes: - rxn.reactants (List[Mol]): List of reactant molecules. - rxn.products (List[Mol]): List of product molecules. Mol can be any type with .numbers and .positions (Å) attributes, for example ASE Atoms objects.

  • variant (str) – B2R2 variant to compute. Options: - ‘l’: Local variant with element-resolved skewed Gaussians (default). - ‘a’: Agnostic variant with element-pair Gaussians. - ‘n’: Nuclear variant with combined skewed Gaussians.

  • progress (bool) – If True, displays progress bar. Defaults to False.

  • rcut (float) – Cutoff radius for bond detection in Å. Defaults to 3.5.

  • gridspace (float) – Grid spacing for discretization in Å. Defaults to 0.03.

Returns:

B2R2 representations of shape (n_reactions, n_features).

For variants ‘l’ and ‘a’, returns difference (products - reactants). For variant ‘n’, returns concatenation [reactants, products].

Return type:

numpy.ndarray

Raises:

RuntimeError – If an unknown variant is specified.

qstack.qml.b2r2.get_b2r2_a_molecular(ncharges, coords, elements, rcut=3.5, gridspace=0.03)[source]

Compute B2R2_a representation for a single molecule.

Parameters:
  • ncharges (array-like) – Atomic numbers for all atoms in the molecule.

  • coords (array-like) – Atomic coordinates in Å, shape (natom, 3).

  • elements (array-like) – Unique atomic numbers present in the dataset.

  • rcut (float) – Cutoff radius for bond detection in Å. Defaults to 3.5.

  • gridspace (float) – Grid spacing for discretization in Å. Defaults to 0.03.

Returns:

B2R2_a representation (n_pairs*ngrid,).

Return type:

numpy.ndarray

qstack.qml.b2r2.get_b2r2_inner(reactions, progress=False, rcut=3.5, gridspace=0.03, get_b2r2_molecular=None, combine=None)[source]

Compute the B2R2 representations for a list of reactions.

Internal implementation function that computes B2R2 representations using provided molecular representation function and combination strategy. Automatically determines element set from all reactant molecules.

Parameters:
  • reactions (List[rxn]) – List of reaction objects with attributes: - rxn.reactants (List[Mol]): List of reactant molecules. - rxn.products (List[Mol]): List of product molecules. Mol can be any type with .numbers and .positions (Å) attributes, for example ASE Atoms objects.

  • progress (bool) – If True, displays progress bar. Defaults to False.

  • rcut (float) – Cutoff radius for bond detection in Å. Defaults to 3.5.

  • gridspace (float) – Grid spacing for discretization in Å. Defaults to 0.03.

  • get_b2r2_molecular (callable) – Function to compute molecular representations. Should be one of get_b2r2_{l,n,a}_molecular.

  • combine (callable) – Function(r: ndarray, p: ndarray) -> ndarray to combine reactant and product representations (e.g., difference or concatenation).

Returns:

B2R2 representations of shape (n_reactions, n_features),

where each row represents a reaction according to the combine function.

Return type:

numpy.ndarray

qstack.qml.b2r2.get_b2r2_l_molecular(ncharges, coords, elements, rcut=3.5, gridspace=0.03)[source]

Compute B2R2_l representation for a single molecule.

Parameters:
  • ncharges (array-like) – Atomic numbers for all atoms in the molecule.

  • coords (array-like) – Atomic coordinates in Å, shape (natom, 3).

  • elements (array-like) – Unique atomic numbers present in the dataset.

  • rcut (float) – Cutoff radius for bond detection in Å. Defaults to 3.5.

  • gridspace (float) – Grid spacing for discretization in Å. Defaults to 0.03.

Returns:

B2R2_l representation (n_elements*ngrid,).

Return type:

numpy.ndarray

qstack.qml.b2r2.get_b2r2_n_molecular(ncharges, coords, elements, rcut=3.5, gridspace=0.03)[source]

Compute B2R2_n representation for a single molecule.

Parameters:
  • ncharges (array-like) – Atomic numbers for all atoms in the molecule.

  • coords (array-like) – Atomic coordinates in Å, shape (natom, 3).

  • elements (array-like) – Unique atomic numbers present in the dataset.

  • rcut (float) – Cutoff radius for bond detection in Å. Defaults to 3.5.

  • gridspace (float) – Grid spacing for discretization in Å. Defaults to 0.03.

Returns:

B2R2_n representation (ngrid,).

Return type:

numpy.ndarray

qstack.qml.b2r2.get_bags(unique_ncharges)[source]

Generate all unique element pair combinations including self-interactions.

Parameters:

unique_ncharges (array-like) – Array of unique atomic charges/numbers.

Returns:

List of all unique element pairs [Z_i, Z_j] including self-interactions.

Return type:

list

qstack.qml.b2r2.get_gaussian(x, R)[source]

Compute Gaussian function values for a given interatomic distance.

Parameters:
  • x (numpy ndarray) – Grid points to evaluate the Gaussian.

  • R (float) – Interatomic distance determining the Gaussian parameters.

Returns:

Gaussian function values at the grid points.

Return type:

numpy ndarray

qstack.qml.b2r2.get_mu_sigma(R)[source]

Get Gaussian distribution parameters from interatomic distance.

The constants used here are taken from the original B2R2 implementation.

Parameters:

R (float) – Interatomic distance.

Returns:

Mean (mu) and standard deviation (sigma) for the Gaussian distribution.

Return type:

tuple

qstack.qml.b2r2.get_skew_gaussian_l_both(x, R, Z_I, Z_J)[source]

Compute skewed Gaussian distributions for B2R2_l representation.

Parameters:
  • x (numpy ndarray) – Grid points to evaluate the functions.

  • R (float) – Interatomic distance.

  • Z_I (int) – Atomic number of atom I.

  • Z_J (int) – Atomic number of atom J.

Returns:

Two skewed Gaussian distributions (a, b) for the atom pair.

Return type:

tuple

qstack.qml.b2r2.get_skew_gaussian_n_both(x, R, Z_I, Z_J)[source]

Compute combined skewed Gaussian distribution for B2R2_n representation.

Parameters:
  • x (numpy ndarray) – Grid points to evaluate the function.

  • R (float) – Interatomic distance.

  • Z_I (int) – Atomic number of atom I.

  • Z_J (int) – Atomic number of atom J.

Returns:

Combined skewed Gaussian distribution for the atom pair.

Return type:

numpy ndarray