qstack.regression.global_kernels

Global (molecular) kernel implementations.

Provides:

global_kernels_dict: Dictionary mapping global kernel names to functions.

qstack.regression.global_kernels.avg_kernel(kernel, _options)[source]

Compute the average kernel value.

Parameters:
  • kernel (numpy ndarray) – Kernel matrix.

  • _options (dict) – Options dictionary (unused).

Returns:

Average of all kernel matrix elements.

Return type:

float

qstack.regression.global_kernels.get_covariance(mol1, mol2, species, max_atoms, max_size, kernel, sigma=None)[source]

Compute the covariance matrix between two molecules using local kernels.

Parameters:
  • mol1 (dict) – First molecule represented as dictionary of atomic environments by species.

  • mol2 (dict) – Second molecule represented as dictionary of atomic environments by species.

  • species (numpy ndarray) – Array of unique atomic species present in the dataset.

  • max_atoms (dict) – Maximum number of atoms per species across all molecules.

  • max_size (int) – Total size of the padded covariance matrix.

  • kernel (callable) – Local kernel function.

  • sigma (float, optional) – Kernel width parameter. Defaults to None.

Returns:

Covariance matrix of shape (max_size, max_size).

Return type:

numpy ndarray

qstack.regression.global_kernels.get_global_K(X, Y, sigma, local_kernel, global_kernel, options)[source]

Compute global kernel matrix between two sets of molecular representations.

Parameters:
  • X (list) – List of molecular representations (first set).

  • Y (list) – List of molecular representations (second set).

  • sigma (float) – Kernel width parameter.

  • local_kernel (callable) – Local kernel function for atomic environments.

  • global_kernel (callable) – Global kernel function for combining local kernels.

  • options (dict) – Dictionary of global kernel options.

Returns:

Global kernel matrix of shape (len(X), len(Y)).

Return type:

numpy ndarray

qstack.regression.global_kernels.mol_to_dict(mol, species)[source]

Convert molecular representation to a dictionary organized by atomic species.

Parameters:
  • mol (numpy ndarray) – Molecular representation where each row is [atomic_number, features…].

  • species (numpy ndarray) – Array of unique atomic species.

Returns:

Dictionary mapping atomic numbers to arrays of atomic feature vectors.

Return type:

dict

qstack.regression.global_kernels.normalize_kernel(kernel, self_x=None, self_y=None, verbose=0)[source]

Normalize a kernel matrix using self-kernel values.

Parameters:
  • kernel (numpy ndarray) – Kernel matrix to normalize.

  • self_x (numpy ndarray, optional) – Self-kernel values for X. If None, extracted from diagonal. Defaults to None.

  • self_y (numpy ndarray, optional) – Self-kernel values for Y. If None, extracted from diagonal. Defaults to None.

  • verbose (int) – Verbosity level. Defaults to 0.

Returns:

Normalized kernel matrix.

Return type:

numpy ndarray

qstack.regression.global_kernels.rematch_kernel(kernel, options)[source]

Compute the REMatch (Regularized Entropy Match) kernel.

Uses Sinkhorn algorithm to compute optimal transport-based kernel similarity.

Reference:

S. De, A. P. Bartók, G. Csányi, M. Ceriotti, “Comparing molecules and solids across structural and alchemical space”, Phys. Chem. Chem. Phys. 18, 13754 (2016), doi:10.1039/C6CP00415F

Parameters:
  • kernel (numpy ndarray) – Local kernel matrix.

  • options (dict) – Options dictionary containing ‘alpha’ parameter for regularization.

Returns:

REMatch kernel value.

Return type:

float

qstack.regression.global_kernels.sumsq(x)[source]

Compute sum of squares (dot product with itself).

Parameters:

x (numpy ndarray) – Input vector.

Returns:

Sum of squared elements.

Return type:

float