qstack.regression.hyperparameters¶
Hyperparameter optimization.
- qstack.regression.hyperparameters.hyperparameters(X, y, sigma=[1.0, 3.1622776601683795, 10.0, 31.622776601683793, 100.0, 316.22776601683796, 1000.0, 3162.2776601683795, 10000.0, 31622.776601683792, 100000.0, 316227.7660168379, 1000000.0], eta=[1e-10, 3.162277660168379e-08, 1e-05, 0.0031622776601683794, 1.0], akernel='L', gkernel=None, gdict={'alpha': 1.0, 'normalize': 1, 'verbose': 0}, test_size=0.2, splits=5, idx_test=None, idx_train=None, printlevel=0, adaptive=False, read_kernel=False, sparse=None, random_state=0)[source]¶
Perform a Kfold cross-validated hyperparameter optimization (for width of kernel and regularization parameter).
- Parameters:
X (numpy.ndarray[Nsamples,...]) – Array containing the representations of all Nsamples.
y (numpy.1darray[Nsamples]) – Array containing the target property of all Nsamples.
sigma (list) – List of kernel width for the grid search.
eta (list) – List of regularization strength for the grid search.
akernel (str) – Local kernel (‘L’ for Laplacian, ‘G’ for Gaussian, ‘dot’, ‘cosine’).
gkernel (str) – Global kernel (None, ‘REM’, ‘avg’).
gdict (dict) – Parameters of the global kernels.
test_size (float or int) – Test set fraction (or number of samples).
splits (int) – K number of splits for the Kfold cross-validation.
idx_test (numpy.1darray) – List of indices for the test set (based on the sequence in X).
idx_train (numpy.1darray) – List of indices for the training set (based on the sequence in X).
printlevel (int) – Controls level of output printing.
adaptive (bool) – To expand the grid search adaptatively.
read_kernel (bool) – If ‘X’ is a kernel and not an array of representations.
sparse (int) – The number of reference environnments to consider for sparse regression.
random_state (int) – The seed used for random number generator (controls train/test splitting).
- Returns:
- The results of the grid search as a numpy.2darray [Cx(MAE,std,eta,sigma)],
where C is the number of parameter set and the array is sorted according to MAEs (last is minimum)
- Raises:
RuntimeError – If ‘X’ is a kernel and sparse regression is chosen.
- qstack.regression.hyperparameters.main()[source]¶
Command-line entry point for hyperparameter optimization.
Command-line use¶
This program finds the optimal hyperparameters.
usage: python3 -m qstack.regression.hyperparameters [-h] --x REPR --y PROP
[--eta ETA [ETA ...]]
[--sigma SIGMA [SIGMA ...]]
[--akernel {G,L,dot,cosine,G_sklearn,G_custom_c,L_sklearn,L_custom_c,L_custom_py,myG,myL,myLfast}]
[--gkernel {avg,rem}]
[--gdict [GDICT ...]]
[--test TEST_SIZE] [--ll]
[--readkernel]
[--sparse SPARSE]
[--splits SPLITS]
[--print PRINTLEVEL]
[--ada] [--name NAMEOUT]
Named Arguments¶
- --x
path to the representations file
- --y
path to the properties file
- --eta
eta array
Default:
[1e-10, 3.162277660168379e-08, 1e-05, 0.0031622776601683794, 1.0]- --sigma
sigma array
Default:
[1.0, 3.1622776601683795, 10.0, 31.622776601683793, 100.0, 316.22776601683796, 1000.0, 3162.2776601683795, 10000.0, 31622.776601683792, 100000.0, 316227.7660168379, 1000000.0]- --akernel
Possible choices: G, L, dot, cosine, G_sklearn, G_custom_c, L_sklearn, L_custom_c, L_custom_py, myG, myL, myLfast
local kernel type: “G” for Gaussian, “L” for Laplacian, “dot” for dot products, “cosine” for cosine similarity. “G_{sklearn,custom_c}”, “L_{sklearn,custom_c,custom_py}” for specific implementations. “L_custompy” is suited to open-shell systems
Default:
'L'- --gkernel
Possible choices: avg, rem
global kernel type: “avg” for average, “rem” for REMatch
- --gdict
dictionary like input string to initialize global kernel parameters, e.g. “–gdict alpha=2 normalize=0”
Default:
{'alpha': 1.0, 'normalize': 1, 'verbose': 0}- --test
test set fraction
Default:
0.2- --ll
if correct for the numper of threads
Default:
False- --readkernel
if X is kernel
Default:
False- --sparse
regression basis size for sparse learning
- --splits
k in k-fold cross validation
Default:
5printlevel
Default:
0- --ada
if adapt sigma
Default:
False- --name
the name of the output file
Note
If you built those docs yourself and the command-line section is empty, please make sure you have installed the right components of qstack.