Utilities#
Potential conversion#
There are two basic formats for ACE potentials:
- B-basis set - YAML format, i.e. 'Al.pbe.yaml'. This is an internal complete format for potential fitting.
- Ctilde-basis set - YACE (special form of YAML) format, i.e. 'Al.pbe.yace'. This format is irreversibly converted from B-basis set for public potentials distribution and for using in LAMMPS simulations.
Please see [pacemaker paper] for more details about B-basis and Ctilde-basis sets
To convert potential you can use following utility, that is installed together with pyace package into you executable paths:
* YAML to yace : pace_yaml2yace. Usage:
usage: pace_yaml2yace [-h] [-o OUTPUT] input [input ...]
Conversion utility from B-basis (.yaml file) to new-style Ctilde-basis (.yace
file)
positional arguments:
input input B-basis file name (.yaml)
optional arguments:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
output Ctilde-basis file name (.yace)
YAML potential timing#
Utility to run the single-CPU timing test for PACE (.yaml) potential. Usage:
pace_timing [-h] potential_file
YAML potential info#
Utility to show the basic information (type of embedding, cutoff, radial functions, n-max, l-max etc.) for PACE (.yaml) potential. Usage:
pace_info [-h] potential_file
Collect and store VASP data in pickle file#
Utility to collect VASP calculations from a top-level directory and store them in a *.pckl.gzip file that can be used for fitting with pacemaker.
The reference energies could be provided for each element (default value is zero)
or extracted automatically from the calculation with single atom and large enough (>500 Ang^3/atom) volume. Usage:
usage: pace_collect [-h] [-wd WORKING_DIR] [--output-dataset-filename OUTPUT_DATASET_FILENAME]
[--free-atom-energy [FREE_ATOM_ENERGY [FREE_ATOM_ENERGY ...]]] [--selection SELECTION]
optional arguments:
-h, --help show this help message and exit
-wd WORKING_DIR, --working-dir WORKING_DIR
top directory where keep calculations
--output-dataset-filename OUTPUT_DATASET_FILENAME
pickle filename, default is collected.pckl.gzip
--free-atom-energy [FREE_ATOM_ENERGY [FREE_ATOM_ENERGY ...]]
dictionary of reference energies (auto for extraction from dataset), i.e. `Al:-0.123 Cu:-0.456 Zn:auto`, default is zero. If option is `auto`, then it will be extracted from dataset
--selection SELECTION
Option to select from multiple configurations of single VASP calculation: first, last, all, first_and_last (default: last)
Active set generation#
Utility to generate active set (used for extrapolation grade calculation).
usage: pace_activeset [-h] [-d DATASET] [-f] [-b BATCH_SIZE] [-g GAMMA_TOLERANCE] [-i MAXVOL_ITERS] [-r MAXVOL_REFINEMENT] [-m MEMORY_LIMIT] potential_file
Utility to compute active set for PACE (.yaml) potential
positional arguments:
potential_file B-basis file name (.yaml)
optional arguments:
-h, --help show this help message and exit
-d DATASET, --dataset DATASET
Dataset file name, ex.: filename.pckl.gzip
-f, --full Compute active set on full (linearized) design matrix
-b BATCH_SIZE, --batch_size BATCH_SIZE
Batch size (number of structures) considered simultaneously.If not provided - all dataset at once is considered
-g GAMMA_TOLERANCE, --gamma_tolerance GAMMA_TOLERANCE
Gamma tolerance
-i MAXVOL_ITERS, --maxvol_iters MAXVOL_ITERS
Number of maximum iteration in MaxVol algorithm
-r MAXVOL_REFINEMENT, --maxvol_refinement MAXVOL_REFINEMENT
Number of refinements (epochs)
-m MEMORY_LIMIT, --memory-limit MEMORY_LIMIT
Memory limit (i.e. 1GB, 500MB or 'auto')
Example of usage:
pace_activeset -d fitting_data_info.pckl.gzip output_potential.yaml
that will generate linear active set and store it into output_potential.asi file.
or
pace_activeset -d fitting_data_info.pckl.gzip output_potential.yaml -f
that will generate full active set (including linearized part of non-linear embedding function)
and store it into output_potential.asi.nonlinear file.
D-optimality structure selection#
Utility to select most representative training structures from large dataset using D-optimality criterion.
usage: pace_select [-h] -p POTENTIAL_FILE [-a ACTIVE_SET_INV_FNAME] [-e ELEMENTS] [-m MAX_STRUCTURES] [-o SELECTED_STRUCTURES_FILENAME] [-b BATCH_SIZE] [-g GAMMA_TOLERANCE] [-i MAXVOL_ITERS] [-r MAXVOL_REFINEMENT] [-mem MEMORY_LIMIT] [-V] dataset [dataset ...]
Utility to select structures for training se based on D-optimality criterion
positional arguments:
dataset Dataset file name(s), ex.: filename.pckl.gzip [extrapolative_structures.dat]
optional arguments:
-h, --help show this help message and exit
-p POTENTIAL_FILE, --potential_file POTENTIAL_FILE
B-basis file name (.yaml)
-a ACTIVE_SET_INV_FNAME, --active-set-inv ACTIVE_SET_INV_FNAME
Active Set Inverted (ASI) filename
-e ELEMENTS, --elements ELEMENTS
List of elements, used in LAMMPS, i.e. "Ni Nb O"
-m MAX_STRUCTURES, --max-structures MAX_STRUCTURES
Maximum number of structures to select (default -1 = all)
-o SELECTED_STRUCTURES_FILENAME, --output SELECTED_STRUCTURES_FILENAME
Selected structures filename, default: selected.pkl.gz
-b BATCH_SIZE, --batch_size BATCH_SIZE
Batch size (number of structures) considered simultaneously.If not provided - all dataset at once is considered
-g GAMMA_TOLERANCE, --gamma_tolerance GAMMA_TOLERANCE
Gamma tolerance
-i MAXVOL_ITERS, --maxvol_iters MAXVOL_ITERS
Number of maximum iteration in MaxVol algorithm
-r MAXVOL_REFINEMENT, --maxvol_refinement MAXVOL_REFINEMENT
Number of refinements (epochs)
-mem MEMORY_LIMIT, --memory-limit MEMORY_LIMIT
Memory limit (i.e. 1GB, 500MB or 'auto')
-V suppress verbosity of numerical procedures
Example of usage:
pace_select -p NiNb_potential.yaml -a NiNb_potential.asi -m 100 -e "Ni Nb" extrapolative_structures_1.dump extrapolative_structures_2.dump
Data augmentation#
Utility to generate augmented dataset. Energies and forces will be predicted with ZBL potential.
usage: pace_augment [-h] -d DATASET [-a ACTIVE_SET_INV_FNAME] [-m MAX_STRUCTURES] [-o AUGMENTED_STRUCTURES_FILENAME] [-V] [-mnat MAX_NUM_AT] [-mss MAX_SEED_STRUCTURES] [-minepa MIN_AUG_EPA] [-maxepa MAX_AUG_EPA] [-eparmax EPA_RELIABLE_MAX] [-nnstep NN_DISTANCE_STEP]
[-nnmin NN_DISTANCE_MIN]
potential_file
Utility to generate augmented dataset with ZBL and/or EOS data
positional arguments:
potential_file B-basis file name (.yaml)
optional arguments:
-h, --help show this help message and exit
-d DATASET, --dataset DATASET
Dataset file name(s), ex.: -d filename.pckl.gzip [-d filename2.pckl.gzip]
-a ACTIVE_SET_INV_FNAME, --active-set-inv ACTIVE_SET_INV_FNAME
Active Set Inverted (ASI) filename, considered as extra B-projections
-m MAX_STRUCTURES, --max-structures MAX_STRUCTURES
Maximum number of structures to select (default -1 = all)
-o AUGMENTED_STRUCTURES_FILENAME, --output AUGMENTED_STRUCTURES_FILENAME
Augmented structures filename, (default: aug_df.pkl.gz)
-V suppress verbosity of numerical procedures
-mnat MAX_NUM_AT, --max-num-atoms MAX_NUM_AT
Maximum number of atoms for seed structures, selected for augmentation (-1 = no limit, default = 32)
-mss MAX_SEED_STRUCTURES, --max-seed-structures MAX_SEED_STRUCTURES
Maximum number of seed structures, selected for augmentation (-1 = all, default = 100)
-minepa MIN_AUG_EPA, --min--aug-epa MIN_AUG_EPA
Minimal augmented energy-per-atom (default None = no limit)
-maxepa MAX_AUG_EPA, --max--aug-epa MAX_AUG_EPA
Maximal augmented energy-per-atom (default None = 150)
-eparmax EPA_RELIABLE_MAX, --epa-reliable-max EPA_RELIABLE_MAX
Maximum for reliable energy-per-atom (default None = no limit)
-nnstep NN_DISTANCE_STEP, --nn-dist-step NN_DISTANCE_STEP
Nearest-neighbour distance step for data augmentation (default = 0.1)
-nnmin NN_DISTANCE_MIN, --nn-dist-min NN_DISTANCE_MIN
Nearest-neighbour distance step for data augmentation (default = 1)
Example of usage:
pace_augment NiNb-FM-upfit-mu.yaml -a NiNb-FM-upfit-mu.asi -d df_all_NbNi_FM_new_str.pckl.gzip
Core-repulsion tuner#
Utility for automatic/manual setup of ZBL core-repulsion.
usage: pace_corerep [-h] [-d DATASET] [-a ACTIVE_SET_INV_FNAME] [-o OUTPUT_FILE] [-V] [-nnstep NN_DISTANCE_STEP] [-nnmin NN_DISTANCE_MIN] [-nnmax NN_DISTANCE_MAX] [-n NUM_OF_STRUCTURES] [-g GAMMA_MAX] [--inner-cutoff [INNER_CUTOFF_DICT [INNER_CUTOFF_DICT ...]]] potential_file
Utility to (auto)tune potential and add ZBL core-repulsion ZBL
positional arguments:
potential_file B-basis file name (.yaml)
optional arguments:
-h, --help show this help message and exit
-d DATASET, --dataset DATASET
Dataset file name(s), ex.: -d filename.pckl.gzip [-d filename2.pckl.gzip]
-a ACTIVE_SET_INV_FNAME, --active-set-inv ACTIVE_SET_INV_FNAME
Active Set Inverted (ASI) filename, considered as extra B-projections
-o OUTPUT_FILE, --output OUTPUT_FILE
Output filename for auto-tuned core-rep potential. default=none - same as `potential_file`. If `auto` - 'corerep' suffix will be added
-V suppress verbosity of numerical procedures
-nnstep NN_DISTANCE_STEP, --nn-dist-step NN_DISTANCE_STEP
Nearest-neighbour distance step for data augmentation (default = 0.05)
-nnmin NN_DISTANCE_MIN, --nn-dist-min NN_DISTANCE_MIN
Min. nearest-neighbour distance for data augmentation (default = 1)
-nnmax NN_DISTANCE_MAX, --nn-dist-max NN_DISTANCE_MAX
Max. nearest-neighbour distance for data augmentation (default = 2.5)
-n NUM_OF_STRUCTURES, --num-of-structures NUM_OF_STRUCTURES
Number of structures selected to compress (default = 50)
-g GAMMA_MAX, --gamma-max GAMMA_MAX
Max. extrapolation grade gamma for reliable atomic env. (default = 10)
--inner-cutoff [INNER_CUTOFF_DICT [INNER_CUTOFF_DICT ...]]
dictionary of inner cutoff `Al:-0.123 Cu-Cu:-0.456 Al-Cu:auto`, default is zero. If option is `auto`, then it will be extracted from dataset
Example of usage, automatic:
pace_corerep NiNb-5-FM-ZBL.yaml -a NiNb-5-FM-ZBL.asi -d fitting_data_info.pckl.gzip
manual configuration of inner cutoff
pace_corerep AlLi-8a-auto.yaml --inner-cutoff Al:1.95 Li:1.90 Al-Li:1.85