Utilities#
Potential conversion#
There are two basic formats for ACE potentials:
- B-basis set - YAML format, i.e. 'Al.pbe.yaml'. This is an internal complete format for potential fitting.
- Ctilde-basis set - YACE (special form of YAML) format, i.e. 'Al.pbe.yace'. This format is irreversibly converted from B-basis set for public potentials distribution and for using in LAMMPS simulations.
Please see [pacemaker paper] for more details about B-basis and Ctilde-basis sets
To convert potential you can use following utility, that is installed together with pyace
package into you executable paths:
* YAML
to yace
: pace_yaml2yace
. Usage:
usage: pace_yaml2yace [-h] [-o OUTPUT] input [input ...]
Conversion utility from B-basis (.yaml file) to new-style Ctilde-basis (.yace
file)
positional arguments:
input input B-basis file name (.yaml)
optional arguments:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
output Ctilde-basis file name (.yace)
YAML potential timing#
Utility to run the single-CPU timing test for PACE (.yaml) potential. Usage:
pace_timing [-h] potential_file
YAML potential info#
Utility to show the basic information (type of embedding, cutoff, radial functions, n-max, l-max etc.) for PACE (.yaml) potential. Usage:
pace_info [-h] potential_file
Collect and store VASP data in pickle file#
Utility to collect VASP calculations from a top-level directory and store them in a *.pckl.gzip
file that can be used for fitting with pacemaker
.
The reference energies could be provided for each element (default value is zero)
or extracted automatically from the calculation with single atom and large enough (>500 Ang^3/atom) volume. Usage:
usage: pace_collect [-h] [-wd WORKING_DIR] [--output-dataset-filename OUTPUT_DATASET_FILENAME]
[--free-atom-energy [FREE_ATOM_ENERGY [FREE_ATOM_ENERGY ...]]] [--selection SELECTION]
optional arguments:
-h, --help show this help message and exit
-wd WORKING_DIR, --working-dir WORKING_DIR
top directory where keep calculations
--output-dataset-filename OUTPUT_DATASET_FILENAME
pickle filename, default is collected.pckl.gzip
--free-atom-energy [FREE_ATOM_ENERGY [FREE_ATOM_ENERGY ...]]
dictionary of reference energies (auto for extraction from dataset), i.e. `Al:-0.123 Cu:-0.456 Zn:auto`, default is zero. If option is `auto`, then it will be extracted from dataset
--selection SELECTION
Option to select from multiple configurations of single VASP calculation: first, last, all, first_and_last (default: last)
Active set generation#
Utility to generate active set (used for extrapolation grade calculation).
usage: pace_activeset [-h] [-d DATASET] [-f] [-b BATCH_SIZE] [-g GAMMA_TOLERANCE] [-i MAXVOL_ITERS] [-r MAXVOL_REFINEMENT] [-m MEMORY_LIMIT] potential_file
Utility to compute active set for PACE (.yaml) potential
positional arguments:
potential_file B-basis file name (.yaml)
optional arguments:
-h, --help show this help message and exit
-d DATASET, --dataset DATASET
Dataset file name, ex.: filename.pckl.gzip
-f, --full Compute active set on full (linearized) design matrix
-b BATCH_SIZE, --batch_size BATCH_SIZE
Batch size (number of structures) considered simultaneously.If not provided - all dataset at once is considered
-g GAMMA_TOLERANCE, --gamma_tolerance GAMMA_TOLERANCE
Gamma tolerance
-i MAXVOL_ITERS, --maxvol_iters MAXVOL_ITERS
Number of maximum iteration in MaxVol algorithm
-r MAXVOL_REFINEMENT, --maxvol_refinement MAXVOL_REFINEMENT
Number of refinements (epochs)
-m MEMORY_LIMIT, --memory-limit MEMORY_LIMIT
Memory limit (i.e. 1GB, 500MB or 'auto')
Example of usage:
pace_activeset -d fitting_data_info.pckl.gzip output_potential.yaml
that will generate linear active set and store it into output_potential.asi
file.
or
pace_activeset -d fitting_data_info.pckl.gzip output_potential.yaml -f
that will generate full active set (including linearized part of non-linear embedding function)
and store it into output_potential.asi.nonlinear
file.
D-optimality structure selection#
Utility to select most representative training structures from large dataset using D-optimality criterion.
usage: pace_select [-h] -p POTENTIAL_FILE [-a ACTIVE_SET_INV_FNAME] [-e ELEMENTS] [-m MAX_STRUCTURES] [-o SELECTED_STRUCTURES_FILENAME] [-b BATCH_SIZE] [-g GAMMA_TOLERANCE] [-i MAXVOL_ITERS] [-r MAXVOL_REFINEMENT] [-mem MEMORY_LIMIT] [-V] dataset [dataset ...]
Utility to select structures for training se based on D-optimality criterion
positional arguments:
dataset Dataset file name(s), ex.: filename.pckl.gzip [extrapolative_structures.dat]
optional arguments:
-h, --help show this help message and exit
-p POTENTIAL_FILE, --potential_file POTENTIAL_FILE
B-basis file name (.yaml)
-a ACTIVE_SET_INV_FNAME, --active-set-inv ACTIVE_SET_INV_FNAME
Active Set Inverted (ASI) filename
-e ELEMENTS, --elements ELEMENTS
List of elements, used in LAMMPS, i.e. "Ni Nb O"
-m MAX_STRUCTURES, --max-structures MAX_STRUCTURES
Maximum number of structures to select (default -1 = all)
-o SELECTED_STRUCTURES_FILENAME, --output SELECTED_STRUCTURES_FILENAME
Selected structures filename, default: selected.pkl.gz
-b BATCH_SIZE, --batch_size BATCH_SIZE
Batch size (number of structures) considered simultaneously.If not provided - all dataset at once is considered
-g GAMMA_TOLERANCE, --gamma_tolerance GAMMA_TOLERANCE
Gamma tolerance
-i MAXVOL_ITERS, --maxvol_iters MAXVOL_ITERS
Number of maximum iteration in MaxVol algorithm
-r MAXVOL_REFINEMENT, --maxvol_refinement MAXVOL_REFINEMENT
Number of refinements (epochs)
-mem MEMORY_LIMIT, --memory-limit MEMORY_LIMIT
Memory limit (i.e. 1GB, 500MB or 'auto')
-V suppress verbosity of numerical procedures
Example of usage:
pace_select -p NiNb_potential.yaml -a NiNb_potential.asi -m 100 -e "Ni Nb" extrapolative_structures_1.dump extrapolative_structures_2.dump
Data augmentation#
Utility to generate augmented dataset. Energies and forces will be predicted with ZBL potential.
usage: pace_augment [-h] -d DATASET [-a ACTIVE_SET_INV_FNAME] [-m MAX_STRUCTURES] [-o AUGMENTED_STRUCTURES_FILENAME] [-V] [-mnat MAX_NUM_AT] [-mss MAX_SEED_STRUCTURES] [-minepa MIN_AUG_EPA] [-maxepa MAX_AUG_EPA] [-eparmax EPA_RELIABLE_MAX] [-nnstep NN_DISTANCE_STEP]
[-nnmin NN_DISTANCE_MIN]
potential_file
Utility to generate augmented dataset with ZBL and/or EOS data
positional arguments:
potential_file B-basis file name (.yaml)
optional arguments:
-h, --help show this help message and exit
-d DATASET, --dataset DATASET
Dataset file name(s), ex.: -d filename.pckl.gzip [-d filename2.pckl.gzip]
-a ACTIVE_SET_INV_FNAME, --active-set-inv ACTIVE_SET_INV_FNAME
Active Set Inverted (ASI) filename, considered as extra B-projections
-m MAX_STRUCTURES, --max-structures MAX_STRUCTURES
Maximum number of structures to select (default -1 = all)
-o AUGMENTED_STRUCTURES_FILENAME, --output AUGMENTED_STRUCTURES_FILENAME
Augmented structures filename, (default: aug_df.pkl.gz)
-V suppress verbosity of numerical procedures
-mnat MAX_NUM_AT, --max-num-atoms MAX_NUM_AT
Maximum number of atoms for seed structures, selected for augmentation (-1 = no limit, default = 32)
-mss MAX_SEED_STRUCTURES, --max-seed-structures MAX_SEED_STRUCTURES
Maximum number of seed structures, selected for augmentation (-1 = all, default = 100)
-minepa MIN_AUG_EPA, --min--aug-epa MIN_AUG_EPA
Minimal augmented energy-per-atom (default None = no limit)
-maxepa MAX_AUG_EPA, --max--aug-epa MAX_AUG_EPA
Maximal augmented energy-per-atom (default None = 150)
-eparmax EPA_RELIABLE_MAX, --epa-reliable-max EPA_RELIABLE_MAX
Maximum for reliable energy-per-atom (default None = no limit)
-nnstep NN_DISTANCE_STEP, --nn-dist-step NN_DISTANCE_STEP
Nearest-neighbour distance step for data augmentation (default = 0.1)
-nnmin NN_DISTANCE_MIN, --nn-dist-min NN_DISTANCE_MIN
Nearest-neighbour distance step for data augmentation (default = 1)
Example of usage:
pace_augment NiNb-FM-upfit-mu.yaml -a NiNb-FM-upfit-mu.asi -d df_all_NbNi_FM_new_str.pckl.gzip
Core-repulsion tuner#
Utility for automatic/manual setup of ZBL core-repulsion.
usage: pace_corerep [-h] [-d DATASET] [-a ACTIVE_SET_INV_FNAME] [-o OUTPUT_FILE] [-V] [-nnstep NN_DISTANCE_STEP] [-nnmin NN_DISTANCE_MIN] [-nnmax NN_DISTANCE_MAX] [-n NUM_OF_STRUCTURES] [-g GAMMA_MAX] [--inner-cutoff [INNER_CUTOFF_DICT [INNER_CUTOFF_DICT ...]]] potential_file
Utility to (auto)tune potential and add ZBL core-repulsion ZBL
positional arguments:
potential_file B-basis file name (.yaml)
optional arguments:
-h, --help show this help message and exit
-d DATASET, --dataset DATASET
Dataset file name(s), ex.: -d filename.pckl.gzip [-d filename2.pckl.gzip]
-a ACTIVE_SET_INV_FNAME, --active-set-inv ACTIVE_SET_INV_FNAME
Active Set Inverted (ASI) filename, considered as extra B-projections
-o OUTPUT_FILE, --output OUTPUT_FILE
Output filename for auto-tuned core-rep potential. default=none - same as `potential_file`. If `auto` - 'corerep' suffix will be added
-V suppress verbosity of numerical procedures
-nnstep NN_DISTANCE_STEP, --nn-dist-step NN_DISTANCE_STEP
Nearest-neighbour distance step for data augmentation (default = 0.05)
-nnmin NN_DISTANCE_MIN, --nn-dist-min NN_DISTANCE_MIN
Min. nearest-neighbour distance for data augmentation (default = 1)
-nnmax NN_DISTANCE_MAX, --nn-dist-max NN_DISTANCE_MAX
Max. nearest-neighbour distance for data augmentation (default = 2.5)
-n NUM_OF_STRUCTURES, --num-of-structures NUM_OF_STRUCTURES
Number of structures selected to compress (default = 50)
-g GAMMA_MAX, --gamma-max GAMMA_MAX
Max. extrapolation grade gamma for reliable atomic env. (default = 10)
--inner-cutoff [INNER_CUTOFF_DICT [INNER_CUTOFF_DICT ...]]
dictionary of inner cutoff `Al:-0.123 Cu-Cu:-0.456 Al-Cu:auto`, default is zero. If option is `auto`, then it will be extracted from dataset
Example of usage, automatic:
pace_corerep NiNb-5-FM-ZBL.yaml -a NiNb-5-FM-ZBL.asi -d fitting_data_info.pckl.gzip
manual configuration of inner cutoff
pace_corerep AlLi-8a-auto.yaml --inner-cutoff Al:1.95 Li:1.90 Al-Li:1.85