Temperature dependent force constants calculation using pypolymlp and symfc#
This is an experimental feature, and its usage may change occasionally.
Note
This feature is supported through the phonopy-load
command but not the phonopy
command.
With the --pypolymlp
option, phonopy can interface with the polynomial machine
learning potential (MLP) code,
pypolymlp, to perform training and
evaluation tasks of MLPs. This feature aims to reduce the computational cost of
anharmonic force constant calculations by using MLPs as an intermediary layer,
efficiently representing atomic interactions. The example is found at
example/KCl-SSCHA
.
The training process involves using a dataset consisting of supercell displacements, forces, and energies. The trained MLPs are then employed to compute forces for supercells with specific displacements.
For further details on combining phonopy calculations with pypolymlp, refer to A. Togo and A. Seko, J. Chem. Phys. 160, 211001 (2024) [doi] [arxiv].
Using the polynomial MLPs, stochastic self-consistent harmonic approximation (SSCHA) calculation is performed in the following sections. By this, temperature dependent force constants are calculated within SSCHA. About SSCHA, please refer the paper by L. Monacelli et al., J. Phys.: Condens. Matter 33 363001 (2021) and A. van Roekeghem A, et al., Comput. Phys. Commun. 263 107945 (2021). Technically, the computational procedure introduced here is equivalent to the approach of the latter paper.
Citation of pypolymlp#
“Tutorial: Systematic development of polynomial machine learning potentials for elemental and alloy systems”, A. Seko, J. Appl. Phys. 133, 011101 (2023) [doi].
@article{pypolymlp,
author = {Seko, Atsuto},
title = "{"Tutorial: Systematic development of polynomial machine learning potentials for elemental and alloy systems"}",
journal = {J. Appl. Phys.},
volume = {133},
number = {1},
pages = {011101},
year = {2023},
month = {01},
}
Citation of symfc#
“Projector-based efficient estimation of force constants”, A. Seko and A. Togo, Phys. Rev. B, 110, 214302 (2024) [doi] [arxiv].
@article{PhysRevB.110.214302,
title = {Projector-based efficient estimation of force constants},
author = {Seko, Atsuto and Togo, Atsushi},
journal = {Phys. Rev. B},
volume = {110},
issue = {21},
pages = {214302},
numpages = {18},
year = {2024},
month = {Dec},
}
Requirements#
How to calculate#
Workflow#
Generate random displacements in supercells. Use –rd option.
Calculate corresponding forces and energies in supercells. Use of VASP interface is recommended for –sp option is supported.
Prepare dataset composed of displacements, forces, and energies in supercells. The dataset must be stored in a phonopy-yaml-like file, e.g.,
phonopy_params.yaml
. Use -f and –sp option simultaneously.Develop MLPs. By default, 90 and 10 percents of the dataset are used for the training and test, respectively. At this step
phonopy.pmlp
is saved.Generate displacements in supercells either systematic or random displacements.
Evaluate MLPs for forces of the supercells generated in step 5.
Calculate force constants from displacement-force dataset from steps 5 and 6.
Temperature dependent force constants calculation
The steps 4-7 are executed in running phonopy with --pypolymlp
option.
Steps 1-3: Dataset preparation#
For the training, the following supercell data are required in the phonopy setting to use pypolymlp:
Displacements
Forces
Total energies
These data must be stored in phonopy.yaml-like file.
The supercells with displacements are generated by
% phonopy --pa auto --rd 1000 -c POSCAR-unitcell --dim 2 2 2 --amin 0.03 --amax 1.5
_
_ __ | |__ ___ _ __ ___ _ __ _ _
| '_ \| '_ \ / _ \| '_ \ / _ \ | '_ \| | | |
| |_) | | | | (_) | | | | (_) || |_) | |_| |
| .__/|_| |_|\___/|_| |_|\___(_) .__/ \__, |
|_| |_| |___/
2.31.1
Compiled with OpenMP support (max 10 threads).
Python version 3.12.6
Spglib version 2.5.0
Crystal structure was read from "POSCAR-unitcell".
Unit of length: angstrom
Displacements creation mode
Number of supercells with random displacements: 1000
Min displacement distance: 0.03
Max displacement distance: 1.5
Settings:
Supercell: [2 2 2]
Primitive matrix (Auto):
[0. 0.5 0.5]
[0.5 0. 0.5]
[0.5 0.5 0. ]
Spacegroup: Fm-3m (225)
Number of symmetry operations in supercell: 1536
Use -v option to watch primitive cell, unit cell, and supercell structures.
"phonopy_disp.yaml" and supercells have been created.
Summary of calculation was written in "phonopy_disp.yaml".
_
___ _ __ __| |
/ _ \ '_ \ / _` |
| __/ | | | (_| |
\___|_| |_|\__,_|
For the generated supercells, forces and energies are calculated. Here it is assumed to use the VASP code. Once the calculations are complete, the data (forces and energies) can be extracted using the following command:
% phonopy --sp -f vasprun_xmls/vasprun-{001..120}.xml
This command extracts the necessary data and stores it in the
phonopy_params.yaml
file. For more details, refer to the description of the
–sp option. Currently, supercell energy extraction
from calculator outputs is only supported when using the VASP interface.
Steps 4-7: Force constants calculation (systematic displacements in step 5)#
After developing MLPs, displacements are generated systematically considering crystal symmetry.
Having phonopy_params.yaml
, phonopy is executed with --pypolymlp
option,
% phonopy-load phonopy_mlpsscha_params_KCl-120.yaml.xz --pypolymlp --mlp-params="ntrain=100, ntest=20"
_
_ __ | |__ ___ _ __ ___ _ __ _ _
| '_ \| '_ \ / _ \| '_ \ / _ \ | '_ \| | | |
| |_) | | | | (_) | | | | (_) || |_) | |_| |
| .__/|_| |_|\___/|_| |_|\___(_) .__/ \__, |
|_| |_| |___/
2.31.2
Compiled with OpenMP support (max 10 threads).
Running in phonopy.load mode.
Python version 3.12.3
Spglib version 2.5.0
Crystal structure was read from "phonopy_mlpsscha_params_KCl-120.yaml.xz".
Unit of length: angstrom
Settings:
Supercell: [2 2 2]
Primitive matrix:
[0. 0.5 0.5]
[0.5 0. 0.5]
[0.5 0.5 0. ]
Spacegroup: Fm-3m (225)
Number of symmetry operations in supercell: 1536
Use -v option to watch primitive cell, unit cell, and supercell structures.
NAC parameters were read from "phonopy_mlpsscha_params_KCl-120.yaml.xz".
Displacement-force dataset was read from "phonopy_mlpsscha_params_KCl-120.yaml.xz".
----------------------------- pypolymlp start ------------------------------
Pypolymlp is a generator of polynomial machine learning potentials.
Please cite the paper: A. Seko, J. Appl. Phys. 133, 011101 (2023).
Pypolymlp is developed at https://github.com/sekocha/pypolymlp.
Parameters:
cutoff: 8.0
model_type: 3
max_p: 2
gtinv_order: 3
gtinv_maxl: (8, 8)
gaussian_params1: (1.0, 1.0, 1)
gaussian_params2: (0.0, 7.0, 10)
ntrain: 100
ntest: 20
Developing MLPs by pypolymlp...
Regression: cholesky decomposition ...
- alpha: 0.001
- alpha: 0.01
- alpha: 0.1
- alpha: 1.0
- alpha: 10.0
Clear training X.T @ X
Calculate X.T @ X for test data
Clear test X.T @ X
Regression: model selection ...
- alpha = 1.000e-03 : rmse (train, test) = 0.02432 0.23669
- alpha = 1.000e-02 : rmse (train, test) = 0.03613 0.16766
- alpha = 1.000e-01 : rmse (train, test) = 0.07193 0.22140
- alpha = 1.000e+00 : rmse (train, test) = 0.11563 0.26042
- alpha = 1.000e+01 : rmse (train, test) = 0.19375 0.31767
MLPs were written into "phonopy.pmlp"
------------------------------ pypolymlp end -------------------------------
Generate displacements
Displacement distance: 0.001
Evaluate forces in 4 supercells by pypolymlp
-------------------------------- Symfc start -------------------------------
Symfc is a force constants calculator. See the following paper:
A. Seko and A. Togo, arXiv:2403.03588.
Symfc is developed at https://github.com/symfc/symfc.
Computing [2] order force constants.
Increase log-level to watch detailed symfc log.
--------------------------------- Symfc end --------------------------------
Max drift of force constants: -0.000000 (yy) -0.000000 (yy)
----------------------------------------------------------------------------
One of the following run modes may be specified for phonon calculations.
- Mesh sampling (MESH, --mesh)
- Q-points (QPOINTS, --qpoints)
- Band structure (BAND, --band)
- Animation (ANIME, --anime)
- Modulation (MODULATION, --modulation)
- Characters of Irreps (IRREPS, --irreps)
- Create displacements (CREATE_DISPLACEMENTS, -d)
----------------------------------------------------------------------------
Dataset generated using MMLPs was written in "phonopy_mlp_eval_dataset.yaml".
Summary of calculation was written in "phonopy.yaml".
_
___ _ __ __| |
/ _ \ '_ \ / _` |
| __/ | | | (_| |
\___|_| |_|\__,_|
Information about the development of MLPs using pypolymlp is provided between
the pypolymlp start
and pypolymlp end
sections. The polynomial MLPs are
saved in the phonopy.pmlp
file. This file is automatically searched in
subsequent phonopy executions with the --pypolymlp
option and reused.
After the MLPs are developed, systematic displacements, such as those involving
the displacement of one or two atoms in supercells, are generated with a
displacement distance of 0.001 Angstrom. The forces for these supercells are
then evaluated using pypolymlp. Both the generated displacements and the
corresponding forces are stored in the phonopy_mlp_eval_dataset
file.
The calculated force constants may be refered as the harmonic force constants.
Step 8: Temperature dependent force constants calculation#
After the last step, the phonopy.pmlp
file exists in the current directory.
This file is read automatically in the next calculation with the --pypolymlp
option. If the developed MLPs can predict well forces at relatively large
displacements, temperature dependent force constants are calculated with the
--sscha NUMBER_OF_ITERATIONS
option.
% phonopy-load phonopy_mlpsscha_params_KCl-120.yaml.xz --pypolymlp --sscha 10 --rd-temperature 300 --rd 1000
_
_ __ | |__ ___ _ __ ___ _ __ _ _
| '_ \| '_ \ / _ \| '_ \ / _ \ | '_ \| | | |
| |_) | | | | (_) | | | | (_) || |_) | |_| |
| .__/|_| |_|\___/|_| |_|\___(_) .__/ \__, |
|_| |_| |___/
2.31.2
Compiled with OpenMP support (max 10 threads).
Running in phonopy.load mode.
Python version 3.12.3
Spglib version 2.5.0
Crystal structure was read from "phonopy_mlpsscha_params_KCl-120.yaml.xz".
Unit of length: angstrom
Displacements creation mode
Number of supercells with random displacements: 1000
Temperatuere to generate random displacements: 300.0
Settings:
Supercell: [2 2 2]
Primitive matrix:
[0. 0.5 0.5]
[0.5 0. 0.5]
[0.5 0.5 0. ]
Spacegroup: Fm-3m (225)
Number of symmetry operations in supercell: 1536
Use -v option to watch primitive cell, unit cell, and supercell structures.
NAC parameters were read from "phonopy_mlpsscha_params_KCl-120.yaml.xz".
Displacement-force dataset was read from "phonopy_mlpsscha_params_KCl-120.yaml.xz".
----------------------------- pypolymlp start ------------------------------
Pypolymlp is a generator of polynomial machine learning potentials.
Please cite the paper: A. Seko, J. Appl. Phys. 133, 011101 (2023).
Pypolymlp is developed at https://github.com/sekocha/pypolymlp.
Load MLPs from "phonopy.pmlp".
------------------------------ pypolymlp end -------------------------------
------------------------------- SSCHA start --------------------------------
[ SSCHA initialization (rd=0.001, n_supercells=1000) ]
Evaluate MLP to obtain forces using pypolymlp
Calculate force constants using symfc
SSCHA free energy: -102.786 meV
Initial force constants are written into "phonopy_sscha_fc_0.yaml.xz".
[ SSCHA iteration 1 / 10 ]
Generate 1000 supercells with displacements at 300.0 K
[0.010, 0.084] ****
[0.084, 0.159] ******************
[0.159, 0.234] ***************************
[0.234, 0.309] *************************
[0.309, 0.383] ****************
[0.383, 0.458] *******
[0.458, 0.533] **
[0.533, 0.608] *
[0.608, 0.683]
[0.683, 0.757]
Evaluate MLP to obtain forces using pypolymlp
Calculate force constants using symfc
SSCHA free energy: -98.205 meV
SSCHA force constants are written into "phonopy_sscha_fc_1.yaml.xz".
[ SSCHA iteration 2 / 10 ]
Generate 1000 supercells with displacements at 300.0 K
[0.007, 0.094] *****
[0.094, 0.181] ***********************
[0.181, 0.267] ********************************
[0.267, 0.354] ************************
[0.354, 0.440] ***********
[0.440, 0.527] ***
[0.527, 0.613] *
[0.613, 0.700]
[0.700, 0.787]
[0.787, 0.873]
Evaluate MLP to obtain forces using pypolymlp
Calculate force constants using symfc
SSCHA free energy: -98.221 meV
SSCHA force constants are written into "phonopy_sscha_fc_2.yaml.xz".
...
[ SSCHA iteration 10 / 10 ]
Generate 1000 supercells with displacements at 300.0 K
[0.006, 0.089] *****
[0.089, 0.171] *********************
[0.171, 0.253] *******************************
[0.253, 0.336] *************************
[0.336, 0.418] *************
[0.418, 0.500] ****
[0.500, 0.583] *
[0.583, 0.665]
[0.665, 0.747]
[0.747, 0.830]
Evaluate MLP to obtain forces using pypolymlp
Calculate force constants using symfc
SSCHA free energy: -98.122 meV
SSCHA force constants are written into "phonopy_sscha_fc_10.yaml.xz".
-------------------------------- SSCHA end ---------------------------------
----------------------------------------------------------------------------
One of the following run modes may be specified for phonon calculations.
- Mesh sampling (MESH, --mesh)
- Q-points (QPOINTS, --qpoints)
- Band structure (BAND, --band)
- Animation (ANIME, --anime)
- Modulation (MODULATION, --modulation)
- Characters of Irreps (IRREPS, --irreps)
- Create displacements (CREATE_DISPLACEMENTS, -d)
----------------------------------------------------------------------------
Summary of calculation was written in "phonopy.yaml".
_
___ _ __ __| |
/ _ \ '_ \ / _` |
| __/ | | | (_| |
\___|_| |_|\__,_|
The final force constants are stored in files named
phonopy_sscha_fc_NUM.yaml.xz
, where NUM
represents the integer corresponding
to the iteration step. By performing a sufficient number of SSCHA iterations and
utilizing a sufficiently large set of supercells with random displacements at a
given temperature, the SSCHA force constants can be reliably determined. The
convergence of these force constants can be monitored through the SSCHA free
energy. Additionally, convergence can be assessed by plotting the phonon band
structures corresponding to the SSCHA force constants at the iteration
steps. For example:
% for i in {0..10}; do phonopy-load phonopy_sscha_fc_$i.yaml.xz --band auto --band-points 101; mv band.yaml band-$i.yaml; done
% phonopy-bandplot band-{0..10}.yaml --legend
Parameters for developing MLPs#
A few parameters can be specified using the --mlp-params
option for the
development of MLPs. The parameters are provided as a string, e.g.,
% phonopy-load phonopy_params.yaml --pypolymlp --mlp-params="ntrain=80, ntest=20"
Parameters are separated by commas for configuration. A brief explanation of the
available parameters can be found in the docstring of PypolymlpParams
that is
found by
In [1]: from phonopy.interface.pypolymlp import PypolymlpParams
In [2]: help(PypolymlpParams)
ntrain
and ntest
are implemented in phonopy, while the remaining parameters
are directly passed to pypolymlp. Optimizing pypolymlp parameters can be
difficult, both in terms of achieving accuracy and managing the computational
resources required. The current default parameters are likely suitable for
systems up to ternary compounds. For binary systems, the calculations can
generally be run on standard laptop computers, but for ternary systems, around
40 GB of memory or more may be necessary.
For parameter adjustments, it is recommended to consult the pypolymlp documentation and review the relevant research papers.
ntrain
and ntest
#
This method provides a straightforward dataset split: the first ntrain
supercells from the list are used for training, while the last ntest
supercells are reserved for testing.
Convergence with respect to dataset size#
In general, increasing the amount of data improves the accuracy of representing force constants. Therefore, it is recommended to check the convergence of the target property with respect to the number of supercells in the training dataset. Lattice thermal conductivity may be a convenient property to monitor when assessing convergence.
For example, by preparing an initial set with 100 supercell data, calculations can then be performed by varying the size of the training dataset while keeping the test dataset unchanged as follows:
% phonopy-load --pypolymlp --mlp-params="ntrain=20, ntest=20" --br --mesh 40 phonopy_params.yaml | tee log-20
% phonopy-load --pypolymlp --mlp-params="ntrain=40, ntest=20" --br --mesh 40 phonopy_params.yaml | tee log-40
% phonopy-load --pypolymlp --mlp-params="ntrain=60, ntest=20" --br --mesh 40 phonopy_params.yaml | tee log-60
% phonopy-load --pypolymlp --mlp-params="ntrain=80, ntest=20" --br --mesh 40 phonopy_params.yaml | tee log-80
% phonopy-load --pypolymlp --mlp-params="ntrain=100, ntest=20" --br --mesh 40 phonopy_params.yaml | tee log-100
The computed phonon band structures are plotted against the size of the training dataset to observe the frequency convergence. If it has not converged, an additional set of supercell data (e.g., forces and energies in the next 100 supercells) will be computed and included. With this procedure in mind, it may be convenient to generate a sufficiently large number of supercells with random displacements in advance, such as 1000 supercells, before starting the temperature dependent force constants calculation with pypolymlp.