ML on system with H2O
Moderators: Global Moderator, Moderator
- paulfons
- Jr. Member
- Posts: 85
- Joined: Sun Nov 04, 2012 2:40 am
- License Nr.: 5-1405
- Location: Yokohama, Japan
- Contact:
ML on system with H2O
I have been exploring using the ML force fields in Vasp 6.3.0. While it produces reasonable results for heavier systems like GeTe, I find that the force statistics in particular are significantly reduced compared to an ab-initio calculation for a system with graphene and a few tens of H2O molecules. I suspect that tuning of the ML parameters is called for. In particular, can anyone offer some suggestions as to how to tune parameters such as ML_SION1 for a system containing H2O with bond lengths around 1 Angstrom?
-
- Global Moderator
- Posts: 473
- Joined: Mon Nov 04, 2019 12:44 pm
Re: ML on system with H2O
Please post your calculations here (POSCAR, POTCAR, KPOINTS, INCAR, ML_LOGFILE, OUTCAR and ML_AB file that you get).
- paulfons
- Jr. Member
- Posts: 85
- Joined: Sun Nov 04, 2012 2:40 am
- License Nr.: 5-1405
- Location: Yokohama, Japan
- Contact:
Re: ML on system with H2O
I have posted the input files. Unfortunately the OUTCAR file and the ML_logfile files are quite large and will be difficult to post. I will try uploading the ML_LOGFILE file in a separate reply
You do not have the required permissions to view the files attached to this post.
- paulfons
- Jr. Member
- Posts: 85
- Joined: Sun Nov 04, 2012 2:40 am
- License Nr.: 5-1405
- Location: Yokohama, Japan
- Contact:
Re: ML on system with H2O
The OUTCAR file is too big to post (as is the ML_LOGFILE). I have posted a Dropbox link to the gzipped ML_LOGFILE, but I don't have enough space on Dropbox to post the OUTCAR file.
https://www.dropbox.com/s/k6d69hcufhqma ... LE.gz?dl=0
https://www.dropbox.com/s/k6d69hcufhqma ... LE.gz?dl=0
- paulfons
- Jr. Member
- Posts: 85
- Joined: Sun Nov 04, 2012 2:40 am
- License Nr.: 5-1405
- Location: Yokohama, Japan
- Contact:
Re: ML on system with H2O
The system composition is H32 C32 O16. It consists of graphene with some water molecules above it. It is a toy system to learn more about ML before starting actual calculations. The input files INCAR, KPOINTS, POTCAR, POSCAR are identical except that for the addition of ML related tags and the removal of MAXMIX and mixing tags from the ML run. Note the standard deviation in forces is significantly larger for the ab-initio run.
For the ab-initio simulation, here are some simple aggregate statistics by atom type:
Hydrogen Mean Forces: Fx: -0.0001 Fy: 0.0003 Fz: -0.0004
Carbon Mean Forces: Fx: -0.0015 Fy: 0.0040 Fz: -0.0041
Oxygen Mean Forces: Fx: -0.0020 Fy: 0.0051 Fz: -0.0056
Hydrogen std Forces: Fx: 1.2350 Fy: 1.2406 Fz: 1.3687
Carbon std Forces: Fx: 0.6336 Fy: 0.6440 Fz: 0.7802
Oxygen std Forces: Fx: 2.4204 Fy: 2.4183 Fz: 2.8169
Here are the same results for the machine learning run (same input without MAXMIX and mixing tags)
Hydrogen Mean Forces: Fx: 0.0000 Fy: 0.0000 Fz: 0.0000
Carbon Mean Forces: Fx: 0.0000 Fy: -0.0000 Fz: -0.0000
Oxygen Mean Forces: Fx: 0.0000 Fy: -0.0000 Fz: -0.0000
Hydrogen std Forces: Fx: 0.0925 Fy: 0.0918 Fz: 0.0786
Carbon std Forces: Fx: 0.0188 Fy: 0.0186 Fz: 0.0208
Oxygen std Forces: Fx: 0.0851 Fy: 0.0835 Fz: 0.0848
For the ab-initio simulation, here are some simple aggregate statistics by atom type:
Hydrogen Mean Forces: Fx: -0.0001 Fy: 0.0003 Fz: -0.0004
Carbon Mean Forces: Fx: -0.0015 Fy: 0.0040 Fz: -0.0041
Oxygen Mean Forces: Fx: -0.0020 Fy: 0.0051 Fz: -0.0056
Hydrogen std Forces: Fx: 1.2350 Fy: 1.2406 Fz: 1.3687
Carbon std Forces: Fx: 0.6336 Fy: 0.6440 Fz: 0.7802
Oxygen std Forces: Fx: 2.4204 Fy: 2.4183 Fz: 2.8169
Here are the same results for the machine learning run (same input without MAXMIX and mixing tags)
Hydrogen Mean Forces: Fx: 0.0000 Fy: 0.0000 Fz: 0.0000
Carbon Mean Forces: Fx: 0.0000 Fy: -0.0000 Fz: -0.0000
Oxygen Mean Forces: Fx: 0.0000 Fy: -0.0000 Fz: -0.0000
Hydrogen std Forces: Fx: 0.0925 Fy: 0.0918 Fz: 0.0786
Carbon std Forces: Fx: 0.0188 Fy: 0.0186 Fz: 0.0208
Oxygen std Forces: Fx: 0.0851 Fy: 0.0835 Fz: 0.0848
- paulfons
- Jr. Member
- Posts: 85
- Joined: Sun Nov 04, 2012 2:40 am
- License Nr.: 5-1405
- Location: Yokohama, Japan
- Contact:
Re: ML on system with H2O
I was curious to know if you were able to download the files correctly. If you still need to OUTCAR file, please let me know. I can send it in parts or investigate other methods to forward it if necessary.
-
- Global Moderator
- Posts: 473
- Joined: Mon Nov 04, 2019 12:44 pm
Re: ML on system with H2O
Thanks I have received the ML_LOGFILE. I will need some time to have a proper look on it.
What I immediately see is:
You don't have enough local reference configurations and training structures (16 training structrues - 5th line in the ML_AB).
So in the first 16 calculations local reference configurations and training structures are picked up and the new threshold for judgement of the Bayesian error is recalculated from the previous Bayesian errors. You can quickly now that by typing "grep ERR ML_LOGFILE". "ERR" contains the real errors of the force compared to ab-initio data and is only calculated if an update of the force field occurs due to learning. Unfortunately at the end of the 16th step the new Bayesian error is estimated too high ("grep THRUPD ML_LOGFILE") and from then on every new configuration has a maximum force error that is below that threshold ("grep BEEF ML_LOGFILE"). This can sometimes happen since the automatic determination of the new threshold relies on the previous values, but at the beginning of the calculation the previous values have too bad statistics.
One way to circumvent this problem is to learn via heating runs. Since the errors grow with increasing temperature, one is less likely to get stuck below the errors of the previous temperatures. Also I saw you run on 300K. Is that your target temperature for your production runs? If yes I would advise to learn also at temperatures above that, since higher temperatures contain larger number of configurations and you are less likely to extrapolate with your force field in the production runs.
So you could try to set for example:
TEBEG=200
TEEND=500
Please also send me your POSCAR and POTCAR files. I need to check everything and also be consistent with your calculation.
What I immediately see is:
You don't have enough local reference configurations and training structures (16 training structrues - 5th line in the ML_AB).
So in the first 16 calculations local reference configurations and training structures are picked up and the new threshold for judgement of the Bayesian error is recalculated from the previous Bayesian errors. You can quickly now that by typing "grep ERR ML_LOGFILE". "ERR" contains the real errors of the force compared to ab-initio data and is only calculated if an update of the force field occurs due to learning. Unfortunately at the end of the 16th step the new Bayesian error is estimated too high ("grep THRUPD ML_LOGFILE") and from then on every new configuration has a maximum force error that is below that threshold ("grep BEEF ML_LOGFILE"). This can sometimes happen since the automatic determination of the new threshold relies on the previous values, but at the beginning of the calculation the previous values have too bad statistics.
One way to circumvent this problem is to learn via heating runs. Since the errors grow with increasing temperature, one is less likely to get stuck below the errors of the previous temperatures. Also I saw you run on 300K. Is that your target temperature for your production runs? If yes I would advise to learn also at temperatures above that, since higher temperatures contain larger number of configurations and you are less likely to extrapolate with your force field in the production runs.
So you could try to set for example:
TEBEG=200
TEEND=500
Please also send me your POSCAR and POTCAR files. I need to check everything and also be consistent with your calculation.
- paulfons
- Jr. Member
- Posts: 85
- Joined: Sun Nov 04, 2012 2:40 am
- License Nr.: 5-1405
- Location: Yokohama, Japan
- Contact:
Re: ML on system with H2O
Dear Ferenc,
Thank you for your reply. I can see the problem. I will try running the ML again with a temperature ramp as you suggest. I would really like to use the ML as for the test system (another post) of 428 atoms consisting of graphene and water, the speed up over the ab-initio results is between two and three orders of magnitude faster. Strangely enough running the same simulation on a gpu is still slow with the odd result that the individual scf steps are about 1 second, but the update steps take 10 minutes or so. It looks like ML is the way to go, but it is critical that I train the FF correctly. I will get back to you with the results of the temperature ramp and the associated forces.
Thank you for your reply. I can see the problem. I will try running the ML again with a temperature ramp as you suggest. I would really like to use the ML as for the test system (another post) of 428 atoms consisting of graphene and water, the speed up over the ab-initio results is between two and three orders of magnitude faster. Strangely enough running the same simulation on a gpu is still slow with the odd result that the individual scf steps are about 1 second, but the update steps take 10 minutes or so. It looks like ML is the way to go, but it is critical that I train the FF correctly. I will get back to you with the results of the temperature ramp and the associated forces.
-
- Global Moderator
- Posts: 473
- Joined: Mon Nov 04, 2019 12:44 pm
Re: ML on system with H2O
Yes the speed up of ab-initio MD compared to pure ML force field (ML_ISTART=2) at the moment is around 1000-10000. But we are working on an optimization of the code to make the ML even significant faster (there is plenty of room within the optimization, so I'm very optimistic on that).
The ML code is not ported for GPU's at the moment, so please don't try that.
Also don't forget, you can train on a smaller system but run the force field on much larger systems.
Additionally in our interface pinning calculations (R. Jinnouchi, F. Karsai, and G. Kresse, Phys. Rev. B 100, 014105 (2019)), we saw that it is sufficient to train only on liquid and solid systems (with under 100 atoms each) separately and then run the pure ML force field calculations on the interface with several hundreds of atoms. This way one can save huge ab initio training calculations on the interface, or at least only need to collect very few number of training structures on the interface and the interface can be also kept much smaller then in the production runs. But be cautious and test it for your system, because I cannot confirm generality. We observed this in the few interface systems we had so far.
We have some experience with water, I will talk to the boss and if it is ok I will upload our training structures and conditions. Of course this is then for pure water so maybe you need to adjust for your system
The ML code is not ported for GPU's at the moment, so please don't try that.
Also don't forget, you can train on a smaller system but run the force field on much larger systems.
Additionally in our interface pinning calculations (R. Jinnouchi, F. Karsai, and G. Kresse, Phys. Rev. B 100, 014105 (2019)), we saw that it is sufficient to train only on liquid and solid systems (with under 100 atoms each) separately and then run the pure ML force field calculations on the interface with several hundreds of atoms. This way one can save huge ab initio training calculations on the interface, or at least only need to collect very few number of training structures on the interface and the interface can be also kept much smaller then in the production runs. But be cautious and test it for your system, because I cannot confirm generality. We observed this in the few interface systems we had so far.
We have some experience with water, I will talk to the boss and if it is ok I will upload our training structures and conditions. Of course this is then for pure water so maybe you need to adjust for your system
- paulfons
- Jr. Member
- Posts: 85
- Joined: Sun Nov 04, 2012 2:40 am
- License Nr.: 5-1405
- Location: Yokohama, Japan
- Contact:
Re: ML on system with H2O
Dear Ferenc,
Thank you for your quick response. As I mentioned in my last note, I am following up on your suggestion to improve the training set by varying the temperature. I am using a system of about 100 atoms for the training and the training process is quite fast with this system size. I had another concern in reading about the details of the machine learning parameters in the Vasp wiki, namely the half-widths for the Gaussians used in the fitting (ML_SION1) default to 0.5 Angstrom. For the case of water the bond is quite a bit shorter than typical solids (about 1 Angstrom) so I was wondering if it would be wise to decrease ML_SION from 0.5 Angstroms to something smaller (0.25 Angstrom for example). Is this concern unwarranted?
Thank you for your quick response. As I mentioned in my last note, I am following up on your suggestion to improve the training set by varying the temperature. I am using a system of about 100 atoms for the training and the training process is quite fast with this system size. I had another concern in reading about the details of the machine learning parameters in the Vasp wiki, namely the half-widths for the Gaussians used in the fitting (ML_SION1) default to 0.5 Angstrom. For the case of water the bond is quite a bit shorter than typical solids (about 1 Angstrom) so I was wondering if it would be wise to decrease ML_SION from 0.5 Angstroms to something smaller (0.25 Angstrom for example). Is this concern unwarranted?
- paulfons
- Jr. Member
- Posts: 85
- Joined: Sun Nov 04, 2012 2:40 am
- License Nr.: 5-1405
- Location: Yokohama, Japan
- Contact:
Re: ML on system with H2O
Dear Ferenc,
I noticed you asked for the POSCAR and POTCAR files. I am attaching them to this note. Note there is nothing special regarding the initial configuration of the water molecules. They were placed by a Monte Carlo algorithm requiring only a minimum spacing from other atoms.
I noticed you asked for the POSCAR and POTCAR files. I am attaching them to this note. Note there is nothing special regarding the initial configuration of the water molecules. They were placed by a Monte Carlo algorithm requiring only a minimum spacing from other atoms.
You do not have the required permissions to view the files attached to this post.
-
- Global Moderator
- Posts: 473
- Joined: Mon Nov 04, 2019 12:44 pm
Re: ML on system with H2O
The Guassian width (ML_SION) is usually more a problem for the fitting, especially with larger values the condition number of the problem gets worsened. But for ML_SION lower than the default possibly the number of radial basis functions for the descriptors need to be increased. This results in higher computational demand.
We have a wiki article how to get accurate force fields:
wiki/index.php/Machine_learning_force_f ... ns:_Basics
I would maybe first start to learn pure water.
Water is not such an easy to learn system (generally liquids are hard to learn).
Maybe you can try
ML_IALGO_LINREG=1; ML_SION1=0.3; ML_MRB1=12
Please expect around 1000-2000 training structures (ML_MCONF) and several thousand local reference configurations (ML_MB).
I think after that you should try the water-graphene interface.
We have a wiki article how to get accurate force fields:
wiki/index.php/Machine_learning_force_f ... ns:_Basics
I would maybe first start to learn pure water.
Water is not such an easy to learn system (generally liquids are hard to learn).
Maybe you can try
ML_IALGO_LINREG=1; ML_SION1=0.3; ML_MRB1=12
Please expect around 1000-2000 training structures (ML_MCONF) and several thousand local reference configurations (ML_MB).
I think after that you should try the water-graphene interface.
- paulfons
- Jr. Member
- Posts: 85
- Joined: Sun Nov 04, 2012 2:40 am
- License Nr.: 5-1405
- Location: Yokohama, Japan
- Contact:
Re: ML on system with H2O
I have carried out ML with only H2O atoms in a big box and then tried to continue to training using with a graphene sheet in a cell. I used 5000 steps for the H2O in a box training while ramping the temperature from 200 to 500K. This generated a total of 21 configurations according to a grep of the ML_LOGFILE. As the H-O bond is only about 1 Angstrom, I reduced the size of ML_SION1 and ML_SION2 to 0.3 Angstroms as I was worried that the probability density distribution might not pick up changes to the water molecules correctly. I was also under the impression that including a temperature ramp as well as a smaller ML_SION values would increase the number of configurations found, but in 5000 steps, only a total of 21 configurations where found. What is a reasonable number for configurations and how can I go about achieving a sufficient number of configurations. I have attached the various input and ML_LOGFILE for reference.
Code: Select all
LCONF 1 H 0 66 O 0 33
LCONF 2 H 65 131 O 32 65
LCONF 3 H 131 197 O 65 98
LCONF 4 H 195 261 O 97 130
LCONF 5 H 255 321 O 127 160
LCONF 6 H 308 374 O 151 184
LCONF 7 H 361 427 O 176 209
LCONF 8 H 403 469 O 194 227
LCONF 9 H 438 504 O 207 240
LCONF 10 H 471 537 O 224 257
LCONF 20 H 504 525 O 237 250
LCONF 21 H 517 582 O 245 278
LCONF 22 H 573 635 O 276 308
LCONF 23 H 628 694 O 306 339
LCONF 24 H 688 754 O 336 369
LCONF 25 H 749 815 O 367 400
LCONF 26 H 809 870 O 395 427
LCONF 27 H 859 917 O 421 449
LCONF 37 H 905 912 O 443 448
LCONF 608 H 2015 2020 O 1006 1006
You do not have the required permissions to view the files attached to this post.
-
- Global Moderator
- Posts: 473
- Joined: Mon Nov 04, 2019 12:44 pm
Re: ML on system with H2O
For liquids I would expect several hundreds to a few thousand training structures and a few thousand local reference configurations.
So your water calculation seems quite unlucky and you seem to stumble upon a water setting where the current automatic threshold determination seems to fail.
We have currently some changes in the default settings in the latest developers version that aid the stability of the criteria determination. These settings will soon be in the next update 6.3.1.
Give us some time to check these calculations ourselves and then come back to you.
So first we will try to cure your pure water calculation and only after that you should try the interface.
So your water calculation seems quite unlucky and you seem to stumble upon a water setting where the current automatic threshold determination seems to fail.
We have currently some changes in the default settings in the latest developers version that aid the stability of the criteria determination. These settings will soon be in the next update 6.3.1.
Give us some time to check these calculations ourselves and then come back to you.
So first we will try to cure your pure water calculation and only after that you should try the interface.
- paulfons
- Jr. Member
- Posts: 85
- Joined: Sun Nov 04, 2012 2:40 am
- License Nr.: 5-1405
- Location: Yokohama, Japan
- Contact:
Re: ML on system with H2O with 6.3.1
Dear Ferenc,
I built and installed Vasp.6.3.1 and tried running the H2O learning process. I realized that you may be working on confirming the changes yourself, but I thought I would give you an update. The input files are attached. Compared with version 6.3.0, the ML sampling seems to have increased, but the run hangs with the temperature becoming undefined. I varied the temperature in this run from 200-500 as you suggested earlier to improve the configuration sampling. I assume this is because the force field is incorrect.
I also had another question. How to I confirm the number of reference configurations and training structures? The ML_LOGFILE seems to give the number of local reference configurations, but what did you mean earlier by "training structures" and how can I check the values for this.
I note in this run (which hang due to apparent temperature problems), the number of reference configurations seems to be 841 when the job hanged. If things were working normally, I assume I should aim for about 3000 reference configurations. Is this correct?
If you are still working on confirming the H2O training, please accept my apologies for jumping the gun, but I thought it would be a good idea to try it out. Would it be helpful to change the ML_CX parameter to a negative value?
LCONF 24 H 623 686 O 303 335
LCONF 25 H 678 738 O 331 361
LCONF 26 H 731 790 O 355 384
LCONF 27 H 785 837 O 377 402
LCONF 38 H 830 841 O 398 408
OSZICAR tail
38 T= 2666. E= -.34729491E+03 F= -.42015726E+03 E0= -.42015726E+03 EK= 0.33776E+02 SP= 0.14E+02 SK= 0.25E+02
39 T= 7226. E= -.30198113E+03 F= -.47947883E+03 E0= -.47947883E+03 EK= 0.91531E+02 SP= 0.16E+02 SK= 0.70E+02
40 T= 3020. E= -.29036222E+03 F= -.47911317E+03 E0= -.47911317E+03 EK= 0.38262E+02 SP= 0.18E+02 SK= 0.13E+03
41 T= 1027. E= -.27286507E+03 F= -.47950076E+03 E0= -.47950076E+03 EK= 0.13013E+02 SP= 0.20E+02 SK= 0.17E+03
42 T= 314. E= -.24654575E+03 F= -.47984964E+03 E0= -.47984964E+03 EK= 0.39803E+01 SP= 0.23E+02 SK= 0.21E+03
43 T= 98. E= -.20538318E+03 F= -.48025734E+03 E0= -.48025734E+03 EK= 0.12469E+01 SP= 0.26E+02 SK= 0.25E+03
44 T= 38. E= -.12965465E+03 F= -.48066793E+03 E0= -.48066793E+03 EK= 0.48265E+00 SP= 0.30E+02 SK= 0.32E+03
45 T= 19. E= 0.91974487E+02 F= -.48102718E+03 E0= -.48102718E+03 EK= 0.23731E+00 SP= 0.34E+02 SK= 0.54E+03
46 T= 4. E= 0.22036670E+05 F= -.48129367E+03 E0= -.48129367E+03 EK= 0.54099E-01 SP= 0.39E+02 SK= 0.22E+05
47 T= ****** E= 0.19601427+122 F= -.48107354E+03 E0= -.48107354E+03 EK= 0.14851E+57 SP= 0.58E+02 SK= 0.20+122
I built and installed Vasp.6.3.1 and tried running the H2O learning process. I realized that you may be working on confirming the changes yourself, but I thought I would give you an update. The input files are attached. Compared with version 6.3.0, the ML sampling seems to have increased, but the run hangs with the temperature becoming undefined. I varied the temperature in this run from 200-500 as you suggested earlier to improve the configuration sampling. I assume this is because the force field is incorrect.
I also had another question. How to I confirm the number of reference configurations and training structures? The ML_LOGFILE seems to give the number of local reference configurations, but what did you mean earlier by "training structures" and how can I check the values for this.
I note in this run (which hang due to apparent temperature problems), the number of reference configurations seems to be 841 when the job hanged. If things were working normally, I assume I should aim for about 3000 reference configurations. Is this correct?
If you are still working on confirming the H2O training, please accept my apologies for jumping the gun, but I thought it would be a good idea to try it out. Would it be helpful to change the ML_CX parameter to a negative value?
LCONF 24 H 623 686 O 303 335
LCONF 25 H 678 738 O 331 361
LCONF 26 H 731 790 O 355 384
LCONF 27 H 785 837 O 377 402
LCONF 38 H 830 841 O 398 408
OSZICAR tail
38 T= 2666. E= -.34729491E+03 F= -.42015726E+03 E0= -.42015726E+03 EK= 0.33776E+02 SP= 0.14E+02 SK= 0.25E+02
39 T= 7226. E= -.30198113E+03 F= -.47947883E+03 E0= -.47947883E+03 EK= 0.91531E+02 SP= 0.16E+02 SK= 0.70E+02
40 T= 3020. E= -.29036222E+03 F= -.47911317E+03 E0= -.47911317E+03 EK= 0.38262E+02 SP= 0.18E+02 SK= 0.13E+03
41 T= 1027. E= -.27286507E+03 F= -.47950076E+03 E0= -.47950076E+03 EK= 0.13013E+02 SP= 0.20E+02 SK= 0.17E+03
42 T= 314. E= -.24654575E+03 F= -.47984964E+03 E0= -.47984964E+03 EK= 0.39803E+01 SP= 0.23E+02 SK= 0.21E+03
43 T= 98. E= -.20538318E+03 F= -.48025734E+03 E0= -.48025734E+03 EK= 0.12469E+01 SP= 0.26E+02 SK= 0.25E+03
44 T= 38. E= -.12965465E+03 F= -.48066793E+03 E0= -.48066793E+03 EK= 0.48265E+00 SP= 0.30E+02 SK= 0.32E+03
45 T= 19. E= 0.91974487E+02 F= -.48102718E+03 E0= -.48102718E+03 EK= 0.23731E+00 SP= 0.34E+02 SK= 0.54E+03
46 T= 4. E= 0.22036670E+05 F= -.48129367E+03 E0= -.48129367E+03 EK= 0.54099E-01 SP= 0.39E+02 SK= 0.22E+05
47 T= ****** E= 0.19601427+122 F= -.48107354E+03 E0= -.48107354E+03 EK= 0.14851E+57 SP= 0.58E+02 SK= 0.20+122
You do not have the required permissions to view the files attached to this post.