Dear all,
I am using MLFF to learn a PES. I realizd that for the learning step VASP excutes MLFF run before SCF loop starts, and this can take 1000 longer than regular MLFF steps in the process. This happens with very accurate learning settings such as
ML_WTIFOR = 20
ML_CX = -0.1
ML_IALGO_LINREG=1
ML_SION1=0.3
ML_MRB2=12
The MLFF timings just before the SCF loop:
ML FREE ENERGIE OF THE ION-ELECTRON SYSTEM (eV)
---------------------------------------------------
free energy ML TOTEN = -1908.20671935 eV
ML energy without entropy= -1908.20671935 ML energy(sigma->0) = -1908.20671935
MLFF: cpu time 0.0794: real time 0.0807
WAVPRE: cpu time 0.1541: real time 0.1542
FEWALD: cpu time 0.0070: real time 0.0069
ORTHCH: cpu time 4.3136: real time 4.3167
LOOP+: cpu time 579.2458: real time 579.9671
A regular accurate interval is
ML FREE ENERGIE OF THE ION-ELECTRON SYSTEM (eV)
---------------------------------------------------
free energy ML TOTEN = -1928.89655573 eV
ML energy without entropy= -1928.89655573 ML energy(sigma->0) = -1928.89655573
MLFF: cpu time 0.0981: real time 0.0995
LOOP+: cpu time 0.1069: real time 0.1094
RANDOM_SEED = 262985622 1728 0
IONSTEP: cpu time 0.0087: real time 0.0087
I experience that with less number of atoms and less accurate settings this timing is reduced a lot. I think it is mostly related to I/O as MLFF times are quite short. Would you have an idea why this step takes so much time and can be alleviated?
The files can be accessed from: https://www.dropbox.com/scl/fo/ubdrs42b ... zjycq&dl=0
Regards,
Burak
Long MLFF learning times before SCF loops
Moderators: Global Moderator, Moderator
-
- Jr. Member
- Posts: 51
- Joined: Thu Apr 06, 2023 12:25 pm
-
- Global Moderator
- Posts: 473
- Joined: Mon Nov 04, 2019 12:44 pm
Re: Long MLFF learning times before SCF loops
For further analysis and to reproduce the calculation there are some files missing.
Please send the following: OUTCAR, ML_AB, OSZICAR, ML_LOGFILE.
Please send the following: OUTCAR, ML_AB, OSZICAR, ML_LOGFILE.
-
- Jr. Member
- Posts: 51
- Joined: Thu Apr 06, 2023 12:25 pm
Re: Long MLFF learning times before SCF loops
Dear Ferenc,
please find the file
https://www.dropbox.com/scl/fo/jk3v9n9g ... zu0om&dl=0
I think the large LOOP+ times are for MLFF run step after SCF loop and shows the additionl the time spent during SCF loop in the previous step. I kind of verified this, but good to know your opinion to be sure.
Regards,
Burak
please find the file
https://www.dropbox.com/scl/fo/jk3v9n9g ... zu0om&dl=0
I think the large LOOP+ times are for MLFF run step after SCF loop and shows the additionl the time spent during SCF loop in the previous step. I kind of verified this, but good to know your opinion to be sure.
Regards,
Burak
-
- Global Moderator
- Posts: 249
- Joined: Mon Apr 26, 2021 7:40 am
Re: Long MLFF learning times before SCF loops
Dear Burak,
I had a closer look at this and there seems to be a minor bug in the OUTCAR output. As you suggested the timing from the ab initio calculation is incorrectly added to the LOOP+ value of the next ionic iteration, therefore resulting in a confusing output. For example, consider a ML run with lots of prediction steps and a single ab initio calculation in between. The ML_LOGFILE may look like this:
Then, in the OUTCAR file we can find the following order of LOOP+ timing statements:
As you can see from my remarks the LOOP+ timer is incorrectly stopped and written out after the ML prediction in step 32. It seems it is then immediately reset and adds the time of the ab initio calculation. Since there is no further LOOP+ output in step 32 we find the timing of the ab initio calculation only afterwards in step 33.
Thank you for making us aware of this bug, we will fix this until the next release!
All the best,
Andreas Singraber
I had a closer look at this and there seems to be a minor bug in the OUTCAR output. As you suggested the timing from the ab initio calculation is incorrectly added to the LOOP+ value of the next ionic iteration, therefore resulting in a confusing output. For example, consider a ML run with lots of prediction steps and a single ab initio calculation in between. The ML_LOGFILE may look like this:
Code: Select all
...
--------------------------------------------------------------------------------
STATUS 30 interval 7 F T 8 8
BEEF 30 1.14869189E-06 7.97017403E-03 3.17484779E-03 5.68933390E-03 2.01358232E-01 1.48497335E-01
--------------------------------------------------------------------------------
STATUS 31 interval 7 F T 9 9
BEEF 31 1.14999887E-06 8.16317631E-03 3.20769044E-03 5.68933390E-03 1.98323879E-01 1.46492904E-01
--------------------------------------------------------------------------------
STATUS 32 threshold 2 T T 0 10
BEEF 32 1.15492983E-06 8.37160870E-03 3.25291923E-03 5.68933390E-03 1.96233164E-01 1.44612855E-01
--------------------------------------------------------------------------------
STATUS 33 interval 7 F T 1 11
BEEF 33 1.15713436E-06 8.62825755E-03 3.26689721E-03 5.68933390E-03 1.91540138E-01 1.41744708E-01
--------------------------------------------------------------------------------
STATUS 34 interval 7 F T 2 12
BEEF 34 1.12730343E-06 8.76582926E-03 3.18219016E-03 5.68933390E-03 1.83203490E-01 1.34458294E-01
--------------------------------------------------------------------------------
...
Code: Select all
...
--------------------------------------- Ionic step 30 -------------------------------------------
... MLFF prediction ...
MLFF: cpu time 0.1060: real time 0.1061
LOOP+: cpu time 0.1066: real time 0.1067
...
--------------------------------------- Ionic step 31 -------------------------------------------
... MLFF prediction ...
MLFF: cpu time 0.1067: real time 0.1069
LOOP+: cpu time 0.1074: real time 0.1075
...
--------------------------------------- Ionic step 32 -------------------------------------------
... MLFF prediction ...
MLFF: cpu time 0.1175: real time 0.1176
WAVPRE: cpu time 0.0013: real time 0.0013 \
FEWALD: cpu time 0.0030: real time 0.0030 |
ORTHCH: cpu time 0.0278: real time 0.0278 | THIS SHOULD COME AFTER THE AB INITIO CALCULATION!
LOOP+: cpu time 0.1502: real time 0.1503 /
... Ab initio calculation
...
--------------------------------------- Ionic step 33 -------------------------------------------
... MLFF prediction ...
MLFF: cpu time 0.1050: real time 0.1051
LOOP+: cpu time 3.6094: real time 3.6122 | TIMING OF SCF CYCLE IS INCORRECTLY ADDED HERE!
...
--------------------------------------- Ionic step 34 -------------------------------------------
Thank you for making us aware of this bug, we will fix this until the next release!
All the best,
Andreas Singraber