Design matrix for ML_FF of vasp.6.3

Queries about input and output files, running specific calculations, etc.


Moderators: Global Moderator, Moderator

Post Reply
Message
Author
gerbrand_ceder2
Newbie
Newbie
Posts: 1
Joined: Wed Feb 19, 2020 9:41 pm

Design matrix for ML_FF of vasp.6.3

#1 Post by gerbrand_ceder2 » Sun Apr 24, 2022 7:06 pm

I am running some AIMD simulation with ML force field on vasp.6.3. I easily encounter the error message "!!! MLFF : Not enough storage for local reference configurations, please increase se ML_MB !!!"

Would that be okay to increase the ML_MB value? It appears to me that more training data will usually be good for ML. But I also want to make sure there is no upper limit that is not suggested by the developer

alex
Hero Member
Hero Member
Posts: 593
Joined: Tue Nov 16, 2004 2:21 pm
License Nr.: 5-67
Location: Germany

Re: Design matrix for ML_FF of vasp.6.3

#2 Post by alex » Mon Apr 25, 2022 7:52 am

Hello Gerbrand,

a similar issue was discussed here recently: forum/viewtopic.php?f=4&t=18475#p21820

Hth,

alex

ferenc_karsai
Global Moderator
Global Moderator
Posts: 473
Joined: Mon Nov 04, 2019 12:44 pm

Re: Design matrix for ML_FF of vasp.6.3

#3 Post by ferenc_karsai » Mon Apr 25, 2022 9:33 am

It's not exactly the same problem.

In the other post the user ran out of memory immediately because scaLAPACK was not employed. The code is practically unusable without scaLAPACK for realistic systems, since each processor needs to have the entire design matrix which is a huge object. With scaLAPACK the design matrix is linearly scaling with the number of processors.

In our current case Gerbrand doesn't run out of memory, but the maximum number of local reference configurations (ML_MB) is reached. The default is ML_MB=1500. This number is usually enough for simple to medium difficult systems, but for complex systems or training data from different conditions (e.g. Si in it's different phases) this is easily not enough. So in that case just simply increase this number.

ML_MB sets the column dimension of the design matrx for each atom type.
The row dimension is ML_MCONF. ML_MCONF conatins the whole training structures (this is exportable to other ML methods), ML_MB conatins the local reference configurations for specific atoms (this is specific to Kernel ridge regression). So the size of the design matrix that will be allocated is ML_MB*ML_CONF*Number_of_atom_types. Again with scalapack this array will be then shared by all processors, so the more processors one uses the smaller this array will get per processor.

The beginning of the ML_LOGFILE (wiki/index.php/ML_LOGFILE) contains information on the estimated memory.

Most of the arrays like the design matrix are statically allocated at the beginning of the code. Why? Because we use shared memory MPI. At the point when we implemented shared memory using SystemV we saw that reallocations of shared memory segments lead to total irregular crashes. Shared memory is used for many important arrays, so we ended up using static memory allocations.

Post Reply