ML_MB in MLFF

Message

reza_namakian1 · #1 Post by **reza_namakian1** » Wed Dec 11, 2024 9:05 pm

Dear VASP Team,

I am currently working on developing an MLFF for MAPbI₃ and have capped the ML_MB value at 4000. However, I have observed that certain species, such as I and H, might require up to 6000, whereas 2000 seems sufficient for Pb.

Is there a way to tune the ML_MB value individually for each species?

Thank you in advance for your time and any suggestions you can provide!

Reza.

#2 Post by **ferenc_karsai** » Thu Dec 12, 2024 9:40 am

There is currently no way to tune ML_MB separately for each species.
The important thing is to have an overall cap and I don't think you need to tune the species individually.
Some species like H experience many different environments due to rapid movements and hence need significantly more local reference configurations, while other species like Pb see practically very similar environments. So it is absolutely fine that the number of local reference configurations is different for the different species. I've once tried to artificially increase the number of local reference configurations for a given species, but it absolutely changed nothing in the accuracy. So I think it's just fine if you have different amount of local reference configurations.

In your case, I don't know under what conditions you train MAPbI3 but MAPbI3 is quite easy to learn and in our papers we needed only around 1500-2000 local reference configurations for H and around 500 for Pb.

reza_namakian1 · #3 Post by **reza_namakian1** » Thu Dec 12, 2024 3:53 pm

Dear Frenece,

Thank you for your reply!

I am trying to perform some mechanical testing and fracture analysis on a relatively large supercell of the material.

If I am not mistaken, I thought memory allocation would primarily depend on the $ML_MB$ parameter. Is that correct? If $ML_MB$ is indeed the dominant factor, I was considering allocating more memory to atoms with rapidly changing environments, like H and I, rather than slower ones like Pb. My goal, let's say, is to fully utilize the 1TB memory available while having the flexibility to decide how it is distributed across species. I hope this makes sense.

On another note, I am a little bit concerned about the effect of long-range interactions. As I understand, these interactions may not be directly captured in the MLFF. Do you have any advice on this, particularly during stretching, fracturing, or separating the material? My worry is that once the material separates, the separated parts might not remain neutrally charged, potentially amplifying the impact of long-range interactions.

I really appreciate your thoughts and suggestions on these points!

Reza.

#4 Post by **ferenc_karsai** » Fri Dec 13, 2024 1:35 pm

Memory depends on the combination of ML_MB, the number of species, ML_MCONF (number of training structures) and the number of atoms per structure. Design matrix has size ML_MCONF[(number of atoms for each structure)3+1+6](number of species)ML_MB. The covariance matrix has size 3[(number of species)ML_MB]2. These are possibly the largest things and there are some details how they are substituted among cores/nodes which I'm not going to explain here.

Regarding the long-range effects they are not included at all in our MLFFs. I can't tell you in advance how much their effect is on cracking.
For phase transitions our published results were ok without long-range effects included.

My Community

ML_MB in MLFF

ML_MB in MLFF

Re: ML_MB in MLFF

Re: ML_MB in MLFF

Re: ML_MB in MLFF