Hi, I wish to run a nudged elastic band simulation with 14 images. I wonder if my following settings are optimized for an efficient run.
If I use 84 cores (3 nodes * 28 cores-per-node), each image gets 6 cores.
According to VASPwiki (wiki/index.php/NPAR), I should set NPAR=2 [i.e., \sqrt(\#of cores)] in a cluster with fast network and NPAR=6 in a cluster with slow network. This means that NCORE=3 and NCORE=1, respectively.
However, I found the following warning/advice in the log file, suggesting that the value of NCORE should be between 4 and \sqrt(6)=2.
So I guess my question is, which NCORE value should I choose if I have a fast network? NCORE=3 according to VASPwiki, or NCORE=4 (the minimum number in the advice), or NCORE=2 (the upper bound value in the advice)? Thanks a lot.
-----------------------------------------------------------------------------
| |
| W W AA RRRRR N N II N N GGGG !!! |
| W W A A R R NN N II NN N G G !!! |
| W W A A R R N N N II N N N G !!! |
| W WW W AAAAAA RRRRR N N N II N N N G GGG ! |
| WW WW A A R R N NN II N NN G G |
| W W A A R R N N II N N GGGG !!! |
| |
| For optimal performance we recommend to set |
| NCORE = 4 - approx SQRT(number of cores). |
| NCORE specifies how many cores store one orbital (NPAR=cpu/NCORE). |
| This setting can greatly improve the performance of VASP for DFT. |
| The default, NCORE=1 might be grossly inefficient on modern |
| multi-core architectures or massively parallel machines. Do your |
| own testing!!!! |
| Unfortunately you need to use the default for GW and RPA |
| calculations (for HF NCORE is supported but not extensively tested |
| yet). |
| |
-----------------------------------------------------------------------------
NPAR or NCORE setting for NEB
Moderators: Global Moderator, Moderator
-
- Global Moderator
- Posts: 250
- Joined: Mon Apr 26, 2021 7:40 am
Re: NPAR or NCORE setting for NEB
Hello!
First, you may consider using less images for your NEB simulation. Here is a quote from the VASP Wiki (wiki/index.php/IMAGES):
In the second line I aligned the images (6 cores each). You can easily spot that images 5 and 10 will require communication over node boundaries and therefore
(depending on your network) may slow down the entire NEB simulation. This should be avoided and I suggest using a combination of number of images and/or nodes that would allow the images to fit within node boundaries.
Another point is that I found that the recommendation from your warning message
is not up-to-date with the current source code and the Wiki page (this has changed about 4 months ago). The current recommendation is (in accordance with
the Wiki):
Hence, if we stick to your example where you have 6 cores per image available the recommendation says NCORE=2,3,6 can work which would correspond to NPAR=3,2 and 1, respectively. I recommend you test out these values in a smaller setting where you are not using 14 images. Start with a configuration from a single image and do a normal VASP test run with different NCORE values. Then use the optimal value for your large NEB simulation.
Maybe another hint from the NCORE Wiki page (wiki/index.php/NCORE) is useful in your case:
First, you may consider using less images for your NEB simulation. Here is a quote from the VASP Wiki (wiki/index.php/IMAGES):
14 images looks like a very high number already, potentially your NEB simulation will also work with much less. However, sticking to your example with 14 images it should be noted that you may find unexpected parallel performance because not all your images can rely on intra-node communication. Have a look at this schematic view of your parallel setup where each "*" denotes one core and "|" represents the node boundaries:The fewer images are used, the faster to convergence to the groundstate is. Often, it is advisable to start with a single image between the two endpoints, and to increase the number of images, once this first run has converged.
Code: Select all
nodes/cores: |****************************|****************************|****************************|
images: |(--1-)(--2-)(--3-)(--4-)(--5|-)(--6-)(--7-)(--8-)(--9-)(-|10-)(-11-)(-12-)(-13-)(-14-)|
(depending on your network) may slow down the entire NEB simulation. This should be avoided and I suggest using a combination of number of images and/or nodes that would allow the images to fit within node boundaries.
Another point is that I found that the recommendation from your warning message
Code: Select all
NCORE = 4 - approx SQRT(number of cores)
the Wiki):
Code: Select all
NCORE = 2 up to number-of-cores-per-socket
Maybe another hint from the NCORE Wiki page (wiki/index.php/NCORE) is useful in your case:
The best value NCORE depends somewhat on the number of atoms in the unit cell. Values around 4 are usually ideal for 100 atoms in the unit cell. For very large unit cells (more than 400 atoms) values around 12-16 are often optimal.