NCORE Values on a 128-Atom

Queries about input and output files, running specific calculations, etc.


Moderators: Global Moderator, Moderator

Post Reply
Message
Author
cgtetsas
Newbie
Newbie
Posts: 1
Joined: Tue May 16, 2023 3:02 pm

NCORE Values on a 128-Atom

#1 Post by cgtetsas » Thu Feb 22, 2024 3:43 pm

Hello everyone,

I'm currently working on a computational project that involves a system
with 128 atoms running on VASP 6.4.2. I can access two nodes, each with 32
cores for my calculation. I've been experimenting with various NCORE
values, specifically 2, 4, and 8, to optimize my VASP calculations. Despite
these adjustments, I'm encountering errors that I haven't been able to
resolve.
"
srun: error: gra153: tasks 0,26,28: Out Of Memory

Some of the step tasks have been OOM Killed.
"
Could anyone share insights or suggestions on how to address these issues?
Any advice on optimal NCORE settings for a system of this size or common
pitfalls to watch out for would be greatly appreciated.

Thank you in advance for your assistance!




INCAR:

# VASP Input File
# SYSTEM: INCAR-For-SCF

SYSTEM = INCAR-For-SCF # Title of the SYSTEM

# Electronic Relaxation
ISTART = 1 # Read existing wavefunction, if available (else
set ISTART=0)
ICHARG = 2 # Default: ICHARG=2 (read CHGCAR if available)
ISMEAR = -5 # Gaussian smearing type (0 for
semiconductors/insulators)
SIGMA = 0.05 # Smearing value (0.03-0.05 eV for
semiconductors/insulators)
ALGO = VeryFast # Fast algorithm selection (IALGO=48, RMM-DIIS)
PREC = ACC # Precision mode
ENCUT = 400.0 # Plane-wave cutoff energy (ENMAX from POTCAR)
NELM = 300 # Max number of electronic self-consistency steps
NELMIN = 6 # Min number of electronic self-consistency steps
EDIFF = 1E-06 # SCF energy convergence criterion (in eV)
LREAL = AUTO # Projection operators (Auto for automatic selection)

# Ionic Relaxation
IBRION = 2 # conjugate gradient (SCF Calculation)
POTIM = 0.5 # Time step (0.5 if IBRION=1, 2, 3 for ionic
relaxation)
ISIF = 4 # Stress tensor calculation flag (Ions and cell
shape, fixed volume)
NSW = 100 # Max number of ionic steps (0 for SCF
Calculation)
EDIFFG = -0.02 # Ionic convergence criterion (eV/Å)

# Symmetry and Spin
ISYM = 0 # Symmetry usage in calculations (0 disables
symmetry)
ISPIN = 2 # Spin polarization (1 for non-spin-polarized, 2
for spin-polarized)

# Output Controls
LWAVE = TRUE # Write WAVECAR (wavefunction)
LCHARG = TRUE # Write CHGCAR (charge density)
LORBIT = 11 # Output for magnetization density

# Advanced Settings
IDIPOL = 3 # Monopole/dipole/quadrupole corrections
LMAXMIX = 4 # Mixing parameter (4 for d-elements, 6 for
f-elements)
ADDGRID = TRUE # Additional FFT grid
NELMDL = -20 # Initial electronic minimization
AMIX = 0.2 # Linear mixing parameter
BMIX = 0.00001 # Kerker mixing parameter
AMIX_MAG = 0.8 # Magnetic mixing parameter
BMIX_MAG = 0.00001 # Magnetic Kerker parameter
#IVDW = 11 # Van der Waals correction method (DFT-D3 method)
GGA = PE # Generalized Gradient Approximation (PBE)

# Magnetisation
LNONCOLLINEAR = F
MAGMOM= 26*0 256*0 25*0 25*5.0000 26*5.0000

# NCORE and NPAR Settings
NCORE =4 # Number of cores per band
#NPAR =2
LPLANE = .TRUE.
LSCALU = .FALSE.
#NSIM = 4
KPAR =2
#NBANDS = 768




KPOINT:
K-Points
0
Monkhorst Pack
4 4 4
0 0 0

ferenc_karsai
Global Moderator
Global Moderator
Posts: 473
Joined: Mon Nov 04, 2019 12:44 pm

Re: NCORE Values on a 128-Atom

#2 Post by ferenc_karsai » Thu Feb 22, 2024 4:00 pm

srun: error: gra153: tasks 0,26,28: Out Of Memory

Some of the step tasks have been OOM Killed.
It seems that you run out of memory and your slurm catches that. With NCORE you will not be able to bring down the required memory.
Try reducing the system and it's parameters (Number of atoms, KPOINTS, ENCUT, etc.) just to test if the error goes away.
This system size is possibly not suited for the available memory you have.

For good NCORE values please visit the wiki site https://www.vasp.at/wiki/index.php/NCORE). Ther it writes for large cells:
Ideally, NCORE should be a factor of the number-of-cores-per-socket (or number-of-cores-per-node), since this reduces communication between the sockets or nodes.

Post Reply