I'm experienceing wierd problem when vasp.6.2.0 vasp_std running with nvidia hpc-sdk compilers (hpc-sdk 21.3 version).
The cpu usage blows up for each mpi processes like the picture below.
stdout also shows these messages I have not seen in gpu compiled version, nor vasp5.
To run, I exported two LD_LIBRARY_PATHs ;
/opt/nvidia/hpc_sdk/Linux_x86_64/21.3/comm_libs/openmpi/openmpi-3.1.5/lib
/opt/nvidia/hpc_sdk/Linux_x86_64/21.3/compilers/extras/qd/lib
Is this normal?
I post on this board since I think it might be compiler related?
The job does run until the end.
Any advice would be appreciated.
Thank you in advance.
------------------------
makefile.include
------------------------
# Precompiler options
CPP_OPTIONS= -DHOST=\"LinuxPGI\" \
-DMPI -DMPI_BLOCK=8000 -DMPI_INPLACE -Duse_collective \
-DscaLAPACK \
-DCACHE_SIZE=4000 \
-Davoidalloc \
-Dvasp6 \
-Duse_bse_te \
-Dtbdyn \
-Dqd_emulate \
-Dfock_dblbuf
nvpath = /opt/nvidia/hpc_sdk/Linux_x86_64/21.3/compilers/bin
CPP = $(nvpath)/nvfortran -Mpreprocess -Mfree -Mextend -E $(CPP_OPTIONS) $*$(FUFFIX) > $*$(SUFFIX)
FC = mpif90
FCL = mpif90 -c++libs
FREE = -Mfree
FFLAGS = -Mbackslash -Mlarge_arrays
OFLAG = -fast
DEBUG = -Mfree -O0 -traceback
# Specify your NV HPC-SDK installation, try to set NVROOT automatically
NVROOT =$(shell which nvfortran | awk -F /compilers/bin/nvfortran '{ print $$1 }')
# ...or set NVROOT manually
NVHPC ?= /opt/nvidia/hpc_sdk
NVVERSION = 21.3
NVROOT = $(NVHPC)/Linux_x86_64/$(NVVERSION)
# Use NV HPC-SDK provided BLAS and LAPACK libraries
BLAS = -lblas
LAPACK = -llapack
BLACS =
SCALAPACK = -Mscalapack
LLIBS = $(SCALAPACK) $(LAPACK) $(BLAS)
# Software emulation of quadruple precsion
QD ?= $(NVROOT)/compilers/extras/qd
LLIBS += -L$(QD)/lib -lqdmod -lqd
INCS += -I$(QD)/include/qd
# Use the FFTs from fftw
FFTW ?= /xtmp/khs/fftw3/fftw-3.3.8
LLIBS += -L$(FFTW)/lib -lfftw3
INCS += -I$(FFTW)/include
OBJECTS = fftmpiw.o fftmpi_map.o fftw3d.o fft3dlib.o
# Redefine the standard list of O1 and O2 objects
SOURCE_O1 := pade_fit.o
SOURCE_O2 := pead.o
# For what used to be vasp.5.lib
CPP_LIB = $(CPP)
FC_LIB = nvfortran
CC_LIB = nvc
CFLAGS_LIB = -O
FFLAGS_LIB = -O1 -Mfixed
FREE_LIB = $(FREE)
OBJECTS_LIB= linpack_double.o getshmem.o
# For the parser library
CXX_PARS = nvc++ --no_warnings
# Normally no need to change this
SRCDIR = ../../src
BINDIR = ../../bin
#================================================
# GPU Stuff
CPP_GPU = -DCUDA_GPU -DRPROMU_CPROJ_OVERLAP -DCUFFT_MIN=28 -UscaLAPACK -Ufock_dblbuf # -DUSE_PINNED_MEMORY
OBJECTS_GPU= fftmpiw.o fftmpi_map.o fft3dlib.o fftw3d_gpu.o fftmpiw_gpu.o
CC = nvc
CXX = nvc++
CFLAGS = -fPIC -DADD_ -mp -cuda -DMAGMA_WITH_MKL -DMAGMA_SETAFFINITY -DGPUSHMEM=300 -DHAVE_CUBLAS
## Use a custom CUDA installation: minimal requirement is CUDA >= 10.X. For "sm_80" you need CUDA >= 11.X.
#CUDA_ROOT ?= /usr/local/cuda
#NVCC := $(CUDA_ROOT)/bin/nvcc
#CUDA_LIB := -L$(CUDA_ROOT)/lib64 -lnvToolsExt -lcudart -lcuda -lcufft -lcublas
# Or use the CUDA installation from the NV HPC-SDK
#CUDA_ROOT ?= $(NVROOT)/cuda/11.0
NVCC := nvcc
CUDA_LIB = -cudalib=cublas,cufft -lnvToolsExt
GENCODE_ARCH := -gencode=arch=compute_60,code=\"sm_60,compute_60\" \
-gencode=arch=compute_70,code=\"sm_70,compute_70\" \
-gencode=arch=compute_80,code=\"sm_80,compute_80\"
MPI_INC = $(NVROOT)/comm_libs/mpi/include
CPU usage explodes when using vasp6.2 cpu version compiled with nvidia hpc-sdk (openacc)
Moderators: Global Moderator, Moderator
-
- Newbie
- Posts: 4
- Joined: Tue Mar 30, 2021 7:32 am
-
- Global Moderator
- Posts: 249
- Joined: Mon Apr 26, 2021 7:40 am
Re: CPU usage explodes when using vasp6.2 cpu version compiled with nvidia hpc-sdk (openacc)
Hi!
This is certainly not normal but at this point I can only speculate what the reason could be... what kind of machine are you using? Is this an 8 core machine? Am I assuming correctly that you started this run with 8 MPI tasks? Although you are obviously compiling without OpenMP or OpenACC support it could be that some library called from VASP is creating multiple threads. Can you try to run again but before set the number of threads manually to 1, e.g. by
The stdout output is particularly weird, can you comment if you get this only on the screen or also in some files?
This is certainly not normal but at this point I can only speculate what the reason could be... what kind of machine are you using? Is this an 8 core machine? Am I assuming correctly that you started this run with 8 MPI tasks? Although you are obviously compiling without OpenMP or OpenACC support it could be that some library called from VASP is creating multiple threads. Can you try to run again but before set the number of threads manually to 1, e.g. by
Code: Select all
export OMP_NUM_THREADS=1
-
- Newbie
- Posts: 4
- Joined: Tue Mar 30, 2021 7:32 am
Re: CPU usage explodes when using vasp6.2 cpu version compiled with nvidia hpc-sdk (openacc)
Hi.
Thanks for the advice, the solution (setting OMP_NUM_THREADS=1) works! The load no longer explodes.
But the stdout remains as above, outputting strange lines.. Though I cut off the calculation within few electronic steps, I have attached the file, in case it may be of use.
My machine is running Ubuntu 18, and indeed I ran with "mpirun -np 8" on AMD threadripper 2990wx(Total 32 cores).
Thanks for the advice, the solution (setting OMP_NUM_THREADS=1) works! The load no longer explodes.
But the stdout remains as above, outputting strange lines.. Though I cut off the calculation within few electronic steps, I have attached the file, in case it may be of use.
My machine is running Ubuntu 18, and indeed I ran with "mpirun -np 8" on AMD threadripper 2990wx(Total 32 cores).
You do not have the required permissions to view the files attached to this post.