Tests failing for VASP 6.3.2 on A64FX with Arm and OpenMPI

Questions regarding the compilation of VASP on various platforms: hardware, compilers and libraries, etc.


Moderators: Global Moderator, Moderator

Locked
Message
Author
decarlson
Newbie
Newbie
Posts: 3
Joined: Wed Nov 16, 2022 3:13 pm

Tests failing for VASP 6.3.2 on A64FX with Arm and OpenMPI

#1 Post by decarlson » Fri Mar 15, 2024 12:53 pm

Hi All,

I'm trying to build VASP 6.3.2 on an A64FX cluster using the Arm 23.10 compilers + Armpl libraries, with OpenMPI and HDF5.

The build "works" successfully, but there are a lot of tests that are failing.

My makefile.include is below (and also attached with the test log).

Note that I'm building with a patch for vaspsol, but I encounter these errors even building the non-vaspsol version.

Code: Select all

# Default precompiler options
CPP_OPTIONS = -DHOST=\"FJ-A64FX\" \
              -DMPI -DMPI_BLOCK=8000 -Duse_collective \
              -DscaLAPACK \
              -DCACHE_SIZE=4000 \
              -Davoidalloc \
              -Dvasp6 \
              -Duse_bse_te \
              -Dtbdyn \
              -Dfock_dblbuf \
              -D_OPENMP \
              -Dqd_emulate \
              -Dsol_compat

CPP         = cpp -P -traditional-cpp $(CPP_OPTIONS) $*$(FUFFIX) > $*$(SUFFIX)

# N.B.: to cross-compile for A64FX on X86_64 replace mpifrt, frt, fcc, and FCC
#       by mpifrtpx, frtpx, fccpx, and FCCpx, respectively.
FC          = mpif90 -fno-backslash -fopenmp
FCL         = mpif90 -fno-backslash -fopenmp

FREE        = -ffree-form -ffree-line-length-none

FFLAGS      = -w -mcpu=a64fx -fopenmp
OFLAG       = -O2 -g -mcpu=a64fx -armpl
OFLAG_IN    = $(OFLAG)
DEBUG       = -O0 -g
OFLAG_1     = -O1
OFLAG_2     = -O2
OFLAG_3     = -Ofast

ARMPL_DIR = /lustre/software/arm/23.10/armpl-23.10.0_RHEL-8_arm-linux-compiler

BLAS       = -L$(ARMPL_DIR)/lib  -larmpl_lp64_mp
#BLAS        = -lopenblas
LAPACK     = -L$(LIBDIR)/lib  -larmpl_lp64_mp
#LAPACK      = -llapack
SCALAPACK  = -L/lustre/software/scalapack/arm23.10/armpl23.10/2.2.0/lib -lscalapack
BLACS      = /lustre/software/blacs/arm21/BLACS2/LIB/blacs_MPI-LINUX-0.a
QD_LIB     += -L/lustre/software/qd/arm23/2.3.24/lib -lqdmod -lqd
INCS       += -I/lustre/software/qd/arm23/2.3.24/include
INCS       += -I/lustre/software/qd/arm23/2.3.24/include/qd

LLIBS = $(SCALAPACK) $(LAPACK) $(BLAS) $(INCS) $(QD_LIB)

OBJECTS     = fftmpiw.o fftmpi_map.o fft3dlib.o fftw3d.o
OBJECTS_O1 += minimax_dependence.o fftw3d.o fftmpi.o fftmpiw.o
OBJECTS_O2 += fft3dlib.o

# For what used to be vasp.5.lib
CPP_LIB     = $(CPP)
FC_LIB      = $(FC)
CC_LIB      = armclang
#CFLAGS_LIB  = -O3
CFLAGS_LIB  = -O2
FFLAGS_LIB  = $(OFLAG)
FREE_LIB    = $(FREE)

OBJECTS_LIB = linpack_double.o

# For the parser library
CXX_PARS    = armclang++
LLIBS       += -lstdc++ -Lparser -lparser

##
## Customize as of this point! Of course you may change the preceding
## part of this file as well if you like, but it should rarely be
## necessary ...
##

# BLAS, LAPACK, and SCALAPACK (mandatory)
#LLIBS      += -SSL2BLAMP -SCALAPACK

FFTW       ?= ${ARMPL_DIR}
LLIBS      += -L$(FFTW)/lib -larmpl_lp64 -larmpl_lp64_mp
INCS       += -I$(FFTW)/include

# HDF5-support (optional but strongly recommended)
CPP_OPTIONS += -DVASP_HDF5
HDF5_ROOT  ?= /lustre/software/hdf5-parallel/openmpi/arm23.10/1.14.3
#/path/to/your/hdf5/installation
#LLIBS      += -L$(HDF5_ROOT)/lib  -Wl,-rpath,$(HDF5_ROOT)/lib -lhdf5_fortran -lhdf5
LLIBS      += -L$(HDF5_ROOT)/lib -lhdf5_fortran -lhdf5
INCS       += -I$(HDF5_ROOT)/include

# For the VASP-2-Wannier90 interface (optional)
#CPP_OPTIONS    += -DVASP2WANNIER90
#WANNIER90_ROOT ?= /path/to/your/wannier90/installation
#LLIBS          += -L$(WANNIER90_ROOT)/lib -lwannier

# For the fftlib library (recommended)
CPP_OPTIONS+= -Dsysv
FCL        += fftlib.o
CXX_FFTLIB  = armclang++ -fopenmp -DFFTLIB_THREADSAFE
INCS_FFTLIB = -I./include -I$(FFTW)/include
LIBS       += fftlib
LLIBS      += -ldl
Does anybody see anything I might be doing wrong or have suggestions for what I should try to solve these failing tests?

Many thanks!
Dave
You do not have the required permissions to view the files attached to this post.

andreas.singraber
Global Moderator
Global Moderator
Posts: 248
Joined: Mon Apr 26, 2021 7:40 am

Re: Tests failing for VASP 6.3.2 on A64FX with Arm and OpenMPI

#2 Post by andreas.singraber » Mon Mar 25, 2024 3:24 pm

Hello Dave!

Unfortunately, I do not have access to an A64FX machine but I have at least some hint of where the problem may come from.. I noticed that VASP reports at the beginning of each test:

Code: Select all

running    4 mpi-ranks, with   12 threads/rank  
So, the tests are running with 12 OpenMP threads per MPI rank. I do not understand why this leads to the observed behavior, however, may I ask you to repeat the failing tests without OpenMP parallelization? Just try with

Code: Select all

export OMP_NUM_THREADS=1
before running the tests. Thank you!

All the best,
Andreas Singraber

decarlson
Newbie
Newbie
Posts: 3
Joined: Wed Nov 16, 2022 3:13 pm

Re: Tests failing for VASP 6.3.2 on A64FX with Arm and OpenMPI

#3 Post by decarlson » Thu Mar 28, 2024 6:37 pm

Hi Andreas,

Many thanks for your response. Yes, I will rerun the tests with only a single thread per rank and report back on the results.

Thanks!
Dave

Locked