Benchmarks
Here you can find some of the benchmarks made with EPW.
Scalability of the interpolation part of EPW v6-alpha2 (to be released) on Frontera for MgB2 on one million k and q points.
The calculations were performed using Intel 19.1.1 with Intel MPI and MKL on Intel Xeon Platinum 8280 (“Cascade Lake”) @ 2.7GHz [56 cores per node machine]. The scaling test was done during Texascale Days in December 2020.
Scalability of the interpolation part of EPW v6-alpha1 (to be released) on Summit for MgB2 on 72x72x72 k and q points.
The calculations were performed using IBM XL 16.1.1 with IBM Spectrum MPI and ESSL on IBM Power9 @ 3.07GHz [42 cores per node machine]. The scaling test was done in November 2020.
Scalability of the interpolation part of EPW v6-alpha (to be released) on Frontera for MgB2 on one million k and q points.
The calculations were performed using Intel 19.0.5 with Intel MPI and MKL on Intel Xeon Platinum 8280 (“Cascade Lake”) @ 2.7GHz [56 cores per node machine]. The scaling test was done in March 2020.
Scalability of the interpolation part of EPW v5.0 on MareNostrum 4 for cubic CsPbI3 on 52,476 k-point grid and 4559 q-points.
The calculations were performed using Intel 17.4 with intel mpi, mkl and fftw on Intel(R) Xeon(R) Platinum 8160 CPU @ 2.10GHz [48 cores per node machine]. The scaling test was done in April 2018.
Scalability of the interpolation part of EPW v4.2 on CSD3 for polar SiC on a 64x64x64 k-point grid and 8x8x8 q-grid.
The calculations were performed using Intel 17.0.4 with intel mpi and mkl and with “-xAVX -mavx -axCOMMON-AVX512” vectorization flags on Intel(R) Xeon Phi(TM) CPU 7210 @ 1.30GHz. Intel Omni-Path HPC interconnect. Multi-petabyte SSD-accelerated Intel Lustre.
Scalability of the interpolation part of EPW v4.1 on ARCHER Cray XC30 for the polar wurtzite GaN.
The calculations were performed using the Intel 15.0.2.164 compiler on a Cray XC30 machine with 12-core Intel Xeon E5-2697v2 (Ivy Bridge) 2.7 GHz processors sharing 64GB of memory and joined by two QPI links, connected via proprietary Cray Aries interconnect (Dragonfly topology). The analysis was performed using Score-P 2.0.2 and Scalasca 2.3.1. instrumentation.
Scalability EPW v4.0 on SiC using a 6 × 6 × 6 Γ-centered k and q-points coarse grids.
The fine grids on which the Wannier interpolation was performed were a 50 × 50 × 50 k-point grid and a 10 × 10 × 10 q-point grid. The test was performed on an Intel Xeon CPU E5620 with a clock frequency of 2.40 GHz. The codes were compiled using ifort 13.0.1 with the following compilation flags -O2 -assume byterecl -g -traceback -nomodule -fpp. The MPI parallelization was performed using Open MPI 1.8.1.