We describe a new high-performance conjugate-gradient (HPCG) benchmark. HPCG is composed of computations and data-access patterns commonly found in scientific applications. HPCG strives for a better correlation to existing codes from the computational science domain and to be representative of their performance. HPCG is meant to help drive the computer system design and implementation in directions that will better impact future performance improvement.
BaileyDBarsczEBartonJ. (1994) The NAS parallel benchmarks. Technical Report no. RNR-94-007, NASA Ames Research Center, USA.
2.
BaileyDHarrisTSaphirW. (1995) The NAS parallel benchmarks 2.0. Techinical Report no. NAS-95-020, NASA Ames Research Center, USA.
3.
ByunJHLinRYelickKA. (2012) Autotuning sparse matrix-vector multiplication for multicore. Technical Report no. UCB/EECS-2012-215, University of California, USA.
4.
ChronopoulosAGearC (1989) s-Step iterative methods for symmetric linear systems. Journal of Computational and Applied Mathematics25: 153–168.
5.
D’AzevedoEEijkhoutVRomineC (1993) LAPACK working note 56: Reducing communication costs in the conjugate gradient algorithm on distributed memory multiprocessor. Technical Report no. CS-93-185, University of Tennessee, Knoxville, USA.
6.
der WijngaartRFV (2002) NAS parallel benchmarks version 2.4. Technical Report no. NAS-02-007, Computer Sciences Corporation, NASA Advanced Supercomputing (NAS) Division, USA, October.
7.
DongarraJEijkhoutV (2003) Finite-choice algorithm optimization inconjugate gradients. Technical Report no. 159, LAPACK Working Note, University of Tennessee, USA.
8.
DongarraJEijkhoutVvan der VorstH (2001) Iterative solver benchmark. Scientific Programming9(4): 223–231.
9.
DongarraJHerouxM (2013) Toward a new metric for ranking high performance computing systems. Technical Report no. SAND2013-4744, Sandia National Laboratories, USA.
10.
DongarraJJLuszczekPPetitetA (2003) The LINPACK benchmark: Past, present, and future. Concurrency and Computation: Practice and Experience15(9): 803–820.
11.
EijkhoutV (1992) LAPACK working note 51: Qualitative properties of the conjugate gradient and Lanczos methods in a matrix framework. Technical Report no. CS 92-170, University of Tennessee, USA.
ORNL Leadership Computing Facility (2013b) Introducing Titan — the world’s #1 open science supercomputer, Available at: http://www.olcf.ornl.gov/titan (accessed 29 May 2013).
14.
GhyselsPVanrooseW (2012) Hiding global synchronization latency in the preconditioned Conjugate Gradient algorithm. Technical Report no. 12.2012.1, Intel Labs Europe. Presented at PRECON13, June 19–21, 2013, Oxford, UK.
15.
HerouxMADoerflerDWCrozierPS. (2009) Improving performance via mini-applications. Technical Report no. SAND2009-5574, Sandia National Laboratories.
16.
HoeflerTGottschlingPLumsdaineA. (2007) Optimizing a Conjugate Gradient Solver with Non-Blocking Collective Operations. Elsevier Journal of Parallel Computing33(9): 624–633.
17.
ImEJYelickKVuducR (2004) Sparsity: Optimization framework for sparse matrix kernels. International Journal of High Performance Computing Applications18(1): 135–158.
18.
JoubertWKotheDNamHA (2009) Preparing for exascale: ORNL leadership computing facility application requirements and strategy. Technical Report no. ORNL/TM-2009/308, Oak Ridge National Laboratory, USA, December.
19.
LiuXSmelyanskiyMChowE. (2013) Efficient sparse matrix-vector multiplication on x86-based many-core processors. In: ICS’13, Eugene, OR, 10–14 June 2013.
20.
LuszczekPDongarraJ (2010) Analysis of various scalar, vector, and parallel implementations of RandomAccess. Technical Report no. ICL-UT-10-03, Innovative Computing Laboratory, USA.
21.
LuszczekPDongarraJKepnerJ (2006) Design and implementation of the HPCC benchmark suite. CT Watch Quarterly2(4): 18–23.
VuducRDemmelJYelickK (2005) OSKI: A library of automatically tuned sparse matrix kernels. In: Proceedings of SciDAC 2005, Journal of Physics: Conference Series, San Francisco, CA, 2005, pp. 51–530. Bristol, UK: IOPscience.