Symbolic mapping and allocation for the Cholesky factorization on NUMA machines

Abstract

We discuss some performance issues of the tiled Cholesky factorization on non-uniform memory access-time (NUMA) shared memory machines. We show how to optimize thread and data placement in order to improve performance. The final result is 50\% faster than PLASMA and 75\% faster than MKL.

Keywords

performance optimization symbolic data mapping and clustering Cholesky factorization NUMA machine PLASMA MKL

Get full access to this article

View all access options for this article.

References

Agullo

Dongarra

Hadri

(2010) PLASMA users guide. Technical Report, Innovative Computing Laboratory, University of Tennessee , TN.

Bosilca

Bouteiller

Danalis

(2012) DAGuE: A generic distributed DAG engine for high performance computing. Parallel Computing 38 (1–2): 27–51.

Cosnard

Loi

(1995) Automatic task graph generation techniques. Parallel Processing Letters 5(4): 527–538.

Cosnard

Loi

(1996) A simple algorithm for the generation of efficient loop structures. International Journal of Parallel Programming 24(3): 265–289.

Cosnard

Jeannot

Yang

(1999) SLC: Symbolic scheduling for executing parameterized task graphs on multiprocessors. In: International conference on parallel processing (ICPP’99), Aizu Wakamatsu, Japan.

Cosnard

Jeannot

Yang

(2004) Compact DAG representation and its symbolic scheduling. Journal of Parallel and Distributed Computing 64(8): 921–935.

Feautrier

(1991) Dataflow analysis of array and scalar references. International Journal of Parallel Programming 20(1): 23–53.

Feautrier

(1994) Toward automatic distribution. Parallel Processing Letters 4(3): 233–244.

Intel

(2012) Intel math kernel library reference manual. Technical report no. 630813-051US. Available at: http://software.intel.com/sites/products/documentation/hpc/mkl/mklman/mklman.pdf (accessed 7 June 2013).

10.

YarKhan

Kurzak

Dongarra

(2011) QUARK users’ guide: Queueing and runtime for kernels. Technical report no. ICL-UT-11-02, Innovative Computing Laboratory, University of Tennessee , TN.