In this paper we look at a number of approaches being investigated in the Center for Research on Parallel Computation (CRPC) to develop linear algebra soft ware for high-performance computers. These ap proaches are exemplified by the LAPACK, templates, and ARPACK projects. LAPACK is a software library for performing dense and banded linear algebra computa tions, and was designed to run efficiently on high-per formance computers. We focus on the design of the distributed-memory version of LAPACK, and on an ob ject-oriented interface to LAPACK.
Get full access to this article
View all access options for this article.
References
1.
Alvarez-Cohen, L.M. and McCarty, P.L.1991. A cometabolic biotransformation model for halogenated aliphatic compounds exhibiting product toxicity. Environmental Science and Technology25(8): 1381-1387.
2.
Anderson, E., Bai, Z., Bischof, C., Demmel, J., Dongarra, J., Croz, J.D., Greenbaum, A., Hammarling, S., McKenney, A., Ostrouchov, S., and Sorensen, D.1992. LAPACK User's GuidePhiladelphia : SIAM.
3.
Barrett, R., Berry, M., Chan, T.F., Demmel, J., Donato, J., Dongarra, J., Eijkhout, V., Pozo, R., Romine, C., and van der Vorst, H.1994. Templates for the solution of linear systems. Philadelphia: SIAM.
4.
Choi, J., Dongarra, J.J., Pozo, R., and Walker, D.W.1992. ScaLAPACK: A scalable linear algebra library for distributed memory concurrent computers. In Proc. fourth symposium on massively parallel computing, edited by H. J. Siegel, pp. 120-127.
5.
Choi, J., Dongarra, J.J., and Walker, D.W.1993a. The design of scalable software libraries for distributed memory concurrent computers. In Environments and tools for parallel scientific computing, edited by J. J. Dongarra and B. Tourancheau. Proc. workshop held September 7-8, 1992, in Saint Hilaire du Touvet, France, pp. 3—15.
6.
Choi, J., Dongarra, J.J., and Walker, D.W.1993b. Level 3 BLAS for distributed memory concurrent computers. In Environments and tools for parallel scientific computing, edited by J.J. Dongarra and B. Tourancheau. Proc. workshop held September 7—8, 1992, in Saint Hilaire du Touvet , France, pp. 17-29.
7.
Choi, J., Dongarra, J.J., and Walker, D.W.1994a. The design of a parallel, dense linear algebra software library: Reduction to Hessenberg, tridiagonal, and bidiagonal form . In preparation.
8.
Choi, J., Dongarra, J.J., and Walker, D.W.1994b. PUMMA: Parallel universal matrix multiplication algorithms on distributed memory concurrent computers. Coracurrency: Practice and experience . To appear. (Also available as Oak Ridge National Laboratory report TM-12252, November, 1992.)
9.
Dongarra, J., Du Croz, J., Duff, I., and Hammarling, S.1990. A set of level 3 basic linear algebra subprograms . ACM Trans. Math. Software16: 1-17.
10.
Dongarra, J., and Ostrouchov, S.1990. LAPACK block factorization algorithms on the Intel iPSC/860. Technical Report CS-90-115, University of Tennessee at Knoxville, Computer Science Department.
11.
Dongarra, J.J.1991. LAPACK Working Note 34: Workshop on the BLACS. Computer Science Dept. Technical Report CS-91-134, University of Tennessee, Knoxville, TN. (LAPACK Working Note No. 34).
12.
Dongarra,J.J., DuCroz, J., Hammarling, S., and Hanson, R.1988. An extended set of Fortran basic linear algebra subroutines . ACM Trans. Math. Software14(1): 1-17.
13.
Dongarra, J.J., Hempel, R., Hey, A.J.G., and Walker, D.W.1993. A proposal for a user-level, message passing interface in a distributed memory environment. Technical Report TM-I2231, Oak Ridge National Laboratory.
14.
Dongarra, J.J., Pozo, R., and Walker, D.W.1993. Design overview of object-oriented extensions for high performance linear algebra. In Proc. Supercomputing '93, IEEE Computer Society Press.
15.
Dongarra,J.J., van de Geijn, R., and Walker, D.W.1992. A look at scalable dense linear algebra libraries . In Proc. scalable high-performance computing conference , pp. 372-379. IEEE Publishers.
16.
Dongarra,J.J., and van de Geijn, R.A.1991. Two-dimensional basic linear algebra communication subprograms. Technical Report LAPACK Working Note 37 , Computer Science Department, University of Tennessee , Knoxville, TN.
17.
Dongarra, J.J., van de Geijn, R.A., and Walker, D.W.1994. Scalability issues affecting the design of dense linear algebra library. Journal of Parallel and Distributed Computing. Accepted for publication.
18.
Dykaar, B.1993. Macroscopic groundwater flow and transport coefficients. Ph.D. Thesis Proposal, Stanford University.
19.
Edelman, A.1993. Large dense numerical linear algebra in 1993: The parallel computing influence. Intemational Journal Supercomputer Applications.7(2):113-128.
20.
Edwards, W.S., Tuckerman, L.S., Friesner, R.A., and Sorensen, D.C.1993. Krylov methods for
21.
the incompressible Navier-Stokes equations. Journal of Computational Physics110(1):82-102.
Harrington, R.1990. Origin and development of the method of moments for field computation. IEEE Antennas and Propagation Magazine .
24.
Hayes, E.F., Pendergast, P.H., Darakjian, Z., and Sorensen, D.C.1993. Scalable algorithms for three-dimensional reactive scattering: evaluation of a new algorithm for obtaining surface functions . J. of Computational Physics. To appear.
25.
Hess, J.L.1990. Panel methods in computational fluid dynamics. Annual Reviews of Fluid Mechanics22:255-274.
26.
Hess, J.L., and Smith, M.O.1967. Calculation of potential flows about arbitrary bodies. In Progress in Aeronautical Sciences, Vol. 8, edited by D. Küchemann, Pergamon Press.
27.
Jaffre, J., and Vaudescal, J.-L.1993. Arnoldi's method for two-group neutron diffusion. In Proc. international conference on mathematical methods and supercomputing in nuclear applications , pp. 19-23.
28.
Kooper, M.N., van der Vorst, H.A., Poedts, S., and Goedbloed, J.P.1993. Application of the implicitly updated Arnoldi method with a complex shift and invert strategy in mhd. Technical report, Institute for Plasmaphysics, FOM Rijnhuizen, Nieuwegein , The Netherlands.
29.
Lawson, C., Hanson, R., Kincaid, D., and Krogh, F.1979. Basic linear algebra subprograms for Fortran usage . ACM Trans. Math. Software5:308-323.
30.
Li, T.L., and Kuhn, K.J.1993. Finite element solution to quantum wells by irreducible formulations.
31.
Technical report, University of Washington, Seattle.
32.
Lichtenstein, W., and Johnsson, S.L.1993. Block cyclic dense linear algebra. SIAM Journal on ScientificComputing14(6):1259-1288.
33.
MPI Forum, T.1993a. Document for a standard message-passing interface. Technical Report CS-93-214, Department of Computer Science, University of Tennessee, Knoxville. Also available electronically using netlib.
34.
MPI Forum, T.1993b. MPI: A message passing interface. In Proc. Supercomputing 93. IEEE Computer Society Press. Smith, H.A., Sorensen , D.C., and Singh, R.K.1993. A Lanczos-based eigensolution technique for exact vibration analysis . International Journal for Numerical Methods in Engineering36:1987-2000.
35.
Sorensen, D.C.1992. Implicit application of polynomial filters in a k-step Arnoldi method. SIAM Journal on Numerical Analysis (Series B) 28:1752-1775.
36.
Sorensen, D.C., Tomasic, Z.A., and Vu, P.A.1993. Algorithms and software for large scale eigenproblems on high performance computers. In Proc. high performance computing 93, edited by A. Tentner, Society for Computer Simulation, pp. 149-154.
37.
Van deVelde, E.F.1990. Data redistribution and concurrency. Parallel Computing16.
38.
Walker, D.1992. Standards for message passing in a distributed memory environment. Technical Report TM-12147, Oak Ridge National Laboratory.
39.
Walker, D.W.1994. The design of a standard message passing interface for distributed memory concurrent computers. Parallel Computing20(4):657-673.
40.
Wang, J.J.H.1991. Generalized moment methods in electromagnetics . New York: John Wiley & Sons .