Abstract
Sparse LU factorization is a fundamental operation in circuit simulation, and its efficiency directly impacts the overall simulation performance, particularly for large-scale circuits. As the demand for high-performance simulation of radio frequency (RF) circuits increases, driven by the proliferation of advanced wireless communication technologies such as 5G and WiFi, optimizing RF circuit simulation has become crucial. RF simulation matrices, often sparse, exhibit a unique structure characterized by dense blocks. This distinct structural pattern has been underexplored in prior works, resulting in suboptimal exploitation of available computational resources. In this paper, we address this gap by proposing a novel blocked format for the L and U factors in sparse LU factorization, explicitly tailored to the block structure inherent in RF matrices. This approach facilitates a more efficient representation of the data objects in LU factorization by preserving and exploiting the spatial locality of RF matrices. We then redesign the sparse LU factorization algorithm, aligning it with our proposed blocked storage format. Our algorithm leverages the inherent data locality present in RF matrices, which not only reduces memory transactions but also minimizes the need for costly indirect memory access that typically degrades performance. The proposed data format transformation is streamlined to remove redundant data movement, mitigating the memory-bound operations. Furthermore, we convert vector-based operations into matrix operations, which significantly enhances data reuse and enables more efficient parallelization at the data level. By aligning computational patterns with the underlying memory hierarchy, our method improves computational efficiency. Experimental results demonstrate that our approach substantially outperforms existing state-of-the-art implementations, achieving notable performance improvements, and thereby providing advanced support for high-performance large-scale RF circuit simulation.
Get full access to this article
View all access options for this article.
