This case study presents an analytical performance model for the DARPA UHPC streaming sensor challenge problem developed using Aspen, a domain-specific language for performance modeling. The model focuses on the exploration of algorithmic tradeoffs, data structures and storage, and the impact of an important tiling factor in the image formation kernel of a synthetic aperture radar image-processing computation.
AlexandrovAIonescuMFSchauserKEScheimanC (1995) LogGP: incorporating long messages into the LogP model. In: Proceedings of the Seventh Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA ’95), pp. 95–105.
2.
CampbellDCookDMulvaneyB (2011) A streaming sensor challenge problem for ubiquitous high performance computing. In: Proceedings of Fifteenth Annual Workshop on High Performance Embedded Computing (HPEC ’11).
3.
CullerDKarpRPattersonD. (1993) LogP: Towards a realistic model of parallel computation. In: Proceedings of the Fourth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP ’93), pp. 1–12.
4.
CzechowskiKBattaglinoCMcClanahanCIyerKYeungPVuducR (2012) On the communication complexity of 3D FFT and its implications for exascale. In: Proceedings of the 2012 International Conference on Supercomputing (ICS 2012).
5.
FrigoMLeisersonCProkopHRamachandranS (1999) Cache-oblivious algorithms. In: 40th Annual Symposium on Foundations of Computer Science, pp. 285–297.
6.
HoeflerTGroppWKramerWSnirM (2011) Performance modeling for systematic performance tuning. In: State of the Practice Reports (SC ’11), pp. 6:1–6:12.
7.
JanssenCLAdalsteinssonHKennyJP (2011) Using simulation to design extremescale applications and architectures: Programming model exploration. SIGMETRICS Performance Evaluation Review38(4): 4–8.
8.
JohnsonSGFrigoM (2007) A modified split-radix FFT with fewer arithmetic operations. IEEE Transactions on Signal Processing55(1): 111–119.
9.
KecklerS (2011) GPU computing and the road to extreme-scale parallel systems. In: IEEE International Symposium on Workload Characterization (IISWC ’11).
10.
KecklerSDallyWKhailanyBGarlandMGlascoD (2011) GPUs and the future of parallel computing. IEEE Micro31(5): 7–17.
11.
ParkJTangPTPSmelyanskiyMKimDBensonT (2012) Efficient backprojection-based synthetic aperture radar computation with many-core processors. In: Proceedings of the ACM International Conference for High Performance Computing, Networking, Storage and Analysis (SC ’12).
SpaffordKVetterJ (2012) Aspen: a domain specific language for performance modeling. In: Proceedings of the ACM International Conference for High Performance Computing, Networking, Storage and Analysis (SC ’12).
14.
ValiantLG (1990) A bridging model for parallel computation. Communications of the ACM33(8): 103–111.