Author of the publication

Domain-Specific Multi-Level IR Rewriting for GPU: The Open Earth Compiler for GPU-accelerated Climate Simulation.

, , , , , , , , and . ACM Trans. Archit. Code Optim., 18 (4): 51:1-51:23 (2021)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

The impact of network noise at large-scale communication performance., , and . IPDPS, page 1-8. IEEE, (2009)A practically constant-time MPI Broadcast Algorithm for large-scale InfiniBand Clusters with Multicast., , and . IPDPS, page 1-8. IEEE, (2007)dCUDA: hardware supported overlap of computation and communication., , and . SC, page 52. ACM, (2016)To Push or To Pull: On Reducing Communication and Synchronization in Graph Computations., , , , and . HPDC, page 93-104. ACM, (2017)AllConcur: Leaderless Concurrent Atomic Broadcast., , and . HPDC, page 205-218. ACM, (2017)Using Simulation to Evaluate the Performance of Resilience Strategies at Scale., , , , , and . PMBS@SC, volume 8551 of Lecture Notes in Computer Science, page 91-114. Springer, (2013)sPIN: high-performance streaming processing in the network., , , , and . SC, page 59:1-59:16. ACM, (2017)Hybrid MPI: efficient message passing for multi-core systems., , , and . SC, page 18:1-18:11. ACM, (2013)POSTER: Cache-Oblivious MPI All-to-All Communications on Many-Core Architectures., , and . PPOPP, page 445-446. ACM, (2017)Embedding Functions Into Reversible Circuits: A Probabilistic Approach to the Number of Lines., , and . DAC, page 72. ACM, (2019)