Author of the publication

Search Space Generation and Pruning System for Autotuners.

, , , , and . IPDPS Workshops, page 1545-1554. IEEE Computer Society, (2016)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Accelerating Numerical Dense Linear Algebra Calculations with GPUs., , , , , , and . Numerical Computations with GPUs, Springer, (2014)Leading Edge Hybrid Multi-GPU Algorithms for Generalized Eigenproblems in Electronic Structure Calculations., , , , , and . ISC, volume 7905 of Lecture Notes in Computer Science, page 67-80. Springer, (2013)Massively Parallel Automated Software Tuning., , , , and . ICPP, page 92:1-92:10. ACM, (2019)Autotuning Batch Cholesky Factorization in CUDA with Interleaved Layout of Matrices., , , , and . IPDPS Workshops, page 1408-1417. IEEE Computer Society, (2017)High-performance hybrid CPU and GPU parallel algorithm for digital volume correlation., , and . IJHPCA, 29 (1): 92-106 (2015)A survey of recent developments in parallel implementations of Gaussian elimination., , , , , , and . Concurrency and Computation: Practice and Experience, 27 (5): 1292-1309 (2015)Heterogeneous Streaming., , , , , , , , , and 8 other author(s). IPDPS Workshops, page 611-620. IEEE Computer Society, (2016)clMAGMA: high performance dense linear algebra with OpenCL., , , , , and . IWOCL, page 1:1-1:9. ACM, (2014)Bringing High Performance Computing to Big Data Algorithms., , , , , , and . Handbook of Big Data Technologies, Springer, (2017)A Set of Batched Basic Linear Algebra Subprograms and LAPACK Routines., , , , , , , , , and 1 other author(s). ACM Trans. Math. Softw., 47 (3): 21:1-21:23 (2021)