Author of the publication

Memory Efficient Two-Pass 3D FFT Algorithm for Intel® Xeon PhiTM Coprocessor.

, , , and . J. Comput. Sci. Technol., 29 (6): 989-1002 (2014)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

QuantWiz: A scalable parallel software package for label-free protein quantification., , , , and . BIC-TA, page 976-980. IEEE, (2010)AUGEM: automatically generate high performance dense linear algebra kernels on x86 CPUs., , , and . SC, page 25:1-25:12. ACM, (2013)QuantWiz: A Parallel Software Package for LC-MS-based Label-Free Protein Quantification., , , , , , and . HPCC, page 683-687. IEEE, (2009)The BLIS Framework: Experiments in Portability., , , , , , , , , and 2 other author(s). ACM Trans. Math. Softw., 42 (2): 12:1-12:19 (2016)623 Tflop/s HPCG run on Tianhe-2: Leveraging millions of hybrid cores., , , , , , , , and . IJHPCA, 30 (1): 39-54 (2016)CRSD: Application Specific Auto-tuning of SpMV for Diagonal Sparse Matrices., , , , , and . Euro-Par (2), volume 6853 of Lecture Notes in Computer Science, page 316-327. Springer, (2011)Early Performance Evaluation of Dawning 5000A and DeepComp 7000., , , , , , and . ICPADS, page 578-585. IEEE Computer Society, (2009)Memory Efficient Two-Pass 3D FFT Algorithm for Intel® Xeon PhiTM Coprocessor., , , and . J. Comput. Sci. Technol., 29 (6): 989-1002 (2014)Optimizing SpMV for Diagonal Sparse Matrices on GPU., , , , , and . ICPP, page 492-501. IEEE Computer Society, (2011)Model-driven Level 3 BLAS Performance Optimization on Loongson 3A Processor., , and . ICPADS, page 684-691. IEEE Computer Society, (2012)