@simtech

Evaluation of SYCL’s Different Data Parallel Kernels

, , and . Proceedings of the 12th International Workshop on OpenCL and SYCL, page 1-4. New York, NY, USA, Association for Computing Machinery, (April 2024)
DOI: 10.1145/3648115.3648130

Abstract

SYCL provides programmers with four, and in the case of AdaptiveCpp even five, ways for calling and writing a device kernel. This paper analyzes the performance of these diverse kernel invocation types for DPC++ and AdaptiveCpp as SYCL implementations on an NVIDIA A100 GPU, an AMD Instinct MI210 GPU, and a dual-socket AMD EPYC 9274F CPU. Using the example of a kernel matrix assembly, we show why the performance can differ by a factor of 100 in the worst case on the same hardware for the same problem using different SYCL implementations and kernel invocation types.

Links and resources

Tags

community

  • @unibiblio
  • @testusersimtech
  • @simtech
  • @exc2075
  • @ipvs-sc
  • @aisa
  • @vancraen
  • @simtechpuma
  • @ipvs-sgs
@simtech's tags highlighted