{"e8ebd2f42b453167d2065725486828cbvancraen":{"DOI":"10.1145/3648115.3648130","ISBN":"9798400717901","ISSN":"","URL":"https://doi.org/10.1145/3648115.3648130","abstract":"SYCL provides programmers with four, and in the case of AdaptiveCpp even five, ways for calling and writing a device kernel. This paper analyzes the performance of these diverse kernel invocation types for DPC++ and AdaptiveCpp as SYCL implementations on an NVIDIA A100 GPU, an AMD Instinct MI210 GPU, and a dual-socket AMD EPYC 9274F CPU. Using the example of a kernel matrix assembly, we show why the performance can differ by a factor of 100 in the worst case on the same hardware for the same problem using different SYCL implementations and kernel invocation types.","annote":"","author":[{"family":"Breyer","given":"Marcel"},{"family":"Van Craen","given":"Alexander"},{"family":"Pflüger","given":"Dirk"}],"citation-label":"breyer2024evaluation","collection-editor":[],"collection-title":"IWOCL '24","container-author":[],"container-title":"Proceedings of the 12th International Workshop on OpenCL and SYCL","documents":[],"edition":"","editor":[],"event-date":{"date-parts":[["2024","04"]],"literal":"2024"},"event-place":"Chicago, IL, USA","id":"e8ebd2f42b453167d2065725486828cbvancraen","interhash":"bfbc52cd98d241445f5051b284bf6ded","intrahash":"e8ebd2f42b453167d2065725486828cb","issue":"","issued":{"date-parts":[["2024","04"]],"literal":"2024"},"keyword":"myown Performance Evaluation CPU SVM SYCL GPU AISA exc2075","misc":{"isbn":"9798400717901","language":"english","numpages":"4","articleno":"10","location":"Chicago, IL, USA","doi":"10.1145/3648115.3648130"},"note":"","number":"","number-of-pages":"3","page":"1-4","page-first":"1","publisher":"Association for Computing Machinery","publisher-place":"Chicago, IL, USA","status":"","title":"Evaluation of SYCL’s Different Data Parallel Kernels","type":"paper-conference","username":"vancraen","version":"","volume":""}}