Zusammenfassung
This paper presents a new programming methodology for intro- ducing and tuning parallelism for heterogeneous shared-memory systems (comprising a mixture of CPUs and GPUs), using a com- bination of algorithmic skeletons (such as farms and pipelines), Monte-Carlo tree search for deriving mappings of tasks to avail- able hardware resources, and refactoring tool support for applying the patterns and mappings in an easy and effective way. Using our approach, we demonstrate easily obtainable, significant and scal- able speedups on a number of case studies showing speedups of up to 41 over the sequential code on a 24-core machine with one GPU. We also demonstrate that the mappings the MCTS algorithm suggest are comparable to the best possible speedups that can be obtained.
Nutzer