@scadsfct

A C++ Library for Memory Layout and Performance Portability of Scientific Applications

, , , and . Euro-Par 2022, page 109--120. Germany, Springer Science and Business Media B.V., (2023)Publisher Copyright: © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.; 28th International European Conference on Parallel and Distributed Computing , Euro-Par 2022 ; Conference date: 22-08-2022 Through 26-08-2022.
DOI: 10.1007/978-3-031-31209-0_8

Abstract

We present a C++14 library for performance portability of scientific computing codes across CPU and GPU architectures. Our library combines generic data structures like vectors, multi-dimensional arrays, maps, graphs, and sparse grids with basic, reusable algorithms like convolutions, sorting, prefix sum, reductions, and scan. The memory layout of the data structures is adapted at compile-time using tuples with optional memory mirroring between CPU and GPU. We combine this transparent memory mapping with generic algorithms under two alternative programming interfaces: a CUDA-like kernel interface for multi-core CPUs, Nvidia GPUs, and AMD GPUs, as well as a lambda interface. We validate and benchmark the presented library using micro-benchmarks, showing that the abstractions introduce negligible performance overhead, and we compare performance against the current state of the art.

Links and resources

Tags