Artikel,

Metrics and design of an instruction roofline model for AMD GPUs

M. Leinhauser, R. Widera, S. Bastrakov, A. Debus, M. Bussmann, und S. Chandrasekaran.
ACM Trans. Parallel Comput., 9 (1): 1--14 (März 2022)

Zusammenfassung

Due to the recent announcement of the Frontier supercomputer, many scientific application developers are working to make their applications compatible with AMD (CPU-GPU) architectures, which means moving away from the traditional CPU and NVIDIA-GPU systems. Due to the current limitations of profiling tools for AMD GPUs, this shift leaves a void in how to measure application performance on AMD GPUs. In this article, we design an instruction roofline model for AMD GPUs using AMD's ROCProfiler and a benchmarking tool, BabelStream (the HIP implementation), as a way to measure an application's performance in instructions and memory transactions on new AMD hardware. Specifically, we create instruction roofline models for a case study scientific application, PIConGPU, an open source particle-in-cell simulations application used for plasma and laser-plasma physics on the NVIDIA V100, AMD Radeon Instinct MI60, and AMD Instinct MI100 GPUs. When looking at the performance of multiple kernels of interest in PIConGPU we find that although the AMD MI100 GPU achieves a similar, or better, execution time compared to the NVIDIA V100 GPU, profiling tool differences make comparing performance of these two architectures hard. When looking at execution time, GIPS, and instruction intensity, the AMD MI60 achieves the worst performance out of the three GPUs used in this work.

BibTeX-Schlüssel: Leinhauser2022-ao
Eintragstyp: article
Jahr: 2022
Monat: mar
Zeitschrift: ACM Trans. Parallel Comput.
Nummer: 1
Seiten: 1--14
Verlag: Association for Computing Machinery (ACM)
Band: 9
language: en

Nutzer

Kommentare und Rezensionenanzeigen / verbergen

Bitte melden Sie sich an um selbst Rezensionen oder Kommentare zu erstellen.

Zitieren Sie diese Publikation

@article{Leinhauser2022-ao, abstract = {Due to the recent announcement of the Frontier supercomputer, many scientific application developers are working to make their applications compatible with AMD (CPU-GPU) architectures, which means moving away from the traditional CPU and NVIDIA-GPU systems. Due to the current limitations of profiling tools for AMD GPUs, this shift leaves a void in how to measure application performance on AMD GPUs. In this article, we design an instruction roofline model for AMD GPUs using AMD's ROCProfiler and a benchmarking tool, BabelStream (the HIP implementation), as a way to measure an application's performance in instructions and memory transactions on new AMD hardware. Specifically, we create instruction roofline models for a case study scientific application, PIConGPU, an open source particle-in-cell simulations application used for plasma and laser-plasma physics on the NVIDIA V100, AMD Radeon Instinct MI60, and AMD Instinct MI100 GPUs. When looking at the performance of multiple kernels of interest in PIConGPU we find that although the AMD MI100 GPU achieves a similar, or better, execution time compared to the NVIDIA V100 GPU, profiling tool differences make comparing performance of these two architectures hard. When looking at execution time, GIPS, and instruction intensity, the AMD MI60 achieves the worst performance out of the three GPUs used in this work.}, added-at = {2024-09-10T11:56:37.000+0200}, author = {Leinhauser, Matthew and Widera, Ren{\'e} and Bastrakov, Sergei and Debus, Alexander and Bussmann, Michael and Chandrasekaran, Sunita}, biburl = {https://puma.scadsai.uni-leipzig.de/bibtex/2dae7f797789ecefade4bd33093b92ff9/scadsfct}, interhash = {2b901387ee2736279ecf1fc171072415}, intrahash = {dae7f797789ecefade4bd33093b92ff9}, journal = {ACM Trans. Parallel Comput.}, keywords = {topic_physchemistry}, language = {en}, month = mar, number = 1, pages = {1--14}, publisher = {Association for Computing Machinery (ACM)}, timestamp = {2024-11-28T17:41:37.000+0100}, title = {Metrics and design of an instruction roofline model for {AMD} {GPUs}}, volume = 9, year = 2022 }

PUMA

Metrics and design of an instruction roofline model for AMD GPUs

Zusammenfassung

Tags

Nutzer

Kommentare und Rezensionenanzeigen / verbergen

Zitieren Sie diese Publikation

Mehr Zitationsstile

Suchen auf