Publication result detail

Performance and Accuracy Analysis of Nonlinear k-Wave Simulations Using Local Domain Decomposition with an 8-GPU Server

TREEBY, B.; VAVERKA, F.; JAROŠ, J.

Original Title

Performance and Accuracy Analysis of Nonlinear k-Wave Simulations Using Local Domain Decomposition with an 8-GPU Server

English Title

Performance and Accuracy Analysis of Nonlinear k-Wave Simulations Using Local Domain Decomposition with an 8-GPU Server

Type

Scopus Article

Original Abstract


Large-scale nonlinear ultrasound simulations using the open-source k-Wave toolbox are now routinely performed using the MPI version of k-Wave running on traditional CPU-based clusters. However, the allto-all communications required by the 3D fast Fourier transform (FFT) severely impact performance when scaling to large numbers of compute cores. This can be overcome by using a domain decomposition strategy based on a local Fourier basis. In this work, we analyse the performance and accuracy of using local domain decomposition for running a high-intensity focused ultrasound (HIFU) simulation in the kidney on a single server containing eight NVIDIA P40 graphical processing units (GPUs). Different decompositions and overlap sizes are investigated and compared to a global MPI simulation running on a CPU-based supercomputer using 1280 cores. For a grid size of 960 × 960 × 1280 grid points and an overlap size of 4 grid points, the error in the simulation using local domain decomposition is on the order of 0.1% compared to the global simulation, which is sufficient for most applications. The financial cost for running the simulation is also reduced by more than an order of magnitude.

English abstract


Large-scale nonlinear ultrasound simulations using the open-source k-Wave toolbox are now routinely performed using the MPI version of k-Wave running on traditional CPU-based clusters. However, the allto-all communications required by the 3D fast Fourier transform (FFT) severely impact performance when scaling to large numbers of compute cores. This can be overcome by using a domain decomposition strategy based on a local Fourier basis. In this work, we analyse the performance and accuracy of using local domain decomposition for running a high-intensity focused ultrasound (HIFU) simulation in the kidney on a single server containing eight NVIDIA P40 graphical processing units (GPUs). Different decompositions and overlap sizes are investigated and compared to a global MPI simulation running on a CPU-based supercomputer using 1280 cores. For a grid size of 960 × 960 × 1280 grid points and an overlap size of 4 grid points, the error in the simulation using local domain decomposition is on the order of 0.1% compared to the global simulation, which is sufficient for most applications. The financial cost for running the simulation is also reduced by more than an order of magnitude.

Keywords

k-Wave, Local domain decomposition, Fourier Basis, pseudospectral methods

Key words in English

k-Wave, Local domain decomposition, Fourier Basis, pseudospectral methods

Authors

TREEBY, B.; VAVERKA, F.; JAROŠ, J.

RIV year

2020

Released

22.10.2018

Book

Proceedings of Meetings on Acoustics

ISBN

1939-800X

Periodical

Proceedings of Meetings on Acoustics

Volume

34

Number

1

State

United States of America

Pages from

1

Pages to

5

Pages count

5

URL

BibTex

@article{BUT155074,
  author="Bradley {Treeby} and Filip {Vaverka} and Jiří {Jaroš}",
  title="Performance and Accuracy Analysis of Nonlinear k-Wave Simulations Using Local Domain Decomposition with an 8-GPU Server",
  journal="Proceedings of Meetings on Acoustics",
  year="2018",
  volume="34",
  number="1",
  pages="1--5",
  doi="10.1121/2.0000883",
  issn="1939-800X",
  url="https://asa.scitation.org/doi/10.1121/2.0000883"
}

Documents