基于国产加速卡的地震模拟计算性能分析与优化
作者:
作者单位:

1.中国地震局第二监测中心,陕西 西安 710054 ; 2.西安电子科技大学智能感知与图像理解教育部重点实验室,陕西 西安 710071 ; 3.西安电子科技大学人工智能学院,陕西 西安 710071 ; 4.西安电子科技大学杭州研究院,浙江 杭州 311231

作者简介:

周辉(1981—),男,高级工程师,硕士。主要从事地震软件研发与测试。E-mail:hui@smac.ac.cn

通讯作者:

中图分类号:

P315

基金项目:

陕西省重点研发计划(2022ZDLGY01-09)、光合基金(202302019674)、陕西省自然科学基础研究计划(2023JC-YB-242)资助


Performance Analysis and Optimization of Seismic Simulation Computation Based on Domestic Accelerator Cards
Author:
Affiliation:

1.The Second Monitoring and Application Center, CEA, Xi'an 710054 , China ; 2.Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, XiDian University, Xi'an 710071 , China ; 3.School of Artificial Intelligence, XiDian University, Xi'an 710071 , China ; 4.Hangzhou Institute of Technology,XiDian University, Hangzhou 311231 , China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    AWP‐ODC是基于有限差分数值方法来实现大规模三维地震模拟的软件。随着国外对我国高性能计算芯片的出口限制,我国急需发展自己的高性能计算芯片及其软件生态。早期的AWP‐ODC加速主要基于NVIDIA GPU软硬件架构来设计优化,近年来,多种异构计算平台迅猛发展,如何基于新的异构计算软硬件平台来加速 AWP‐ODC具有重要研究价值。为此,本文在一种国产加速卡上对AWP‐ODC进行移植。针对耗时较多的核函数 dstrqc,通过GPU访存优化和网格参数优化等方式缩短了其运行时间。最后分别在国产类GPU单卡和双卡上,利用Fréchet Kernels地震和8·3鲁甸地震数据集进行性能测试。实验结果表明,在单卡计算环境下,两个数据集的 FLOPS分别提高了30.51%和25.21%;在双卡计算环境下,两个数据集的FLOPS分别提高了9.42%和23.6%。

    Abstract:

    AWP-ODC is a software for large-scale 3D seismic simulation based on the finite difference numerical method. Due to foreign export restrictions on high-performance computing chips to China, there is an urgent need to develop China's own high-performance computing chips and software ecosys tem. The early acceleration of AWP-ODC was primarily designed and optimized based on the NVIDIA GPU software and hardware architecture. In recent years, various heterogeneous computing plat forms developed rapidly. How to accelerate AWP-ODC based on new heterogeneous computing soft ware and hardware platforms showed significant research value. To this end, AWP-ODC was ported to a domestic accelerator card. By optimizing GPU memory access and grid parameters, the execution time of the time-consuming kernel function dstrqc was reduced. Finally, performance tests were con ducted on a domestic GPU single-card and dual-card setup using the Fréchet Kernels seismic dataset and the 8·3 Ludian earthquake dataset. Experimental results showed that, under a single-card comput ing environment, the FLOPS for the two datasets increased by 30.51% and 25.21%, respectively. Under a dual-card computing environment, the FLOPS for the two datasets increased by 9.42% and 23.6%, respectively.

    参考文献
    相似文献
    引证文献
引用本文

周辉,朱虎明,高天琦,董西淼,张凌云,刘卉杰,陈志鹏.基于国产加速卡的地震模拟计算性能分析与优化[J].防灾减灾工程学报,2025,45(1):21-33

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-09-12
  • 最后修改日期:2025-01-10
  • 录用日期:
  • 在线发布日期:2025-03-10
  • 出版日期: