XcalableMP PGAS Programming Language : : From Programming Model to Applications.

Saved in:
Bibliographic Details
:
Place / Publishing House:Singapore : : Springer Singapore Pte. Limited,, 2020.
Ã2021.
Year of Publication:2020
Edition:1st ed.
Language:English
Online Access:
Physical Description:1 online resource (265 pages)
Tags: Add Tag
No Tags, Be the first to tag this record!
id 5006403583
ctrlnum (MiAaPQ)5006403583
(Au-PeEL)EBL6403583
(OCoLC)1231606603
collection bib_alma
record_format marc
spelling Sato, Mitsuhisa.
XcalableMP PGAS Programming Language : From Programming Model to Applications.
1st ed.
Singapore : Springer Singapore Pte. Limited, 2020.
Ã2021.
1 online resource (265 pages)
text txt rdacontent
computer c rdamedia
online resource cr rdacarrier
Intro -- Preface -- Contents -- XcalableMP Programming Model and Language -- 1 Introduction -- 1.1 Target Hardware -- 1.2 Execution Model -- 1.3 Data Model -- 1.4 Programming Models -- 1.4.1 Partitioned Global Address Space -- 1.4.2 Global-View Programming Model -- 1.4.3 Local-View Programming Model -- 1.4.4 Mixture of Global View and Local View -- 1.5 Base Languages -- 1.5.1 Array Section in XcalableMP C -- 1.5.2 Array Assignment Statement in XcalableMP C -- 1.6 Interoperability -- 2 Data Mapping -- 2.1 nodes Directive -- 2.2 template Directive -- 2.3 distribute Directive -- 2.3.1 Block Distribution -- 2.3.2 Cyclic Distribution -- 2.3.3 Block-Cyclic Distribution -- 2.3.4 Gblock Distribution -- 2.3.5 Distribution of Multi-Dimensional Templates -- 2.4 align Directive -- 2.5 Dynamic Allocation of Distributed Array -- 2.6 template_fix Construct -- 3 Work Mapping -- 3.1 task and tasks Construct -- 3.1.1 task Construct -- 3.1.2 tasks Construct -- 3.2 loop Construct -- 3.2.1 Reduction Computation -- 3.2.2 Parallelizing Nested Loop -- 3.3 array Construct -- 4 Data Communication -- 4.1 shadow Directive and reflect Construct -- 4.1.1 Declaring Shadow -- 4.1.2 Updating Shadow -- 4.2 gmove Construct -- 4.2.1 Collective Mode -- 4.2.2 In Mode -- 4.2.3 Out Mode -- 4.3 barrier Construct -- 4.4 reduction Construct -- 4.5 bcast Construct -- 4.6 wait_async Construct -- 4.7 reduce_shadow Construct -- 5 Local-View Programming -- 5.1 Introduction -- 5.2 Coarray Declaration -- 5.3 Put Communication -- 5.4 Get Communication -- 5.5 Synchronization -- 5.5.1 Sync All -- 5.5.2 Sync Images -- 5.5.3 Sync Memory -- 6 Procedure Interface -- 7 XMPT Tool Interface -- 7.1 Overview -- 7.2 Specification -- 7.2.1 Initialization -- 7.2.2 Events -- References -- Implementation and Performance Evaluation of Omni Compiler -- 1 Overview -- 2 Implementation -- 2.1 Operation Flow.
2.2 Example of Code Translation -- 2.2.1 Distributed Array -- 2.2.2 Loop Statement -- 2.2.3 Communication -- 3 Installation -- 3.1 Overview -- 3.2 Get Source Code -- 3.2.1 From GitHub -- 3.2.2 From Our Website -- 3.3 Software Dependency -- 3.4 General Installation -- 3.4.1 Build and Install -- 3.4.2 Set PATH -- 3.5 Optional Installation -- 3.5.1 OpenACC -- 3.5.2 XcalableACC -- 3.5.3 One-Sided Library -- 4 Creation of Execution Binary -- 4.1 Compile -- 4.2 Execution -- 4.2.1 XcalableMP and XcalableACC -- 4.2.2 OpenACC -- 4.3 Cooperation with Profiler -- 4.3.1 Scalasca -- 4.3.2 tlog -- 5 Performance Evaluation -- 5.1 Experimental Environment -- 5.2 EP STREAM Triad -- 5.2.1 Design -- 5.2.2 Implementation -- 5.2.3 Evaluation -- 5.3 High-Performance Linpack -- 5.3.1 Design -- 5.3.2 Implementation -- 5.3.3 Evaluation -- 5.4 Global Fast Fourier Transform -- 5.4.1 Design -- 5.4.2 Implementation -- 5.4.3 Evaluation -- 5.5 RandomAccess -- 5.5.1 Design -- 5.5.2 Implementation -- 5.5.3 Evaluation -- 5.6 Discussion -- 6 Conclusion -- References -- Coarrays in the Context of XcalableMP -- 1 Introduction -- 2 Requirements from Language Specifications -- 2.1 Images Mapped to XMP Nodes -- 2.2 Allocation of Coarrays -- 2.3 Communication -- 2.4 Synchronization -- 2.5 Subarrays and Data Contiguity -- 2.6 Coarray C Language Specifications -- 3 Implementation -- 3.1 Omni XMP Compiler Framework -- 3.2 Allocation and Registration -- 3.2.1 Three Methods of Memory Management -- 3.2.2 Initial Allocation for Static Coarrays -- 3.2.3 Runtime Allocation for Allocatable Coarrays -- 3.3 PUT/GET Communication -- 3.3.1 Determining the Possibility of DMA -- 3.3.2 Buffering Communication Methods -- 3.3.3 Non-blocking PUT Communication -- 3.3.4 Optimization of GET Communication -- 3.4 Runtime Libraries -- 3.4.1 Fortran Wrapper -- 3.4.2 Upper-layer Runtime (ULR) Library.
3.4.3 Lower-layer Runtime (LLR) Library -- 3.4.4 Communication Libraries -- 4 Evaluation -- 4.1 Fundamental Performance -- 4.2 Non-blocking Communication -- 4.3 Application Program -- 4.3.1 Coarray Version of the Himeno Benchmark -- 4.3.2 Measurement Result -- 4.3.3 Productivity -- 5 Related Work -- 6 Conclusion -- References -- XcalableACC: An Integration of XcalableMP and OpenACC -- 1 Introduction -- 1.1 Hardware Model -- 1.2 Programming Model -- 1.2.1 XMP Extensions -- 1.2.2 OpenACC Extensions -- 1.3 Execution Model -- 1.4 Data Model -- 2 XcalableACC Language -- 2.1 Data Mapping -- Example -- 2.2 Work Mapping -- Restriction -- Example 1 -- Example 2 -- 2.3 Data Communication and Synchronization -- Example -- 2.4 Coarrays -- Restriction -- Example -- 2.5 Handling Multiple Accelerators -- 2.5.1 devices Directive -- Example -- 2.5.2 on_device Clause -- 2.5.3 layout Clause -- Example -- 2.5.4 shadow Clause -- Example -- 2.5.5 barrier_device Construct -- Example -- 3 Omni XcalableACC Compiler -- 4 Performance of Lattice QCD Application -- 4.1 Overview of Lattice QCD -- 4.2 Implementation -- 5 Performance Evaluation -- 5.1 Result -- 5.2 Discussion -- 6 Productivity Improvement -- 6.1 Requirement for Productive Parallel Language -- 6.2 Quantitative Evaluation by Delta Source Lines of Codes -- 6.3 Discussion -- References -- Mixed-Language Programming with XcalableMP -- 1 Background -- 2 Translation by Omni Compiler -- 3 Functions for Mixed-Language -- 3.1 Function to Call MPI Program from XMP Program -- 3.2 Function to Call XMP Program from MPI Program -- 3.3 Function to Call XMP Program from Python Program -- 3.3.1 From Parallel Python Program -- 3.3.2 From Sequential Python Program -- 4 Application to Order/Degree Problem -- 4.1 What Is Order/Degree Program -- 4.2 Implementation -- 4.3 Evaluation -- 5 Conclusion -- References.
Three-Dimensional Fluid Code with XcalableMP -- 1 Introduction -- 2 Global-View Programming Model -- 2.1 Domain Decomposition Methods -- 2.2 Performance on the K Computer -- 2.2.1 Comparison with Hand-Coded MPI Program -- 2.2.2 Optimization for SIMD -- 2.2.3 Optimization for Allocatable Arrays -- 3 Local-View Programming Model -- 3.1 Communications Using Coarray -- 3.2 Performance on the K Computer -- 4 Summary -- References -- Hybrid-View Programming of Nuclear Fusion Simulation Code in XcalableMP -- 1 Introduction -- 2 Nuclear Fusion Simulation Code -- 2.1 Gyrokinetic PIC Simulation -- 2.2 GTC -- 3 Implementation of GTC-P by Hybrid-view Programming -- 3.1 Hybrid-View Programming Model -- 3.2 Implementation Based on the XMP-Localview Model: XMP-localview -- 3.3 Implementation Based on the XMP-Hybridview Model: XMP-Hybridview -- 4 Performance Evaluation -- 4.1 Experimental Setting -- 4.2 Results -- 4.3 Productivity and Performance -- 5 Related Research -- 6 Conclusion -- References -- Parallelization of Atomic Image Reconstruction from X-ray Fluorescence Holograms with XcalableMP -- 1 Introduction -- 2 X-ray Fluorescence Holography -- 2.1 Reconstruction of Atomic Images -- 2.2 Analysis Procedure of XFH -- 3 Parallelization -- 3.1 Parallelization of Reconstruction of Two-Dimensional Atomic Images by OpenMP -- 3.2 Parallelization of Reconstruction of Three-dimensional Atomic Images by XcalableMP -- 4 Performance Evaluation -- 4.1 Performance Results of Reconstruction of Two-Dimensional Atomic Images -- 4.2 Performance Results of Reconstruction of Three-dimensional Atomic Images -- 4.3 Comparison of Parallelization with MPI -- 5 Conclusion -- References -- Multi-SPMD Programming Model with YML and XcalableMP -- 1 Introduction -- 2 Background: International Collaborations for the Post-Petascale and Exascale Computing -- 3 Multi-SPMD Programming Model.
3.1 Overview -- 3.2 YML -- 3.3 OmniRPC-MPI -- 4 Application Development in the mSPMD Programming Environment -- 4.1 Task Generator -- 4.2 Workflow Development -- 4.3 Workflow Execution -- 5 Experiments -- 6 Eigen Solver on the mSPMD Programming Model -- 6.1 Implicitly Restarted Arnoldi Method (IRAM), Multiple Implicitly Restarted Arnoldi Method (MIRAM) and Their Implementations for the mSPMD Programming Model -- 6.2 Experiments -- 7 Fault-Tolerance Features in the mSPMD Programming Model -- 7.1 Overview and Implementation -- 7.2 Experiments -- 8 Runtime Correctness Check for the mSPMD Programming Model -- 8.1 Overview and Implementation -- 8.2 Experiments -- 9 Summary -- References -- XcalableMP 2.0 and Future Directions -- 1 Introduction -- 2 XcalableMP on Fugaku -- 2.1 Performance of XcalableMP Global View Programming -- 2.2 Performance of XcalableMP Local View Programming -- 3 Global Task Parallel Programming -- 3.1 OpenMP and XMP Tasklet Directive -- 3.2 A Proposal for Global Task Parallel Programming -- 3.3 Prototype Design of Code Transformation -- 3.4 Preliminary Performance -- 3.5 Communication Optimization for Manycore Clusters -- 4 Retrospectives and Challenges for Future PGAS Models -- 4.1 Low-Level Communication Layer for PGAS Model -- 4.2 XcalableMP as a DSL for Stencil Applications -- 4.3 XcalableMP API: Compiler-Free Approach -- 4.4 Global Task Parallel Programming Model for Accelerators -- References.
Description based on publisher supplied metadata and other sources.
Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2024. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.
Electronic books.
Print version: Sato, Mitsuhisa XcalableMP PGAS Programming Language Singapore : Springer Singapore Pte. Limited,c2020 9789811576829
ProQuest (Firm)
https://ebookcentral.proquest.com/lib/oeawat/detail.action?docID=6403583 Click to View
language English
format eBook
author Sato, Mitsuhisa.
spellingShingle Sato, Mitsuhisa.
XcalableMP PGAS Programming Language : From Programming Model to Applications.
Intro -- Preface -- Contents -- XcalableMP Programming Model and Language -- 1 Introduction -- 1.1 Target Hardware -- 1.2 Execution Model -- 1.3 Data Model -- 1.4 Programming Models -- 1.4.1 Partitioned Global Address Space -- 1.4.2 Global-View Programming Model -- 1.4.3 Local-View Programming Model -- 1.4.4 Mixture of Global View and Local View -- 1.5 Base Languages -- 1.5.1 Array Section in XcalableMP C -- 1.5.2 Array Assignment Statement in XcalableMP C -- 1.6 Interoperability -- 2 Data Mapping -- 2.1 nodes Directive -- 2.2 template Directive -- 2.3 distribute Directive -- 2.3.1 Block Distribution -- 2.3.2 Cyclic Distribution -- 2.3.3 Block-Cyclic Distribution -- 2.3.4 Gblock Distribution -- 2.3.5 Distribution of Multi-Dimensional Templates -- 2.4 align Directive -- 2.5 Dynamic Allocation of Distributed Array -- 2.6 template_fix Construct -- 3 Work Mapping -- 3.1 task and tasks Construct -- 3.1.1 task Construct -- 3.1.2 tasks Construct -- 3.2 loop Construct -- 3.2.1 Reduction Computation -- 3.2.2 Parallelizing Nested Loop -- 3.3 array Construct -- 4 Data Communication -- 4.1 shadow Directive and reflect Construct -- 4.1.1 Declaring Shadow -- 4.1.2 Updating Shadow -- 4.2 gmove Construct -- 4.2.1 Collective Mode -- 4.2.2 In Mode -- 4.2.3 Out Mode -- 4.3 barrier Construct -- 4.4 reduction Construct -- 4.5 bcast Construct -- 4.6 wait_async Construct -- 4.7 reduce_shadow Construct -- 5 Local-View Programming -- 5.1 Introduction -- 5.2 Coarray Declaration -- 5.3 Put Communication -- 5.4 Get Communication -- 5.5 Synchronization -- 5.5.1 Sync All -- 5.5.2 Sync Images -- 5.5.3 Sync Memory -- 6 Procedure Interface -- 7 XMPT Tool Interface -- 7.1 Overview -- 7.2 Specification -- 7.2.1 Initialization -- 7.2.2 Events -- References -- Implementation and Performance Evaluation of Omni Compiler -- 1 Overview -- 2 Implementation -- 2.1 Operation Flow.
2.2 Example of Code Translation -- 2.2.1 Distributed Array -- 2.2.2 Loop Statement -- 2.2.3 Communication -- 3 Installation -- 3.1 Overview -- 3.2 Get Source Code -- 3.2.1 From GitHub -- 3.2.2 From Our Website -- 3.3 Software Dependency -- 3.4 General Installation -- 3.4.1 Build and Install -- 3.4.2 Set PATH -- 3.5 Optional Installation -- 3.5.1 OpenACC -- 3.5.2 XcalableACC -- 3.5.3 One-Sided Library -- 4 Creation of Execution Binary -- 4.1 Compile -- 4.2 Execution -- 4.2.1 XcalableMP and XcalableACC -- 4.2.2 OpenACC -- 4.3 Cooperation with Profiler -- 4.3.1 Scalasca -- 4.3.2 tlog -- 5 Performance Evaluation -- 5.1 Experimental Environment -- 5.2 EP STREAM Triad -- 5.2.1 Design -- 5.2.2 Implementation -- 5.2.3 Evaluation -- 5.3 High-Performance Linpack -- 5.3.1 Design -- 5.3.2 Implementation -- 5.3.3 Evaluation -- 5.4 Global Fast Fourier Transform -- 5.4.1 Design -- 5.4.2 Implementation -- 5.4.3 Evaluation -- 5.5 RandomAccess -- 5.5.1 Design -- 5.5.2 Implementation -- 5.5.3 Evaluation -- 5.6 Discussion -- 6 Conclusion -- References -- Coarrays in the Context of XcalableMP -- 1 Introduction -- 2 Requirements from Language Specifications -- 2.1 Images Mapped to XMP Nodes -- 2.2 Allocation of Coarrays -- 2.3 Communication -- 2.4 Synchronization -- 2.5 Subarrays and Data Contiguity -- 2.6 Coarray C Language Specifications -- 3 Implementation -- 3.1 Omni XMP Compiler Framework -- 3.2 Allocation and Registration -- 3.2.1 Three Methods of Memory Management -- 3.2.2 Initial Allocation for Static Coarrays -- 3.2.3 Runtime Allocation for Allocatable Coarrays -- 3.3 PUT/GET Communication -- 3.3.1 Determining the Possibility of DMA -- 3.3.2 Buffering Communication Methods -- 3.3.3 Non-blocking PUT Communication -- 3.3.4 Optimization of GET Communication -- 3.4 Runtime Libraries -- 3.4.1 Fortran Wrapper -- 3.4.2 Upper-layer Runtime (ULR) Library.
3.4.3 Lower-layer Runtime (LLR) Library -- 3.4.4 Communication Libraries -- 4 Evaluation -- 4.1 Fundamental Performance -- 4.2 Non-blocking Communication -- 4.3 Application Program -- 4.3.1 Coarray Version of the Himeno Benchmark -- 4.3.2 Measurement Result -- 4.3.3 Productivity -- 5 Related Work -- 6 Conclusion -- References -- XcalableACC: An Integration of XcalableMP and OpenACC -- 1 Introduction -- 1.1 Hardware Model -- 1.2 Programming Model -- 1.2.1 XMP Extensions -- 1.2.2 OpenACC Extensions -- 1.3 Execution Model -- 1.4 Data Model -- 2 XcalableACC Language -- 2.1 Data Mapping -- Example -- 2.2 Work Mapping -- Restriction -- Example 1 -- Example 2 -- 2.3 Data Communication and Synchronization -- Example -- 2.4 Coarrays -- Restriction -- Example -- 2.5 Handling Multiple Accelerators -- 2.5.1 devices Directive -- Example -- 2.5.2 on_device Clause -- 2.5.3 layout Clause -- Example -- 2.5.4 shadow Clause -- Example -- 2.5.5 barrier_device Construct -- Example -- 3 Omni XcalableACC Compiler -- 4 Performance of Lattice QCD Application -- 4.1 Overview of Lattice QCD -- 4.2 Implementation -- 5 Performance Evaluation -- 5.1 Result -- 5.2 Discussion -- 6 Productivity Improvement -- 6.1 Requirement for Productive Parallel Language -- 6.2 Quantitative Evaluation by Delta Source Lines of Codes -- 6.3 Discussion -- References -- Mixed-Language Programming with XcalableMP -- 1 Background -- 2 Translation by Omni Compiler -- 3 Functions for Mixed-Language -- 3.1 Function to Call MPI Program from XMP Program -- 3.2 Function to Call XMP Program from MPI Program -- 3.3 Function to Call XMP Program from Python Program -- 3.3.1 From Parallel Python Program -- 3.3.2 From Sequential Python Program -- 4 Application to Order/Degree Problem -- 4.1 What Is Order/Degree Program -- 4.2 Implementation -- 4.3 Evaluation -- 5 Conclusion -- References.
Three-Dimensional Fluid Code with XcalableMP -- 1 Introduction -- 2 Global-View Programming Model -- 2.1 Domain Decomposition Methods -- 2.2 Performance on the K Computer -- 2.2.1 Comparison with Hand-Coded MPI Program -- 2.2.2 Optimization for SIMD -- 2.2.3 Optimization for Allocatable Arrays -- 3 Local-View Programming Model -- 3.1 Communications Using Coarray -- 3.2 Performance on the K Computer -- 4 Summary -- References -- Hybrid-View Programming of Nuclear Fusion Simulation Code in XcalableMP -- 1 Introduction -- 2 Nuclear Fusion Simulation Code -- 2.1 Gyrokinetic PIC Simulation -- 2.2 GTC -- 3 Implementation of GTC-P by Hybrid-view Programming -- 3.1 Hybrid-View Programming Model -- 3.2 Implementation Based on the XMP-Localview Model: XMP-localview -- 3.3 Implementation Based on the XMP-Hybridview Model: XMP-Hybridview -- 4 Performance Evaluation -- 4.1 Experimental Setting -- 4.2 Results -- 4.3 Productivity and Performance -- 5 Related Research -- 6 Conclusion -- References -- Parallelization of Atomic Image Reconstruction from X-ray Fluorescence Holograms with XcalableMP -- 1 Introduction -- 2 X-ray Fluorescence Holography -- 2.1 Reconstruction of Atomic Images -- 2.2 Analysis Procedure of XFH -- 3 Parallelization -- 3.1 Parallelization of Reconstruction of Two-Dimensional Atomic Images by OpenMP -- 3.2 Parallelization of Reconstruction of Three-dimensional Atomic Images by XcalableMP -- 4 Performance Evaluation -- 4.1 Performance Results of Reconstruction of Two-Dimensional Atomic Images -- 4.2 Performance Results of Reconstruction of Three-dimensional Atomic Images -- 4.3 Comparison of Parallelization with MPI -- 5 Conclusion -- References -- Multi-SPMD Programming Model with YML and XcalableMP -- 1 Introduction -- 2 Background: International Collaborations for the Post-Petascale and Exascale Computing -- 3 Multi-SPMD Programming Model.
3.1 Overview -- 3.2 YML -- 3.3 OmniRPC-MPI -- 4 Application Development in the mSPMD Programming Environment -- 4.1 Task Generator -- 4.2 Workflow Development -- 4.3 Workflow Execution -- 5 Experiments -- 6 Eigen Solver on the mSPMD Programming Model -- 6.1 Implicitly Restarted Arnoldi Method (IRAM), Multiple Implicitly Restarted Arnoldi Method (MIRAM) and Their Implementations for the mSPMD Programming Model -- 6.2 Experiments -- 7 Fault-Tolerance Features in the mSPMD Programming Model -- 7.1 Overview and Implementation -- 7.2 Experiments -- 8 Runtime Correctness Check for the mSPMD Programming Model -- 8.1 Overview and Implementation -- 8.2 Experiments -- 9 Summary -- References -- XcalableMP 2.0 and Future Directions -- 1 Introduction -- 2 XcalableMP on Fugaku -- 2.1 Performance of XcalableMP Global View Programming -- 2.2 Performance of XcalableMP Local View Programming -- 3 Global Task Parallel Programming -- 3.1 OpenMP and XMP Tasklet Directive -- 3.2 A Proposal for Global Task Parallel Programming -- 3.3 Prototype Design of Code Transformation -- 3.4 Preliminary Performance -- 3.5 Communication Optimization for Manycore Clusters -- 4 Retrospectives and Challenges for Future PGAS Models -- 4.1 Low-Level Communication Layer for PGAS Model -- 4.2 XcalableMP as a DSL for Stencil Applications -- 4.3 XcalableMP API: Compiler-Free Approach -- 4.4 Global Task Parallel Programming Model for Accelerators -- References.
author_facet Sato, Mitsuhisa.
author_variant m s ms
author_sort Sato, Mitsuhisa.
title XcalableMP PGAS Programming Language : From Programming Model to Applications.
title_sub From Programming Model to Applications.
title_full XcalableMP PGAS Programming Language : From Programming Model to Applications.
title_fullStr XcalableMP PGAS Programming Language : From Programming Model to Applications.
title_full_unstemmed XcalableMP PGAS Programming Language : From Programming Model to Applications.
title_auth XcalableMP PGAS Programming Language : From Programming Model to Applications.
title_new XcalableMP PGAS Programming Language :
title_sort xcalablemp pgas programming language : from programming model to applications.
publisher Springer Singapore Pte. Limited,
publishDate 2020
physical 1 online resource (265 pages)
edition 1st ed.
contents Intro -- Preface -- Contents -- XcalableMP Programming Model and Language -- 1 Introduction -- 1.1 Target Hardware -- 1.2 Execution Model -- 1.3 Data Model -- 1.4 Programming Models -- 1.4.1 Partitioned Global Address Space -- 1.4.2 Global-View Programming Model -- 1.4.3 Local-View Programming Model -- 1.4.4 Mixture of Global View and Local View -- 1.5 Base Languages -- 1.5.1 Array Section in XcalableMP C -- 1.5.2 Array Assignment Statement in XcalableMP C -- 1.6 Interoperability -- 2 Data Mapping -- 2.1 nodes Directive -- 2.2 template Directive -- 2.3 distribute Directive -- 2.3.1 Block Distribution -- 2.3.2 Cyclic Distribution -- 2.3.3 Block-Cyclic Distribution -- 2.3.4 Gblock Distribution -- 2.3.5 Distribution of Multi-Dimensional Templates -- 2.4 align Directive -- 2.5 Dynamic Allocation of Distributed Array -- 2.6 template_fix Construct -- 3 Work Mapping -- 3.1 task and tasks Construct -- 3.1.1 task Construct -- 3.1.2 tasks Construct -- 3.2 loop Construct -- 3.2.1 Reduction Computation -- 3.2.2 Parallelizing Nested Loop -- 3.3 array Construct -- 4 Data Communication -- 4.1 shadow Directive and reflect Construct -- 4.1.1 Declaring Shadow -- 4.1.2 Updating Shadow -- 4.2 gmove Construct -- 4.2.1 Collective Mode -- 4.2.2 In Mode -- 4.2.3 Out Mode -- 4.3 barrier Construct -- 4.4 reduction Construct -- 4.5 bcast Construct -- 4.6 wait_async Construct -- 4.7 reduce_shadow Construct -- 5 Local-View Programming -- 5.1 Introduction -- 5.2 Coarray Declaration -- 5.3 Put Communication -- 5.4 Get Communication -- 5.5 Synchronization -- 5.5.1 Sync All -- 5.5.2 Sync Images -- 5.5.3 Sync Memory -- 6 Procedure Interface -- 7 XMPT Tool Interface -- 7.1 Overview -- 7.2 Specification -- 7.2.1 Initialization -- 7.2.2 Events -- References -- Implementation and Performance Evaluation of Omni Compiler -- 1 Overview -- 2 Implementation -- 2.1 Operation Flow.
2.2 Example of Code Translation -- 2.2.1 Distributed Array -- 2.2.2 Loop Statement -- 2.2.3 Communication -- 3 Installation -- 3.1 Overview -- 3.2 Get Source Code -- 3.2.1 From GitHub -- 3.2.2 From Our Website -- 3.3 Software Dependency -- 3.4 General Installation -- 3.4.1 Build and Install -- 3.4.2 Set PATH -- 3.5 Optional Installation -- 3.5.1 OpenACC -- 3.5.2 XcalableACC -- 3.5.3 One-Sided Library -- 4 Creation of Execution Binary -- 4.1 Compile -- 4.2 Execution -- 4.2.1 XcalableMP and XcalableACC -- 4.2.2 OpenACC -- 4.3 Cooperation with Profiler -- 4.3.1 Scalasca -- 4.3.2 tlog -- 5 Performance Evaluation -- 5.1 Experimental Environment -- 5.2 EP STREAM Triad -- 5.2.1 Design -- 5.2.2 Implementation -- 5.2.3 Evaluation -- 5.3 High-Performance Linpack -- 5.3.1 Design -- 5.3.2 Implementation -- 5.3.3 Evaluation -- 5.4 Global Fast Fourier Transform -- 5.4.1 Design -- 5.4.2 Implementation -- 5.4.3 Evaluation -- 5.5 RandomAccess -- 5.5.1 Design -- 5.5.2 Implementation -- 5.5.3 Evaluation -- 5.6 Discussion -- 6 Conclusion -- References -- Coarrays in the Context of XcalableMP -- 1 Introduction -- 2 Requirements from Language Specifications -- 2.1 Images Mapped to XMP Nodes -- 2.2 Allocation of Coarrays -- 2.3 Communication -- 2.4 Synchronization -- 2.5 Subarrays and Data Contiguity -- 2.6 Coarray C Language Specifications -- 3 Implementation -- 3.1 Omni XMP Compiler Framework -- 3.2 Allocation and Registration -- 3.2.1 Three Methods of Memory Management -- 3.2.2 Initial Allocation for Static Coarrays -- 3.2.3 Runtime Allocation for Allocatable Coarrays -- 3.3 PUT/GET Communication -- 3.3.1 Determining the Possibility of DMA -- 3.3.2 Buffering Communication Methods -- 3.3.3 Non-blocking PUT Communication -- 3.3.4 Optimization of GET Communication -- 3.4 Runtime Libraries -- 3.4.1 Fortran Wrapper -- 3.4.2 Upper-layer Runtime (ULR) Library.
3.4.3 Lower-layer Runtime (LLR) Library -- 3.4.4 Communication Libraries -- 4 Evaluation -- 4.1 Fundamental Performance -- 4.2 Non-blocking Communication -- 4.3 Application Program -- 4.3.1 Coarray Version of the Himeno Benchmark -- 4.3.2 Measurement Result -- 4.3.3 Productivity -- 5 Related Work -- 6 Conclusion -- References -- XcalableACC: An Integration of XcalableMP and OpenACC -- 1 Introduction -- 1.1 Hardware Model -- 1.2 Programming Model -- 1.2.1 XMP Extensions -- 1.2.2 OpenACC Extensions -- 1.3 Execution Model -- 1.4 Data Model -- 2 XcalableACC Language -- 2.1 Data Mapping -- Example -- 2.2 Work Mapping -- Restriction -- Example 1 -- Example 2 -- 2.3 Data Communication and Synchronization -- Example -- 2.4 Coarrays -- Restriction -- Example -- 2.5 Handling Multiple Accelerators -- 2.5.1 devices Directive -- Example -- 2.5.2 on_device Clause -- 2.5.3 layout Clause -- Example -- 2.5.4 shadow Clause -- Example -- 2.5.5 barrier_device Construct -- Example -- 3 Omni XcalableACC Compiler -- 4 Performance of Lattice QCD Application -- 4.1 Overview of Lattice QCD -- 4.2 Implementation -- 5 Performance Evaluation -- 5.1 Result -- 5.2 Discussion -- 6 Productivity Improvement -- 6.1 Requirement for Productive Parallel Language -- 6.2 Quantitative Evaluation by Delta Source Lines of Codes -- 6.3 Discussion -- References -- Mixed-Language Programming with XcalableMP -- 1 Background -- 2 Translation by Omni Compiler -- 3 Functions for Mixed-Language -- 3.1 Function to Call MPI Program from XMP Program -- 3.2 Function to Call XMP Program from MPI Program -- 3.3 Function to Call XMP Program from Python Program -- 3.3.1 From Parallel Python Program -- 3.3.2 From Sequential Python Program -- 4 Application to Order/Degree Problem -- 4.1 What Is Order/Degree Program -- 4.2 Implementation -- 4.3 Evaluation -- 5 Conclusion -- References.
Three-Dimensional Fluid Code with XcalableMP -- 1 Introduction -- 2 Global-View Programming Model -- 2.1 Domain Decomposition Methods -- 2.2 Performance on the K Computer -- 2.2.1 Comparison with Hand-Coded MPI Program -- 2.2.2 Optimization for SIMD -- 2.2.3 Optimization for Allocatable Arrays -- 3 Local-View Programming Model -- 3.1 Communications Using Coarray -- 3.2 Performance on the K Computer -- 4 Summary -- References -- Hybrid-View Programming of Nuclear Fusion Simulation Code in XcalableMP -- 1 Introduction -- 2 Nuclear Fusion Simulation Code -- 2.1 Gyrokinetic PIC Simulation -- 2.2 GTC -- 3 Implementation of GTC-P by Hybrid-view Programming -- 3.1 Hybrid-View Programming Model -- 3.2 Implementation Based on the XMP-Localview Model: XMP-localview -- 3.3 Implementation Based on the XMP-Hybridview Model: XMP-Hybridview -- 4 Performance Evaluation -- 4.1 Experimental Setting -- 4.2 Results -- 4.3 Productivity and Performance -- 5 Related Research -- 6 Conclusion -- References -- Parallelization of Atomic Image Reconstruction from X-ray Fluorescence Holograms with XcalableMP -- 1 Introduction -- 2 X-ray Fluorescence Holography -- 2.1 Reconstruction of Atomic Images -- 2.2 Analysis Procedure of XFH -- 3 Parallelization -- 3.1 Parallelization of Reconstruction of Two-Dimensional Atomic Images by OpenMP -- 3.2 Parallelization of Reconstruction of Three-dimensional Atomic Images by XcalableMP -- 4 Performance Evaluation -- 4.1 Performance Results of Reconstruction of Two-Dimensional Atomic Images -- 4.2 Performance Results of Reconstruction of Three-dimensional Atomic Images -- 4.3 Comparison of Parallelization with MPI -- 5 Conclusion -- References -- Multi-SPMD Programming Model with YML and XcalableMP -- 1 Introduction -- 2 Background: International Collaborations for the Post-Petascale and Exascale Computing -- 3 Multi-SPMD Programming Model.
3.1 Overview -- 3.2 YML -- 3.3 OmniRPC-MPI -- 4 Application Development in the mSPMD Programming Environment -- 4.1 Task Generator -- 4.2 Workflow Development -- 4.3 Workflow Execution -- 5 Experiments -- 6 Eigen Solver on the mSPMD Programming Model -- 6.1 Implicitly Restarted Arnoldi Method (IRAM), Multiple Implicitly Restarted Arnoldi Method (MIRAM) and Their Implementations for the mSPMD Programming Model -- 6.2 Experiments -- 7 Fault-Tolerance Features in the mSPMD Programming Model -- 7.1 Overview and Implementation -- 7.2 Experiments -- 8 Runtime Correctness Check for the mSPMD Programming Model -- 8.1 Overview and Implementation -- 8.2 Experiments -- 9 Summary -- References -- XcalableMP 2.0 and Future Directions -- 1 Introduction -- 2 XcalableMP on Fugaku -- 2.1 Performance of XcalableMP Global View Programming -- 2.2 Performance of XcalableMP Local View Programming -- 3 Global Task Parallel Programming -- 3.1 OpenMP and XMP Tasklet Directive -- 3.2 A Proposal for Global Task Parallel Programming -- 3.3 Prototype Design of Code Transformation -- 3.4 Preliminary Performance -- 3.5 Communication Optimization for Manycore Clusters -- 4 Retrospectives and Challenges for Future PGAS Models -- 4.1 Low-Level Communication Layer for PGAS Model -- 4.2 XcalableMP as a DSL for Stencil Applications -- 4.3 XcalableMP API: Compiler-Free Approach -- 4.4 Global Task Parallel Programming Model for Accelerators -- References.
isbn 9789811576836
9789811576829
callnumber-first Q - Science
callnumber-subject QA - Mathematics
callnumber-label QA76
callnumber-sort QA 276.76 C65
genre Electronic books.
genre_facet Electronic books.
url https://ebookcentral.proquest.com/lib/oeawat/detail.action?docID=6403583
illustrated Not Illustrated
oclc_num 1231606603
work_keys_str_mv AT satomitsuhisa xcalablemppgasprogramminglanguagefromprogrammingmodeltoapplications
status_str n
ids_txt_mv (MiAaPQ)5006403583
(Au-PeEL)EBL6403583
(OCoLC)1231606603
carrierType_str_mv cr
is_hierarchy_title XcalableMP PGAS Programming Language : From Programming Model to Applications.
marc_error Info : Unimarc and ISO-8859-1 translations identical, choosing ISO-8859-1. --- [ 856 : z ]
_version_ 1792331057319641088
fullrecord <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>10481nam a22004213i 4500</leader><controlfield tag="001">5006403583</controlfield><controlfield tag="003">MiAaPQ</controlfield><controlfield tag="005">20240229073836.0</controlfield><controlfield tag="006">m o d | </controlfield><controlfield tag="007">cr cnu||||||||</controlfield><controlfield tag="008">240229s2020 xx o ||||0 eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9789811576836</subfield><subfield code="q">(electronic bk.)</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="z">9789811576829</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(MiAaPQ)5006403583</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(Au-PeEL)EBL6403583</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1231606603</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">MiAaPQ</subfield><subfield code="b">eng</subfield><subfield code="e">rda</subfield><subfield code="e">pn</subfield><subfield code="c">MiAaPQ</subfield><subfield code="d">MiAaPQ</subfield></datafield><datafield tag="050" ind1=" " ind2="4"><subfield code="a">QA76.76.C65</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Sato, Mitsuhisa.</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">XcalableMP PGAS Programming Language :</subfield><subfield code="b">From Programming Model to Applications.</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">1st ed.</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Singapore :</subfield><subfield code="b">Springer Singapore Pte. Limited,</subfield><subfield code="c">2020.</subfield></datafield><datafield tag="264" ind1=" " ind2="4"><subfield code="c">Ã2021.</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">1 online resource (265 pages)</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">computer</subfield><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">online resource</subfield><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="505" ind1="0" ind2=" "><subfield code="a">Intro -- Preface -- Contents -- XcalableMP Programming Model and Language -- 1 Introduction -- 1.1 Target Hardware -- 1.2 Execution Model -- 1.3 Data Model -- 1.4 Programming Models -- 1.4.1 Partitioned Global Address Space -- 1.4.2 Global-View Programming Model -- 1.4.3 Local-View Programming Model -- 1.4.4 Mixture of Global View and Local View -- 1.5 Base Languages -- 1.5.1 Array Section in XcalableMP C -- 1.5.2 Array Assignment Statement in XcalableMP C -- 1.6 Interoperability -- 2 Data Mapping -- 2.1 nodes Directive -- 2.2 template Directive -- 2.3 distribute Directive -- 2.3.1 Block Distribution -- 2.3.2 Cyclic Distribution -- 2.3.3 Block-Cyclic Distribution -- 2.3.4 Gblock Distribution -- 2.3.5 Distribution of Multi-Dimensional Templates -- 2.4 align Directive -- 2.5 Dynamic Allocation of Distributed Array -- 2.6 template_fix Construct -- 3 Work Mapping -- 3.1 task and tasks Construct -- 3.1.1 task Construct -- 3.1.2 tasks Construct -- 3.2 loop Construct -- 3.2.1 Reduction Computation -- 3.2.2 Parallelizing Nested Loop -- 3.3 array Construct -- 4 Data Communication -- 4.1 shadow Directive and reflect Construct -- 4.1.1 Declaring Shadow -- 4.1.2 Updating Shadow -- 4.2 gmove Construct -- 4.2.1 Collective Mode -- 4.2.2 In Mode -- 4.2.3 Out Mode -- 4.3 barrier Construct -- 4.4 reduction Construct -- 4.5 bcast Construct -- 4.6 wait_async Construct -- 4.7 reduce_shadow Construct -- 5 Local-View Programming -- 5.1 Introduction -- 5.2 Coarray Declaration -- 5.3 Put Communication -- 5.4 Get Communication -- 5.5 Synchronization -- 5.5.1 Sync All -- 5.5.2 Sync Images -- 5.5.3 Sync Memory -- 6 Procedure Interface -- 7 XMPT Tool Interface -- 7.1 Overview -- 7.2 Specification -- 7.2.1 Initialization -- 7.2.2 Events -- References -- Implementation and Performance Evaluation of Omni Compiler -- 1 Overview -- 2 Implementation -- 2.1 Operation Flow.</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">2.2 Example of Code Translation -- 2.2.1 Distributed Array -- 2.2.2 Loop Statement -- 2.2.3 Communication -- 3 Installation -- 3.1 Overview -- 3.2 Get Source Code -- 3.2.1 From GitHub -- 3.2.2 From Our Website -- 3.3 Software Dependency -- 3.4 General Installation -- 3.4.1 Build and Install -- 3.4.2 Set PATH -- 3.5 Optional Installation -- 3.5.1 OpenACC -- 3.5.2 XcalableACC -- 3.5.3 One-Sided Library -- 4 Creation of Execution Binary -- 4.1 Compile -- 4.2 Execution -- 4.2.1 XcalableMP and XcalableACC -- 4.2.2 OpenACC -- 4.3 Cooperation with Profiler -- 4.3.1 Scalasca -- 4.3.2 tlog -- 5 Performance Evaluation -- 5.1 Experimental Environment -- 5.2 EP STREAM Triad -- 5.2.1 Design -- 5.2.2 Implementation -- 5.2.3 Evaluation -- 5.3 High-Performance Linpack -- 5.3.1 Design -- 5.3.2 Implementation -- 5.3.3 Evaluation -- 5.4 Global Fast Fourier Transform -- 5.4.1 Design -- 5.4.2 Implementation -- 5.4.3 Evaluation -- 5.5 RandomAccess -- 5.5.1 Design -- 5.5.2 Implementation -- 5.5.3 Evaluation -- 5.6 Discussion -- 6 Conclusion -- References -- Coarrays in the Context of XcalableMP -- 1 Introduction -- 2 Requirements from Language Specifications -- 2.1 Images Mapped to XMP Nodes -- 2.2 Allocation of Coarrays -- 2.3 Communication -- 2.4 Synchronization -- 2.5 Subarrays and Data Contiguity -- 2.6 Coarray C Language Specifications -- 3 Implementation -- 3.1 Omni XMP Compiler Framework -- 3.2 Allocation and Registration -- 3.2.1 Three Methods of Memory Management -- 3.2.2 Initial Allocation for Static Coarrays -- 3.2.3 Runtime Allocation for Allocatable Coarrays -- 3.3 PUT/GET Communication -- 3.3.1 Determining the Possibility of DMA -- 3.3.2 Buffering Communication Methods -- 3.3.3 Non-blocking PUT Communication -- 3.3.4 Optimization of GET Communication -- 3.4 Runtime Libraries -- 3.4.1 Fortran Wrapper -- 3.4.2 Upper-layer Runtime (ULR) Library.</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">3.4.3 Lower-layer Runtime (LLR) Library -- 3.4.4 Communication Libraries -- 4 Evaluation -- 4.1 Fundamental Performance -- 4.2 Non-blocking Communication -- 4.3 Application Program -- 4.3.1 Coarray Version of the Himeno Benchmark -- 4.3.2 Measurement Result -- 4.3.3 Productivity -- 5 Related Work -- 6 Conclusion -- References -- XcalableACC: An Integration of XcalableMP and OpenACC -- 1 Introduction -- 1.1 Hardware Model -- 1.2 Programming Model -- 1.2.1 XMP Extensions -- 1.2.2 OpenACC Extensions -- 1.3 Execution Model -- 1.4 Data Model -- 2 XcalableACC Language -- 2.1 Data Mapping -- Example -- 2.2 Work Mapping -- Restriction -- Example 1 -- Example 2 -- 2.3 Data Communication and Synchronization -- Example -- 2.4 Coarrays -- Restriction -- Example -- 2.5 Handling Multiple Accelerators -- 2.5.1 devices Directive -- Example -- 2.5.2 on_device Clause -- 2.5.3 layout Clause -- Example -- 2.5.4 shadow Clause -- Example -- 2.5.5 barrier_device Construct -- Example -- 3 Omni XcalableACC Compiler -- 4 Performance of Lattice QCD Application -- 4.1 Overview of Lattice QCD -- 4.2 Implementation -- 5 Performance Evaluation -- 5.1 Result -- 5.2 Discussion -- 6 Productivity Improvement -- 6.1 Requirement for Productive Parallel Language -- 6.2 Quantitative Evaluation by Delta Source Lines of Codes -- 6.3 Discussion -- References -- Mixed-Language Programming with XcalableMP -- 1 Background -- 2 Translation by Omni Compiler -- 3 Functions for Mixed-Language -- 3.1 Function to Call MPI Program from XMP Program -- 3.2 Function to Call XMP Program from MPI Program -- 3.3 Function to Call XMP Program from Python Program -- 3.3.1 From Parallel Python Program -- 3.3.2 From Sequential Python Program -- 4 Application to Order/Degree Problem -- 4.1 What Is Order/Degree Program -- 4.2 Implementation -- 4.3 Evaluation -- 5 Conclusion -- References.</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">Three-Dimensional Fluid Code with XcalableMP -- 1 Introduction -- 2 Global-View Programming Model -- 2.1 Domain Decomposition Methods -- 2.2 Performance on the K Computer -- 2.2.1 Comparison with Hand-Coded MPI Program -- 2.2.2 Optimization for SIMD -- 2.2.3 Optimization for Allocatable Arrays -- 3 Local-View Programming Model -- 3.1 Communications Using Coarray -- 3.2 Performance on the K Computer -- 4 Summary -- References -- Hybrid-View Programming of Nuclear Fusion Simulation Code in XcalableMP -- 1 Introduction -- 2 Nuclear Fusion Simulation Code -- 2.1 Gyrokinetic PIC Simulation -- 2.2 GTC -- 3 Implementation of GTC-P by Hybrid-view Programming -- 3.1 Hybrid-View Programming Model -- 3.2 Implementation Based on the XMP-Localview Model: XMP-localview -- 3.3 Implementation Based on the XMP-Hybridview Model: XMP-Hybridview -- 4 Performance Evaluation -- 4.1 Experimental Setting -- 4.2 Results -- 4.3 Productivity and Performance -- 5 Related Research -- 6 Conclusion -- References -- Parallelization of Atomic Image Reconstruction from X-ray Fluorescence Holograms with XcalableMP -- 1 Introduction -- 2 X-ray Fluorescence Holography -- 2.1 Reconstruction of Atomic Images -- 2.2 Analysis Procedure of XFH -- 3 Parallelization -- 3.1 Parallelization of Reconstruction of Two-Dimensional Atomic Images by OpenMP -- 3.2 Parallelization of Reconstruction of Three-dimensional Atomic Images by XcalableMP -- 4 Performance Evaluation -- 4.1 Performance Results of Reconstruction of Two-Dimensional Atomic Images -- 4.2 Performance Results of Reconstruction of Three-dimensional Atomic Images -- 4.3 Comparison of Parallelization with MPI -- 5 Conclusion -- References -- Multi-SPMD Programming Model with YML and XcalableMP -- 1 Introduction -- 2 Background: International Collaborations for the Post-Petascale and Exascale Computing -- 3 Multi-SPMD Programming Model.</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">3.1 Overview -- 3.2 YML -- 3.3 OmniRPC-MPI -- 4 Application Development in the mSPMD Programming Environment -- 4.1 Task Generator -- 4.2 Workflow Development -- 4.3 Workflow Execution -- 5 Experiments -- 6 Eigen Solver on the mSPMD Programming Model -- 6.1 Implicitly Restarted Arnoldi Method (IRAM), Multiple Implicitly Restarted Arnoldi Method (MIRAM) and Their Implementations for the mSPMD Programming Model -- 6.2 Experiments -- 7 Fault-Tolerance Features in the mSPMD Programming Model -- 7.1 Overview and Implementation -- 7.2 Experiments -- 8 Runtime Correctness Check for the mSPMD Programming Model -- 8.1 Overview and Implementation -- 8.2 Experiments -- 9 Summary -- References -- XcalableMP 2.0 and Future Directions -- 1 Introduction -- 2 XcalableMP on Fugaku -- 2.1 Performance of XcalableMP Global View Programming -- 2.2 Performance of XcalableMP Local View Programming -- 3 Global Task Parallel Programming -- 3.1 OpenMP and XMP Tasklet Directive -- 3.2 A Proposal for Global Task Parallel Programming -- 3.3 Prototype Design of Code Transformation -- 3.4 Preliminary Performance -- 3.5 Communication Optimization for Manycore Clusters -- 4 Retrospectives and Challenges for Future PGAS Models -- 4.1 Low-Level Communication Layer for PGAS Model -- 4.2 XcalableMP as a DSL for Stencil Applications -- 4.3 XcalableMP API: Compiler-Free Approach -- 4.4 Global Task Parallel Programming Model for Accelerators -- References.</subfield></datafield><datafield tag="588" ind1=" " ind2=" "><subfield code="a">Description based on publisher supplied metadata and other sources.</subfield></datafield><datafield tag="590" ind1=" " ind2=" "><subfield code="a">Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2024. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries. </subfield></datafield><datafield tag="655" ind1=" " ind2="4"><subfield code="a">Electronic books.</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Print version:</subfield><subfield code="a">Sato, Mitsuhisa</subfield><subfield code="t">XcalableMP PGAS Programming Language</subfield><subfield code="d">Singapore : Springer Singapore Pte. Limited,c2020</subfield><subfield code="z">9789811576829</subfield></datafield><datafield tag="797" ind1="2" ind2=" "><subfield code="a">ProQuest (Firm)</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://ebookcentral.proquest.com/lib/oeawat/detail.action?docID=6403583</subfield><subfield code="z">Click to View</subfield></datafield></record></collection>