Pro TBB : : C++ Parallel Programming with Threading Building Blocks.

Saved in:
Bibliographic Details
:
TeilnehmendeR:
Place / Publishing House:Berkeley, CA : : Apress L. P.,, 2019.
©2019.
Year of Publication:2019
Edition:1st ed.
Language:English
Online Access:
Physical Description:1 online resource (807 pages)
Tags: Add Tag
No Tags, Be the first to tag this record!
id 5005920567
ctrlnum (MiAaPQ)5005920567
(Au-PeEL)EBL5920567
(OCoLC)1132430067
collection bib_alma
record_format marc
spelling Voss, Michael.
Pro TBB : C++ Parallel Programming with Threading Building Blocks.
1st ed.
Berkeley, CA : Apress L. P., 2019.
©2019.
1 online resource (807 pages)
text txt rdacontent
computer c rdamedia
online resource cr rdacarrier
Intro -- Table of Contents -- About the Authors -- Acknowledgments -- Preface -- Part 1 -- Chapter 1: Jumping Right In: "Hello, TBB!" -- Why Threading Building Blocks? -- Performance: Small Overhead, Big Benefits for C++ -- Evolving Support for Parallelism in TBB and C++ -- Recent C++ Additions for Parallelism -- The Threading Building Blocks (TBB) Library -- Parallel Execution Interfaces -- Interfaces That Are Independent of the Execution Model -- Using the Building Blocks in TBB -- Let's Get Started Already! -- Getting the Threading Building Blocks (TBB) Library -- Getting a Copy of the Examples -- Writing a First "Hello, TBB!" Example -- Building the Simple Examples -- Steps to Set Up an Environment -- Building on Windows Using Microsoft Visual Studio -- Building on a Linux Platform from a Terminal -- Using the Intel Compiler -- tbbvars and pstlvars Scripts -- Setting Up Variables Manually Without Using the tbbvars Script or the Intel Compiler -- A More Complete Example -- Starting with a Serial Implementation -- Adding a Message-Driven Layer Using a Flow Graph -- Adding a Fork-Join Layer Using a parallel_for -- Adding a SIMD Layer Using a Parallel STL Transform -- Summary -- Chapter 2: Generic Parallel Algorithms -- Functional / Task Parallelism -- A Slightly More Complicated Example: A Parallel Implementation of Quicksort -- Loops: parallel_for, parallel_reduce, and parallel_scan -- parallel_for: Applying a Body to Each Element in a Range -- A Slightly More Complicated Example: Parallel Matrix Multiplication -- parallel_reduce: Calculating a Single Result Across a Range -- A Slightly More Complicated Example: Calculating π by Numerical Integration -- parallel_scan: A Reduction with Intermediate Values -- How Does This Work? -- A Slightly More Complicated Example: Line of Sight -- Cook Until Done: parallel_do and parallel_pipeline.
parallel_do: Apply a Body Until There Are No More Items Left -- A Slightly More Complicated Example: Forward Substitution -- parallel_pipeline: Streaming Items Through a Series of Filters -- A Slightly More Complicated Example: Creating 3D Stereoscopic Images -- Summary -- For More Information -- Chapter 3: Flow Graphs -- Why Use Graphs to Express Parallelism? -- The Basics of the TBB Flow Graph Interface -- Step 1: Create the Graph Object -- Step 2: Make the Nodes -- Step 3: Add Edges -- Step 4: Start the Graph -- Step 5: Wait for the Graph to Complete Executing -- A More Complicated Example of a Data Flow Graph -- Implementing the Example as a TBB Flow Graph -- Understanding the Performance of a Data Flow Graph -- The Special Case of Dependency Graphs -- Implementing a Dependency Graph -- Estimating the Scalability of a Dependency Graph -- Advanced Topics in TBB Flow Graphs -- Summary -- Chapter 4: TBB and the Parallel Algorithms of the C++ Standard Template Library -- Does the C++ STL Library Belong in This Book? -- A Parallel STL Execution Policy Analogy -- A Simple Example Using std::for_each -- What Algorithms Are Provided in a Parallel STL Implementation? -- How to Get and Use a Copy of Parallel STL That Uses TBB -- Algorithms in Intel's Parallel STL -- Capturing More Use Cases with Custom Iterators -- Highlighting Some of the Most Useful Algorithms -- std::for_each, std::for_each_n -- std::transform -- std::reduce -- std::transform_reduce -- A Deeper Dive into the Execution Policies -- The sequenced_policy -- The parallel_policy -- The unsequenced_policy -- The parallel_unsequenced_policy -- Which Execution Policy Should We Use? -- Other Ways to Introduce SIMD Parallelism -- Summary -- For More Information -- Chapter 5: Synchronization: Why and How to Avoid It -- A Running Example: Histogram of an Image -- An Unsafe Parallel Implementation.
A First Safe Parallel Implementation: Coarse-Grained Locking -- Mutex Flavors -- A Second Safe Parallel Implementation: Fine-Grained Locking -- A Third Safe Parallel Implementation: Atomics -- A Better Parallel Implementation: Privatization and Reduction -- Thread Local Storage, TLS -- enumerable_thread_specific, ETS -- combinable -- The Easiest Parallel Implementation: Reduction Template -- Recap of Our Options -- Summary -- For More Information -- Chapter 6: Data Structures for Concurrency -- Key Data Structures Basics -- Unordered Associative Containers -- Map vs. Set -- Multiple Values -- Hashing -- Unordered -- Concurrent Containers -- Concurrent Unordered Associative Containers -- concurrent_hash_map -- Concurrent Support for map/multimap and set/multiset Interfaces -- Built-In Locking vs. No Visible Locking -- Iterating Through These Structures Is Asking for Trouble -- Concurrent Queues: Regular, Bounded, and Priority -- Bounding Size -- Priority Ordering -- Staying Thread-Safe: Try to Forget About Top, Size, Empty, Front, Back -- Iterators -- Why to Use This Concurrent Queue: The A-B-A Problem -- When to NOT Use Queues: Think Algorithms! -- Concurrent Vector -- When to Use tbb::concurrent_vector Instead of std::vector -- Elements Never Move -- Concurrent Growth of concurrent_vectors -- Summary -- Chapter 7: Scalable Memory Allocation -- Modern C++ Memory Allocation -- Scalable Memory Allocation: What -- Scalable Memory Allocation: Why -- Avoiding False Sharing with Padding -- Scalable Memory Allocation Alternatives: Which -- Compilation Considerations -- Most Popular Usage (C/C++ Proxy Library): How -- Linux: malloc/new Proxy Library Usage -- macOS: malloc/new Proxy Library Usage -- Windows: malloc/new Proxy Library Usage -- Testing our Proxy Library Usage -- C Functions: Scalable Memory Allocators for C.
C++ Classes: Scalable Memory Allocators for C++ -- Allocators with std::allocator&lt -- T&gt -- Signature -- scalable_allocator -- tbb_allocator -- zero_allocator -- cached_aligned_allocator -- Memory Pool Support: memory_pool_allocator -- Array Allocation Support: aligned_space -- Replacing new and delete Selectively -- Performance Tuning: Some Control Knobs -- What Are Huge Pages? -- TBB Support for Huge Pages -- scalable_allocation_mode(int mode, intptr_t value) -- TBBMALLOC_USE_HUGE_PAGES -- TBBMALLOC_SET_SOFT_HEAP_LIMIT -- int scalable_allocation_command(int cmd, void ∗param) -- TBBMALLOC_CLEAN_ALL_BUFFERS -- TBBMALLOC_CLEAN_THREAD_BUFFERS -- Summary -- Chapter 8: Mapping Parallel Patterns to TBB -- Parallel Patterns vs. Parallel Algorithms -- Patterns Categorize Algorithms, Designs, etc. -- Patterns That Work -- Data Parallelism Wins -- Nesting Pattern -- Map Pattern -- Workpile Pattern -- Reduction Patterns (Reduce and Scan) -- Fork-Join Pattern -- Divide-and-Conquer Pattern -- Branch-and-Bound Pattern -- Pipeline Pattern -- Event-Based Coordination Pattern (Reactive Streams) -- Summary -- For More Information -- Part 2 -- Chapter 9: The Pillars of Composability -- What Is Composability? -- Nested Composition -- Concurrent Composition -- Serial Composition -- The Features That Make TBB a Composable Library -- The TBB Thread Pool (the Market) and Task Arenas -- The TBB Task Dispatcher: Work Stealing and More -- Putting It All Together -- Looking Forward -- Controlling the Number of Threads -- Work Isolation -- Task-to-Thread and Thread-to-Core Affinity -- Task Priorities -- Summary -- For More Information -- Chapter 10: Using Tasks to Create Your Own Algorithms -- A Running Example: The Sequence -- The High-Level Approach: parallel_invoke -- The Highest Among the Lower: task_group -- The Low-Level Task Interface: Part One - Task Blocking.
The Low-Level Task Interface: Part Two - Task Continuation -- Bypassing the Scheduler -- The Low-Level Task Interface: Part Three - Task Recycling -- Task Interface Checklist -- One More Thing: FIFO (aka Fire-and-Forget) Tasks -- Putting These Low-Level Features to Work -- Summary -- For More Information -- Chapter 11: Controlling the Number of Threads Used for Execution -- A Brief Recap of the TBB Scheduler Architecture -- Interfaces for Controlling the Number of Threads -- Controlling Thread Count with task_scheduler_init -- Controlling Thread Count with task_arena -- Controlling Thread Count with global_control -- Summary of Concepts and Classes -- The Best Approaches for Setting the Number of Threads -- Using a Single task_scheduler_init Object for a Simple Application -- Using More Than One task_scheduler_init Object in a Simple Application -- Using Multiple Arenas with Different Numbers of Slots to Influence Where TBB Places Its Worker Threads -- Using global_control to Control How Many Threads Are Available to Fill Arena Slots -- Using global_control to Temporarily Restrict the Number of Available Threads -- When NOT to Control the Number of Threads -- Figuring Out What's Gone Wrong -- Summary -- Chapter 12: Using Work Isolation for Correctness and Performance -- Work Isolation for Correctness -- Creating an Isolated Region with  this_task_arena::isolate -- Oh No! Work Isolation Can Cause Its Own Correctness Issues! -- Even When It Is Safe, Work Isolation Is Not Free -- Using Task Arenas for Isolation: A Double-Edged Sword -- Don't Be Tempted to Use task_arenas to Create Work Isolation for Correctness -- Summary -- For More Information -- Chapter 13: Creating Thread-to-Core and Task-to-Thread Affinity -- Creating Thread-to-Core Affinity -- Creating Task-to-Thread Affinity -- When and How Should We Use the TBB Affinity Features? -- Summary.
For More Information.
Description based on publisher supplied metadata and other sources.
Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2024. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.
Electronic books.
Asenjo, Rafael.
Reinders, James.
Print version: Voss, Michael Pro TBB Berkeley, CA : Apress L. P.,c2019 9781484243978
ProQuest (Firm)
https://ebookcentral.proquest.com/lib/oeawat/detail.action?docID=5920567 Click to View
language English
format eBook
author Voss, Michael.
spellingShingle Voss, Michael.
Pro TBB : C++ Parallel Programming with Threading Building Blocks.
Intro -- Table of Contents -- About the Authors -- Acknowledgments -- Preface -- Part 1 -- Chapter 1: Jumping Right In: "Hello, TBB!" -- Why Threading Building Blocks? -- Performance: Small Overhead, Big Benefits for C++ -- Evolving Support for Parallelism in TBB and C++ -- Recent C++ Additions for Parallelism -- The Threading Building Blocks (TBB) Library -- Parallel Execution Interfaces -- Interfaces That Are Independent of the Execution Model -- Using the Building Blocks in TBB -- Let's Get Started Already! -- Getting the Threading Building Blocks (TBB) Library -- Getting a Copy of the Examples -- Writing a First "Hello, TBB!" Example -- Building the Simple Examples -- Steps to Set Up an Environment -- Building on Windows Using Microsoft Visual Studio -- Building on a Linux Platform from a Terminal -- Using the Intel Compiler -- tbbvars and pstlvars Scripts -- Setting Up Variables Manually Without Using the tbbvars Script or the Intel Compiler -- A More Complete Example -- Starting with a Serial Implementation -- Adding a Message-Driven Layer Using a Flow Graph -- Adding a Fork-Join Layer Using a parallel_for -- Adding a SIMD Layer Using a Parallel STL Transform -- Summary -- Chapter 2: Generic Parallel Algorithms -- Functional / Task Parallelism -- A Slightly More Complicated Example: A Parallel Implementation of Quicksort -- Loops: parallel_for, parallel_reduce, and parallel_scan -- parallel_for: Applying a Body to Each Element in a Range -- A Slightly More Complicated Example: Parallel Matrix Multiplication -- parallel_reduce: Calculating a Single Result Across a Range -- A Slightly More Complicated Example: Calculating π by Numerical Integration -- parallel_scan: A Reduction with Intermediate Values -- How Does This Work? -- A Slightly More Complicated Example: Line of Sight -- Cook Until Done: parallel_do and parallel_pipeline.
parallel_do: Apply a Body Until There Are No More Items Left -- A Slightly More Complicated Example: Forward Substitution -- parallel_pipeline: Streaming Items Through a Series of Filters -- A Slightly More Complicated Example: Creating 3D Stereoscopic Images -- Summary -- For More Information -- Chapter 3: Flow Graphs -- Why Use Graphs to Express Parallelism? -- The Basics of the TBB Flow Graph Interface -- Step 1: Create the Graph Object -- Step 2: Make the Nodes -- Step 3: Add Edges -- Step 4: Start the Graph -- Step 5: Wait for the Graph to Complete Executing -- A More Complicated Example of a Data Flow Graph -- Implementing the Example as a TBB Flow Graph -- Understanding the Performance of a Data Flow Graph -- The Special Case of Dependency Graphs -- Implementing a Dependency Graph -- Estimating the Scalability of a Dependency Graph -- Advanced Topics in TBB Flow Graphs -- Summary -- Chapter 4: TBB and the Parallel Algorithms of the C++ Standard Template Library -- Does the C++ STL Library Belong in This Book? -- A Parallel STL Execution Policy Analogy -- A Simple Example Using std::for_each -- What Algorithms Are Provided in a Parallel STL Implementation? -- How to Get and Use a Copy of Parallel STL That Uses TBB -- Algorithms in Intel's Parallel STL -- Capturing More Use Cases with Custom Iterators -- Highlighting Some of the Most Useful Algorithms -- std::for_each, std::for_each_n -- std::transform -- std::reduce -- std::transform_reduce -- A Deeper Dive into the Execution Policies -- The sequenced_policy -- The parallel_policy -- The unsequenced_policy -- The parallel_unsequenced_policy -- Which Execution Policy Should We Use? -- Other Ways to Introduce SIMD Parallelism -- Summary -- For More Information -- Chapter 5: Synchronization: Why and How to Avoid It -- A Running Example: Histogram of an Image -- An Unsafe Parallel Implementation.
A First Safe Parallel Implementation: Coarse-Grained Locking -- Mutex Flavors -- A Second Safe Parallel Implementation: Fine-Grained Locking -- A Third Safe Parallel Implementation: Atomics -- A Better Parallel Implementation: Privatization and Reduction -- Thread Local Storage, TLS -- enumerable_thread_specific, ETS -- combinable -- The Easiest Parallel Implementation: Reduction Template -- Recap of Our Options -- Summary -- For More Information -- Chapter 6: Data Structures for Concurrency -- Key Data Structures Basics -- Unordered Associative Containers -- Map vs. Set -- Multiple Values -- Hashing -- Unordered -- Concurrent Containers -- Concurrent Unordered Associative Containers -- concurrent_hash_map -- Concurrent Support for map/multimap and set/multiset Interfaces -- Built-In Locking vs. No Visible Locking -- Iterating Through These Structures Is Asking for Trouble -- Concurrent Queues: Regular, Bounded, and Priority -- Bounding Size -- Priority Ordering -- Staying Thread-Safe: Try to Forget About Top, Size, Empty, Front, Back -- Iterators -- Why to Use This Concurrent Queue: The A-B-A Problem -- When to NOT Use Queues: Think Algorithms! -- Concurrent Vector -- When to Use tbb::concurrent_vector Instead of std::vector -- Elements Never Move -- Concurrent Growth of concurrent_vectors -- Summary -- Chapter 7: Scalable Memory Allocation -- Modern C++ Memory Allocation -- Scalable Memory Allocation: What -- Scalable Memory Allocation: Why -- Avoiding False Sharing with Padding -- Scalable Memory Allocation Alternatives: Which -- Compilation Considerations -- Most Popular Usage (C/C++ Proxy Library): How -- Linux: malloc/new Proxy Library Usage -- macOS: malloc/new Proxy Library Usage -- Windows: malloc/new Proxy Library Usage -- Testing our Proxy Library Usage -- C Functions: Scalable Memory Allocators for C.
C++ Classes: Scalable Memory Allocators for C++ -- Allocators with std::allocator&lt -- T&gt -- Signature -- scalable_allocator -- tbb_allocator -- zero_allocator -- cached_aligned_allocator -- Memory Pool Support: memory_pool_allocator -- Array Allocation Support: aligned_space -- Replacing new and delete Selectively -- Performance Tuning: Some Control Knobs -- What Are Huge Pages? -- TBB Support for Huge Pages -- scalable_allocation_mode(int mode, intptr_t value) -- TBBMALLOC_USE_HUGE_PAGES -- TBBMALLOC_SET_SOFT_HEAP_LIMIT -- int scalable_allocation_command(int cmd, void ∗param) -- TBBMALLOC_CLEAN_ALL_BUFFERS -- TBBMALLOC_CLEAN_THREAD_BUFFERS -- Summary -- Chapter 8: Mapping Parallel Patterns to TBB -- Parallel Patterns vs. Parallel Algorithms -- Patterns Categorize Algorithms, Designs, etc. -- Patterns That Work -- Data Parallelism Wins -- Nesting Pattern -- Map Pattern -- Workpile Pattern -- Reduction Patterns (Reduce and Scan) -- Fork-Join Pattern -- Divide-and-Conquer Pattern -- Branch-and-Bound Pattern -- Pipeline Pattern -- Event-Based Coordination Pattern (Reactive Streams) -- Summary -- For More Information -- Part 2 -- Chapter 9: The Pillars of Composability -- What Is Composability? -- Nested Composition -- Concurrent Composition -- Serial Composition -- The Features That Make TBB a Composable Library -- The TBB Thread Pool (the Market) and Task Arenas -- The TBB Task Dispatcher: Work Stealing and More -- Putting It All Together -- Looking Forward -- Controlling the Number of Threads -- Work Isolation -- Task-to-Thread and Thread-to-Core Affinity -- Task Priorities -- Summary -- For More Information -- Chapter 10: Using Tasks to Create Your Own Algorithms -- A Running Example: The Sequence -- The High-Level Approach: parallel_invoke -- The Highest Among the Lower: task_group -- The Low-Level Task Interface: Part One - Task Blocking.
The Low-Level Task Interface: Part Two - Task Continuation -- Bypassing the Scheduler -- The Low-Level Task Interface: Part Three - Task Recycling -- Task Interface Checklist -- One More Thing: FIFO (aka Fire-and-Forget) Tasks -- Putting These Low-Level Features to Work -- Summary -- For More Information -- Chapter 11: Controlling the Number of Threads Used for Execution -- A Brief Recap of the TBB Scheduler Architecture -- Interfaces for Controlling the Number of Threads -- Controlling Thread Count with task_scheduler_init -- Controlling Thread Count with task_arena -- Controlling Thread Count with global_control -- Summary of Concepts and Classes -- The Best Approaches for Setting the Number of Threads -- Using a Single task_scheduler_init Object for a Simple Application -- Using More Than One task_scheduler_init Object in a Simple Application -- Using Multiple Arenas with Different Numbers of Slots to Influence Where TBB Places Its Worker Threads -- Using global_control to Control How Many Threads Are Available to Fill Arena Slots -- Using global_control to Temporarily Restrict the Number of Available Threads -- When NOT to Control the Number of Threads -- Figuring Out What's Gone Wrong -- Summary -- Chapter 12: Using Work Isolation for Correctness and Performance -- Work Isolation for Correctness -- Creating an Isolated Region with  this_task_arena::isolate -- Oh No! Work Isolation Can Cause Its Own Correctness Issues! -- Even When It Is Safe, Work Isolation Is Not Free -- Using Task Arenas for Isolation: A Double-Edged Sword -- Don't Be Tempted to Use task_arenas to Create Work Isolation for Correctness -- Summary -- For More Information -- Chapter 13: Creating Thread-to-Core and Task-to-Thread Affinity -- Creating Thread-to-Core Affinity -- Creating Task-to-Thread Affinity -- When and How Should We Use the TBB Affinity Features? -- Summary.
For More Information.
author_facet Voss, Michael.
Asenjo, Rafael.
Reinders, James.
author_variant m v mv
author2 Asenjo, Rafael.
Reinders, James.
author2_variant r a ra
j r jr
author2_role TeilnehmendeR
TeilnehmendeR
author_sort Voss, Michael.
title Pro TBB : C++ Parallel Programming with Threading Building Blocks.
title_sub C++ Parallel Programming with Threading Building Blocks.
title_full Pro TBB : C++ Parallel Programming with Threading Building Blocks.
title_fullStr Pro TBB : C++ Parallel Programming with Threading Building Blocks.
title_full_unstemmed Pro TBB : C++ Parallel Programming with Threading Building Blocks.
title_auth Pro TBB : C++ Parallel Programming with Threading Building Blocks.
title_new Pro TBB :
title_sort pro tbb : c++ parallel programming with threading building blocks.
publisher Apress L. P.,
publishDate 2019
physical 1 online resource (807 pages)
edition 1st ed.
contents Intro -- Table of Contents -- About the Authors -- Acknowledgments -- Preface -- Part 1 -- Chapter 1: Jumping Right In: "Hello, TBB!" -- Why Threading Building Blocks? -- Performance: Small Overhead, Big Benefits for C++ -- Evolving Support for Parallelism in TBB and C++ -- Recent C++ Additions for Parallelism -- The Threading Building Blocks (TBB) Library -- Parallel Execution Interfaces -- Interfaces That Are Independent of the Execution Model -- Using the Building Blocks in TBB -- Let's Get Started Already! -- Getting the Threading Building Blocks (TBB) Library -- Getting a Copy of the Examples -- Writing a First "Hello, TBB!" Example -- Building the Simple Examples -- Steps to Set Up an Environment -- Building on Windows Using Microsoft Visual Studio -- Building on a Linux Platform from a Terminal -- Using the Intel Compiler -- tbbvars and pstlvars Scripts -- Setting Up Variables Manually Without Using the tbbvars Script or the Intel Compiler -- A More Complete Example -- Starting with a Serial Implementation -- Adding a Message-Driven Layer Using a Flow Graph -- Adding a Fork-Join Layer Using a parallel_for -- Adding a SIMD Layer Using a Parallel STL Transform -- Summary -- Chapter 2: Generic Parallel Algorithms -- Functional / Task Parallelism -- A Slightly More Complicated Example: A Parallel Implementation of Quicksort -- Loops: parallel_for, parallel_reduce, and parallel_scan -- parallel_for: Applying a Body to Each Element in a Range -- A Slightly More Complicated Example: Parallel Matrix Multiplication -- parallel_reduce: Calculating a Single Result Across a Range -- A Slightly More Complicated Example: Calculating π by Numerical Integration -- parallel_scan: A Reduction with Intermediate Values -- How Does This Work? -- A Slightly More Complicated Example: Line of Sight -- Cook Until Done: parallel_do and parallel_pipeline.
parallel_do: Apply a Body Until There Are No More Items Left -- A Slightly More Complicated Example: Forward Substitution -- parallel_pipeline: Streaming Items Through a Series of Filters -- A Slightly More Complicated Example: Creating 3D Stereoscopic Images -- Summary -- For More Information -- Chapter 3: Flow Graphs -- Why Use Graphs to Express Parallelism? -- The Basics of the TBB Flow Graph Interface -- Step 1: Create the Graph Object -- Step 2: Make the Nodes -- Step 3: Add Edges -- Step 4: Start the Graph -- Step 5: Wait for the Graph to Complete Executing -- A More Complicated Example of a Data Flow Graph -- Implementing the Example as a TBB Flow Graph -- Understanding the Performance of a Data Flow Graph -- The Special Case of Dependency Graphs -- Implementing a Dependency Graph -- Estimating the Scalability of a Dependency Graph -- Advanced Topics in TBB Flow Graphs -- Summary -- Chapter 4: TBB and the Parallel Algorithms of the C++ Standard Template Library -- Does the C++ STL Library Belong in This Book? -- A Parallel STL Execution Policy Analogy -- A Simple Example Using std::for_each -- What Algorithms Are Provided in a Parallel STL Implementation? -- How to Get and Use a Copy of Parallel STL That Uses TBB -- Algorithms in Intel's Parallel STL -- Capturing More Use Cases with Custom Iterators -- Highlighting Some of the Most Useful Algorithms -- std::for_each, std::for_each_n -- std::transform -- std::reduce -- std::transform_reduce -- A Deeper Dive into the Execution Policies -- The sequenced_policy -- The parallel_policy -- The unsequenced_policy -- The parallel_unsequenced_policy -- Which Execution Policy Should We Use? -- Other Ways to Introduce SIMD Parallelism -- Summary -- For More Information -- Chapter 5: Synchronization: Why and How to Avoid It -- A Running Example: Histogram of an Image -- An Unsafe Parallel Implementation.
A First Safe Parallel Implementation: Coarse-Grained Locking -- Mutex Flavors -- A Second Safe Parallel Implementation: Fine-Grained Locking -- A Third Safe Parallel Implementation: Atomics -- A Better Parallel Implementation: Privatization and Reduction -- Thread Local Storage, TLS -- enumerable_thread_specific, ETS -- combinable -- The Easiest Parallel Implementation: Reduction Template -- Recap of Our Options -- Summary -- For More Information -- Chapter 6: Data Structures for Concurrency -- Key Data Structures Basics -- Unordered Associative Containers -- Map vs. Set -- Multiple Values -- Hashing -- Unordered -- Concurrent Containers -- Concurrent Unordered Associative Containers -- concurrent_hash_map -- Concurrent Support for map/multimap and set/multiset Interfaces -- Built-In Locking vs. No Visible Locking -- Iterating Through These Structures Is Asking for Trouble -- Concurrent Queues: Regular, Bounded, and Priority -- Bounding Size -- Priority Ordering -- Staying Thread-Safe: Try to Forget About Top, Size, Empty, Front, Back -- Iterators -- Why to Use This Concurrent Queue: The A-B-A Problem -- When to NOT Use Queues: Think Algorithms! -- Concurrent Vector -- When to Use tbb::concurrent_vector Instead of std::vector -- Elements Never Move -- Concurrent Growth of concurrent_vectors -- Summary -- Chapter 7: Scalable Memory Allocation -- Modern C++ Memory Allocation -- Scalable Memory Allocation: What -- Scalable Memory Allocation: Why -- Avoiding False Sharing with Padding -- Scalable Memory Allocation Alternatives: Which -- Compilation Considerations -- Most Popular Usage (C/C++ Proxy Library): How -- Linux: malloc/new Proxy Library Usage -- macOS: malloc/new Proxy Library Usage -- Windows: malloc/new Proxy Library Usage -- Testing our Proxy Library Usage -- C Functions: Scalable Memory Allocators for C.
C++ Classes: Scalable Memory Allocators for C++ -- Allocators with std::allocator&lt -- T&gt -- Signature -- scalable_allocator -- tbb_allocator -- zero_allocator -- cached_aligned_allocator -- Memory Pool Support: memory_pool_allocator -- Array Allocation Support: aligned_space -- Replacing new and delete Selectively -- Performance Tuning: Some Control Knobs -- What Are Huge Pages? -- TBB Support for Huge Pages -- scalable_allocation_mode(int mode, intptr_t value) -- TBBMALLOC_USE_HUGE_PAGES -- TBBMALLOC_SET_SOFT_HEAP_LIMIT -- int scalable_allocation_command(int cmd, void ∗param) -- TBBMALLOC_CLEAN_ALL_BUFFERS -- TBBMALLOC_CLEAN_THREAD_BUFFERS -- Summary -- Chapter 8: Mapping Parallel Patterns to TBB -- Parallel Patterns vs. Parallel Algorithms -- Patterns Categorize Algorithms, Designs, etc. -- Patterns That Work -- Data Parallelism Wins -- Nesting Pattern -- Map Pattern -- Workpile Pattern -- Reduction Patterns (Reduce and Scan) -- Fork-Join Pattern -- Divide-and-Conquer Pattern -- Branch-and-Bound Pattern -- Pipeline Pattern -- Event-Based Coordination Pattern (Reactive Streams) -- Summary -- For More Information -- Part 2 -- Chapter 9: The Pillars of Composability -- What Is Composability? -- Nested Composition -- Concurrent Composition -- Serial Composition -- The Features That Make TBB a Composable Library -- The TBB Thread Pool (the Market) and Task Arenas -- The TBB Task Dispatcher: Work Stealing and More -- Putting It All Together -- Looking Forward -- Controlling the Number of Threads -- Work Isolation -- Task-to-Thread and Thread-to-Core Affinity -- Task Priorities -- Summary -- For More Information -- Chapter 10: Using Tasks to Create Your Own Algorithms -- A Running Example: The Sequence -- The High-Level Approach: parallel_invoke -- The Highest Among the Lower: task_group -- The Low-Level Task Interface: Part One - Task Blocking.
The Low-Level Task Interface: Part Two - Task Continuation -- Bypassing the Scheduler -- The Low-Level Task Interface: Part Three - Task Recycling -- Task Interface Checklist -- One More Thing: FIFO (aka Fire-and-Forget) Tasks -- Putting These Low-Level Features to Work -- Summary -- For More Information -- Chapter 11: Controlling the Number of Threads Used for Execution -- A Brief Recap of the TBB Scheduler Architecture -- Interfaces for Controlling the Number of Threads -- Controlling Thread Count with task_scheduler_init -- Controlling Thread Count with task_arena -- Controlling Thread Count with global_control -- Summary of Concepts and Classes -- The Best Approaches for Setting the Number of Threads -- Using a Single task_scheduler_init Object for a Simple Application -- Using More Than One task_scheduler_init Object in a Simple Application -- Using Multiple Arenas with Different Numbers of Slots to Influence Where TBB Places Its Worker Threads -- Using global_control to Control How Many Threads Are Available to Fill Arena Slots -- Using global_control to Temporarily Restrict the Number of Available Threads -- When NOT to Control the Number of Threads -- Figuring Out What's Gone Wrong -- Summary -- Chapter 12: Using Work Isolation for Correctness and Performance -- Work Isolation for Correctness -- Creating an Isolated Region with  this_task_arena::isolate -- Oh No! Work Isolation Can Cause Its Own Correctness Issues! -- Even When It Is Safe, Work Isolation Is Not Free -- Using Task Arenas for Isolation: A Double-Edged Sword -- Don't Be Tempted to Use task_arenas to Create Work Isolation for Correctness -- Summary -- For More Information -- Chapter 13: Creating Thread-to-Core and Task-to-Thread Affinity -- Creating Thread-to-Core Affinity -- Creating Task-to-Thread Affinity -- When and How Should We Use the TBB Affinity Features? -- Summary.
For More Information.
isbn 9781484243985
9781484243978
callnumber-first Q - Science
callnumber-subject QA - Mathematics
callnumber-label QA76
callnumber-sort QA 276.76 C65
genre Electronic books.
genre_facet Electronic books.
url https://ebookcentral.proquest.com/lib/oeawat/detail.action?docID=5920567
illustrated Not Illustrated
oclc_num 1132430067
work_keys_str_mv AT vossmichael protbbcparallelprogrammingwiththreadingbuildingblocks
AT asenjorafael protbbcparallelprogrammingwiththreadingbuildingblocks
AT reindersjames protbbcparallelprogrammingwiththreadingbuildingblocks
status_str n
ids_txt_mv (MiAaPQ)5005920567
(Au-PeEL)EBL5920567
(OCoLC)1132430067
carrierType_str_mv cr
is_hierarchy_title Pro TBB : C++ Parallel Programming with Threading Building Blocks.
author2_original_writing_str_mv noLinkedField
noLinkedField
_version_ 1792331056497557505
fullrecord <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>11841nam a22004573i 4500</leader><controlfield tag="001">5005920567</controlfield><controlfield tag="003">MiAaPQ</controlfield><controlfield tag="005">20240229073832.0</controlfield><controlfield tag="006">m o d | </controlfield><controlfield tag="007">cr cnu||||||||</controlfield><controlfield tag="008">240229s2019 xx o ||||0 eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781484243985</subfield><subfield code="q">(electronic bk.)</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="z">9781484243978</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(MiAaPQ)5005920567</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(Au-PeEL)EBL5920567</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)1132430067</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">MiAaPQ</subfield><subfield code="b">eng</subfield><subfield code="e">rda</subfield><subfield code="e">pn</subfield><subfield code="c">MiAaPQ</subfield><subfield code="d">MiAaPQ</subfield></datafield><datafield tag="050" ind1=" " ind2="4"><subfield code="a">QA76.76.C65</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Voss, Michael.</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Pro TBB :</subfield><subfield code="b">C++ Parallel Programming with Threading Building Blocks.</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">1st ed.</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Berkeley, CA :</subfield><subfield code="b">Apress L. P.,</subfield><subfield code="c">2019.</subfield></datafield><datafield tag="264" ind1=" " ind2="4"><subfield code="c">©2019.</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">1 online resource (807 pages)</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">computer</subfield><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">online resource</subfield><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="505" ind1="0" ind2=" "><subfield code="a">Intro -- Table of Contents -- About the Authors -- Acknowledgments -- Preface -- Part 1 -- Chapter 1: Jumping Right In: "Hello, TBB!" -- Why Threading Building Blocks? -- Performance: Small Overhead, Big Benefits for C++ -- Evolving Support for Parallelism in TBB and C++ -- Recent C++ Additions for Parallelism -- The Threading Building Blocks (TBB) Library -- Parallel Execution Interfaces -- Interfaces That Are Independent of the Execution Model -- Using the Building Blocks in TBB -- Let's Get Started Already! -- Getting the Threading Building Blocks (TBB) Library -- Getting a Copy of the Examples -- Writing a First "Hello, TBB!" Example -- Building the Simple Examples -- Steps to Set Up an Environment -- Building on Windows Using Microsoft Visual Studio -- Building on a Linux Platform from a Terminal -- Using the Intel Compiler -- tbbvars and pstlvars Scripts -- Setting Up Variables Manually Without Using the tbbvars Script or the Intel Compiler -- A More Complete Example -- Starting with a Serial Implementation -- Adding a Message-Driven Layer Using a Flow Graph -- Adding a Fork-Join Layer Using a parallel_for -- Adding a SIMD Layer Using a Parallel STL Transform -- Summary -- Chapter 2: Generic Parallel Algorithms -- Functional / Task Parallelism -- A Slightly More Complicated Example: A Parallel Implementation of Quicksort -- Loops: parallel_for, parallel_reduce, and parallel_scan -- parallel_for: Applying a Body to Each Element in a Range -- A Slightly More Complicated Example: Parallel Matrix Multiplication -- parallel_reduce: Calculating a Single Result Across a Range -- A Slightly More Complicated Example: Calculating π by Numerical Integration -- parallel_scan: A Reduction with Intermediate Values -- How Does This Work? -- A Slightly More Complicated Example: Line of Sight -- Cook Until Done: parallel_do and parallel_pipeline.</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">parallel_do: Apply a Body Until There Are No More Items Left -- A Slightly More Complicated Example: Forward Substitution -- parallel_pipeline: Streaming Items Through a Series of Filters -- A Slightly More Complicated Example: Creating 3D Stereoscopic Images -- Summary -- For More Information -- Chapter 3: Flow Graphs -- Why Use Graphs to Express Parallelism? -- The Basics of the TBB Flow Graph Interface -- Step 1: Create the Graph Object -- Step 2: Make the Nodes -- Step 3: Add Edges -- Step 4: Start the Graph -- Step 5: Wait for the Graph to Complete Executing -- A More Complicated Example of a Data Flow Graph -- Implementing the Example as a TBB Flow Graph -- Understanding the Performance of a Data Flow Graph -- The Special Case of Dependency Graphs -- Implementing a Dependency Graph -- Estimating the Scalability of a Dependency Graph -- Advanced Topics in TBB Flow Graphs -- Summary -- Chapter 4: TBB and the Parallel Algorithms of the C++ Standard Template Library -- Does the C++ STL Library Belong in This Book? -- A Parallel STL Execution Policy Analogy -- A Simple Example Using std::for_each -- What Algorithms Are Provided in a Parallel STL Implementation? -- How to Get and Use a Copy of Parallel STL That Uses TBB -- Algorithms in Intel's Parallel STL -- Capturing More Use Cases with Custom Iterators -- Highlighting Some of the Most Useful Algorithms -- std::for_each, std::for_each_n -- std::transform -- std::reduce -- std::transform_reduce -- A Deeper Dive into the Execution Policies -- The sequenced_policy -- The parallel_policy -- The unsequenced_policy -- The parallel_unsequenced_policy -- Which Execution Policy Should We Use? -- Other Ways to Introduce SIMD Parallelism -- Summary -- For More Information -- Chapter 5: Synchronization: Why and How to Avoid It -- A Running Example: Histogram of an Image -- An Unsafe Parallel Implementation.</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">A First Safe Parallel Implementation: Coarse-Grained Locking -- Mutex Flavors -- A Second Safe Parallel Implementation: Fine-Grained Locking -- A Third Safe Parallel Implementation: Atomics -- A Better Parallel Implementation: Privatization and Reduction -- Thread Local Storage, TLS -- enumerable_thread_specific, ETS -- combinable -- The Easiest Parallel Implementation: Reduction Template -- Recap of Our Options -- Summary -- For More Information -- Chapter 6: Data Structures for Concurrency -- Key Data Structures Basics -- Unordered Associative Containers -- Map vs. Set -- Multiple Values -- Hashing -- Unordered -- Concurrent Containers -- Concurrent Unordered Associative Containers -- concurrent_hash_map -- Concurrent Support for map/multimap and set/multiset Interfaces -- Built-In Locking vs. No Visible Locking -- Iterating Through These Structures Is Asking for Trouble -- Concurrent Queues: Regular, Bounded, and Priority -- Bounding Size -- Priority Ordering -- Staying Thread-Safe: Try to Forget About Top, Size, Empty, Front, Back -- Iterators -- Why to Use This Concurrent Queue: The A-B-A Problem -- When to NOT Use Queues: Think Algorithms! -- Concurrent Vector -- When to Use tbb::concurrent_vector Instead of std::vector -- Elements Never Move -- Concurrent Growth of concurrent_vectors -- Summary -- Chapter 7: Scalable Memory Allocation -- Modern C++ Memory Allocation -- Scalable Memory Allocation: What -- Scalable Memory Allocation: Why -- Avoiding False Sharing with Padding -- Scalable Memory Allocation Alternatives: Which -- Compilation Considerations -- Most Popular Usage (C/C++ Proxy Library): How -- Linux: malloc/new Proxy Library Usage -- macOS: malloc/new Proxy Library Usage -- Windows: malloc/new Proxy Library Usage -- Testing our Proxy Library Usage -- C Functions: Scalable Memory Allocators for C.</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">C++ Classes: Scalable Memory Allocators for C++ -- Allocators with std::allocator&amp;lt -- T&amp;gt -- Signature -- scalable_allocator -- tbb_allocator -- zero_allocator -- cached_aligned_allocator -- Memory Pool Support: memory_pool_allocator -- Array Allocation Support: aligned_space -- Replacing new and delete Selectively -- Performance Tuning: Some Control Knobs -- What Are Huge Pages? -- TBB Support for Huge Pages -- scalable_allocation_mode(int mode, intptr_t value) -- TBBMALLOC_USE_HUGE_PAGES -- TBBMALLOC_SET_SOFT_HEAP_LIMIT -- int scalable_allocation_command(int cmd, void ∗param) -- TBBMALLOC_CLEAN_ALL_BUFFERS -- TBBMALLOC_CLEAN_THREAD_BUFFERS -- Summary -- Chapter 8: Mapping Parallel Patterns to TBB -- Parallel Patterns vs. Parallel Algorithms -- Patterns Categorize Algorithms, Designs, etc. -- Patterns That Work -- Data Parallelism Wins -- Nesting Pattern -- Map Pattern -- Workpile Pattern -- Reduction Patterns (Reduce and Scan) -- Fork-Join Pattern -- Divide-and-Conquer Pattern -- Branch-and-Bound Pattern -- Pipeline Pattern -- Event-Based Coordination Pattern (Reactive Streams) -- Summary -- For More Information -- Part 2 -- Chapter 9: The Pillars of Composability -- What Is Composability? -- Nested Composition -- Concurrent Composition -- Serial Composition -- The Features That Make TBB a Composable Library -- The TBB Thread Pool (the Market) and Task Arenas -- The TBB Task Dispatcher: Work Stealing and More -- Putting It All Together -- Looking Forward -- Controlling the Number of Threads -- Work Isolation -- Task-to-Thread and Thread-to-Core Affinity -- Task Priorities -- Summary -- For More Information -- Chapter 10: Using Tasks to Create Your Own Algorithms -- A Running Example: The Sequence -- The High-Level Approach: parallel_invoke -- The Highest Among the Lower: task_group -- The Low-Level Task Interface: Part One - Task Blocking.</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">The Low-Level Task Interface: Part Two - Task Continuation -- Bypassing the Scheduler -- The Low-Level Task Interface: Part Three - Task Recycling -- Task Interface Checklist -- One More Thing: FIFO (aka Fire-and-Forget) Tasks -- Putting These Low-Level Features to Work -- Summary -- For More Information -- Chapter 11: Controlling the Number of Threads Used for Execution -- A Brief Recap of the TBB Scheduler Architecture -- Interfaces for Controlling the Number of Threads -- Controlling Thread Count with task_scheduler_init -- Controlling Thread Count with task_arena -- Controlling Thread Count with global_control -- Summary of Concepts and Classes -- The Best Approaches for Setting the Number of Threads -- Using a Single task_scheduler_init Object for a Simple Application -- Using More Than One task_scheduler_init Object in a Simple Application -- Using Multiple Arenas with Different Numbers of Slots to Influence Where TBB Places Its Worker Threads -- Using global_control to Control How Many Threads Are Available to Fill Arena Slots -- Using global_control to Temporarily Restrict the Number of Available Threads -- When NOT to Control the Number of Threads -- Figuring Out What's Gone Wrong -- Summary -- Chapter 12: Using Work Isolation for Correctness and Performance -- Work Isolation for Correctness -- Creating an Isolated Region with  this_task_arena::isolate -- Oh No! Work Isolation Can Cause Its Own Correctness Issues! -- Even When It Is Safe, Work Isolation Is Not Free -- Using Task Arenas for Isolation: A Double-Edged Sword -- Don't Be Tempted to Use task_arenas to Create Work Isolation for Correctness -- Summary -- For More Information -- Chapter 13: Creating Thread-to-Core and Task-to-Thread Affinity -- Creating Thread-to-Core Affinity -- Creating Task-to-Thread Affinity -- When and How Should We Use the TBB Affinity Features? -- Summary.</subfield></datafield><datafield tag="505" ind1="8" ind2=" "><subfield code="a">For More Information.</subfield></datafield><datafield tag="588" ind1=" " ind2=" "><subfield code="a">Description based on publisher supplied metadata and other sources.</subfield></datafield><datafield tag="590" ind1=" " ind2=" "><subfield code="a">Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2024. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries. </subfield></datafield><datafield tag="655" ind1=" " ind2="4"><subfield code="a">Electronic books.</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Asenjo, Rafael.</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Reinders, James.</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Print version:</subfield><subfield code="a">Voss, Michael</subfield><subfield code="t">Pro TBB</subfield><subfield code="d">Berkeley, CA : Apress L. P.,c2019</subfield><subfield code="z">9781484243978</subfield></datafield><datafield tag="797" ind1="2" ind2=" "><subfield code="a">ProQuest (Firm)</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://ebookcentral.proquest.com/lib/oeawat/detail.action?docID=5920567</subfield><subfield code="z">Click to View</subfield></datafield></record></collection>