Open Access This book is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
This book is about programming for data parallelism using C++. If you are new to parallel programming, that is okay. If you have never heard of SYCL or the DPC++ compiler, that is also okay.
SYCL is an industry-driven Khronos standard adding data parallelism to C++ for heterogeneous systems. DPC++ is an open source compiler project that is based on SYCL, a few extensions, and broad heterogeneous support that includes GPU, CPU, and FPGA support. All examples in this book compile and work with DPC++ compilers.
If you are a C programmer who is not well versed in C++, you are in good company. Several of the authors of this book happily admit that we picked up C++ by reading books that used C++ like this one. With a little patience, this book should be approachable by C programmers with a desire to write modern C++ programs.
When this book project began in 2019, our vision for fully supporting C++ with data parallelism required a number of extensions beyond the then-current SYCL 1.2.1 standard. These extensions, supported by the DPC++ compiler, included support for Unified Shared Memory (USM), sub-groups to complete a three-level hierarchy throughout SYCL, anonymous lambdas, and numerous programming simplifications.
At the time that this book is being published (late 2020), a provisional SYCL 2020 specification is available for public comment. The provisional specification includes support for USM, sub-groups, anonymous lambdas, and simplifications for coding (akin to C++17 CTAD). This book teaches SYCL with extensions to approximate where SYCL will be in the future. These extensions are implemented in the DPC++ compiler project. While we expect changes to be small compared to the bulk of what this book teaches, there will be changes with SYCL as the community helps refine it. Important resources for updated information include the book GitHub and errata that can be found from the web page for this book (www.apress.com/9781484255735), as well as the oneAPI DPC++ language reference (tinyurl.com/dpcppref).
The evolution of SYCL and DPC++ continues. Prospects for the future are discussed in the Epilogue, after we have taken a journey together to learn how to use DPC++ to create programs for heterogeneous systems using SYCL.
It is our hope that our book supports and helps grow the SYCL community, and helps promote data-parallel programming in C++.
It is hard to leap in and explain everything at once. In fact, it is impossible as far as we know. Therefore, this book is a journey that takes us through what we need to know to be an effective programmer with Data Parallel C++.
Chapter 1 lays the first foundation by covering core concepts that are either new or worth refreshing in our minds.
Chapters 2–4 lay a foundation of understanding for data-parallel programming C++. When we finish with reading Chapters 1–4, we will have a solid foundation for data-parallel programming in C++. Chapters 1–4 build on each other, and are best read in order.
Chapters 5–19 fill in important details by building on each other to some degree while being easy to jump between if desired. The book concludes with an Epilogue that discusses likely and possible future directions for Data Parallel C++.
James Reinders
Ben Ashbaugh
James Brodman
Michael Kinsner
John Pennycook
Xinmin Tian
October 2020
We all get to new heights by building on the work of others. Isaac Newton gave credit for his success from “standing on the shoulders of giants.” We would all be in trouble if this was not allowed.
Perhaps there is no easy path to writing a new book on an exciting new developments such as SYCL and DPC++. Fortunately, there are good people who make that path easier—it is our great joy to thank them for their help!
We are deeply thankful for all those whose work has helped make this book possible, and we do wish to thank as many as we can recall by name. If we stood on your shoulders and did not call you out by name, you can know we are thankful, and please accept our apologies for any accidental forgetfulness.
A handful of people tirelessly read our early manuscripts and provided insightful feedback for which we are very grateful. These reviewers include Jefferson Amstutz, Thomas Applencourt, Alexey Bader, Gordon Brown, Konstantin Bobrovsky, Robert Cohn, Jessica Davies, Tom Deakin, Abhishek Deshmukh, Bill Dieter, Max Domeika, Todd Erdner, John Freeman, Joe Garvey, Nithin George, Milind Girkar, Sunny Gogar, Jeff Hammond, Tommy Hoffner, Zheming Jin, Paul Jurczak, Audrey Kertesz, Evgueny Khartchenko, Jeongnim Kim, Rakshith Krishnappa, Goutham Kalikrishna Reddy Kuncham, Victor Lomüller, Susan Meredith, Paul Petersen, Felipe De Azevedo Piovezan, Ruyman Reyes, Jason Sewall, Byron Sinclair, Philippe Thierry, and Peter Žužek.
We thank the entire development team at Intel who created DPC++ including its libraries and documentation, without which this book would not be possible.
The Khronos SYCL working group and Codeplay are giants on which we have relied. We share, with them, the goal of bringing effective and usable data parallelism to C++. We thank all those involved in the development of the SYCL specification. Their tireless work to bring forward a truly open standard for the entire industry is to be admired. The SYCL team has been true to its mission and desire to keep this standard really open. We also highly appreciate the trailblazing work done by Codeplay, to promote and support SYCL before DPC++ was even a glimmer in our eyes. They continue to be an important resource for the entire community.
Many people within Intel have contributed extensively to DPC++ and SYCL—too many to name here. We thank all of you for your hard work, both in the evolution of the language and APIs and in the implementation of prototypes, compilers, libraries, and tools. Although we can’t name everyone, we would like to specifically thank some of the key language evolution architects who have made transformative contributions to DPC++ and SYCL: Roland Schulz, Alexey Bader, Jason Sewall, Alex Wells, Ilya Burylov, Greg Lueck, Alexey Kukanov, Ruslan Arutyunyan, Jeff Hammond, Erich Keane, and Konstantin Bobrovsky.
We appreciate the patience and dedication of the DPC++ user community. The developers at Argonne National Lab have been incredibly supportive in our journey together with DPC++.
As coauthors, we cannot adequately thank each other enough. We came together in early 2019, with a vision that we would write a book to teach SYCL and DPC++. Over the next year, we became a team that learned how to teach together. We faced challenges from many commitments that tried to pull us away from book writing and reviewing, including product deadlines and standards work. Added to the mix for the entire world was COVID-19. We are a little embarrassed to admit that the stay-at-home orders gave us a non-trivial boost in time and focus for the book. Our thoughts and prayers extend to all those affected by this global pandemic.
James Reinders: I wish to thank Jefferson Amstutz for enlightening discussions of parallelism in C++ and some critically useful help to get some C++ coding straight by using Jefferson’s superpower as C++ compiler error message whisperer. I thank my wife, Susan Meredith, for her love, support, advice, and review. I am grateful for those in Intel who thought I would enjoy helping with this project and asked me to join in the fun! Many thanks to coauthors for their patience (with me) and hard work on this ambitious project.
Ben Ashbaugh: I am grateful for the support and encouragement of my wife, Brenna, and son, Spencer. Thank you for the uninterrupted writing time, and for excuses to go for a walk or play games when I needed a break! To everyone in the Khronos SYCL and OpenCL working groups, thank you for the discussion, collaboration, and inspiration. DPC++ and this book would not be possible without you.
James Brodman: I thank my family and friends for all their support. Thanks to all my colleagues at Intel and in Khronos for great discussions and collaborations.
Michael Kinsner: I thank my wife, Jasmine, and children, Winston and Tilly, for their support during the writing of this book and throughout the DPC++ project. Both have required a lot of time and energy, and I wouldn’t have been able to do either without so much support. A thank you also goes to many people at Intel and Khronos who have poured their energy and time into SYCL and DPC++. All of you have shaped SYCL, OpenCL, and DPC++ and have been part of the countless discussions and experiments that have informed the thinking leading to DPC++ and this book.
John Pennycook: I cannot thank my wife, Louise, enough for her patience, understanding, and support in juggling book writing with care of our newborn daughter, Tamsyn. Thanks also to Roland Schulz and Jason Sewall for all of their work on DPC++ and their assistance in making sense of C++ compiler errors!
Xinmin Tian: I appreciate Alice S. Chan and Geoff Lowney for their strong support during the writing of the book and throughout the DPC++ performance work. Sincere thanks to Guei-Yuan Lueh, Konstantin Bobrovsky, Hideki Saito, Kaiyu Chen, Mikhail Loenko, Silvia Linares, Pavel Chupin, Oleg Maslov, Sergey Maslov, Vlad Romanov, Alexey Sotkin, Alexey Sachkov, and the entire DPC++ compiler and runtime and tools teams for all of their great contributions and hard work in making DPC++ compiler and tools possible.
We appreciate the hard work by the entire Apress team, including the people we worked with directly the most: Natalie Pao, Jessica Vakili, C Dulcy Nirmala, and Krishnan Sathyamurthy.
We were blessed with the support and encouragement of some special managers, including Herb Hinstorff, Bill Savage, Alice S. Chan, Victor Lee, Ann Bynum, John Kreatsoulas, Geoff Lowney, Zack Waters, Sanjiv Shah, John Freeman, and Kevin Stevens.
Numerous colleagues offered information, advice, and vision. We are sure that there are more than a few people whom we have failed to mention who have positively impacted this book project. We thank all those who helped by slipping in their ingredients into our book project. We apologize to all who helped us and were not mentioned here.
Thank you all, and we hope you find this book invaluable in your endeavors.
is a consultant with more than three decades of experience in parallel computing and is an author/coauthor/editor of ten technical books related to parallel programming. He has had the great fortune to help make key contributions to two of the world’s fastest computers (#1 on the TOP500 list) as well as many other supercomputers and software developer tools. James finished 10,001 days (over 27 years) at Intel in mid-2016, and he continues to write, teach, program, and consult in areas related to parallel computing (HPC and AI).
is a Software Architect at Intel Corporation where he has worked for over 20 years developing software drivers for Intel graphics products. For the past 10 years, Ben has focused on parallel programming models for general-purpose computation on graphics processors, including SYCL and DPC++. Ben is active in the Khronos SYCL, OpenCL, and SPIR working groups, helping to define industry standards for parallel programming, and he has authored numerous extensions to expose unique Intel GPU features.
is a software engineer at Intel Corporation working on runtimes and compilers for parallel programming, and he is one of the architects of DPC++. He has a Ph.D. in Computer Science from the University of Illinois at Urbana-Champaign.
is a Principal Engineer at Intel Corporation developing parallel programming languages and models for a variety of architectures, and he is one of the architects of DPC++. He contributes extensively to spatial programming models and compilers, and is an Intel representative within The Khronos Group where he works on the SYCL and OpenCL industry standards for parallel programming. Mike has a Ph.D. in Computer Engineering from McMaster University, and is passionate about programming models that cross architectures while still enabling performance.
is an HPC Application Engineer at Intel Corporation, focused on enabling developers to fully utilize the parallelism available in modern processors. He is experienced in optimizing and parallelizing applications from a range of scientific domains, and previously served as Intel’s representative on the steering committee for the Intel eXtreme Performance User’s Group (IXPUG). John has a Ph.D. in Computer Science from the University of Warwick. His research interests are varied, but a recurring theme is the ability to achieve application “performance portability” across different hardware architectures.
is a Senior Principal Engineer and Compiler Architect at Intel Corporation, and serves as Intel’s representative on OpenMP Architecture Review Board (ARB). He is responsible for driving OpenMP offloading, vectorization, and parallelization compiler technologies for current and future Intel architectures. His current focus is on LLVM-based OpenMP offloading, DPC++ compiler optimizations for Intel oneAPI Toolkits for CPU and Xe accelerators, and tuning HPC/AI application performance. He has a Ph.D. in Computer Science, holds 27 U.S. patents, has published over 60 technical papers with over 1200 citations of his work, and has co-authored two books that span his expertise.