| ============================================= |
| SYCL Compiler and Runtime architecture design |
| ============================================= |
| |
| .. contents:: |
| :local: |
| |
| Introduction |
| ============ |
| |
| This document describes the architecture of the SYCL compiler and runtime |
| library. More details are provided in |
| `external document <https://github.com/intel/llvm/blob/sycl/sycl/doc/CompilerAndRuntimeDesign.md>`_\ , |
| which are going to be added to clang documentation in the future. |
| |
| Address space handling |
| ====================== |
| |
| The SYCL specification represents pointers to disjoint memory regions using C++ |
| wrapper classes on an accelerator to enable compilation with a standard C++ |
| toolchain and a SYCL compiler toolchain. Section 3.8.2 of SYCL 2020 |
| specification defines |
| `memory model <https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#_sycl_device_memory_model>`_\ , |
| section 4.7.7 - `address space classes <https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#_address_space_classes>`_ |
| and section 5.9 covers `address space deduction <https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#_address_space_deduction>`_. |
| The SYCL specification allows two modes of address space deduction: "generic as |
| default address space" (see section 5.9.3) and "inferred address space" (see |
| section 5.9.4). Current implementation supports only "generic as default address |
| space" mode. |
| |
| SYCL borrows its memory model from OpenCL however SYCL doesn't perform |
| the address space qualifier inference as detailed in |
| `OpenCL C v3.0 6.7.8 <https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/OpenCL_C.html#addr-spaces-inference>`_. |
| |
| The default address space is "generic-memory", which is a virtual address space |
| that overlaps the global, local, and private address spaces. SYCL mode enables |
| following conversions: |
| |
| - explicit conversions to/from the default address space from/to the address |
| space-attributed type |
| - implicit conversions from the address space-attributed type to the default |
| address space |
| - explicit conversions to/from the global address space from/to the |
| ``__attribute__((opencl_global_device))`` or |
| ``__attribute__((opencl_global_host))`` address space-attributed type |
| - implicit conversions from the ``__attribute__((opencl_global_device))`` or |
| ``__attribute__((opencl_global_host))`` address space-attributed type to the |
| global address space |
| |
| All named address spaces are disjoint and sub-sets of default address space. |
| |
| The SPIR target allocates SYCL namespace scope variables in the global address |
| space. |
| |
| Pointers to default address space should get lowered into a pointer to a generic |
| address space (or flat to reuse more general terminology). But depending on the |
| allocation context, the default address space of a non-pointer type is assigned |
| to a specific address space. This is described in |
| `common address space deduction rules <https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#subsec:commonAddressSpace>`_ |
| section. |
| |
| This is also in line with the behaviour of CUDA (`small example |
| <https://godbolt.org/z/veqTfo9PK>`_). |
| |
| ``multi_ptr`` class implementation example: |
| |
| .. code-block:: C++ |
| |
| // check that SYCL mode is ON and we can use non-standard decorations |
| #if defined(__SYCL_DEVICE_ONLY__) |
| // GPU/accelerator implementation |
| template <typename T, address_space AS> class multi_ptr { |
| // DecoratedType applies corresponding address space attribute to the type T |
| // DecoratedType<T, global_space>::type == "__attribute__((opencl_global)) T" |
| // See sycl/include/CL/sycl/access/access.hpp for more details |
| using pointer_t = typename DecoratedType<T, AS>::type *; |
| |
| pointer_t m_Pointer; |
| public: |
| pointer_t get() { return m_Pointer; } |
| T& operator* () { return *reinterpret_cast<T*>(m_Pointer); } |
| } |
| #else |
| // CPU/host implementation |
| template <typename T, address_space AS> class multi_ptr { |
| T *m_Pointer; // regular undecorated pointer |
| public: |
| T *get() { return m_Pointer; } |
| T& operator* () { return *m_Pointer; } |
| } |
| #endif |
| |
| Depending on the compiler mode, ``multi_ptr`` will either decorate its internal |
| data with the address space attribute or not. |
| |
| To utilize clang's existing functionality, we reuse the following OpenCL address |
| space attributes for pointers: |
| |
| .. list-table:: |
| :header-rows: 1 |
| |
| * - Address space attribute |
| - SYCL address_space enumeration |
| * - ``__attribute__((opencl_global))`` |
| - global_space, constant_space |
| * - ``__attribute__((opencl_global_device))`` |
| - global_space |
| * - ``__attribute__((opencl_global_host))`` |
| - global_space |
| * - ``__attribute__((opencl_local))`` |
| - local_space |
| * - ``__attribute__((opencl_private))`` |
| - private_space |
| |
| |
| .. code-block:: C++ |
| |
| //TODO: add support for __attribute__((opencl_global_host)) and __attribute__((opencl_global_device)). |
| |