blob: 66a808ef2fa86227deabc65ceaabaef13a863239 [file] [log] [blame]
Support, Getting Involved, and FAQ
Please do not hesitate to reach out to us via or join
one of our :ref:`regular calls <calls>`. Some common questions are answered in
the :ref:`faq`.
.. _calls:
OpenMP in LLVM Technical Call
- Development updates on OpenMP (and OpenACC) in the LLVM Project, including Clang, optimization, and runtime work.
- Join `OpenMP in LLVM Technical Call <>`__.
- Time: Weekly call on every Wednesday 7:00 AM Pacific time.
- Meeting minutes are `here <>`__.
- Status tracking `page <>`__.
OpenMP in Flang Technical Call
- Development updates on OpenMP and OpenACC in the Flang Project.
- Join `OpenMP in Flang Technical Call <>`_
- Time: Weekly call on every Thursdays 8:00 AM Pacific time.
- Meeting minutes are `here <>`__.
- Status tracking `page <>`__.
.. _faq:
.. note::
The FAQ is a work in progress and most of the expected content is not
yet available. While you can expect changes, we always welcome feedback and
additions. Please contact, e.g., through ````.
Q: How to contribute a patch to the webpage or any other part?
All patches go through the regular `LLVM review process
.. _build_offload_capable_compiler:
Q: How to build an OpenMP offload capable compiler?
To build an *effective* OpenMP offload capable compiler, only one extra CMake
option, `LLVM_ENABLE_RUNTIMES="openmp"`, is needed when building LLVM (Generic
information about building LLVM is available `here <>`__.).
Make sure all backends that are targeted by OpenMP to be enabled. By default,
Clang will be built with all backends enabled.
If your build machine is not the target machine or automatic detection of the
available GPUs failed, you should also set:
- `CLANG_OPENMP_NVPTX_DEFAULT_ARCH=sm_XX` where `XX` is the architecture of your GPU, e.g, 80.
- `LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES=YY` where `YY` is the numeric compute capacity of your GPU, e.g., 75.
.. note::
The compiler that generates the offload code should be the same (version) as
the compiler that builds the OpenMP device runtimes. The OpenMP host runtime
can be built by a different compiler.
.. _advanced_builds:
Q: Does OpenMP offloading support work in pre-packaged LLVM releases?
For now, the answer is most likely *no*. Please see :ref:`build_offload_capable_compiler`.
Q: Does OpenMP offloading support work in packages distributed as part of my OS?
For now, the answer is most likely *no*. Please see :ref:`build_offload_capable_compiler`.
.. _math_and_complex_in_target_regions:
Q: Does Clang support `<math.h>` and `<complex.h>` operations in OpenMP target on GPUs?
Yes, LLVM/Clang allows math functions and complex arithmetic inside of OpenMP target regions
that are compiled for GPUs.
Clang provides a set of wrapper headers that are found first when `math.h` and
`complex.h`, for C, `cmath` and `complex`, for C++, or similar headers are
included by the application. These wrappers will eventually include the system
version of the corresponding header file after setting up a target device
specific environment. The fact that the system header is included is important
because they differ based on the architecture and operating system and may
contain preprocessor, variable, and function definitions that need to be
available in the target region regardless of the targeted device architecture.
However, various functions may require specialized device versions, e.g.,
`sin`, and others are only available on certain devices, e.g., `__umul64hi`. To
provide "native" support for math and complex on the respective architecture,
Clang will wrap the "native" math functions, e.g., as provided by the device
vendor, in an OpenMP begin/end declare variant. These functions will then be
picked up instead of the host versions while host only variables and function
definitions are still available. Complex arithmetic and functions are support
through a similar mechanism. It is worth noting that this support requires
`extensions to the OpenMP begin/end declare variant context selector
that are exposed through LLVM/Clang to the user as well.
Q: What is a way to debug errors from mapping memory to a target device?
An experimental way to debug these errors is to use :ref:`remote process
offloading <remote_offloading_plugin>`.
By using ```` and ``openmp-offloading-server``, it is
possible to explicitly perform memory transfers between processes on the host
CPU and run sanitizers while doing so in order to catch these errors.
Q: Why does my application say "Named symbol not found" and abort when I run it?
This is most likely caused by trying to use OpenMP offloading with static
libraries. Static libraries do not contain any device code, so when the runtime
attempts to execute the target region it will not be found and you will get an
an error like this.
.. code-block:: text
CUDA error: Loading '__omp_offloading_fd02_3231c15__Z3foov_l2' Failed
CUDA error: named symbol not found
Libomptarget error: Unable to generate entries table for device id 0.
Currently, the only solution is to change how the application is built and avoid
the use of static libraries.