[libclc] Remove __attribute__((always_inline)) (#158791) always_inline doesn't guarantee performance improvement. Target-specific optimizations decide whether inlining is profitable. Changes to amdgcn--amdhsa.bc: * _Z9__clc_logDv16_f and _Z15__clc_remainderDv16_fS_ are not inlined. * sincos vector function code size has doubled due to apparent duplication. Also replace typo _CLC_DECL with _CLC_DEF for function definition. GitOrigin-RevId: 7f3661128b1e5dda69586afcff99a8f662e4126f
libclc is an open source implementation of the library requirements of the OpenCL C programming language, as specified by the OpenCL 1.1 Specification. The following sections of the specification impose library requirements:
libclc is intended to be used with the Clang compiler's OpenCL frontend.
libclc is designed to be portable and extensible. To this end, it provides generic implementations of most library requirements, allowing the target to override the generic implementation at the granularity of individual functions.
libclc currently supports PTX, AMDGPU, SPIRV and CLSPV targets, but support for more targets is welcome.
(in the following instructions you can use make or ninja)
For an in-tree build, Clang must also be built at the same time:
$ cmake <path-to>/llvm-project/llvm/CMakeLists.txt -DLLVM_ENABLE_PROJECTS="clang" \
-DLLVM_ENABLE_RUNTIMES="libclc" -DCMAKE_BUILD_TYPE=Release -G Ninja
$ ninja
Then install:
$ ninja install
Note you can use the DESTDIR Makefile variable to do staged installs.
$ DESTDIR=/path/for/staged/install ninja install
To build out of tree, or in other words, against an existing LLVM build or install:
$ cmake <path-to>/llvm-project/libclc/CMakeLists.txt -DCMAKE_BUILD_TYPE=Release \ -G Ninja -DLLVM_DIR=$(<path-to>/llvm-config --cmakedir) $ ninja
Then install as before.
In both cases this will include all supported targets. You can choose which targets are enabled by passing -DLIBCLC_TARGETS_TO_BUILD to CMake. The default is all.
In both cases, the LLVM used must include the targets you want libclc support for (AMDGPU and NVPTX are enabled in LLVM by default). Apart from SPIRV where you do not need an LLVM target but you do need the llvm-spirv tool available. Either build this in-tree, or place it in the directory pointed to by LLVM_TOOLS_BINARY_DIR.