commit | 313c5239959b8f9e5cc182b982c914978f437ae1 | [log] [tgz] |
---|---|---|
author | Jose M Monsalve Diaz <jmonsalvediaz@anl.gov> | Tue Jul 27 22:38:27 2021 -0400 |
committer | Shilei Tian <tianshilei1992@gmail.com> | Tue Jul 27 22:38:35 2021 -0400 |
tree | 5bbe22450d38bdd6781088be4c54d7e03711c5f9 | |
parent | fe7ca1a9fca0ccea7495224e0e837de705e69699 [diff] |
[OpenMP][Tool] Introducing the `llvm-omp-device-info` tool This patch introduces the `llvm-omp-device-info` tool, which uses the omptarget library and interface to query the device info from all the available devices as seen by OpenMP. This is inspired by PGI's `pgaccelinfo` Since omptarget usually requires a description structure with executable kernels, I split the initialization of the RTLs and Devices to be able to initialize all possible devices and query each of them. This revision relies on the patch that introduces the print device info. A limitation is that the order in which the devices are initialized, and the corresponding device ID is not necesarily the one seen by OpenMP. The changes are as follows: 1. Separate the RTL initialization that was performed in `RegisterLib` to its own `initRTLonce` function 2. Create an `initAllRTLs` method that initializes all available RTLs at runtime 3. Created the `llvm-deviceinfo.cpp` tool that uses `omptarget` to query each device and prints its information. Example Output: ``` Device (0): print_device_info not implemented Device (1): print_device_info not implemented Device (2): print_device_info not implemented Device (3): print_device_info not implemented Device (4): CUDA Driver Version: 11000 CUDA Device Number: 0 Device Name: Quadro P1000 Global Memory Size: 4236312576 bytes Number of Multiprocessors: 5 Concurrent Copy and Execution: Yes Total Constant Memory: 65536 bytes Max Shared Memory per Block: 49152 bytes Registers per Block: 65536 Warp Size: 32 Threads Maximum Threads per Block: 1024 Maximum Block Dimensions: 1024, 1024, 64 Maximum Grid Dimensions: 2147483647 x 65535 x 65535 Maximum Memory Pitch: 2147483647 bytes Texture Alignment: 512 bytes Clock Rate: 1480500 kHz Execution Timeout: Yes Integrated Device: No Can Map Host Memory: Yes Compute Mode: DEFAULT Concurrent Kernels: Yes ECC Enabled: No Memory Clock Rate: 2505000 kHz Memory Bus Width: 128 bits L2 Cache Size: 1048576 bytes Max Threads Per SMP: 2048 Async Engines: Yes (2) Unified Addressing: Yes Managed Memory: Yes Concurrent Managed Memory: Yes Preemption Supported: Yes Cooperative Launch: Yes Multi-Device Boars: No Compute Capabilities: 61 ``` Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D106752
This directory and its sub-directories contain source code for LLVM, a toolkit for the construction of highly optimized compilers, optimizers, and run-time environments.
The README briefly describes how to get started with building LLVM. For more information on how to contribute to the LLVM project, please take a look at the Contributing to LLVM guide.
Taken from https://llvm.org/docs/GettingStarted.html.
Welcome to the LLVM project!
The LLVM project has multiple components. The core of the project is itself called “LLVM”. This contains all of the tools, libraries, and header files needed to process intermediate representations and convert them into object files. Tools include an assembler, disassembler, bitcode analyzer, and bitcode optimizer. It also contains basic regression tests.
C-like languages use the Clang front end. This component compiles C, C++, Objective-C, and Objective-C++ code into LLVM bitcode -- and from there into object files, using LLVM.
Other components include: the libc++ C++ standard library, the LLD linker, and more.
The LLVM Getting Started documentation may be out of date. The Clang Getting Started page might have more accurate information.
This is an example work-flow and configuration to get and build the LLVM source:
Checkout LLVM (including related sub-projects like Clang):
git clone https://github.com/llvm/llvm-project.git
Or, on windows, git clone --config core.autocrlf=false https://github.com/llvm/llvm-project.git
Configure and build LLVM and Clang:
cd llvm-project
cmake -S llvm -B build -G <generator> [options]
Some common build system generators are:
Ninja
--- for generating Ninja build files. Most llvm developers use Ninja.Unix Makefiles
--- for generating make-compatible parallel makefiles.Visual Studio
--- for generating Visual Studio projects and solutions.Xcode
--- for generating Xcode projects.Some Common options:
-DLLVM_ENABLE_PROJECTS='...'
--- semicolon-separated list of the LLVM sub-projects you'd like to additionally build. Can include any of: clang, clang-tools-extra, libcxx, libcxxabi, libunwind, lldb, compiler-rt, lld, polly, or cross-project-tests.
For example, to build LLVM, Clang, libcxx, and libcxxabi, use -DLLVM_ENABLE_PROJECTS="clang;libcxx;libcxxabi"
.
-DCMAKE_INSTALL_PREFIX=directory
--- Specify for directory the full path name of where you want the LLVM tools and libraries to be installed (default /usr/local
).
-DCMAKE_BUILD_TYPE=type
--- Valid options for type are Debug, Release, RelWithDebInfo, and MinSizeRel. Default is Debug.
-DLLVM_ENABLE_ASSERTIONS=On
--- Compile with assertion checks enabled (default is Yes for Debug builds, No for all other build types).
cmake --build build [-- [options] <target>]
or your build system specified above directly.
The default target (i.e. ninja
or make
) will build all of LLVM.
The check-all
target (i.e. ninja check-all
) will run the regression tests to ensure everything is in working order.
CMake will generate targets for each tool and library, and most LLVM sub-projects generate their own check-<project>
target.
Running a serial build will be slow. To improve speed, try running a parallel build. That's done by default in Ninja; for make
, use the option -j NNN
, where NNN
is the number of parallel jobs, e.g. the number of CPUs you have.
For more information see CMake
Consult the Getting Started with LLVM page for detailed information on configuring and compiling LLVM. You can visit Directory Layout to learn about the layout of the source code tree.