| ========================== |
| Vector Predication Roadmap |
| ========================== |
| |
| .. contents:: Table of Contents |
| :depth: 3 |
| :local: |
| |
| Motivation |
| ========== |
| |
| This proposal defines a roadmap towards native vector predication in LLVM, |
| specifically for vector instructions with a mask and/or an explicit vector |
| length. LLVM currently has no target-independent means to model predicated |
| vector instructions for modern SIMD ISAs such as AVX512, ARM SVE, the RISC-V V |
| extension and NEC SX-Aurora. Only some predicated vector operations, such as |
| masked loads and stores, are available through intrinsics [MaskedIR]_. |
| |
| The Vector Predication (VP) extensions is a concrete RFC and prototype |
| implementation to achieve native vector predication in LLVM. The VP prototype |
| and all related discussions can be found in the VP patch on Phabricator |
| [VPRFC]_. |
| |
| Roadmap |
| ======= |
| |
| 1. IR-level VP intrinsics |
| ------------------------- |
| |
| - There is a consensus on the semantics/instruction set of VP. |
| - VP intrinsics and attributes are available on IR level. |
| - TTI has capability flags for VP (``supportsVP()``?, |
| ``haveActiveVectorLength()``?). |
| |
| Result: VP usable for IR-level vectorizers (LV, VPlan, RegionVectorizer), |
| potential integration in Clang with builtins. |
| |
| 2. CodeGen support |
| ------------------ |
| |
| - VP intrinsics translate to first-class SDNodes |
| (eg ``llvm.vp.fdiv.* -> vp_fdiv``). |
| - VP legalization (legalize explicit vector length to mask (AVX512), legalize VP |
| SDNodes to pre-existing ones (SSE, NEON)). |
| |
| Result: Backend development based on VP SDNodes. |
| |
| 3. Lift InstSimplify/InstCombine/DAGCombiner to VP |
| -------------------------------------------------- |
| |
| - Introduce PredicatedInstruction, PredicatedBinaryOperator, .. helper classes |
| that match standard vector IR and VP intrinsics. |
| - Add a matcher context to PatternMatch and context-aware IR Builder APIs. |
| - Incrementally lift DAGCombiner to work on VP SDNodes as well as on regular |
| vector instructions. |
| - Incrementally lift InstCombine/InstSimplify to operate on VP as well as |
| regular IR instructions. |
| |
| Result: Optimization of VP intrinsics on par with standard vector instructions. |
| |
| 4. Deprecate llvm.masked.* / llvm.experimental.reduce.* |
| ------------------------------------------------------- |
| |
| - Modernize llvm.masked.* / llvm.experimental.reduce* by translating to VP. |
| - DCE transitional APIs. |
| |
| Result: VP has superseded earlier vector intrinsics. |
| |
| 5. Predicated IR Instructions |
| ----------------------------- |
| |
| - Vector instructions have an optional mask and vector length parameter. These |
| lower to VP SDNodes (from Stage 2). |
| - Phase out VP intrinsics, only keeping those that are not equivalent to |
| vectorized scalar instructions (reduce, shuffles, ..) |
| - InstCombine/InstSimplify expect predication in regular Instructions (Stage (3) |
| has laid the groundwork). |
| |
| Result: Native vector predication in IR. |
| |
| References |
| ========== |
| |
| .. [MaskedIR] `llvm.masked.*` intrinsics, |
| https://llvm.org/docs/LangRef.html#masked-vector-load-and-store-intrinsics |
| |
| .. [VPRFC] RFC: Prototype & Roadmap for vector predication in LLVM, |
| https://reviews.llvm.org/D57504 |