1========================== 2Vector Predication Roadmap 3========================== 4 5.. contents:: Table of Contents 6 :depth: 3 7 :local: 8 9Motivation 10========== 11 12This proposal defines a roadmap towards native vector predication in LLVM, 13specifically for vector instructions with a mask and/or an explicit vector 14length. LLVM currently has no target-independent means to model predicated 15vector instructions for modern SIMD ISAs such as AVX512, ARM SVE, the RISC-V V 16extension and NEC SX-Aurora. Only some predicated vector operations, such as 17masked loads and stores, are available through intrinsics [MaskedIR]_. 18 19The Vector Predication (VP) extensions is a concrete RFC and prototype 20implementation to achieve native vector predication in LLVM. The VP prototype 21and all related discussions can be found in the VP patch on Phabricator 22[VPRFC]_. 23 24Roadmap 25======= 26 271. IR-level VP intrinsics 28------------------------- 29 30- There is a consensus on the semantics/instruction set of VP. 31- VP intrinsics and attributes are available on IR level. 32- TTI has capability flags for VP (``supportsVP()``?, 33 ``haveActiveVectorLength()``?). 34 35Result: VP usable for IR-level vectorizers (LV, VPlan, RegionVectorizer), 36potential integration in Clang with builtins. 37 382. CodeGen support 39------------------ 40 41- VP intrinsics translate to first-class SDNodes 42 (eg ``llvm.vp.fdiv.* -> vp_fdiv``). 43- VP legalization (legalize explicit vector length to mask (AVX512), legalize VP 44 SDNodes to pre-existing ones (SSE, NEON)). 45 46Result: Backend development based on VP SDNodes. 47 483. Lift InstSimplify/InstCombine/DAGCombiner to VP 49-------------------------------------------------- 50 51- Introduce PredicatedInstruction, PredicatedBinaryOperator, .. helper classes 52 that match standard vector IR and VP intrinsics. 53- Add a matcher context to PatternMatch and context-aware IR Builder APIs. 54- Incrementally lift DAGCombiner to work on VP SDNodes as well as on regular 55 vector instructions. 56- Incrementally lift InstCombine/InstSimplify to operate on VP as well as 57 regular IR instructions. 58 59Result: Optimization of VP intrinsics on par with standard vector instructions. 60 614. Deprecate llvm.masked.* / llvm.experimental.reduce.* 62------------------------------------------------------- 63 64- Modernize llvm.masked.* / llvm.experimental.reduce* by translating to VP. 65- DCE transitional APIs. 66 67Result: VP has superseded earlier vector intrinsics. 68 695. Predicated IR Instructions 70----------------------------- 71 72- Vector instructions have an optional mask and vector length parameter. These 73 lower to VP SDNodes (from Stage 2). 74- Phase out VP intrinsics, only keeping those that are not equivalent to 75 vectorized scalar instructions (reduce, shuffles, ..) 76- InstCombine/InstSimplify expect predication in regular Instructions (Stage (3) 77 has laid the groundwork). 78 79Result: Native vector predication in IR. 80 81References 82========== 83 84.. [MaskedIR] `llvm.masked.*` intrinsics, 85 https://llvm.org/docs/LangRef.html#masked-vector-load-and-store-intrinsics 86 87.. [VPRFC] RFC: Prototype & Roadmap for vector predication in LLVM, 88 https://reviews.llvm.org/D57504 89