1*097a140dSpatrick========================== 2*097a140dSpatrickVector Predication Roadmap 3*097a140dSpatrick========================== 4*097a140dSpatrick 5*097a140dSpatrick.. contents:: Table of Contents 6*097a140dSpatrick :depth: 3 7*097a140dSpatrick :local: 8*097a140dSpatrick 9*097a140dSpatrickMotivation 10*097a140dSpatrick========== 11*097a140dSpatrick 12*097a140dSpatrickThis proposal defines a roadmap towards native vector predication in LLVM, 13*097a140dSpatrickspecifically for vector instructions with a mask and/or an explicit vector 14*097a140dSpatricklength. LLVM currently has no target-independent means to model predicated 15*097a140dSpatrickvector instructions for modern SIMD ISAs such as AVX512, ARM SVE, the RISC-V V 16*097a140dSpatrickextension and NEC SX-Aurora. Only some predicated vector operations, such as 17*097a140dSpatrickmasked loads and stores, are available through intrinsics [MaskedIR]_. 18*097a140dSpatrick 19*097a140dSpatrickThe Vector Predication (VP) extensions is a concrete RFC and prototype 20*097a140dSpatrickimplementation to achieve native vector predication in LLVM. The VP prototype 21*097a140dSpatrickand all related discussions can be found in the VP patch on Phabricator 22*097a140dSpatrick[VPRFC]_. 23*097a140dSpatrick 24*097a140dSpatrickRoadmap 25*097a140dSpatrick======= 26*097a140dSpatrick 27*097a140dSpatrick1. IR-level VP intrinsics 28*097a140dSpatrick------------------------- 29*097a140dSpatrick 30*097a140dSpatrick- There is a consensus on the semantics/instruction set of VP. 31*097a140dSpatrick- VP intrinsics and attributes are available on IR level. 32*097a140dSpatrick- TTI has capability flags for VP (``supportsVP()``?, 33*097a140dSpatrick ``haveActiveVectorLength()``?). 34*097a140dSpatrick 35*097a140dSpatrickResult: VP usable for IR-level vectorizers (LV, VPlan, RegionVectorizer), 36*097a140dSpatrickpotential integration in Clang with builtins. 37*097a140dSpatrick 38*097a140dSpatrick2. CodeGen support 39*097a140dSpatrick------------------ 40*097a140dSpatrick 41*097a140dSpatrick- VP intrinsics translate to first-class SDNodes 42*097a140dSpatrick (eg ``llvm.vp.fdiv.* -> vp_fdiv``). 43*097a140dSpatrick- VP legalization (legalize explicit vector length to mask (AVX512), legalize VP 44*097a140dSpatrick SDNodes to pre-existing ones (SSE, NEON)). 45*097a140dSpatrick 46*097a140dSpatrickResult: Backend development based on VP SDNodes. 47*097a140dSpatrick 48*097a140dSpatrick3. Lift InstSimplify/InstCombine/DAGCombiner to VP 49*097a140dSpatrick-------------------------------------------------- 50*097a140dSpatrick 51*097a140dSpatrick- Introduce PredicatedInstruction, PredicatedBinaryOperator, .. helper classes 52*097a140dSpatrick that match standard vector IR and VP intrinsics. 53*097a140dSpatrick- Add a matcher context to PatternMatch and context-aware IR Builder APIs. 54*097a140dSpatrick- Incrementally lift DAGCombiner to work on VP SDNodes as well as on regular 55*097a140dSpatrick vector instructions. 56*097a140dSpatrick- Incrementally lift InstCombine/InstSimplify to operate on VP as well as 57*097a140dSpatrick regular IR instructions. 58*097a140dSpatrick 59*097a140dSpatrickResult: Optimization of VP intrinsics on par with standard vector instructions. 60*097a140dSpatrick 61*097a140dSpatrick4. Deprecate llvm.masked.* / llvm.experimental.reduce.* 62*097a140dSpatrick------------------------------------------------------- 63*097a140dSpatrick 64*097a140dSpatrick- Modernize llvm.masked.* / llvm.experimental.reduce* by translating to VP. 65*097a140dSpatrick- DCE transitional APIs. 66*097a140dSpatrick 67*097a140dSpatrickResult: VP has superseded earlier vector intrinsics. 68*097a140dSpatrick 69*097a140dSpatrick5. Predicated IR Instructions 70*097a140dSpatrick----------------------------- 71*097a140dSpatrick 72*097a140dSpatrick- Vector instructions have an optional mask and vector length parameter. These 73*097a140dSpatrick lower to VP SDNodes (from Stage 2). 74*097a140dSpatrick- Phase out VP intrinsics, only keeping those that are not equivalent to 75*097a140dSpatrick vectorized scalar instructions (reduce, shuffles, ..) 76*097a140dSpatrick- InstCombine/InstSimplify expect predication in regular Instructions (Stage (3) 77*097a140dSpatrick has laid the groundwork). 78*097a140dSpatrick 79*097a140dSpatrickResult: Native vector predication in IR. 80*097a140dSpatrick 81*097a140dSpatrickReferences 82*097a140dSpatrick========== 83*097a140dSpatrick 84*097a140dSpatrick.. [MaskedIR] `llvm.masked.*` intrinsics, 85*097a140dSpatrick https://llvm.org/docs/LangRef.html#masked-vector-load-and-store-intrinsics 86*097a140dSpatrick 87*097a140dSpatrick.. [VPRFC] RFC: Prototype & Roadmap for vector predication in LLVM, 88*097a140dSpatrick https://reviews.llvm.org/D57504 89