X86SpeculativeLoadHardening.cpp - OpenGrok cross reference for /freebsd-src/contrib/llvm-project/llvm/lib/Target/X86/X86SpeculativeLoadHardening.cpp

Lines Matching full:we
142     // We mostly have one conditional branch, and in extremely rare cases have
234   // We have to insert the new block immediately after the current one as we
235   // don't know what layout-successor relationships the successor has and we
246     // we might have *broken* fallthrough and so need to inject a new
256       // Update the unconditional branch now that we've added one.
274   // If this is the only edge to the successor, we can just replace it in the
275   // CFG. Otherwise we need to add a new entry in the CFG for the new
323 /// FIXME: It's really frustrating that we have to do this, but SSA-form in MIR
324 /// isn't what you might expect. We may have multiple entries in PHI nodes for
325 /// a single predecessor. This makes CFG-updating extremely complex, so here we
336       // First we scan the operands of the PHI looking for duplicate entries
337       // a particular predecessor. We retain the operand index of each duplicate
349       // FIXME: It is really frustrating that we have to use a quadratic
353       // Note that we have to process these backwards so that we don't
367 /// Helper to scan a function for loads vulnerable to misspeculation that we
370 /// We use this to avoid making changes to functions where there is nothing we
389       // We found a load.
403   // Only run if this pass is forced enabled or we detect the relevant function
421   // We support an alternative hardening technique based on a debug flag.
433   // Do a quick scan to see if we have any checkable loads.
436   // See if we have any conditional branching blocks that we will need to trace
440   // If we have no interesting conditions or loads, nothing to do here.
452   // If we have loads being hardened and we've asked for call and ret edges to
455     // We need to insert an LFENCE at the start of the function to suspend any
459     // FIXME: We could skip this for functions which unconditionally return
466   // If we guarded the entry with an LFENCE and have no conditionals to protect
467   // in blocks, then we're done.
469     // We may have changed the function's code at this point to insert fences.
475     // pointer so we pick up any misspeculation in our caller.
479     // as we don't need any initial state.
497   // We're going to need to trace predicate state throughout the function's
510   // We may also enter basic blocks in this function via exception handling
511   // control flow. Here, if we are hardening interprocedurally, we need to
531     // If we are going to harden calls and jumps we need to unfold their memory
535     // Then we trace predicate state through the indirect branches.
540   // Now that we have the predicate state available at the start of each block
542   // as we go.
563 /// We include this as an alternative mostly for the purpose of comparison. The
568   // First, we scan the function looking for blocks that are reached along edges
569   // that we might want to harden.
582     // Add all the non-EH-pad succossors to the blocks we want to harden. We
603   // we need to trace through.
609     // We want to reliably handle any conditional branch terminators in the
610     // MBB, so we manually analyze the branch. We can handle all of the
616     // edge. For each conditional edge, we track the target and the opposite
619     // edge, we inject a separate cmov for each conditional branch with
622     // directly implement that. We don't bother trying to optimize either of
625     // instruction count. This late, we simply assume the minimal number of
634       // Once we've handled all the terminators, we're done.
638       // If we see a non-branch terminator, we can't handle anything so bail.
644       // If we see an unconditional branch, reset our state, clear any
652       // If we get an invalid condition, we have an indirect branch or some
653       // other unanalyzable "fallthrough" case. We model this as a nullptr for
654       // the destination so we can still guard any conditional successors.
660       // We still want to harden the edge to `L1`.
667       // We have a vanilla conditional branch, add it to our list.
693   // Collect the inserted cmov instructions so we can rewrite their uses of the
698   // jumps where we need to update this register along each edge.
729           // First, we split the edge to insert the checking block into a safe
746           // We will wire each cmov to each other, but need to start with the
755             // Note that we intentionally use an empty debug location so that
798       // Decrement the successor count now that we've split one of the edges.
799       // We need to keep the count of edges to the successor accurate in order
805     // Since we may have split edges and changed the number of successors,
806     // normalize the probabilities. This avoids doing it each time we split an
810     // Finally, we need to insert cmovs into the "fallthrough" edge. Here, we
811     // need to intersect the other condition codes. We can do this by just
814       // If we have no fallthrough to protect (perhaps it is an indirect jump?)
819            "We should never have more than one edge to the unconditional "
853     // We use make_early_inc_range here so we can remove instructions if needed
859       // We only care about loading variants of these instructions.
878         // We cannot mitigate far jumps or calls, but we also don't expect them
899         // Use the generic unfold logic now that we know we're dealing with
901         // FIXME: We don't have test coverage for all of these!
911         // If we were able to compute an unfolded reg class, any failure here
962   // We use the SSAUpdater to insert PHI nodes for the target addresses of
963   // indirect branches. We don't actually need the full power of the SSA updater
964   // in this particular case as we always have immediately available values, but
972   // We need to know what blocks end up reached via indirect branches. We
1001       // We cannot mitigate far jumps or calls, but we also don't expect them
1025     // We have definitely found an indirect  branch. Verify that there are no
1026     // preceding conditional branches as we don't yet support that.
1050   // Keep track of the cmov instructions we insert so we can return them.
1053   // If we didn't find any indirect branches with targets, nothing to do here.
1057   // We found indirect branches and targets that need to be instrumented to
1065     // We don't expect EH pads to ever be reached via an indirect branch. If
1066     // this is desired for some reason, we could simply skip them here rather
1071     // We should never end up threading EFLAGS into a block to harden
1079     // We can't handle having non-indirect edges into this block unless this is
1080     // the only successor and we can synthesize the necessary target address.
1082       // If we've already handled this by extracting the target directly,
1087       // Otherwise, we have to be the only successor. We generally expect this
1089       // split already. We don't however need to worry about EH pad successors
1105       // Now we need to compute the address of this block and install it as a
1106       // synthetic target in the predecessor. We do this at the bottom of the
1137     // Materialize the needed SSA value of the target. Note that we need the
1139     // branch back to itself. We can do this here because at this point, every
1151       // Check directly against a relocated immediate when we can.
1224       // Otherwise we've def'ed it, and it is live.
1227     // While at this instruction, also check if we use and kill EFLAGS
1233   // If we didn't find anything conclusive (neither definitely alive or
1241 /// We call this routine once the initial predicate state has been established
1246 /// currently valid predicate state. We have to do these two things together
1247 /// because the SSA updater only works across blocks. Within a block, we track
1250 /// This operates in two passes over each block. First, we analyze the loads in
1253 /// amenable to hardening. We have to process these first because the two
1254 /// strategies may interact -- later hardening may change what strategy we wish
1255 /// to use. We also will analyze data dependencies between loads and avoid
1257 /// address. We also skip hardening loads already behind an LFENCE as that is
1260 /// Second, we actively trace the predicate state through the block, applying
1261 /// the hardening steps we determined necessary in the first pass as we go.
1263 /// These two passes are applied to each basic block. We operate one block at a
1276   // value which we would have checked, we can omit any checks on them.
1282     // hardened. During this walk we propagate load dependence for address
1285     // we check to see if any registers used in the address will have been
1290     // FIXME: We should consider an aggressive mode where we continue to keep as
1294     // Note that we only need this pass if we are actually hardening loads.
1297         // We naively assume that all def'ed registers of an instruction have
1309         // LFENCE to be a speculation barrier, so if we see an LFENCE, there is
1336         // If we have at least one (non-frame-index, non-RIP) register operand,
1337         // and neither operand is load-dependent, we need to check the load.
1349         // If any register operand is dependent, this load is dependent and we
1351         // FIXME: Is this true in the case where we are hardening loads after
1358         // post-load hardening, and we aren't already going to harden one of the
1387     // hardening strategy we have elected. Note that we do this in a second
1388     // pass specifically so that we have the complete set of instructions for
1389     // which we will do post-load hardening and can defer it in certain
1393         // We cannot both require hardening the def of a load and its address.
1416           // interference, we want to try and sink any hardening as far as
1419             // Sink the instruction we'll need to harden as far as we can down
1423             // If we managed to sink this instruction, update everything so we
1424             // harden that instruction when we reach it in the instruction
1428               // we're done.
1432               // Otherwise, add this to the set of defs we harden.
1440           // Mark the resulting hardened register as such so we don't re-harden.
1447         // even if we couldn't find the specific load used, or were able to
1448         // avoid hardening it for some reason. Note that here we cannot break
1449         // out afterward as we may still need to handle any call aspect of this
1455       // After we finish hardening loads we handle interprocedural hardening if
1469       // Otherwise we have a call. We need to handle transferring the predicate
1481     // Currently, we only track data-dependent loads within a basic block.
1482     // FIXME: We should see if this is necessary or if we could be more
1500   // We directly copy the FLAGS register and rely on later lowering to clean
1520 /// stack pointer. The state is essentially a single bit, but we merge this in
1528   // to stay canonical on 64-bit. We should compute this somehow and support
1549   // We know that the stack pointer will have any preserved predicate state in
1550   // its high bit. We just want to smear this across the other bits. Turns out,
1578     // harden it if we're covering fixed address loads as well.
1592     // For both RIP-relative addressed loads or absolute loads, we cannot
1596     // FIXME: When using a segment base (like TLS does) we end up with the
1597     // dynamic address being the base plus -1 because we can't mutate the
1629     // Otherwise, we can directly update this operand and remove it.
1633   // If there are none left, we're done.
1642   // If EFLAGS are live and we don't have access to instructions that avoid
1643   // clobbering EFLAGS we need to save and restore them. This in turn makes
1656     // If this is a vector register, we'll need somewhat custom logic to handle
1664       // FIXME: We could skip this at the cost of longer encodings with AVX-512
1740         // We need to avoid touching EFLAGS so shift out all but the least
1774   // See if we can sink hardening the loaded value.
1779     // We need to find a single use which we can sink the check. We can
1784       // If we're already going to harden this use, it is data invariant, it
1788           // If we've already decided to harden a non-load, we must have sunk
1816         // We already have a single use, this would make two. Bail.
1820       // interfering EFLAGS, we can't sink the hardening to it.
1825       // If this instruction defines multiple registers bail as we won't harden
1830       // If this register isn't a virtual register we can't walk uses of sanely,
1831       // just bail. Also check that its register class is one of the ones we
1847     // Update which MI we're checking now.
1857   // We only support hardening virtual registers.
1864     // We don't support post-load hardening of vectors.
1871   // require REX prefix, we may not be able to satisfy that constraint when
1873   // FIXME: This seems like a pretty lame hack. The way this comes up is when we
1898 /// larger than the predicate state register. FIXME: We should support vector
1946 /// We can harden a non-leaking load into a register without touching the
1947 /// address by just hiding all of the loaded bits during misspeculation. We use
1948 /// an `or` instruction to do this because we set up our poison value as all
1961   // Because we want to completely replace the uses of this def'ed value with
1968   // use. Note that we insert the instructions to compute this *after* the
1983 /// Returns implicitly perform a load which we need to harden. Without hardening
1990 /// We can harden this by introducing an LFENCE that will delay any load of the
1992 /// speculated), or we can harden the address used by the implicit load: the
1995 /// If we are not using an LFENCE, hardening the stack pointer has an additional
2010     // No need to fence here as we'll fence at the return site itself. That
2011     // handles more cases than we can handle here.
2014   // Take our predicate state, shift it to the high 17 bits (so that we keep
2016   // extract it when we return (speculatively).
2025 /// First, we need to send the predicate state into the called function. We do
2028 /// For tail calls, this is all we need to do.
2030 /// For calls where we might return and resume the control flow, we need to
2034 /// We also need to verify that we intended to return to this location in the
2041 /// The way we verify that we returned to the correct location is by preserving
2044 /// was left by the RET instruction when it popped `%rsp`. Alternatively, we can
2046 /// call. We compare this intended return address against the address
2048 /// mismatch, we have detected misspeculation and can poison our predicate
2059       // Tail call, we don't return to this function.
2060       // FIXME: We should also handle noreturn calls.
2063     // We don't need to fence before the call because the function should fence
2064     // in its entry. However, we do need to fence after the call returns.
2073   // First, we transfer the predicate state into the called function by merging
2078   // If this call is also a return, it is a tail call and we don't need anything
2080   // instructions and no successors, this call does not return so we can also
2086   // machine instruction. We will lower extra symbols attached to call
2096   // If we have no red zones or if the function returns twice (possibly without
2097   // using the `ret` instruction) like setjmp, we need to save the expected
2101     // If we don't have red zones, we need to compute the expected return
2108     // the stack). But that isn't our primary goal, so we only use it as
2112     // rematerialization in the register allocator. We somehow need to force
2116     // FIXME: It is even less clear why MachineCSE can't just fold this when we
2137   // If we didn't pre-compute the expected return address into a register, then
2139   // stack immediately after the call. As the very first instruction, we load it
2152   // Now we extract the callee's predicate state from the stack pointer.
2155   // Test the expected return address against our actual address. If we can
2157   // we compute it.
2160     // FIXME: Could we fold this with the load? It would require careful EFLAGS
2178   // Now conditionally update the predicate state we just extracted if we ended
2203 /// will be adequately hardened already, we want to ensure that they are
2207 /// execution. We forcibly unfolded all relevant loads above and so will always
2208 /// have an opportunity to post-load harden here, we just need to scan for cases
2220     // We don't need to harden either far calls or far jumps as they are
2228   // We should never see a loading instruction at this point, as those should
2242   // Try to lookup a hardened version of this register. We retain a reference
2243   // here as we want to update the map to track any newly computed hardened
2247   // If we don't have a hardened register yet, compute one. Otherwise, just use
2250   // FIXME: It is a little suspect that we use partially hardened registers that