codepatch.h - OpenGrok history log for /openbsd-src/sys/arch/amd64/include/codepatch.h

Revision	Date	Author	Comments
# 6cbac32f	12-Feb-2024	guenther <guenther@openbsd.org>	Retpolines are an anti-pattern for IBT, so we need to shift protecting userspace from cross-process BTI to the kernel. Have each CPU track the last pmap run on in userspace and the last vmm VCPU in Retpolines are an anti-pattern for IBT, so we need to shift protecting userspace from cross-process BTI to the kernel. Have each CPU track the last pmap run on in userspace and the last vmm VCPU in guest-mode and use the IBPB msr to flush predictors right before running in userspace on a different pmap or entering guest-mode on a different VCPU. Codepatch-nop the userspace bits and conditionalize the vmm bits to keep working if IBPB isn't supported. ok deraadt@ kettenis@ show more ...
# 1538f8cb	31-Jul-2023	guenther <guenther@openbsd.org>	On CPUs with eIBRS ("enhanced Indirect Branch Restricted Speculation") or IBT enabled the kernel, the hardware should the attacks which retpolines were created to prevent. In those cases, retpolines On CPUs with eIBRS ("enhanced Indirect Branch Restricted Speculation") or IBT enabled the kernel, the hardware should the attacks which retpolines were created to prevent. In those cases, retpolines should be a net negative for security as they are an indirect branch gadget. They're also slower. * use -mretpoline-external-thunk to give us control of the code used for indirect branches * default to using a retpoline as before, but marks it and the other ASM kernel retpolines for code patching * if the CPU has eIBRS, then enable it * if the CPU has eIBRS or IBT, then codepatch the three different retpolines to just indirect jumps make clean && make config required after this ok kettenis@ show more ...
# 183b7dd1	31-Jul-2023	guenther <guenther@openbsd.org>	The replacement code passed to codepatch_replace() can usefully be const. suggested by bluhm@
# 40ce500b	28-Jul-2023	guenther <guenther@openbsd.org>	Add CODEPATCH_CODE() macro to simplify defining a symbol for a chunk of code to use in codepatching. Use that for all the existing codepatching snippets. Similarly, add CODEPATCH_CODE_LEN() which i Add CODEPATCH_CODE() macro to simplify defining a symbol for a chunk of code to use in codepatching. Use that for all the existing codepatching snippets. Similarly, add CODEPATCH_CODE_LEN() which is CODEPATCH_CODE() but also provides a short variable holding the length of the codepatch snippet. Use that for some snippets that will be used for retpoline replacement. ok kettenis@ deraadt@ show more ...
# 55fdb5fa	10-Jul-2023	guenther <guenther@openbsd.org>	Enable Indirect Branch Tracking for amd64 userland, using XSAVES/XRSTORS to save/restore the state and enabling it at exec-time (and for signal handling) if the PS_NOBTCFI flag isn't set. Note: this Enable Indirect Branch Tracking for amd64 userland, using XSAVES/XRSTORS to save/restore the state and enabling it at exec-time (and for signal handling) if the PS_NOBTCFI flag isn't set. Note: this changes the format of the sc_fpstate data in the signal context to possibly be in compressed format: starting now we just guarantee that that state is in a format understood by the XRSTOR instruction of the system that is being executed on. At this time, passing sigreturn a corrupt sc_fpstate now results in the process exiting with no attempt to fix it up or send a T_PROTFLT trap. That may change. prodding by deraadt@ issues with my original signal handling design identified by kettenis@ lots of base and ports preparation for this by deraadt@ and the libressl and ports teams ok deraadt@ kettenis@ show more ...
# 67ca69ec	11-Mar-2020	guenther <guenther@openbsd.org>	Take a swing at blocking Load-Value-Injection attacks against the kernel by using lfence in place of stac/clac on pre-SMAP CPUs. To quote from https://software.intel.com/security-software-guidance/in Take a swing at blocking Load-Value-Injection attacks against the kernel by using lfence in place of stac/clac on pre-SMAP CPUs. To quote from https://software.intel.com/security-software-guidance/insights/deep-dive-load-value-injection "If the OS makes use of Supervisor Mode Access Prevention (SMAP) on processors with SMAP enabled, then LVI on kernel load from user pages will be mitigated. This is because the CLAC and STAC instructions have LFENCE semantics on processors affected by LVI, and this serves as a speculation fence around kernel loads from user pages." ok deraadt@ show more ...
# 457e2542	28-Feb-2020	deraadt <deraadt@openbsd.org>	oops some snapshot tests fell in
# 7042a378	28-Feb-2020	deraadt <deraadt@openbsd.org>	sync
# 5c3fa5a3	07-Aug-2019	guenther <guenther@openbsd.org>	Mitigate CVE-2019-1125: block speculation past conditional jump to mis-skip or mis-take swapgs in interrupt path and in trap/fault/exception path. The latter is improved to have no conditionals arou Mitigate CVE-2019-1125: block speculation past conditional jump to mis-skip or mis-take swapgs in interrupt path and in trap/fault/exception path. The latter is improved to have no conditionals around this when Meltdown mitigation is in effect. Codepatch out the fences based on the description of CPU bugs in the (well written) Linux commit message. feedback from kettenis@ ok deraadt@ show more ...
# b8d87cb1	07-Aug-2019	guenther <guenther@openbsd.org>	Add codepatch_jmp(), like codepath_call() but inserting a jmp instead of a call. tweaked based on feedback from kettenis@ ok deraadt@
# a0dcb178	17-May-2019	guenther <guenther@openbsd.org>	Mitigate Intel's Microarchitectural Data Sampling vulnerability. If the CPU has the new VERW behavior than that is used, otherwise use the proper sequence from Intel's "Deep Dive" doc is used in the Mitigate Intel's Microarchitectural Data Sampling vulnerability. If the CPU has the new VERW behavior than that is used, otherwise use the proper sequence from Intel's "Deep Dive" doc is used in the return-to-userspace and enter-VMM-guest paths. The enter-C3-idle path is not mitigated because it's only a problem when SMT/HT is enabled: mitigating everything when that's enabled would be a _huge_ set of changes that we see no point in doing. Update vmm(4) to pass through the MSR bits so that guests can apply the optimal mitigation. VMM help and specific feedback from mlarkin@ vendor-portability help from jsg@ and kettenis@ ok kettenis@ mlarkin@ deraadt@ jsg@ show more ...
# f95e373f	04-Oct-2018	guenther <guenther@openbsd.org>	Use PCIDs where they and the INVPCID instruction are available. This uses one PCID for kernel threads, one for the U+K tables of normal processes, one for the matching U-K tables (when meltdown in ef Use PCIDs where they and the INVPCID instruction are available. This uses one PCID for kernel threads, one for the U+K tables of normal processes, one for the matching U-K tables (when meltdown in effect), and one for temporary mappings when poking other processes. Some further tweaks are envisioned but this is good enough to provide more separation and has (finally) been stable under ports testing. lots of ports testing and valid complaints from naddy@ and sthen@ feedback from mlarkin@ and sf@ show more ...
# 5f6ecb19	13-Jul-2018	sf <sf@openbsd.org>	Disable codepatching infrastructure after boot This way, it is not available for use in ROP attacks. This diff puts the codepatching code into a separate section and unmaps that section after boot. Disable codepatching infrastructure after boot This way, it is not available for use in ROP attacks. This diff puts the codepatching code into a separate section and unmaps that section after boot. In the future, the memory could potentially be reused but that would require larger changes. ok pguenther@ show more ...
# 1fc8fad1	12-Jul-2018	guenther <guenther@openbsd.org>	Reorganize the Meltdown entry and exit trampolines for syscall and traps so that the "mov %rax,%cr3" is followed by an infinite loop which is avoided because the mapping of the code being executed is Reorganize the Meltdown entry and exit trampolines for syscall and traps so that the "mov %rax,%cr3" is followed by an infinite loop which is avoided because the mapping of the code being executed is changed. This means the sysretq/iretq isn't even present in that flow of instructions in the kernel mapping, so userspace code can't be speculatively reached on the kernel mapping and totally eliminates the conditional jump over the the %cr3 change that supported CPUs without the Meltdown vulnerability. The return paths were probably vulnerable to Spectre v1 (and v1.1/1.2) style attacks, speculatively executing user code post-system-call with the kernel mappings, thus creating cache/TLB/etc side-effects. Would like to apply this technique to the interrupt stubs too, but I'm hitting a bug in clang's assembler which misaligns the code and symbols. While here, when on a CPU not vulnerable to Meltdown, codepatch out the unnecessary bits in cpu_switchto(). Inspiration from sf@, refined over dinner with theo ok mlarkin@ deraadt@ show more ...
# c9de630f	05-Jun-2018	guenther <guenther@openbsd.org>	Switch from lazy FPU switching to semi-eager FPU switching: track whether curproc's xstate ("extended state") is loaded in the CPU or not. - context switch, sendsig(), vmm, and doing CPU crypto in t Switch from lazy FPU switching to semi-eager FPU switching: track whether curproc's xstate ("extended state") is loaded in the CPU or not. - context switch, sendsig(), vmm, and doing CPU crypto in the kernel all check the flag and, if set, save the old thread's state to the PCB, clear the flag, and then load the _blank_ state - when returning to userspace, if the flag is clear then set it and restore the thread's state This simpler tracking also fixes the restoring of FPU state after nested signal handlers. With this, %cr0's TS flag is never set, the FPU #DNA trap can no longer happen, and IPIs are no longer necessary for flushing or syncing FPU state; on the other hand, restoring xstate while returning to userspace means we have to handle xrstor faulting if we could be loading an altered state. If that happens, reset the state, fake a #GP fault (SIGBUS), and recheck for ASTs. While here, regularize fxsave/fxrstor vs xsave/xrstor handling, by using codepatching to switch to xsave/xrstor when present in the CPU. In addition, code patch in use of xsaveopt in most places when the CPU supports that. Use the 64bit-wide variants of the instructions in all cases so that x87 instruction fault IPs are reported correctly. This change has three motivations: 1) with modern clang, SSE registers are used even in rcrt0.o, making lazy FPU switching a smaller benefit vs trap costs 2) the Intel SDM warns that lazy FPU switching may increase power costs 3) post-Spectre rumors suggest that the %cr0 TS flag might not block speculation, permitting leaking of information about FPU state (AES keys?) across protection boundaries. tested by many in snaps; prodding from deraadt@ show more ...
# 019cf0fb	25-Aug-2017	guenther <guenther@openbsd.org>	If SMAP is present, clear PSL_AC on kernel entry and interrupt so that only the code in copy{in,out}* that need it run with it set. Panic if it's set on entry to trap() or syscall(). Prompted by Ma If SMAP is present, clear PSL_AC on kernel entry and interrupt so that only the code in copy{in,out}* that need it run with it set. Panic if it's set on entry to trap() or syscall(). Prompted by Maxime Villard's NetBSD work. ok kettenis@ mlarkin@ deraadt@ show more ...
# 2daa5239	01-Jul-2017	sf <sf@openbsd.org>	Use absolute pointers in codepatch entries Instead of offsets to KERNBASE, store absolute pointers in the codepatch entries. This makes the resulting kernel a few KB larger on amd64, but KERNBASE wi Use absolute pointers in codepatch entries Instead of offsets to KERNBASE, store absolute pointers in the codepatch entries. This makes the resulting kernel a few KB larger on amd64, but KERNBASE will go away when ASLR is introduced. Requested by deraadt@ show more ...
# 984d4744	19-Apr-2015	sf <sf@openbsd.org>	Add support for x2apic mode This is currently only enabled on hypervisors because on real hardware, it requires interrupt remapping which we don't support yet. But on virtualization it reduces the n Add support for x2apic mode This is currently only enabled on hypervisors because on real hardware, it requires interrupt remapping which we don't support yet. But on virtualization it reduces the number of vmexits required per IPI from 4 to 1, causing a significant speed-up for MP guests. ok kettenis@ show more ...
# 61d6df42	16-Jan-2015	sf <sf@openbsd.org>	Binary code patching on amd64 This commit adds generic infrastructure to do binary code patching on amd64. The existing code patching for SMAP is converted to the new infrastruture. More consumers Binary code patching on amd64 This commit adds generic infrastructure to do binary code patching on amd64. The existing code patching for SMAP is converted to the new infrastruture. More consumers and support for i386 will follow later. This version of the diff has some simplifications in codepatch_fill_nop() compared to a version that was: OK @kettenis @mlarkin @jsg show more ...