#
6cbac32f |
| 12-Feb-2024 |
guenther <guenther@openbsd.org> |
Retpolines are an anti-pattern for IBT, so we need to shift protecting userspace from cross-process BTI to the kernel. Have each CPU track the last pmap run on in userspace and the last vmm VCPU in
Retpolines are an anti-pattern for IBT, so we need to shift protecting userspace from cross-process BTI to the kernel. Have each CPU track the last pmap run on in userspace and the last vmm VCPU in guest-mode and use the IBPB msr to flush predictors right before running in userspace on a different pmap or entering guest-mode on a different VCPU. Codepatch-nop the userspace bits and conditionalize the vmm bits to keep working if IBPB isn't supported.
ok deraadt@ kettenis@
show more ...
|
#
1538f8cb |
| 31-Jul-2023 |
guenther <guenther@openbsd.org> |
On CPUs with eIBRS ("enhanced Indirect Branch Restricted Speculation") or IBT enabled the kernel, the hardware should the attacks which retpolines were created to prevent. In those cases, retpolines
On CPUs with eIBRS ("enhanced Indirect Branch Restricted Speculation") or IBT enabled the kernel, the hardware should the attacks which retpolines were created to prevent. In those cases, retpolines should be a net negative for security as they are an indirect branch gadget. They're also slower. * use -mretpoline-external-thunk to give us control of the code used for indirect branches * default to using a retpoline as before, but marks it and the other ASM kernel retpolines for code patching * if the CPU has eIBRS, then enable it * if the CPU has eIBRS *or* IBT, then codepatch the three different retpolines to just indirect jumps
make clean && make config required after this
ok kettenis@
show more ...
|
#
183b7dd1 |
| 31-Jul-2023 |
guenther <guenther@openbsd.org> |
The replacement code passed to codepatch_replace() can usefully be const.
suggested by bluhm@
|
#
40ce500b |
| 28-Jul-2023 |
guenther <guenther@openbsd.org> |
Add CODEPATCH_CODE() macro to simplify defining a symbol for a chunk of code to use in codepatching. Use that for all the existing codepatching snippets.
Similarly, add CODEPATCH_CODE_LEN() which i
Add CODEPATCH_CODE() macro to simplify defining a symbol for a chunk of code to use in codepatching. Use that for all the existing codepatching snippets.
Similarly, add CODEPATCH_CODE_LEN() which is CODEPATCH_CODE() but also provides a short variable holding the length of the codepatch snippet. Use that for some snippets that will be used for retpoline replacement.
ok kettenis@ deraadt@
show more ...
|
#
55fdb5fa |
| 10-Jul-2023 |
guenther <guenther@openbsd.org> |
Enable Indirect Branch Tracking for amd64 userland, using XSAVES/XRSTORS to save/restore the state and enabling it at exec-time (and for signal handling) if the PS_NOBTCFI flag isn't set.
Note: this
Enable Indirect Branch Tracking for amd64 userland, using XSAVES/XRSTORS to save/restore the state and enabling it at exec-time (and for signal handling) if the PS_NOBTCFI flag isn't set.
Note: this changes the format of the sc_fpstate data in the signal context to possibly be in compressed format: starting now we just guarantee that that state is in a format understood by the XRSTOR instruction of the system that is being executed on.
At this time, passing sigreturn a corrupt sc_fpstate now results in the process exiting with no attempt to fix it up or send a T_PROTFLT trap. That may change.
prodding by deraadt@ issues with my original signal handling design identified by kettenis@
lots of base and ports preparation for this by deraadt@ and the libressl and ports teams
ok deraadt@ kettenis@
show more ...
|
#
67ca69ec |
| 11-Mar-2020 |
guenther <guenther@openbsd.org> |
Take a swing at blocking Load-Value-Injection attacks against the kernel by using lfence in place of stac/clac on pre-SMAP CPUs. To quote from https://software.intel.com/security-software-guidance/in
Take a swing at blocking Load-Value-Injection attacks against the kernel by using lfence in place of stac/clac on pre-SMAP CPUs. To quote from https://software.intel.com/security-software-guidance/insights/deep-dive-load-value-injection "If the OS makes use of Supervisor Mode Access Prevention (SMAP) on processors with SMAP enabled, then LVI on kernel load from user pages will be mitigated. This is because the CLAC and STAC instructions have LFENCE semantics on processors affected by LVI, and this serves as a speculation fence around kernel loads from user pages."
ok deraadt@
show more ...
|
#
457e2542 |
| 28-Feb-2020 |
deraadt <deraadt@openbsd.org> |
oops some snapshot tests fell in
|
#
7042a378 |
| 28-Feb-2020 |
deraadt <deraadt@openbsd.org> |
sync
|
#
5c3fa5a3 |
| 07-Aug-2019 |
guenther <guenther@openbsd.org> |
Mitigate CVE-2019-1125: block speculation past conditional jump to mis-skip or mis-take swapgs in interrupt path and in trap/fault/exception path. The latter is improved to have no conditionals arou
Mitigate CVE-2019-1125: block speculation past conditional jump to mis-skip or mis-take swapgs in interrupt path and in trap/fault/exception path. The latter is improved to have no conditionals around this when Meltdown mitigation is in effect. Codepatch out the fences based on the description of CPU bugs in the (well written) Linux commit message.
feedback from kettenis@ ok deraadt@
show more ...
|
#
b8d87cb1 |
| 07-Aug-2019 |
guenther <guenther@openbsd.org> |
Add codepatch_jmp(), like codepath_call() but inserting a jmp instead of a call.
tweaked based on feedback from kettenis@ ok deraadt@
|
#
a0dcb178 |
| 17-May-2019 |
guenther <guenther@openbsd.org> |
Mitigate Intel's Microarchitectural Data Sampling vulnerability. If the CPU has the new VERW behavior than that is used, otherwise use the proper sequence from Intel's "Deep Dive" doc is used in the
Mitigate Intel's Microarchitectural Data Sampling vulnerability. If the CPU has the new VERW behavior than that is used, otherwise use the proper sequence from Intel's "Deep Dive" doc is used in the return-to-userspace and enter-VMM-guest paths. The enter-C3-idle path is not mitigated because it's only a problem when SMT/HT is enabled: mitigating everything when that's enabled would be a _huge_ set of changes that we see no point in doing.
Update vmm(4) to pass through the MSR bits so that guests can apply the optimal mitigation.
VMM help and specific feedback from mlarkin@ vendor-portability help from jsg@ and kettenis@ ok kettenis@ mlarkin@ deraadt@ jsg@
show more ...
|
#
f95e373f |
| 04-Oct-2018 |
guenther <guenther@openbsd.org> |
Use PCIDs where they and the INVPCID instruction are available. This uses one PCID for kernel threads, one for the U+K tables of normal processes, one for the matching U-K tables (when meltdown in ef
Use PCIDs where they and the INVPCID instruction are available. This uses one PCID for kernel threads, one for the U+K tables of normal processes, one for the matching U-K tables (when meltdown in effect), and one for temporary mappings when poking other processes. Some further tweaks are envisioned but this is good enough to provide more separation and has (finally) been stable under ports testing.
lots of ports testing and valid complaints from naddy@ and sthen@ feedback from mlarkin@ and sf@
show more ...
|
#
5f6ecb19 |
| 13-Jul-2018 |
sf <sf@openbsd.org> |
Disable codepatching infrastructure after boot
This way, it is not available for use in ROP attacks. This diff puts the codepatching code into a separate section and unmaps that section after boot.
Disable codepatching infrastructure after boot
This way, it is not available for use in ROP attacks. This diff puts the codepatching code into a separate section and unmaps that section after boot. In the future, the memory could potentially be reused but that would require larger changes.
ok pguenther@
show more ...
|
#
1fc8fad1 |
| 12-Jul-2018 |
guenther <guenther@openbsd.org> |
Reorganize the Meltdown entry and exit trampolines for syscall and traps so that the "mov %rax,%cr3" is followed by an infinite loop which is avoided because the mapping of the code being executed is
Reorganize the Meltdown entry and exit trampolines for syscall and traps so that the "mov %rax,%cr3" is followed by an infinite loop which is avoided because the mapping of the code being executed is changed. This means the sysretq/iretq isn't even present in that flow of instructions in the kernel mapping, so userspace code can't be speculatively reached on the kernel mapping and totally eliminates the conditional jump over the the %cr3 change that supported CPUs without the Meltdown vulnerability. The return paths were probably vulnerable to Spectre v1 (and v1.1/1.2) style attacks, speculatively executing user code post-system-call with the kernel mappings, thus creating cache/TLB/etc side-effects.
Would like to apply this technique to the interrupt stubs too, but I'm hitting a bug in clang's assembler which misaligns the code and symbols.
While here, when on a CPU not vulnerable to Meltdown, codepatch out the unnecessary bits in cpu_switchto().
Inspiration from sf@, refined over dinner with theo ok mlarkin@ deraadt@
show more ...
|
#
c9de630f |
| 05-Jun-2018 |
guenther <guenther@openbsd.org> |
Switch from lazy FPU switching to semi-eager FPU switching: track whether curproc's xstate ("extended state") is loaded in the CPU or not. - context switch, sendsig(), vmm, and doing CPU crypto in t
Switch from lazy FPU switching to semi-eager FPU switching: track whether curproc's xstate ("extended state") is loaded in the CPU or not. - context switch, sendsig(), vmm, and doing CPU crypto in the kernel all check the flag and, if set, save the old thread's state to the PCB, clear the flag, and then load the _blank_ state - when returning to userspace, if the flag is clear then set it and restore the thread's state
This simpler tracking also fixes the restoring of FPU state after nested signal handlers.
With this, %cr0's TS flag is never set, the FPU #DNA trap can no longer happen, and IPIs are no longer necessary for flushing or syncing FPU state; on the other hand, restoring xstate while returning to userspace means we have to handle xrstor faulting if we could be loading an altered state. If that happens, reset the state, fake a #GP fault (SIGBUS), and recheck for ASTs.
While here, regularize fxsave/fxrstor vs xsave/xrstor handling, by using codepatching to switch to xsave/xrstor when present in the CPU. In addition, code patch in use of xsaveopt in most places when the CPU supports that. Use the 64bit-wide variants of the instructions in all cases so that x87 instruction fault IPs are reported correctly.
This change has three motivations: 1) with modern clang, SSE registers are used even in rcrt0.o, making lazy FPU switching a smaller benefit vs trap costs 2) the Intel SDM warns that lazy FPU switching may increase power costs 3) post-Spectre rumors suggest that the %cr0 TS flag might not block speculation, permitting leaking of information about FPU state (AES keys?) across protection boundaries.
tested by many in snaps; prodding from deraadt@
show more ...
|
#
019cf0fb |
| 25-Aug-2017 |
guenther <guenther@openbsd.org> |
If SMAP is present, clear PSL_AC on kernel entry and interrupt so that only the code in copy{in,out}* that need it run with it set. Panic if it's set on entry to trap() or syscall(). Prompted by Ma
If SMAP is present, clear PSL_AC on kernel entry and interrupt so that only the code in copy{in,out}* that need it run with it set. Panic if it's set on entry to trap() or syscall(). Prompted by Maxime Villard's NetBSD work.
ok kettenis@ mlarkin@ deraadt@
show more ...
|
#
2daa5239 |
| 01-Jul-2017 |
sf <sf@openbsd.org> |
Use absolute pointers in codepatch entries
Instead of offsets to KERNBASE, store absolute pointers in the codepatch entries. This makes the resulting kernel a few KB larger on amd64, but KERNBASE wi
Use absolute pointers in codepatch entries
Instead of offsets to KERNBASE, store absolute pointers in the codepatch entries. This makes the resulting kernel a few KB larger on amd64, but KERNBASE will go away when ASLR is introduced.
Requested by deraadt@
show more ...
|
#
984d4744 |
| 19-Apr-2015 |
sf <sf@openbsd.org> |
Add support for x2apic mode
This is currently only enabled on hypervisors because on real hardware, it requires interrupt remapping which we don't support yet. But on virtualization it reduces the n
Add support for x2apic mode
This is currently only enabled on hypervisors because on real hardware, it requires interrupt remapping which we don't support yet. But on virtualization it reduces the number of vmexits required per IPI from 4 to 1, causing a significant speed-up for MP guests.
ok kettenis@
show more ...
|
#
61d6df42 |
| 16-Jan-2015 |
sf <sf@openbsd.org> |
Binary code patching on amd64
This commit adds generic infrastructure to do binary code patching on amd64. The existing code patching for SMAP is converted to the new infrastruture.
More consumers
Binary code patching on amd64
This commit adds generic infrastructure to do binary code patching on amd64. The existing code patching for SMAP is converted to the new infrastruture.
More consumers and support for i386 will follow later.
This version of the diff has some simplifications in codepatch_fill_nop() compared to a version that was:
OK @kettenis @mlarkin @jsg
show more ...
|