History log of /netbsd-src/sys/arch/amd64/include/frameasm.h (Results 1 – 25 of 55)
Revision Date Author Comments
# 376bcb54 30-Jul-2022 riastradh <riastradh@NetBSD.org>

x86: Eliminate mfence hotpatch for membar_sync.

The more-compatible LOCK ADD $0,-N(%rsp) turns out to be cheaper
than MFENCE anyway. Let's save some space and maintenance and rip
out the hotpatch

x86: Eliminate mfence hotpatch for membar_sync.

The more-compatible LOCK ADD $0,-N(%rsp) turns out to be cheaper
than MFENCE anyway. Let's save some space and maintenance and rip
out the hotpatching for it.

show more ...


# e0c914a7 09-Apr-2022 riastradh <riastradh@NetBSD.org>

x86: Every load is a load-acquire, so membar_consumer is a noop.

lfence is only needed for MD logic, such as operations on I/O memory
rather than normal cacheable memory, or special instructions lik

x86: Every load is a load-acquire, so membar_consumer is a noop.

lfence is only needed for MD logic, such as operations on I/O memory
rather than normal cacheable memory, or special instructions like
RDTSC -- never for MI synchronization between threads/CPUs. No need
for hot-patching to do lfence here.

(The x86_lfence function might reasonably be patched on i386 to do
lfence for MD logic, but it isn't now and this doesn't change that.)

show more ...


# 63587e37 17-Apr-2021 rillig <rillig@NetBSD.org>

sys/arch/amd64: remove trailing whitespace


# 95a0a188 19-Jul-2020 maxv <maxv@NetBSD.org>

Revert most of ad's movs/stos change. Instead do a lot simpler: declare
svs_quad_copy() used by SVS only, with no need for instrumentation, because
SVS is disabled when sanitizers are on.


# c1ac8a41 21-Jun-2020 bouyer <bouyer@NetBSD.org>

Fix comment


# 248fe10b 01-Jun-2020 ad <ad@NetBSD.org>

Reported-by: syzbot+6dd5a230d19f0cbc7814@syzkaller.appspotmail.com

Instrument STOS/MOVS for KMSAN to unbreak it.


# 129e4c2b 26-Apr-2020 maxv <maxv@NetBSD.org>

Use the hotpatch framework for LFENCE/MFENCE.


# c24c993f 25-Apr-2020 bouyer <bouyer@NetBSD.org>

Merge the bouyer-xenpvh branch, bringing in Xen PV drivers support under HVM
guests in GENERIC.
Xen support can be disabled at runtime with
boot -c
disable hypervisor


# e7e5aa03 17-Nov-2019 maxv <maxv@NetBSD.org>

Disable KCOV - by raising the interrupt level - in the TLB IPI handler,
because this is only noise.


# 10c5b023 14-Nov-2019 maxv <maxv@NetBSD.org>

Add support for Kernel Memory Sanitizer (kMSan). It detects uninitialized
memory used by the kernel at run time, and just like kASan and kCSan, it
is an excellent feature. It has already detected 38

Add support for Kernel Memory Sanitizer (kMSan). It detects uninitialized
memory used by the kernel at run time, and just like kASan and kCSan, it
is an excellent feature. It has already detected 38 uninitialized variables
in the kernel during my testing, which I have since discreetly fixed.

We use two shadows:
- "shad", to track uninitialized memory with a bit granularity (1:1).
Each bit set to 1 in the shad corresponds to one uninitialized bit of
real kernel memory.
- "orig", to track the origin of the memory with a 4-byte granularity
(1:1). Each uint32_t cell in the orig indicates the origin of the
associated uint32_t of real kernel memory.

The memory consumption of these shadows is consequent, so at least 4GB of
RAM is recommended to run kMSan.

The compiler inserts calls to specific __msan_* functions on each memory
access, to manage both the shad and the orig and detect uninitialized
memory accesses that change the execution flow (like an "if" on an
uninitialized variable).

We mark as uninit several types of memory buffers (stack, pools, kmem,
malloc, uvm_km), and check each buffer passed to copyout, copyoutstr,
bwrite, if_transmit_lock and DMA operations, to detect uninitialized memory
that leaves the system. This allows us to detect kernel info leaks in a way
that is more efficient and also more user-friendly than KLEAK.

Contrary to kASan, kMSan requires comprehensive coverage, ie we cannot
tolerate having one non-instrumented function, because this could cause
false positives. kMSan cannot instrument ASM functions, so I converted
most of them to __asm__ inlines, which kMSan is able to instrument. Those
that remain receive special treatment.

Contrary to kASan again, kMSan uses a TLS, so we must context-switch this
TLS during interrupts. We use different contexts depending on the interrupt
level.

The orig tracks precisely the origin of a buffer. We use a special encoding
for the orig values, and pack together in each uint32_t cell of the orig:
- a code designating the type of memory (Stack, Pool, etc), and
- a compressed pointer, which points either (1) to a string containing
the name of the variable associated with the cell, or (2) to an area
in the kernel .text section which we resolve to a symbol name + offset.

This encoding allows us not to consume extra memory for associating
information with each cell, and produces a precise output, that can tell
for example the name of an uninitialized variable on the stack, the
function in which it was pushed on the stack, and the function where we
accessed this uninitialized variable.

kMSan is available with LLVM, but not with GCC.

The code is organized in a way that is similar to kASan and kCSan, so it
means that other architectures than amd64 can be supported.

show more ...


# 560337f7 12-Oct-2019 maxv <maxv@NetBSD.org>

Rewrite the FPU code on x86. This greatly simplifies the logic and removes
the dependency on IPL_HIGH. NVMM is updated accordingly. Posted on
port-amd64 a week ago.

Bump the kernel version to 9.99.1

Rewrite the FPU code on x86. This greatly simplifies the logic and removes
the dependency on IPL_HIGH. NVMM is updated accordingly. Posted on
port-amd64 a week ago.

Bump the kernel version to 9.99.16.

show more ...


# 125e142f 18-May-2019 maxv <maxv@NetBSD.org>

Two changes in the CPU mitigations:

* Micro-optimize: put every mitigation in the same branch. This removes
two branches in each exc/int return path, and removes all branches in
the syscall r

Two changes in the CPU mitigations:

* Micro-optimize: put every mitigation in the same branch. This removes
two branches in each exc/int return path, and removes all branches in
the syscall return path.

* Modify the SpectreV2 mitigation to be compatible with SpectreV4. I
recently realized that both couldn't be enabled at the same time on
Intel. This is because initially, when there was just SpectreV2, we
could reset the whole IA32_SPEC_CTRL MSR. But then Intel added another
bit in it for SpectreV4, so it isn't right to reset it entirely
anymore. SSBD needs to stay.

show more ...


# 74b8eea5 14-May-2019 maxv <maxv@NetBSD.org>

Mitigation for INTEL-SA-00233: Microarchitectural Data Sampling (MDS).

It requires a microcode update, now available on the Intel website. The
microcode modifies the behavior of the VERW instruction

Mitigation for INTEL-SA-00233: Microarchitectural Data Sampling (MDS).

It requires a microcode update, now available on the Intel website. The
microcode modifies the behavior of the VERW instruction, and makes it flush
internal CPU buffers. We hotpatch the return-to-userland path to add VERW.

Two sysctls are added:

machdep.mds.mitigated = {0/1} user-settable
machdep.mds.method = {string} constructed by the kernel

The kernel will automatically enable the mitigation if the updated
microcode is present. If the new microcode is not present, the user can
load it via cpuctl, and set machdep.mds.mitigated=1.

show more ...


# 427af037 11-Feb-2019 cherry <cherry@NetBSD.org>

We reorganise definitions for XEN source support as follows:

XEN - common sources required for baseline XEN support.
XENPV - sources required for support of XEN in PV mode.
XENPVHVM - sources requir

We reorganise definitions for XEN source support as follows:

XEN - common sources required for baseline XEN support.
XENPV - sources required for support of XEN in PV mode.
XENPVHVM - sources required for support for XEN in HVM mode.
XENPVH - sources required for support for XEN in PVH mode.

show more ...


# afb643e1 12-Aug-2018 maxv <maxv@NetBSD.org>

Move the PCPU area from slot 384 to slot 510, to avoid creating too much
fragmentation in the slot space (384 is in the middle of the kernel half
of the VA).


# 2681a041 13-Jul-2018 martin <martin@NetBSD.org>

Provide empty SVS_ENTER_NMI/SVS_LEAVE_NMI for kernels w/o options SVS


# 8b7b3795 12-Jul-2018 maxv <maxv@NetBSD.org>

Handle NMIs correctly when SVS is enabled. We store the kernel's CR3 at the
top of the NMI stack, and we unconditionally switch to it, because we don't
know with which page tables we received the NMI

Handle NMIs correctly when SVS is enabled. We store the kernel's CR3 at the
top of the NMI stack, and we unconditionally switch to it, because we don't
know with which page tables we received the NMI. Hotpatch the whole thing as
usual.

This restores the ability to use PMCs on Intel CPUs.

show more ...


# 0223f0c8 28-Mar-2018 maxv <maxv@NetBSD.org>

Add the IBRS mitigation for SpectreV2 on amd64.

Different operations are performed during context transitions:

user->kernel: IBRS <- 1
kernel->user: IBRS <- 0

And during context switches:

user

Add the IBRS mitigation for SpectreV2 on amd64.

Different operations are performed during context transitions:

user->kernel: IBRS <- 1
kernel->user: IBRS <- 0

And during context switches:

user->user: IBPB <- 0
kernel->user: IBPB <- 0
[user->kernel:IBPB <- 0 this one may not be needed]

We use two macros, IBRS_ENTER and IBRS_LEAVE, to set the IBRS bit. The
thing is hotpatched for better performance, like SVS.

The idea is that IBRS is a "privileged" bit, which is set to 1 in kernel
mode and 0 in user mode. To protect the branch predictor between user
processes (which are of the same privilege), we use the IBPB barrier.

The Intel manual also talks about (MWAIT/HLT)+HyperThreading, and says
that when using either of the two instructions IBRS must be disabled for
better performance on the core. I'm not totally sure about this part, so
I'm not adding it now.

IBRS is available only when the Intel microcode update is applied. The
mitigation must be enabled manually with machdep.spectreV2.mitigated.

Tested by msaitoh a week ago (but I adapted a few things since). Probably
more changes to come.

show more ...


# ccc038a8 25-Feb-2018 maxv <maxv@NetBSD.org>

Remove INTRENTRY_L, it's not used anymore.


# 5ab8a4e7 22-Feb-2018 maxv <maxv@NetBSD.org>

Make the machdep.svs_enabled sysctl writable, and add the kernel code
needed to disable SVS at runtime.

We set 'svs_enabled' to false, and hotpatch the kernel entry/exit points
to eliminate the cont

Make the machdep.svs_enabled sysctl writable, and add the kernel code
needed to disable SVS at runtime.

We set 'svs_enabled' to false, and hotpatch the kernel entry/exit points
to eliminate the context switch code.

We need to make sure there is no remote CPU that is executing the code we
are hotpatching. So we use two barriers:

* After the first one each CPU is guaranteed to be executing in
svs_disable_cpu with interrupts disabled (this way it can't leave this
place).

* After the second one it is guaranteed that SVS is disabled, so we flush
the cache, enable interrupts and continue execution normally.

Between the two barriers, cpu0 will disable SVS (svs_enabled=false and
hotpatch), and each CPU will restore the generic syscall entry point.

Three notes:

* We should call svs_pgg_update(true) afterwards, to put back PG_G on
the kernel pages (for better performance). This will be done in another
commit.

* The fact that we disable interrupts does not prevent us from receiving
an NMI, and it would be problematic. So we need to add some code to
verify that PMCs are disabled before hotpatching. This will be done
in another commit.

* In svs_disable() we expect each CPU to be online. We need to add a
check to make sure they indeed are.

The sysctl allows only a 1->0 transition. There is no point in doing 0->1
transitions anyway, and it would be complicated to implement because we
need to re-synchronize the CPU user page tables with the current ones (we
lost track of them in the last 1->0 transition).

show more ...


# ebc1f703 22-Feb-2018 maxv <maxv@NetBSD.org>

Add a dynamic detection for SVS.

The SVS_* macros are now compiled as skip-noopt. When the system boots, if
the cpu is from Intel, they are hotpatched to their real content.
Typically:

jmp 1f
i

Add a dynamic detection for SVS.

The SVS_* macros are now compiled as skip-noopt. When the system boots, if
the cpu is from Intel, they are hotpatched to their real content.
Typically:

jmp 1f
int3
int3
int3
... int3 ...
1:

gets hotpatched to:

movq SVS_UTLS+UTLS_KPDIRPA,%rax
movq %rax,%cr3
movq CPUVAR(KRSP0),%rsp

These two chunks of code being of the exact same size. We put int3 (0xCC)
to make sure we never execute there.

In the non-SVS (ie non-Intel) case, all it costs is one jump. Given that
the SVS_* macros are small, this jump will likely leave us in the same
icache line, so it's pretty fast.

The syscall entry point is special, because there we use a scratch uint64_t
not in curcpu but in the UTLS page, and it's difficult to hotpatch this
properly. So instead of hotpatching we declare the entry point as an ASM
macro, and define two functions: syscall and syscall_svs, the latter being
the one used in the SVS case.

While here 'syscall' is optimized not to contain an SVS_ENTER - this way
we don't even need to do a jump on the non-SVS case.

When adding pages in the user page tables, make sure we don't have PG_G,
now that it's dynamic.

A read-only sysctl is added, machdep.svs_enabled, that tells whether the
kernel uses SVS or not.

More changes to come, svs_init() is not very clean.

show more ...


# 2e43b1e6 27-Jan-2018 maxv <maxv@NetBSD.org>

Put the default %cs value in INTR_RECURSE_HWFRAME. Pushing an immediate
costs less than reading the %cs register and pushing its value. This
value is not allowed to be != GSEL(GCODE_SEL,SEL_KPL) in a

Put the default %cs value in INTR_RECURSE_HWFRAME. Pushing an immediate
costs less than reading the %cs register and pushing its value. This
value is not allowed to be != GSEL(GCODE_SEL,SEL_KPL) in all cases.

show more ...


# 7318ea98 27-Jan-2018 maxv <maxv@NetBSD.org>

Declare and use INTR_RECURSE_ENTRY, an optimized version of INTRENTRY.
When processing deferred interrupts, we are always entering the new
handler in kernel mode, so there is no point performing the

Declare and use INTR_RECURSE_ENTRY, an optimized version of INTRENTRY.
When processing deferred interrupts, we are always entering the new
handler in kernel mode, so there is no point performing the userland
checks.

Saves several instructions.

show more ...


# 21a8fbaf 27-Jan-2018 maxv <maxv@NetBSD.org>

Remove DO_DEFERRED_SWITCH and DO_DEFERRED_SWITCH_RETRY, unused.


# cea874c7 21-Jan-2018 maxv <maxv@NetBSD.org>

Unmap the kernel from userland in SVS, and leave only the needed
trampolines. As explained below, SVS should now completely mitigate
Meltdown on GENERIC kernels, even though it needs some more tweaki

Unmap the kernel from userland in SVS, and leave only the needed
trampolines. As explained below, SVS should now completely mitigate
Meltdown on GENERIC kernels, even though it needs some more tweaking
for GENERIC_KASLR.

Until now the kernel entry points looked like:

FUNC(intr)
pushq $ERR
pushq $TRAPNO
INTRENTRY
... handle interrupt ...
INTRFASTEXIT
END(intr)

With this change they are split and become:

FUNC(handle)
... handle interrupt ...
INTRFASTEXIT
END(handle)

TEXT_USER_BEGIN
FUNC(intr)
pushq $ERR
pushq $TRAPNO
INTRENTRY
jmp handle
END(intr)
TEXT_USER_END

A new section is introduced, .text.user, that contains minimal kernel
entry/exit points. In order to choose what to put in this section, two
macros are introduced, TEXT_USER_BEGIN and TEXT_USER_END.

The section is mapped in userland with normal 4K pages.

In GENERIC, the section is 4K-page-aligned and embedded in .text, which
is mapped with large pages. That is to say, when an interrupt comes in,
the CPU has the user page tables loaded and executes the 'intr' functions
on 4K pages; after calling SVS_ENTER (in INTRENTRY) these 4K pages become
2MB large pages, and remain so when executing in kernel mode.

In GENERIC_KASLR, the section is 4K-page-aligned and independent from the
other kernel texts. The prekern just picks it up and maps it at a random
address.

In GENERIC, SVS should now completely mitigate Meltdown: what we put in
.text.user is not secret.

In GENERIC_KASLR, SVS would have to be improved a bit more: the
'jmp handle' instruction is actually secret, since it leaks the address
of the section we are jumping into. By exploiting Meltdown on Intel, this
theoretically allows a local user to reconstruct the address of the first
text section. But given that our KASLR produces several texts, and that
each section is not correlated with the others, the level of protection
KASLR provides is still good.

show more ...


123