#
ffff71aa |
| 14-Jan-2025 |
mvs <mvs@openbsd.org> |
Unlock sysctl_malloc().
Move `buckstring' and `memall' strings initialization to the end of kmeminit() to make them immutable. The rest of sysctl_malloc() accessed data is already mp-safe.
ok claud
Unlock sysctl_malloc().
Move `buckstring' and `memall' strings initialization to the end of kmeminit() to make them immutable. The rest of sysctl_malloc() accessed data is already mp-safe.
ok claudio
show more ...
|
#
276b1d40 |
| 13-Jan-2025 |
mvs <mvs@openbsd.org> |
Rework the INP tables walkthrough in the KERN_FILE_BYFILE case of the sysctl_file().
I don't like how KERN_FILE_BYFILE case of the sysctl_file() delivers sockets data to the userland. It not only ta
Rework the INP tables walkthrough in the KERN_FILE_BYFILE case of the sysctl_file().
I don't like how KERN_FILE_BYFILE case of the sysctl_file() delivers sockets data to the userland. It not only takes exclusive netlock around all except divert socket tables walkthrough, but also does copyout() with mutex(9) held. Sounds strange, but the context switch is avoided because userland pages are wired.
The table is protected with `inpt_mtx' mutex(9), so the socket lock or netlock should be take outside. Since we have no socket pointer, we can't use solock(). We can't use the only shared netlock because this left socket unprotected against concurrent threads which rely on solock_shared().
Now it is possible to rework sysctl_file(). We can use reference counting for all socket types, and bluhm@ introduced `inp_sofree_mtx' mutex(9) to protect `inp_socket'. The INP tables have the special iterator to safely release `inpt_mtx' during table walkthrough.
The FILLSO() or FILLIT2() macros can't be used unrolled, because we need to push mutexes re-locking and solock() deep within. Introduce the FILLINPTABLE() macro which is the unrolling, but with the socket related locking dances. FILLIT2() macro is not required anymore and was merged with FILLIT().
Current implementation takes the reference on `inp_socket' and releases `inpt_mtx' mutex(9). Now it's possible to fairly use shared_solock() on socket instead of netlock while performing fill_file(). The copyout() is external for fill_file() and touches nothing required to be protected so it could be made lockless.
The KERN_FILE_BYFILE case became mp-safe, but the rest sysctl_file() cases still not, so the sysctl_vslock() dances left as is.
ok bluhm
show more ...
|
#
bec77b8e |
| 04-Jan-2025 |
mvs <mvs@openbsd.org> |
Unlock sysctl_dopool().
sysctl_dopool() only delivers pool(9) statistics, moreover it already relies on pool(9) related locks, so it is mp-safe as is. It relies on `pool_lock' rwlock(9) to make `pp'
Unlock sysctl_dopool().
sysctl_dopool() only delivers pool(9) statistics, moreover it already relies on pool(9) related locks, so it is mp-safe as is. It relies on `pool_lock' rwlock(9) to make `pp' pool pointer dereference safe, so copyout()s, M_WAITOK malloc()s and yeld() calls happen locked too. Introduce `pr_refcnt' reference counter to make them lockless.
ok dlg
show more ...
|
#
c7672172 |
| 28-Dec-2024 |
mvs <mvs@openbsd.org> |
Unlock KERN_NOSUIDCOREDUMP.
`nosuidcoredump' is atomically accessed integer. coredump() reads it multiple times, so cache value to `nosuidcoredump_local'.
ok bluhm
|
#
098ff4ac |
| 16-Dec-2024 |
mvs <mvs@openbsd.org> |
Unlock sysctl_video().
This sysctl(2) path contains only `video_record_enable', which is atomically accessed boolean integer.
ok kirill mglocker
|
#
0a766465 |
| 15-Dec-2024 |
mvs <mvs@openbsd.org> |
Unlock KERN_GLOBAL_PTRACE. `global_ptrace' is atomically accessed boolean integer. Only ptrace_ctrl() loads it once outside sysctl(2) layer.
ok mpi
|
#
3e142e7f |
| 14-Dec-2024 |
mvs <mvs@openbsd.org> |
Unlock KERN_WXABORT.
`uvm_wxabort' is atomically accessed boolean integer. uvm_wxcheck() already loads it lockless.
ok mpi
|
#
0815c577 |
| 18-Nov-2024 |
mvs <mvs@openbsd.org> |
Cast atomic_load_int(9) to signed int when loading `securelevel'.
The return value of atomic_load_int(9) is unsigned so needs a cast, otherwise securelevel=-1 gets misrepresented.
From Paul Fertser.
|
#
ea7e0400 |
| 08-Nov-2024 |
bluhm <bluhm@openbsd.org> |
Use PCB iterator for raw IPv6 input loop.
Implement inpcb iterator in rip6_input(). Factor out the real work to rip6_sbappend(). Now UDP broadcast and multicast, raw IPv4 and IPv6 input work simil
Use PCB iterator for raw IPv6 input loop.
Implement inpcb iterator in rip6_input(). Factor out the real work to rip6_sbappend(). Now UDP broadcast and multicast, raw IPv4 and IPv6 input work similar. While there, make rip_input() look more like rip6_input().
OK mvs@
show more ...
|
#
7f618044 |
| 05-Nov-2024 |
bluhm <bluhm@openbsd.org> |
Use PCB iterator for raw IP input deliver loop.
Inspired by mvs@ idea of the iterator in the UDP multicast loop, implement the same for raw IP input delivery. This removes an unneccesary rwlock and
Use PCB iterator for raw IP input deliver loop.
Inspired by mvs@ idea of the iterator in the UDP multicast loop, implement the same for raw IP input delivery. This removes an unneccesary rwlock and only uses table mutex. When comparing the inp routing table, address and port, the table lock must be held. So assume that in_pcb_iterator() already has the table mutex and hold it while traversing the list and doing the checks. Release the mutex during mbuf copy, socket buffer append and the upcalls. Adapt the logic for both rip_input() and udp_input(). In rip_input() move the actual work to rip_sbappend(). This can be called without mutex during list traversal and for the final element.
OK mvs@
show more ...
|
#
f11c1ce4 |
| 05-Nov-2024 |
bluhm <bluhm@openbsd.org> |
Replace rwlock with iterator in UDP input multicast loop.
The broadcast and multicast loop in udp_input() is protected by the table mutex. The relevant PCBs were collected in a separate list, which
Replace rwlock with iterator in UDP input multicast loop.
The broadcast and multicast loop in udp_input() is protected by the table mutex. The relevant PCBs were collected in a separate list, which was processed while the table notify rwlock was held. When sending UDP multicast packets over vxlan(4) configured over UDP with multicast groups, this lock was taken recursively causing a kernel crash. By using an iterator, traversing the PCB list of the table does not require to hold the mutex all the time. Only while accessing the next element after the iterator, the mutex is taken for a short time. udp_sbappend() and the upcall to vxlan_input() is done with neither mutex nor rwlock. The PCB is reference counted while traversing the list.
crash reported by Holger Glaess; iterator implemented by mvs@; tested and fixed by bluhm@; OK mvs@
show more ...
|
#
78b7da88 |
| 31-Oct-2024 |
mvs <mvs@openbsd.org> |
Unlock fs_sysctl(). It is the only `suid_clear' variable - atomically accessed integer.
ok bluhm
|
#
64293e57 |
| 28-Oct-2024 |
mvs <mvs@openbsd.org> |
Unlock KERN_ALLOWKMEM. The `allowkmem' is atomically accessed integer.
Also use atomic_load_int(9) to load `securelevel'. sysctl_securelevel() is mp-safe, but will be under kernel lock until all exi
Unlock KERN_ALLOWKMEM. The `allowkmem' is atomically accessed integer.
Also use atomic_load_int(9) to load `securelevel'. sysctl_securelevel() is mp-safe, but will be under kernel lock until all existing `securelevel' loading became mp-safe too.
ok mpi
show more ...
|
#
9a8ef7f7 |
| 25-Oct-2024 |
mvs <mvs@openbsd.org> |
Unlock timeout_sysctl(). `tostat' timeout(9) statistics is already protected by `timeout_mtx' mutex(9).
ok kettenis
|
#
51c8e26b |
| 30-Sep-2024 |
claudio <claudio@openbsd.org> |
Use ps_ppid instead of ps_pptr->ps_pid in all places. OK mpi@
|
#
dd2b8016 |
| 24-Sep-2024 |
bluhm <bluhm@openbsd.org> |
Fix sleeping race during malloc in sysctl hw.disknames.
When mallocarray(9) sleeps, disk_count can change, and diskstatslen gets inconsistent. This caused free(9) to panic.
Reported-by: syzbot+36e
Fix sleeping race during malloc in sysctl hw.disknames.
When mallocarray(9) sleeps, disk_count can change, and diskstatslen gets inconsistent. This caused free(9) to panic.
Reported-by: syzbot+36e1f3b306f721f90c72@syzkaller.appspotmail.com OK deraadt@ mpi@
show more ...
|
#
5b00b7dd |
| 29-Aug-2024 |
bluhm <bluhm@openbsd.org> |
Show expensive mbuf operations in netstat(1) statistics.
If the memory layout is not optimal, m_defrag(), m_prepend(), m_pullup(), and m_pulldown() will allocate mbufs or copy memory. Count these op
Show expensive mbuf operations in netstat(1) statistics.
If the memory layout is not optimal, m_defrag(), m_prepend(), m_pullup(), and m_pulldown() will allocate mbufs or copy memory. Count these operations to find possible optimizations.
input dhill@; OK mvs@
show more ...
|
#
acdebe03 |
| 26-Aug-2024 |
mvs <mvs@openbsd.org> |
style(9) fix. No functional changes.
|
#
5629e519 |
| 23-Aug-2024 |
mvs <mvs@openbsd.org> |
Fix KERN_AUDIO broken in rev 1.440.
|
#
11c54b09 |
| 22-Aug-2024 |
mvs <mvs@openbsd.org> |
Introduce sysctl_securelevel() to modify `securelevel' mp-safe. Keep KERN_SECURELVL locked until existing `securelevel' checks became moved out of kernel lock.
Make sysctl_securelevel_int() mp-safe
Introduce sysctl_securelevel() to modify `securelevel' mp-safe. Keep KERN_SECURELVL locked until existing `securelevel' checks became moved out of kernel lock.
Make sysctl_securelevel_int() mp-safe by using atomic_load_int(9) to unlocked read-only access for `securelevel'.
Unlock KERN_ALLOWDT. `allowdt' is the atomically accessed integer used only once in dtopen().
ok mpi
show more ...
|
#
1b92846d |
| 20-Aug-2024 |
mvs <mvs@openbsd.org> |
Unlock KERN_MAXFILES.
`maxfiles' is atomically accessed integer which is lockless and read-only accessed in file descriptors layer.
lim_startup() called during kernel bootstrap, no need to atomic_l
Unlock KERN_MAXFILES.
`maxfiles' is atomically accessed integer which is lockless and read-only accessed in file descriptors layer.
lim_startup() called during kernel bootstrap, no need to atomic_load_int() within.
ok mpi
show more ...
|
#
2d79d4b5 |
| 20-Aug-2024 |
mvs <mvs@openbsd.org> |
Unlock KERN_MAXPROC and KERN_MAXTHREAD from `kern_vars'. Both `maxprocess' and `maxthread' are atomically accessed integers.
ok mpi
|
#
19686eac |
| 20-Aug-2024 |
mvs <mvs@openbsd.org> |
Unlock sysctl_audio().
It is the only KERN_AUDIO_RECORD. `audio_record_enable' is atomically accessed integer.
Reasonable from deraadt
|
#
7af15f03 |
| 14-Aug-2024 |
mvs <mvs@openbsd.org> |
Push kernel lock down to net_sysctl().
All except PF_MPLS paths are mp-safe: - net_link_sysctl() and following net_ifiq_sysctl() only return EOPNOTSUPP; - uipc_sysctl() - mp-safe atomic access to
Push kernel lock down to net_sysctl().
All except PF_MPLS paths are mp-safe: - net_link_sysctl() and following net_ifiq_sysctl() only return EOPNOTSUPP; - uipc_sysctl() - mp-safe atomic access to integers; - bpf_sysctl() - mp-safe atomic access to integers; - pflow_sysctl() - returns statistics from per-CPU counters; - pipex_sysctl() - mp-safe atomic access to integer;
Push kernel lock down to mpls_sysctl(). sysctl_int_bounded() do copying with local variable, so context switch is safe. No need to wire memory or take `sysctl_lock' rwlock(9).
Keep protocols locked as they was include pages wiring. Copying will not sleep - no network slowdown while doing it with net lock held.
ok bluhm
show more ...
|
#
e84aaa7e |
| 14-Aug-2024 |
mvs <mvs@openbsd.org> |
Make sysctl_int() and sysctl_int_lower() mp-safe and unlock KERN_HOSTID.
The only difference between sysctl_int() and sysctl_int_bounded() is the range check, so sysctl_int() is just sysctl_int_boun
Make sysctl_int() and sysctl_int_lower() mp-safe and unlock KERN_HOSTID.
The only difference between sysctl_int() and sysctl_int_bounded() is the range check, so sysctl_int() is just sysctl_int_bounded(..., INT_MIN, INT_MAX). sysctl_int() is not the fast path, so this useless check is not significant.
Mp-safe sysctl_int() is meaningless for sysctl_int_lower(), so rework it in the sysctl_int_bounded() style. This time all affected paths are kernel locked, but this doesn't make sysctl_int_lower() worse.
Change `hostid' type to the type of int. It only stored but never used within kernel, userland accesses it through sysctl_int(). Nothing changes, but variable becomes consistent with sysctl_int().
ok bluhm
show more ...
|