#
cc546be7 |
| 27-Jan-2025 |
mvs <mvs@openbsd.org> |
Get rid of unused `so' argument in sofree(). No functional changes.
ok bluhm
|
#
902f0fe0 |
| 27-Jan-2025 |
mvs <mvs@openbsd.org> |
Move buffer zeroing within sorflush() out of socket lock.
Only socantrcvmore() requires socket lock, the rest relies on buffer locks. Previously, some sockets were designed for socket lock only, so
Move buffer zeroing within sorflush() out of socket lock.
Only socantrcvmore() requires socket lock, the rest relies on buffer locks. Previously, some sockets were designed for socket lock only, so I intentionally kept it for the entire path to avoid dances around SB_MTXLOCK. This is not necessary now.
ok bluhm
show more ...
|
#
fcc9afa3 |
| 23-Jan-2025 |
bluhm <bluhm@openbsd.org> |
Fix out-of-band data in socket splicing.
In somove() length of receive buffer and oobmark were not modified in the same critical section. This resulted in wrong out-of-band pointer relative to data
Fix out-of-band data in socket splicing.
In somove() length of receive buffer and oobmark were not modified in the same critical section. This resulted in wrong out-of-band pointer relative to data in the socket buffer. To call pru_rcvd() the receive socket buffer mutex is released, leading to this inconsistency. Moving mutex leave and calling pru_rcvd() after updating so_oobmark fixes the test regress/sys/kern/sosplice/tcp run-args-oobinline.pl.
OK mvs@
show more ...
|
#
acf0b828 |
| 22-Jan-2025 |
mvs <mvs@openbsd.org> |
Completely remove SB_MTXLOCK. Left the sbmtxassertlocked() assertion soft like in soassertlocked(). The `so' argument in sbappend*() functions became unused, but left it to the next diff.
ok bluhm
|
#
31d1f423 |
| 21-Jan-2025 |
mvs <mvs@openbsd.org> |
Start SB_MTXLOCK logic cleaning. Sockets of all types switched to fine grained locks for socket buffers, so SB_MTXLOCK could gone. This time only from kern/uipc_socket.c
ok bluhm
|
#
e28482df |
| 20-Jan-2025 |
bluhm <bluhm@openbsd.org> |
Do not unlock the socket in soabort().
One difference between UNIX and internet sockets is that UNIX sockets unlock in soabort() while TCP does not do that. in_pcbdetach() keeps the lock, change ui
Do not unlock the socket in soabort().
One difference between UNIX and internet sockets is that UNIX sockets unlock in soabort() while TCP does not do that. in_pcbdetach() keeps the lock, change uipc_abort() to behave similar. This also gives symetric lock and unlock in the caller. Refcount is needed to call unlock on an aborted socket. The queue 0 in soclose() is only used by UNIX sockets, so remove the "if" persocket. The "kassert" persocket in soisconnected() is not needed.
OK mvs@
show more ...
|
#
7b08975f |
| 13-Jan-2025 |
mvs <mvs@openbsd.org> |
Unlock the tcp(4) case of somove().
Tested and OK bluhm@.
|
#
20eaa8d5 |
| 09-Jan-2025 |
mvs <mvs@openbsd.org> |
Add 'socket' refcnt type to dt(4).
We started to widely use reference counting for sockets.
ok bluhm
|
#
02fa9548 |
| 09-Jan-2025 |
mvs <mvs@openbsd.org> |
Return EPROTO error when attempting to unsplice the not spliced socket. The EPROTO error was suggested by bluhm.
ok bluhm
|
#
d9cfc367 |
| 07-Jan-2025 |
mvs <mvs@openbsd.org> |
Stop doing `ssp_task' and `ssp_idleto' re-initialization in sosplice().
Initialize them only during so->so_sp or sosp->so_sp allocation and never re-initialize again.
sounsplice() could left `ssp_t
Stop doing `ssp_task' and `ssp_idleto' re-initialization in sosplice().
Initialize them only during so->so_sp or sosp->so_sp allocation and never re-initialize again.
sounsplice() could left `ssp_task' scheduled. This means it is linked to pending queue and TASK_ONQUEUE bit is set on `t_flags'. task_set() overrides `t_flags' with 0, so the next task_add() could break pending queue with double insertion. The described problem is also applicable to `ssp_idleto' timer.
To prevent task and timeout from being rescheduled during sounsplice(), do task_del() and timeout_del() after actual unsplicing. Not critical, but prevents possible dry run.
Problem reported, fix tested and OK bluhm.
show more ...
|
#
ab8da1a7 |
| 04-Jan-2025 |
mvs <mvs@openbsd.org> |
Relax sockets splicing locking.
Sockets splicing works around sockets buffers which have their own locks for all socket types, especially sblock() on `so_snd' which keeps sockets being spliced.
-
Relax sockets splicing locking.
Sockets splicing works around sockets buffers which have their own locks for all socket types, especially sblock() on `so_snd' which keeps sockets being spliced.
- sosplice() does read-only sockets options and state checks, the only modification is `so_sp' assignment. The SB_SPLICE bit modification, `ssp_socket' and `ssp_soback' assignment protected with `sb_mtx' mutex(9). PCB layer does corresponding checks with `sb_mtx' held, so shared solock() is pretty enough in sosplice() path. Introduce special sosplice_solock_pair() for that purpose.
- sounsplice() requires shared socket lock only around so{r,w}wakeup calls.
- Push exclusive solock() down to tcp(4) case of somove(). Such sockets are not ready do unlocked somove() yet.
ok bluhm
show more ...
|
#
268735c4 |
| 03-Jan-2025 |
mvs <mvs@openbsd.org> |
Do not unlock socket within sorele().
Unlock it outside if required. This time socket could be protected by different locks, include different shared solock() variations. sorele() does nothing that
Do not unlock socket within sorele().
Unlock it outside if required. This time socket could be protected by different locks, include different shared solock() variations. sorele() does nothing that required to lock socket, no reason to release it locked.
ok bluhm
show more ...
|
#
9bfcfc3f |
| 03-Jan-2025 |
mvs <mvs@openbsd.org> |
Remove socket state and options checks from the unsplicing path of the sosplice().
The unsplicing paht was the part of the splicing so it followed these checks too. Socket state and options checks w
Remove socket state and options checks from the unsplicing path of the sosplice().
The unsplicing paht was the part of the splicing so it followed these checks too. Socket state and options checks were copy-pasted during sosplice() reordering to avoid possible API break. However, the splicing state was never checked, so the unsplicing of non-spliced socket was always successful. Regarding on this, these checks are useless, moreover the removal doesn't break the kern/sosplice regression test, so this API change should be transparent.
Real reason is the simplification of socket usplicing which relies on socket buffers locks, so there is no reason to lock the socket and stop packets processing just to do useless checks.
ok bluhm
show more ...
|
#
66570633 |
| 01-Jan-2025 |
bluhm <bluhm@openbsd.org> |
Fix whitespace.
|
#
507b5b41 |
| 31-Dec-2024 |
mvs <mvs@openbsd.org> |
Use per-sockbuf mutex(9) to protect `so_snd' buffer of tcp(4) sockets.
Even for tcp(4) case, sosend() only checks `so_snd' free space and sleeps if necessary, actual buffer handling happens in soloc
Use per-sockbuf mutex(9) to protect `so_snd' buffer of tcp(4) sockets.
Even for tcp(4) case, sosend() only checks `so_snd' free space and sleeps if necessary, actual buffer handling happens in solock()ed PCB layer.
Only unlock sosend() path, the somove() is still locked exclusively. The "if (dosolock)" dances are useless, but intentionally left as is.
Tested and ok by bluhm.
show more ...
|
#
c6ec13d0 |
| 30-Dec-2024 |
mvs <mvs@openbsd.org> |
The fixed version of previously reverted tcp(4) sockets unsplicing.
Rework sorele(), sofree() and soclose() to follow closef(), fdrop() and FRELE() way. This version of sofree() never sleeps, but ca
The fixed version of previously reverted tcp(4) sockets unsplicing.
Rework sorele(), sofree() and soclose() to follow closef(), fdrop() and FRELE() way. This version of sofree() never sleeps, but calls sorele() after finished it's part of destruction. sorele() destroys socket if the last reference was released.
As previously, timeout(9) and task(9) reinitialization was replaced by barriers and moved to soclose(), so the only sleep points are common for all socket types.
Tests and ok from bluhm.
show more ...
|
#
a089aaec |
| 27-Dec-2024 |
mvs <mvs@openbsd.org> |
Backout previous. I found that soclose() could leave socket tcp(4) destruction to the tcp_timer_2msl(), and the implemented logic doesn't work in this case.
|
#
c5202efa |
| 27-Dec-2024 |
mvs <mvs@openbsd.org> |
Simplify tcp(4) sockets unsplicing.
tcp(4) PCB layer can destroy only sockets which were not yet accepted. Such sockets are not accessible from the userland and can't be spliced.
For userland acces
Simplify tcp(4) sockets unsplicing.
tcp(4) PCB layer can destroy only sockets which were not yet accepted. Such sockets are not accessible from the userland and can't be spliced.
For userland accessible sockets, tcp(4) PCB layer only destroys PCB, but left the socket alive. The socket destruction always happens through soclose() path.
So while sofree() called from the soclose() path, it's safe to release netlock and wait threads which works with this dying socket.
Drop the exception for tcp(4) sockets unsplicing in the soclose() and follow the common path for both tcp(4) and udp(4) sockets. Also use barrierr for `ssp_idleto' timeout and `ssp_task' task destruction instead of re-initialise them in runtime.
ok bluhm
show more ...
|
#
cc25bede |
| 26-Dec-2024 |
bluhm <bluhm@openbsd.org> |
Run TCP output in parallel.
When called with shared netlock together with socket lock, tcp_output() is MP safe. This is the lock for tcpcb. Mark TCP protocol with PR_MPSOCKET. Also t_oobflags is
Run TCP output in parallel.
When called with shared netlock together with socket lock, tcp_output() is MP safe. This is the lock for tcpcb. Mark TCP protocol with PR_MPSOCKET. Also t_oobflags is protected this way, allowing parallel pru_rcvoob().
OK mvs@
show more ...
|
#
f1bf6f4e |
| 19-Dec-2024 |
mvs <mvs@openbsd.org> |
Use per-sockbuf mutex(9) to protect `so_rcv' buffer of tcp(4) sockets.
Only unlock soreceive() path, somove() path still locked exclusively. Also exclusive socket lock will be taken in the soreceiv
Use per-sockbuf mutex(9) to protect `so_rcv' buffer of tcp(4) sockets.
Only unlock soreceive() path, somove() path still locked exclusively. Also exclusive socket lock will be taken in the soreceive() path each time before pru_rcvd() call.
Note, both socket and `sb_mtx' locks are held while SS_CANTRCVMORE modified, so socket lock is enough to check it in the protocol input path.
ok bluhm
show more ...
|
#
6fb93e47 |
| 15-Dec-2024 |
dlg <dlg@openbsd.org> |
add an AF_FRAME socket domain and an IFT_ETHER protocol family under it.
this allows userland to use sockets to send and receive Ethernet frames. as per the upcoming frame.4 man page:
frame pr
add an AF_FRAME socket domain and an IFT_ETHER protocol family under it.
this allows userland to use sockets to send and receive Ethernet frames. as per the upcoming frame.4 man page:
frame protocol family sockets are designed as an alternative to bpf(4) for handling low data and packet rate communication protocols. Rather than filtering every frame entering the system before the network stack like bpf(4), the frame protocol family processing avoids this overhead by running after the built in protocol handlers in the kernel. For this reason, it is not possible to handle IPv4 or IPv6 packets with frame protocol sockets because the kernel network stack consumes them before the receive handling for frame sockets is run.
if you've used udp sockets then these should feel much the same.
my main motivation is to implement an lldp agent in userland, but without having to have bpf look at every packet when lldp happens every minute or two.
the only feedback i had was positive, so i'm putting it in ok claudio@
show more ...
|
#
542ea96b |
| 08-Nov-2024 |
bluhm <bluhm@openbsd.org> |
Use read once in socket filter functions.
The socket filt_...() functions are called with shared netlock, but without per socket lock. This can be done as they are read-only. After unlocking, TCP w
Use read once in socket filter functions.
The socket filt_...() functions are called with shared netlock, but without per socket lock. This can be done as they are read-only. After unlocking, TCP will modify socket variables in parallel. So explicitly mark with READ_ONCE() where unlocked access to socket fields happens.
OK mvs@
show more ...
|
#
030c498c |
| 31-Oct-2024 |
claudio <claudio@openbsd.org> |
No need to set pkthdr fields to 0 that are already 0. MGETHDR() does that. OK dlg@
|
#
fa82e203 |
| 11-Aug-2024 |
jsg <jsg@openbsd.org> |
spelling; ok mvs@
|
#
59401b77 |
| 06-Aug-2024 |
mvs <mvs@openbsd.org> |
For consistency with other similar sysctl(2) variables use atomic_load_int(9) while loading `somaxconn' and `sominconn'.
ok bluhm
|