tcp_usrreq.c - OpenGrok history log for /openbsd-src/sys/netinet/tcp

Revision	Date	Author	Comments
# e835bce2	16-Jan-2025	bluhm <bluhm@openbsd.org>	Remove net lock from TCP sysctl for keep alive. Keep copies in seconds for the sysctl and update timer variables atomically when they change. tcp_maxidle was historically calculated in tcp_slowtimo Remove net lock from TCP sysctl for keep alive. Keep copies in seconds for the sysctl and update timer variables atomically when they change. tcp_maxidle was historically calculated in tcp_slowtimo() as the timers were called from there. Better calculate maxidle when needed. tcp_timer_init() is useless, just initialize data. While there make the names consistent. input sthen@; OK mvs@ show more ...
# 4e5e13a2	09-Jan-2025	bluhm <bluhm@openbsd.org>	Run TCP sysctl ident and drop with shared net lock. Convert exclusive net lock for TCPCTL_IDENT and TCPCTL_DROP to shared net lock and push it down into tcp_ident(). Grab the socket lock there with Run TCP sysctl ident and drop with shared net lock. Convert exclusive net lock for TCPCTL_IDENT and TCPCTL_DROP to shared net lock and push it down into tcp_ident(). Grab the socket lock there with in_pcbsolock_ref(). Move socket release from in_pcbsolock() to in_pcbsounlock_rele() and add _ref and _rele suffix to the inpcb socket lock functions. They both lock and refcount now. in_pcbsounlock_rele() ignores NULL sockets to make the unlock path in error case simpler. Socket lock also protects tcp_drop() and tcp_close() now, so the socket pointer from incpb may be NULL during unlock. In tcp_ident() improve consistency check of address family. OK mvs@ show more ...
# 9b315513	05-Jan-2025	bluhm <bluhm@openbsd.org>	TCP integer sysctl variables are all atomic. Remove net lock. OK mvs@
# ab8da1a7	04-Jan-2025	mvs <mvs@openbsd.org>	Relax sockets splicing locking. Sockets splicing works around sockets buffers which have their own locks for all socket types, especially sblock() on `so_snd' which keeps sockets being spliced. - Relax sockets splicing locking. Sockets splicing works around sockets buffers which have their own locks for all socket types, especially sblock() on `so_snd' which keeps sockets being spliced. - sosplice() does read-only sockets options and state checks, the only modification is `so_sp' assignment. The SB_SPLICE bit modification, `ssp_socket' and `ssp_soback' assignment protected with `sb_mtx' mutex(9). PCB layer does corresponding checks with `sb_mtx' held, so shared solock() is pretty enough in sosplice() path. Introduce special sosplice_solock_pair() for that purpose. - sounsplice() requires shared socket lock only around so{r,w}wakeup calls. - Push exclusive solock() down to tcp(4) case of somove(). Such sockets are not ready do unlocked somove() yet. ok bluhm show more ...
# 66570633	01-Jan-2025	bluhm <bluhm@openbsd.org>	Fix whitespace.
# 507b5b41	31-Dec-2024	mvs <mvs@openbsd.org>	Use per-sockbuf mutex(9) to protect `so_snd' buffer of tcp(4) sockets. Even for tcp(4) case, sosend() only checks `so_snd' free space and sleeps if necessary, actual buffer handling happens in soloc Use per-sockbuf mutex(9) to protect `so_snd' buffer of tcp(4) sockets. Even for tcp(4) case, sosend() only checks `so_snd' free space and sleeps if necessary, actual buffer handling happens in solock()ed PCB layer. Only unlock sosend() path, the somove() is still locked exclusively. The "if (dosolock)" dances are useless, but intentionally left as is. Tested and ok by bluhm. show more ...
# 77957d73	30-Dec-2024	bluhm <bluhm@openbsd.org>	Remove net lock from TCP syn cache sysctl. TCP syn cache is protected by mutex. Make access to its sysctl variables either atomic or put them into this mutex. Then net lock can be removed. OK mvs@
# f1bf6f4e	19-Dec-2024	mvs <mvs@openbsd.org>	Use per-sockbuf mutex(9) to protect `so_rcv' buffer of tcp(4) sockets. Only unlock soreceive() path, somove() path still locked exclusively. Also exclusive socket lock will be taken in the soreceiv Use per-sockbuf mutex(9) to protect `so_rcv' buffer of tcp(4) sockets. Only unlock soreceive() path, somove() path still locked exclusively. Also exclusive socket lock will be taken in the soreceive() path each time before pru_rcvd() call. Note, both socket and `sb_mtx' locks are held while SS_CANTRCVMORE modified, so socket lock is enough to check it in the protocol input path. ok bluhm show more ...
# 2162e93b	08-Nov-2024	bluhm <bluhm@openbsd.org>	TCP send and receive space update are MP safe. tcp_update_sndspace() and tcp_update_rcvspace() only read global variables that do not change after initialization. Mark them as such. Add braces aro TCP send and receive space update are MP safe. tcp_update_sndspace() and tcp_update_rcvspace() only read global variables that do not change after initialization. Mark them as such. Add braces around multi-line if blocks. ok mvs@ show more ...
# 93536db2	12-Apr-2024	bluhm <bluhm@openbsd.org>	Split single TCP inpcb table into IPv4 and IPv6 parts. With two separate TCP hash tables, each one becomes smaller. When we remove the exclusive net lock from TCP, contention on internet PCB table Split single TCP inpcb table into IPv4 and IPv6 parts. With two separate TCP hash tables, each one becomes smaller. When we remove the exclusive net lock from TCP, contention on internet PCB table mutex will be reduced. UDP has been split earlier into IPv4 and IPv6. Replace branch conditions based on INP_IPV6 with assertions. OK mvs@ show more ...
# 940d25ac	11-Feb-2024	bluhm <bluhm@openbsd.org>	Remove include netinet6/ip6_var.h from netinet/in_pcb.h. OK mvs@
# a342f0b4	19-Jan-2024	bluhm <bluhm@openbsd.org>	Unify inpcb API for inet and inet6. Many functions for IPv4 call their IPv6 counterpart if INP_IPV6 is set at the socket's pcb. By using the generic API consistently, the logic is not in the caller Unify inpcb API for inet and inet6. Many functions for IPv4 call their IPv6 counterpart if INP_IPV6 is set at the socket's pcb. By using the generic API consistently, the logic is not in the caller it gets more readable. OK mvs@ show more ...
# 6285ef23	11-Jan-2024	bluhm <bluhm@openbsd.org>	Fix white spaces in TCP.
# ab485656	03-Dec-2023	bluhm <bluhm@openbsd.org>	Use INP_IPV6 flag instead of sotopf(). During initialization in_pcballoc() sets INP_IPV6 once to avoid reaching through inp_socket->so_proto->pr_domain->dom_family. Use this flag consistently. OK Use INP_IPV6 flag instead of sotopf(). During initialization in_pcballoc() sets INP_IPV6 once to avoid reaching through inp_socket->so_proto->pr_domain->dom_family. Use this flag consistently. OK sashan@ mvs@ show more ...
# cd28665a	01-Dec-2023	bluhm <bluhm@openbsd.org>	Set inp address, port and rtable together with inpcb hash. The inpcb hash table is protected by table->inpt_mtx. The hash is based on addresses, ports, and routing table. These fields were not syc Set inp address, port and rtable together with inpcb hash. The inpcb hash table is protected by table->inpt_mtx. The hash is based on addresses, ports, and routing table. These fields were not sychronized with the hash. Put writes and hash update into the same critical section. Move the updates from ip_ctloutput(), ip6_ctloutput(), syn_cache_get(), tcp_connect(), udp_disconnect() to dedicated inpcb set functions. There they use the same table mutex as in_pcbrehash(). in_pcbbind(), in_pcbconnect(), and in6_pcbconnect() need more work and are not included yet. OK sashan@ mvs@ show more ...
# cff23a6b	01-Dec-2023	bluhm <bluhm@openbsd.org>	Make internet PCB connect more consistent. The public interface is in_pcbconnect(). It dispatches to in6_pcbconnect() if necessary. Call the former from tcp_connect() and udp_connect(). In in6_pcb Make internet PCB connect more consistent. The public interface is in_pcbconnect(). It dispatches to in6_pcbconnect() if necessary. Call the former from tcp_connect() and udp_connect(). In in6_pcbconnect() initialization in6a = NULL is not necessary. in6_pcbselsrc() sets the pointer, but does not read the value. Pass a constant in6_addr pointer to in6_pcbselsrc() and in6_selectsrc(). It returns a reference to the address of some internal data structure. We want to be sure that in6_addr is not modified this way. IPv4 in_pcbselsrc() solves this by passing a copy of the address. OK kn@ sashan@ mvs@ show more ...
# 952c6363	28-Nov-2023	bluhm <bluhm@openbsd.org>	Remove struct inpcb from in6_embedscope() parameters. rip6_output() did modify inp_outputopts6 temporarily to provide different ip6_pktopts to in6_embedscope(). Better pass inp_outputopts6 and inp_ Remove struct inpcb from in6_embedscope() parameters. rip6_output() did modify inp_outputopts6 temporarily to provide different ip6_pktopts to in6_embedscope(). Better pass inp_outputopts6 and inp_moptions6 as separate arguments to in6_embedscope(). Simplify the code that deals with these options in in6_embedscope(). Doucument inp_moptions and inp_moptions6 as protected by net lock. OK kn@ show more ...
# 0bfbfbe7	16-Nov-2023	bluhm <bluhm@openbsd.org>	Run TCP SYN cache timer logik without net lock. Introduce global TCP SYN cache mutex. Devide timer function in parts protected by mutex and sending with netlock. Split the flags field in dynamic f Run TCP SYN cache timer logik without net lock. Introduce global TCP SYN cache mutex. Devide timer function in parts protected by mutex and sending with netlock. Split the flags field in dynamic flags protected by mutex and fixed flags set during initialization. Document whether fields of struct syn_cache are protected by net lock or mutex. input and OK sashan@ show more ...
# bf0d449c	16-Sep-2023	mpi <mpi@openbsd.org>	Allow counters_read(9) to take an optional scratch buffer. Using a scratch buffer makes it possible to take a consistent snapshot of per-CPU counters without having to allocate memory. Makes ddb(4) Allow counters_read(9) to take an optional scratch buffer. Using a scratch buffer makes it possible to take a consistent snapshot of per-CPU counters without having to allocate memory. Makes ddb(4) show uvmexp command work in OOM situations. ok kn@, mvs@, cheloha@ show more ...
# 9e96aff0	06-Jul-2023	bluhm <bluhm@openbsd.org>	Convert tcp_now() time counter to 64 bit. After changing tcp now tick to milliseconds, 32 bits will wrap around after 49 days of uptime. That may be a problem in some places of our stack. Better u Convert tcp_now() time counter to 64 bit. After changing tcp now tick to milliseconds, 32 bits will wrap around after 49 days of uptime. That may be a problem in some places of our stack. Better use a 64 bit counter. As timestamp option is 32 bit in TCP protocol, use the lower 32 bit there. There are casts to 32 bits that should behave correctly. Start with random 63 bit offset to avoid uptime leakage. 2^63 milliseconds result in 2.9*10^8 years of possible uptime. OK yasuoka@ show more ...
# a3c0391f	02-Jul-2023	bluhm <bluhm@openbsd.org>	Use TSO and LRO on the loopback interface to transfer TCP faster. If tcplro is activated on lo(4), ignore the MTU with TCP packets. They are passed along with the information that they have to be ch Use TSO and LRO on the loopback interface to transfer TCP faster. If tcplro is activated on lo(4), ignore the MTU with TCP packets. They are passed along with the information that they have to be chopped in case they are forwarded later. New netstat(1) counter shows that software LRO is in effect. The feature is currently turned off by default. tested by jan@; OK claudio@ jan@ show more ...
# a5a54c4a	23-May-2023	jan <jan@openbsd.org>	New counters for LRO packets from hardware TCP offloading. With tweaks from patrick@ and bluhm@. OK bluhm@
# c06845b1	10-May-2023	bluhm <bluhm@openbsd.org>	Implement TCP send offloading, for now in software only. This is meant as a fallback if network hardware does not support TSO. Driver support is still work in progress. TCP output generates large Implement TCP send offloading, for now in software only. This is meant as a fallback if network hardware does not support TSO. Driver support is still work in progress. TCP output generates large packets. In IP output the packet is chopped to TCP maximum segment size. This reduces the CPU cycles used by pf. The regular output could be assisted by hardware later, but pf route-to and IPsec needs the software fallback in general. For performance comparison or to workaround possible bugs, sysctl net.inet.tcp.tso=0 disables the feature. netstat -s -p tcp shows TSO counter with chopped and generated packets. based on work from jan@ tested by jmc@ jan@ Hrvoje Popovski OK jan@ claudio@ show more ...
# b9587575	14-Mar-2023	yasuoka <yasuoka@openbsd.org>	To avoid misunderstanding, keep variables for tcp keepalive in milliseconds, which is the same unit of tcp_now(). However, keep the unit of sysctl variables in seconds and convert their unit in tcp_ To avoid misunderstanding, keep variables for tcp keepalive in milliseconds, which is the same unit of tcp_now(). However, keep the unit of sysctl variables in seconds and convert their unit in tcp_sysctl(). Additionally revert TCPTV_SRTTDFLT back to 3 seconds, which was mistakenly changed to 1.5 seconds by tcp_timer.h 1.19. ok claudio show more ...
# 4b9bfff3	22-Jan-2023	mvs <mvs@openbsd.org>	Move SS_CANTRCVMORE and SS_RCVATMARK bits from `so_state' to `sb_state' of receive buffer. As it was done for SS_CANTSENDMORE bit, the definition kept as is, but now these bits belongs to the `sb_sta Move SS_CANTRCVMORE and SS_RCVATMARK bits from `so_state' to `sb_state' of receive buffer. As it was done for SS_CANTSENDMORE bit, the definition kept as is, but now these bits belongs to the `sb_state' of receive buffer. `sb_state' ored with `so_state' when socket data exporting to the userland. ok bluhm@ show more ...
12 3 4 5 6 7 8 9 10