#
cbb583bb |
| 22-Jan-2025 |
bluhm <bluhm@openbsd.org> |
Convert bcopy() to memcpy() in tcp_respond().
Struct ip, ip6, and th point to locations on m, which is new memory from m_gethdr(). There is no overlapping memory, so use memcpy.
from dhill@; OK mv
Convert bcopy() to memcpy() in tcp_respond().
Struct ip, ip6, and th point to locations on m, which is new memory from m_gethdr(). There is no overlapping memory, so use memcpy.
from dhill@; OK mvs@
show more ...
|
#
e835bce2 |
| 16-Jan-2025 |
bluhm <bluhm@openbsd.org> |
Remove net lock from TCP sysctl for keep alive.
Keep copies in seconds for the sysctl and update timer variables atomically when they change. tcp_maxidle was historically calculated in tcp_slowtimo
Remove net lock from TCP sysctl for keep alive.
Keep copies in seconds for the sysctl and update timer variables atomically when they change. tcp_maxidle was historically calculated in tcp_slowtimo() as the timers were called from there. Better calculate maxidle when needed. tcp_timer_init() is useless, just initialize data. While there make the names consistent.
input sthen@; OK mvs@
show more ...
|
#
4ab10cec |
| 03-Jan-2025 |
bluhm <bluhm@openbsd.org> |
Reference count the inpcb in TCP timers.
Switch from struct tcpcb to inpcb in the TCP timer argument. The latter already has a reference counter. Increment it at timeout_add() and decrement at tim
Reference count the inpcb in TCP timers.
Switch from struct tcpcb to inpcb in the TCP timer argument. The latter already has a reference counter. Increment it at timeout_add() and decrement at timeout_del() or when handler runs. The reaper timeout is special as it does not need a reference, the inpcb is already dead. Use special field t_timer_reaper instead of regular TCP timeout and run it without reference or lock.
OK mvs@
show more ...
|
#
f9d292df |
| 28-Dec-2024 |
bluhm <bluhm@openbsd.org> |
Read more TCP sysctl variables atomically.
OK mvs@
|
#
535d4cde |
| 26-Dec-2024 |
bluhm <bluhm@openbsd.org> |
Make access to tcp_mssdflt atomic.
To further unlock TCP sysctl, we need atomic access to tcp_mssdflt. pf(4) is reading the value multiple times. Better read it once and pass mssdflt down the call
Make access to tcp_mssdflt atomic.
To further unlock TCP sysctl, we need atomic access to tcp_mssdflt. pf(4) is reading the value multiple times. Better read it once and pass mssdflt down the call stack. In pf_calc_mss() was a potential integer underflow. Use the signed variant imax(9) and imin(9) like it has been fixed it in TCP stack.
OK mvs@
show more ...
|
#
ace0f189 |
| 17-Apr-2024 |
bluhm <bluhm@openbsd.org> |
Use struct ipsec_level within inpcb.
Instead of passing around u_char[4], introduce struct ipsec_level that contains 4 ipsec levels. This provides better type safety. The embedding struct inpcb is
Use struct ipsec_level within inpcb.
Instead of passing around u_char[4], introduce struct ipsec_level that contains 4 ipsec levels. This provides better type safety. The embedding struct inpcb is globally visible for netstat(1), so put struct ipsec_level outside of #ifdef _KERNEL.
OK deraadt@ mvs@
show more ...
|
#
93536db2 |
| 12-Apr-2024 |
bluhm <bluhm@openbsd.org> |
Split single TCP inpcb table into IPv4 and IPv6 parts.
With two separate TCP hash tables, each one becomes smaller. When we remove the exclusive net lock from TCP, contention on internet PCB table
Split single TCP inpcb table into IPv4 and IPv6 parts.
With two separate TCP hash tables, each one becomes smaller. When we remove the exclusive net lock from TCP, contention on internet PCB table mutex will be reduced. UDP has been split earlier into IPv4 and IPv6. Replace branch conditions based on INP_IPV6 with assertions.
OK mvs@
show more ...
|
#
94c0e2bd |
| 13-Feb-2024 |
bluhm <bluhm@openbsd.org> |
Merge struct route and struct route_in6.
Use a common struct route for both inet and inet6. Unfortunately struct sockaddr is shorter than sockaddr_in6, so netinet/in.h has to be exposed from net/ro
Merge struct route and struct route_in6.
Use a common struct route for both inet and inet6. Unfortunately struct sockaddr is shorter than sockaddr_in6, so netinet/in.h has to be exposed from net/route.h. Struct route has to be bsd visible for userland as netstat kvm code inspects inp_route. Internet PCB and TCP SYN cache can use a plain struct route now. All specific sockaddr types for inet and inet6 are embeded there.
OK claudio@
show more ...
|
#
940d25ac |
| 11-Feb-2024 |
bluhm <bluhm@openbsd.org> |
Remove include netinet6/ip6_var.h from netinet/in_pcb.h.
OK mvs@
|
#
7b1356d5 |
| 28-Jan-2024 |
bluhm <bluhm@openbsd.org> |
Use more specific sockaddr type for inpcb notify.
in_pcbnotifyall() is an IPv4 only function. All callers check that sockaddr dst is in fact a sockaddr_in. Pass the more spcific type and remove th
Use more specific sockaddr type for inpcb notify.
in_pcbnotifyall() is an IPv4 only function. All callers check that sockaddr dst is in fact a sockaddr_in. Pass the more spcific type and remove the runtime check at beginning of in_pcbnotifyall(). Use const sockaddr_in in in_pcbnotifyall() and const sockaddr_in6 in6_pcbnotify() as dst parameter.
OK millert@
show more ...
|
#
82b5c162 |
| 27-Jan-2024 |
bluhm <bluhm@openbsd.org> |
Declare address parameter in TCP SYN cache const.
tcp6_ctlinput() casted a constant sockaddr_sin6 to non-const sockaddr. sa6_src may be &sa6_any which lives in read-only data section. Better pass do
Declare address parameter in TCP SYN cache const.
tcp6_ctlinput() casted a constant sockaddr_sin6 to non-const sockaddr. sa6_src may be &sa6_any which lives in read-only data section. Better pass down the const addresses to syn_cache_lookup(). They are needed for hash lookup and are not modified.
OK mvs@
show more ...
|
#
6285ef23 |
| 11-Jan-2024 |
bluhm <bluhm@openbsd.org> |
Fix white spaces in TCP.
|
#
c7641205 |
| 29-Nov-2023 |
bluhm <bluhm@openbsd.org> |
Document inp_socket as immutable and remove NULL checks.
Struct inpcb field inp_socket is initialized in in_pcballoc(). It is not NULL and never changed.
OK mvs@
|
#
2551e577 |
| 26-Nov-2023 |
bluhm <bluhm@openbsd.org> |
Remove inp parameter from ip_output().
ip_output() received inp as parameter. This is only used to lookup the IPsec level of the socket. Reasoning about MP locking is much easier if only relevant
Remove inp parameter from ip_output().
ip_output() received inp as parameter. This is only used to lookup the IPsec level of the socket. Reasoning about MP locking is much easier if only relevant data is passed around. Convert ip_output() to receive constant inp_seclevel as argument and mark it as protected by net lock.
OK mvs@
show more ...
|
#
9e96aff0 |
| 06-Jul-2023 |
bluhm <bluhm@openbsd.org> |
Convert tcp_now() time counter to 64 bit.
After changing tcp now tick to milliseconds, 32 bits will wrap around after 49 days of uptime. That may be a problem in some places of our stack. Better u
Convert tcp_now() time counter to 64 bit.
After changing tcp now tick to milliseconds, 32 bits will wrap around after 49 days of uptime. That may be a problem in some places of our stack. Better use a 64 bit counter.
As timestamp option is 32 bit in TCP protocol, use the lower 32 bit there. There are casts to 32 bits that should behave correctly.
Start with random 63 bit offset to avoid uptime leakage. 2^63 milliseconds result in 2.9*10^8 years of possible uptime.
OK yasuoka@
show more ...
|
#
c06845b1 |
| 10-May-2023 |
bluhm <bluhm@openbsd.org> |
Implement TCP send offloading, for now in software only. This is meant as a fallback if network hardware does not support TSO. Driver support is still work in progress. TCP output generates large
Implement TCP send offloading, for now in software only. This is meant as a fallback if network hardware does not support TSO. Driver support is still work in progress. TCP output generates large packets. In IP output the packet is chopped to TCP maximum segment size. This reduces the CPU cycles used by pf. The regular output could be assisted by hardware later, but pf route-to and IPsec needs the software fallback in general. For performance comparison or to workaround possible bugs, sysctl net.inet.tcp.tso=0 disables the feature. netstat -s -p tcp shows TSO counter with chopped and generated packets. based on work from jan@ tested by jmc@ jan@ Hrvoje Popovski OK jan@ claudio@
show more ...
|
#
00007ca3 |
| 07-Nov-2022 |
yasuoka <yasuoka@openbsd.org> |
Modify TCP receive buffer size auto scaling to use the smoothed RTT (SRTT) instead of the timestamp option. Since the timestamp option is disabled on some OSs (eg. Windows) or dropped by some firewa
Modify TCP receive buffer size auto scaling to use the smoothed RTT (SRTT) instead of the timestamp option. Since the timestamp option is disabled on some OSs (eg. Windows) or dropped by some firewalls/routers, in such a case the window size had been fixed at 16KB, this limits throughput at very low on high latency networks. Also replace "tcp_now" from 2HZ tick counter to binuptime in milliseconds to calculate the SRTT better.
tested by krw matthieu jmatthew dlg djm stu stsp ok claudio
show more ...
|
#
62440853 |
| 03-Oct-2022 |
bluhm <bluhm@openbsd.org> |
System calls should not fail due to temporary memory shortage in malloc(9) or pool_get(9). Pass down a wait flag to pru_attach(). During syscall socket(2) it is ok to wait, this logic was missing fo
System calls should not fail due to temporary memory shortage in malloc(9) or pool_get(9). Pass down a wait flag to pru_attach(). During syscall socket(2) it is ok to wait, this logic was missing for internet pcb. Pfkey and route sockets were already waiting. sonewconn() must not wait when called during TCP 3-way handshake. This logic has been preserved. Unix domain stream socket connect(2) can wait until the other side has created the socket to accept. OK mvs@
show more ...
|
#
a300f670 |
| 03-Sep-2022 |
bluhm <bluhm@openbsd.org> |
Initialize TCP mutex forgotten in previous commit. found by Hrvoje Popovski with witness; OK mvs@
|
#
8c664ca5 |
| 03-Sep-2022 |
bluhm <bluhm@openbsd.org> |
Use a mutex to update tcp_maxidle, tcp_iss, and tcp_now. This removes pressure from the exclusive netlock in tcp_slowtimo(). Reading is done atomically. Ensure that the tcp_now value is read only o
Use a mutex to update tcp_maxidle, tcp_iss, and tcp_now. This removes pressure from the exclusive netlock in tcp_slowtimo(). Reading is done atomically. Ensure that the tcp_now value is read only once per function to provide consistent time. OK yasuoka@
show more ...
|
#
a6b8fd29 |
| 30-Aug-2022 |
bluhm <bluhm@openbsd.org> |
Refactor internet PCB lookup function. Rename in_pcbhashlookup() so the public API is in_pcblookup() and in_pcblookup_listen(). For internal use introduce in_pcbhash_insert() and in_pcbhash_lookup(
Refactor internet PCB lookup function. Rename in_pcbhashlookup() so the public API is in_pcblookup() and in_pcblookup_listen(). For internal use introduce in_pcbhash_insert() and in_pcbhash_lookup() to avoid code duplication. Routing domain is unsigned, change the type to u_int. OK mvs@
show more ...
|
#
9e8a1cdf |
| 08-Aug-2022 |
bluhm <bluhm@openbsd.org> |
To make protocol input functions MP safe, internet PCB need protection. Use their reference counter in more places. The in_pcb lookup functions hold the PCBs in hash tables protected by table->inpt_m
To make protocol input functions MP safe, internet PCB need protection. Use their reference counter in more places. The in_pcb lookup functions hold the PCBs in hash tables protected by table->inpt_mtx mutex. Whenever a result is returned, increment the ref count before releasing the mutex. Then the inp can be used as long as neccessary. Unref it at the end of all functions that call in_pcb lookup. As a shortcut, pf may also hold a reference to the PCB. When pf_inp_lookup() returns it, it also incements the ref count and the caller can handle it like the inp from table lookup. OK sashan@
show more ...
|
#
1941b8b5 |
| 02-Mar-2022 |
bluhm <bluhm@openbsd.org> |
The return value of in6_pcbnotify() is never used. Make it a void function. OK gnezdo@ mvs@ florian@ sashan@
|
#
df8d9afd |
| 02-Jan-2022 |
jsg <jsg@openbsd.org> |
spelling ok jmc@ reads ok tb@
|
#
bec0ed23 |
| 11-Nov-2021 |
bluhm <bluhm@openbsd.org> |
Do not call ip_deliver() recursively from IPsec. As there is no crypto task anymore, it is possible to return the next protocol. Then ip_deliver() will walk the header chain in its loop. IPsec bridg
Do not call ip_deliver() recursively from IPsec. As there is no crypto task anymore, it is possible to return the next protocol. Then ip_deliver() will walk the header chain in its loop. IPsec bridge(4) tested by jan@ OK mvs@ tobhe@ jan@
show more ...
|