Revision tags: release/14.0.0 |
|
#
685dc743 |
| 16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
#
43b117f8 |
| 06-Jun-2023 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: make the maximum number of retransmissions tunable per VNET
Both Windows (TcpMaxDataRetransmissions) and Linux (tcp_retries2) allow to restrict the maximum number of consecutive timer based ret
tcp: make the maximum number of retransmissions tunable per VNET
Both Windows (TcpMaxDataRetransmissions) and Linux (tcp_retries2) allow to restrict the maximum number of consecutive timer based retransmissions. Add that same capability on a per-VNet basis to FreeBSD.
Reviewed By: cc, tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D40424
show more ...
|
#
2169f712 |
| 11-Apr-2023 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: use IPV6_FLOWLABEL_LEN
Avoid magic numbers when handling the IPv6 flow ID for DSCP and ECN fields and use the named variable instead.
Reviewed By: tuexen, #transport Sponsored by: NetApp, In
tcp: use IPV6_FLOWLABEL_LEN
Avoid magic numbers when handling the IPv6 flow ID for DSCP and ECN fields and use the named variable instead.
Reviewed By: tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D39503
show more ...
|
Revision tags: release/13.2.0 |
|
#
69c7c811 |
| 16-Mar-2023 |
Randall Stewart <rrs@FreeBSD.org> |
Move access to tcp's t_logstate into inline functions and provide new tracepoint and bbpoint capabilities.
The TCP stacks have long accessed t_logstate directly, but in order to do tracepoints and t
Move access to tcp's t_logstate into inline functions and provide new tracepoint and bbpoint capabilities.
The TCP stacks have long accessed t_logstate directly, but in order to do tracepoints and the new bbpoints we need to move to using the new inline functions. This adds them and moves rack to now use the tcp_tracepoints.
Reviewed by: tuexen, gallatin Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D38831
show more ...
|
Revision tags: release/12.4.0, release/13.1.0, release/12.3.0 |
|
#
2f201df1 |
| 20-Jul-2021 |
Alfonso <gfunni234@gmail.com> |
Change hw_tls to a bool
Reviewed by: imp Pull Request: https://github.com/freebsd/freebsd-src/pull/512
|
#
c0e4090e |
| 08-Feb-2023 |
Andrew Gallatin <gallatin@FreeBSD.org> |
ktls: Accurately track if ifnet ktls is enabled
This allows us to avoid spurious calls to ktls_disable_ifnet()
When we implemented ifnet kTLSe, we set a flag in the tx socket buffer (SB_TLS_IFNET)
ktls: Accurately track if ifnet ktls is enabled
This allows us to avoid spurious calls to ktls_disable_ifnet()
When we implemented ifnet kTLSe, we set a flag in the tx socket buffer (SB_TLS_IFNET) to indicate ifnet kTLS. This flag meant that now, or in the past, ifnet ktls was active on a socket. Later, I added code to switch ifnet ktls sessions to software in the case of lossy TCP connections that have a high retransmit rate. Because TCP was using SB_TLS_IFNET to know if it needed to do math to calculate the retransmit ratio and potentially call into ktls_disable_ifnet(), it was doing unneeded work long after a session was moved to software.
This patch carefully tracks whether or not ifnet ktls is still enabled on a TCP connection. Because the inp is now embedded in the tcpcb, and because TCP is the most frequent accessor of this state, it made sense to move this from the socket buffer flags to the tcpcb. Because we now need reliable access to the tcbcb, we take a ref on the inp when creating a tx ktls session.
While here, I noticed that rack/bbr were incorrectly implementing tfb_hwtls_change(), and applying the change to all pending sends, when it should apply only to future sends.
This change reduces spurious calls to ktls_disable_ifnet() by 95% or so in a Netflix CDN environment.
Reviewed by: markj, rrs Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D38380
show more ...
|
#
eaabc937 |
| 14-Dec-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: retire TCPDEBUG
This subsystem is superseded by modern debugging facilities, e.g. DTrace probes and TCP black box logging.
We intentionally leave SO_DEBUG in place, as many utilities may set i
tcp: retire TCPDEBUG
This subsystem is superseded by modern debugging facilities, e.g. DTrace probes and TCP black box logging.
We intentionally leave SO_DEBUG in place, as many utilities may set it on a socket. Also the tcp::debug DTrace probes look at this flag on a socket.
Reviewed by: gnn, tuexen Discussed with: rscheff, rrs, jtl Differential revision: https://reviews.freebsd.org/D37694
show more ...
|
#
e68b3792 |
| 07-Dec-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: embed inpcb into tcpcb
For the TCP protocol inpcb storage specify allocation size that would provide space to most of the data a TCP connection needs, embedding into struct tcpcb several struct
tcp: embed inpcb into tcpcb
For the TCP protocol inpcb storage specify allocation size that would provide space to most of the data a TCP connection needs, embedding into struct tcpcb several structures, that previously were allocated separately.
The most import one is the inpcb itself. With embedding we can provide strong guarantee that with a valid TCP inpcb the tcpcb is always valid and vice versa. Also we reduce number of allocs/frees per connection. The embedded inpcb is placed in the beginning of the struct tcpcb, since in_pcballoc() requires that. However, later we may want to move it around for cache line efficiency, and this can be done with a little effort. The new intotcpcb() macro is ready for such move.
The congestion algorithm data, the TCP timers and osd(9) data are also embedded into tcpcb, and temprorary struct tcpcb_mem goes away. There was no extra allocation here, but we went through extra pointer every time we accessed this data.
One interesting side effect is that now TCP data is allocated from SMR-protected zone. Potentially this allows the TCP stacks or other TCP related modules to utilize that for their own synchronization.
Large part of the change was done with sed script:
s/tp->ccv->/tp->t_ccv./g s/tp->ccv/\&tp->t_ccv/g s/tp->cc_algo/tp->t_cc/g s/tp->t_timers->tt_/tp->tt_/g s/CCV\(ccv, osd\)/\&CCV(ccv, t_osd)/g
Dependency side effect is that code that needs to know struct tcpcb should also know struct inpcb, that added several <netinet/in_pcb.h>.
Differential revision: https://reviews.freebsd.org/D37127
show more ...
|
#
9eb0e832 |
| 08-Nov-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: provide macros to access inpcb and socket from a tcpcb
There should be no functional changes with this commit.
Reviewed by: rscheff Differential revision: https://reviews.freebsd.org/D37123
|
#
cd84e78f |
| 04-Oct-2022 |
Randall Stewart <rrs@FreeBSD.org> |
tcp idle reduce does not work for a server.
TCP has an idle-reduce feature that allows a connection to reduce its cwnd after it has been idle more than an RTT. This feature only works for a sending
tcp idle reduce does not work for a server.
TCP has an idle-reduce feature that allows a connection to reduce its cwnd after it has been idle more than an RTT. This feature only works for a sending side connection. It does this by at output checking the idle time (t_rcvtime vs ticks) to see if its more than the RTO timeout.
The problem comes if you are a web server. You get a request and then send out all the data.. then go idle. The next time you would send is in response to a request from the peer asking for more data. But the thing is you updated t_rcvtime when the request came in so you never reduce.
The fix is to do the idle reduce check also on inbound.
Reviewed by: tuexen, rscheff Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D36721
show more ...
|
#
08af8aac |
| 27-Sep-2022 |
Randall Stewart <rrs@FreeBSD.org> |
Tcp progress timeout
Rack has had the ability to timeout connections that just sit idle automatically. This feature of course is off by default and requires the user set it on (though the socket opt
Tcp progress timeout
Rack has had the ability to timeout connections that just sit idle automatically. This feature of course is off by default and requires the user set it on (though the socket option has been missing in tcp_usrreq.c). Lets get the progress timeout fully supported in the base stack as well as rack.
Reviewed by: tuexen Sponsored by: Netflix Inc Differential Revision: https://reviews.freebsd.org/D36716
show more ...
|
#
a743fc88 |
| 22-Sep-2022 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: fix cwnd restricted SACK retransmission loop
While doing the initial SACK retransmission segment while heavily cwnd constrained, tcp_ouput can erroneously send out the entire sendbuffer again.
tcp: fix cwnd restricted SACK retransmission loop
While doing the initial SACK retransmission segment while heavily cwnd constrained, tcp_ouput can erroneously send out the entire sendbuffer again. This may happen after an retransmission timeout, which resets snd_nxt to snd_una while the SACK scoreboard is still populated.
Reviewed By: tuexen, #transport PR: 264257 PR: 263445 PR: 260393 MFC after: 3 days Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D36637
show more ...
|
#
6d9e911f |
| 19-Sep-2022 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp: fix computation of offset
Only update the offset if actually retransmitting from the scoreboard. If not done correctly, this may result in trying to (re)-transmit data not being being in the so
tcp: fix computation of offset
Only update the offset if actually retransmitting from the scoreboard. If not done correctly, this may result in trying to (re)-transmit data not being being in the socket buffe and therefore resulting in a panic.
PR: 264257 PR: 263445 PR: 260393 Reviewed by: rscheff@ MFC after: 3 days Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D36626
show more ...
|
#
4012ef77 |
| 31-Aug-2022 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: Functional implementation of Accurate ECN
The AccECN handshake and TCP header flags are supported, no support yet for the AccECN option. This minimalistic implementation is sufficient to suppor
tcp: Functional implementation of Accurate ECN
The AccECN handshake and TCP header flags are supported, no support yet for the AccECN option. This minimalistic implementation is sufficient to support DCTCP while dramatically cutting the number of ACKs, and provide ECN response from the receiver to the CC modules.
Reviewed By: #transport, #manpages, rrs, pauamma Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D21011
show more ...
|
#
bd30a121 |
| 08-Aug-2022 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp: improve BBLog for output events when using the FreeBSD stack
Put the return value of ip_output()/ip6_output in the output event instead of adding another one in case of an error. This improves
tcp: improve BBLog for output events when using the FreeBSD stack
Put the return value of ip_output()/ip6_output in the output event instead of adding another one in case of an error. This improves consistency with other similar places.
Reviewed by: rscheff Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D36085
show more ...
|
#
66605ff7 |
| 14-Jul-2022 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: Undo the increase in sequence number by 1 due to the FIN flag in case of a transient error.
If an error occurs while processing a TCP segment with some data and the FIN flag, the back out of th
tcp: Undo the increase in sequence number by 1 due to the FIN flag in case of a transient error.
If an error occurs while processing a TCP segment with some data and the FIN flag, the back out of the sequence number advance does not take into account the increase by 1 due to the FIN flag.
Reviewed By: jch, gnn, #transport, tuexen Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D2970
show more ...
|
#
28173d49 |
| 02-Jun-2022 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
tcp: Correctly compute the retransmit length for all 64-bit platforms.
When the TCP sequence number subtracted is greater than 2**32 minus the window size, or 2**31 minus the window size, the use of
tcp: Correctly compute the retransmit length for all 64-bit platforms.
When the TCP sequence number subtracted is greater than 2**32 minus the window size, or 2**31 minus the window size, the use of unsigned long as an intermediate variable, may result in an incorrect retransmit length computation on all 64-bit platforms.
While at it create a helper macro to facilitate the computation of the difference between two TCP sequence numbers.
Differential Revision: https://reviews.freebsd.org/D35388 Reviewed by: rscheff MFC after: 3 days Sponsored by: NVIDIA Networking
show more ...
|
#
43283184 |
| 12-May-2022 |
Gleb Smirnoff <glebius@FreeBSD.org> |
sockets: use socket buffer mutexes in struct socket directly
Since c67f3b8b78e the sockbuf mutexes belong to the containing socket, and socket buffers just point to it. In 74a68313b50 macros that a
sockets: use socket buffer mutexes in struct socket directly
Since c67f3b8b78e the sockbuf mutexes belong to the containing socket, and socket buffers just point to it. In 74a68313b50 macros that access this mutex directly were added. Go over the core socket code and eliminate code that reaches the mutex by dereferencing the sockbuf compatibility pointer.
This change requires a KPI change, as some functions were given the sockbuf pointer only without any hint if it is a receive or send buffer.
This change doesn't cover the whole kernel, many protocols still use compatibility pointers internally. However, it allows operation of a protocol that doesn't use them.
Reviewed by: markj Differential revision: https://reviews.freebsd.org/D35152
show more ...
|
#
732b6d4d |
| 13-Apr-2022 |
John Baldwin <jhb@FreeBSD.org> |
netinet: Use __diagused for variables only used in KASSERT().
|
#
2ff07d92 |
| 25-Feb-2022 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: Restore correct ECT marking behavior on SACK retransmissions
While coalescing all ECN-related code into new common source files, the flag to deal with SACK retransmissions was skipped. This lea
tcp: Restore correct ECT marking behavior on SACK retransmissions
While coalescing all ECN-related code into new common source files, the flag to deal with SACK retransmissions was skipped. This leads to non-compliant ECT-marking of SACK retransmissions, as well as the premature sending of other TCP ECN flags (CWR).
Reviewed By: rrs, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D34376
show more ...
|
#
f7220c48 |
| 05-Feb-2022 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: move ECN handling code to a common file
Reduce the burden to maintain correct and extensible ECN related code across multiple stacks and codepaths.
Formally no functional change.
Incidentiall
tcp: move ECN handling code to a common file
Reduce the burden to maintain correct and extensible ECN related code across multiple stacks and codepaths.
Formally no functional change.
Incidentially this establishes correct ECN operation in one instance.
Reviewed By: rrs, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D34162
show more ...
|
#
7994ef3c |
| 05-Feb-2022 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
Revert "tcp: move ECN handling code to a common file"
This reverts commit 0c424c90eaa6602e07bca7836b1d178b91f2a88a.
|
#
0c424c90 |
| 04-Feb-2022 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: move ECN handling code to a common file
Reduce the burden to maintain correct and extensible ECN related code across multiple stacks and codepaths.
Formally no functional change.
Incidentiall
tcp: move ECN handling code to a common file
Reduce the burden to maintain correct and extensible ECN related code across multiple stacks and codepaths.
Formally no functional change.
Incidentially this establishes correct ECN operation in one instance.
Reviewed By: rrs, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D34162
show more ...
|
#
f026275e |
| 03-Feb-2022 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: set IP ECN header codepoint properly
TCP RACK can cache the IP header while preparing a new TCP packet for transmission. Thus all the IP ECN codepoint bits need to be assigned, without assuming
tcp: set IP ECN header codepoint properly
TCP RACK can cache the IP header while preparing a new TCP packet for transmission. Thus all the IP ECN codepoint bits need to be assigned, without assuming a clear field beforehand.
Reviewed By: tuexen, kbowling, #transport MFC after: 3 days Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D34148
show more ...
|
#
1ebf4607 |
| 03-Feb-2022 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: Access all 12 TCP header flags via inline function
In order to consistently provide access to all (including reserved) TCP header flag bits, use an accessor function tcp_get_flags and tcp_set_f
tcp: Access all 12 TCP header flags via inline function
In order to consistently provide access to all (including reserved) TCP header flag bits, use an accessor function tcp_get_flags and tcp_set_flags. Also expand any flag variable from uint8_t / char to uint16_t.
Reviewed By: hselasky, tuexen, glebius, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D34130
show more ...
|