Revision tags: release/14.2.0 |
|
#
37898108 |
| 21-Nov-2024 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: avoid bcopy() in tcp_mss_update()
|
#
09000cc1 |
| 21-Nov-2024 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: mechanically rename hostcache metrics structure fields
Use hc_ prefix instead of rmx_. The latter stands for "route metrix" and is an artifact from the 90-ies, when TCP caching was embedded in
tcp: mechanically rename hostcache metrics structure fields
Use hc_ prefix instead of rmx_. The latter stands for "route metrix" and is an artifact from the 90-ies, when TCP caching was embedded into the routing table. The rename should have happened back in 97d8d152c28bb.
No functional change. Done with sed(1) command:
s/rmx_(mtu|ssthresh|rtt|rttvar|cwnd|sendpipe|recvpipe|granularity|expire|q|hits|updates)/hc_\1/g
show more ...
|
#
8f5a2e21 |
| 14-Nov-2024 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: fix cwnd recalculation during limited transmit
Properly calculate the expected flight size (cwnd) during limited transmit. Exclude the SACK scoreboard from consideration when still in limited t
tcp: fix cwnd recalculation during limited transmit
Properly calculate the expected flight size (cwnd) during limited transmit. Exclude the SACK scoreboard from consideration when still in limited transmit.
PR: 282605 Reviewed By: tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D47541
show more ...
|
#
dded4e9e |
| 13-Nov-2024 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: change SOCKBUF_* macros to SOCK_[RECV|SEND]BUF_* macros
Change the older LOCK related macros over to the dedicated send/recv buffer macros in the base tcp stack.
No functional change intended.
tcp: change SOCKBUF_* macros to SOCK_[RECV|SEND]BUF_* macros
Change the older LOCK related macros over to the dedicated send/recv buffer macros in the base tcp stack.
No functional change intended.
Reviewed By: tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D47567
show more ...
|
#
7dc78150 |
| 29-Oct-2024 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: refactor cwnd during SACK transmissions to allow TSO
Refactoring of cwnd and moving the adjustment for SACKed data into tcp_output() - cwnd tracking the maximum extent starting at snd_una - all
tcp: refactor cwnd during SACK transmissions to allow TSO
Refactoring of cwnd and moving the adjustment for SACKed data into tcp_output() - cwnd tracking the maximum extent starting at snd_una - allows both SACK loss recovery as well as SACK transmissions after RTO during slow start and if allowed, the use of TSO while in loss recovery.
Reviewed By: tuexen, cc, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D43470
show more ...
|
#
440f4ba1 |
| 10-Oct-2024 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: fix duplicate retransmissions when RTO happens during SACK loss recovery
When snd_nxt doesn't track snd_max, partial SACK ACKs may elicit unexpected duplicate retransmissions. This is usually m
tcp: fix duplicate retransmissions when RTO happens during SACK loss recovery
When snd_nxt doesn't track snd_max, partial SACK ACKs may elicit unexpected duplicate retransmissions. This is usually masked by LRO not necessarily ACKing every individual segment, and prior to RFC6675 SACK loss recovery, harder to trigger even when an RTO happens while SACK loss recovery is ongoing.
Address this by improving the logic when to start a SACK loss recovery and how to deal with a RTO, as well as improvements to the adjusted congestion window during transmission selection.
Reviewed By: tuexen, cc, #transport Sponsored by: NetApp, Inc. MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D43355
show more ...
|
Revision tags: release/13.4.0 |
|
#
40299c55 |
| 25-Jul-2024 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp: implement challenge ACK throttling for the base stack
Implement ACK throttling of challenge ACKs as described in RFC 5961.
Reviewed by: Peter Lei, rscheff, cc MFC after: 1 week Sponsored by:
tcp: implement challenge ACK throttling for the base stack
Implement ACK throttling of challenge ACKs as described in RFC 5961.
Reviewed by: Peter Lei, rscheff, cc MFC after: 1 week Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D46066
show more ...
|
#
37b3e6a6 |
| 22-Jul-2024 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp: use TCP_MAXWIN instead of 65535
This is suggested by cc@. No functional change.
Sponsored by: Netflix, Inc.
|
#
646c28ea |
| 21-Jul-2024 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp: improve SEG.ACK validation
Implement the improved SEG.ACK validation described in RFC 5961. In addition to that, also detect ghost ACKs, which are ACKs for data that has never been sent. The ad
tcp: improve SEG.ACK validation
Implement the improved SEG.ACK validation described in RFC 5961. In addition to that, also detect ghost ACKs, which are ACKs for data that has never been sent. The additional checks are enabled by default, but can be disabled by setting the sysctl-variable net.inet.tcp.insecure_ack to a non-zero value.
PR: 250357 Reviewed by: Peter Lei, rscheff (older version) MFC after: 1 week Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D45894
show more ...
|
Revision tags: release/14.1.0, release/13.3.0 |
|
#
b6919741 |
| 14-Nov-2023 |
Konstantin Belousov <kib@FreeBSD.org> |
ipsec_offload: handle TSO if supported
Allow for TSO to operate if network interface supports ipsec inline offload and supports TSO over it.
Reviewed by: tuexen Sponsored by: NVIDIA networking Diff
ipsec_offload: handle TSO if supported
Allow for TSO to operate if network interface supports ipsec inline offload and supports TSO over it.
Reviewed by: tuexen Sponsored by: NVIDIA networking Differential revision: https://reviews.freebsd.org/D44222
show more ...
|
#
df9de82f |
| 25-May-2024 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp: fix sending RST after second inp lookup
When we first find an inp, we set also the tp. If then a second lookup is necessary, the inp is recomputed. If this fails, the tp is not cleared, which r
tcp: fix sending RST after second inp lookup
When we first find an inp, we set also the tp. If then a second lookup is necessary, the inp is recomputed. If this fails, the tp is not cleared, which resulted in failing KASSERT. Therefore, clear the tp when staring the inp lookup procedure. Reported by: Jenkins Fixes: 02d15215cef2 ("tcp: improve blackhole support") MFC after: 1 week Sponsored by: Netflix, Inc.
show more ...
|
#
02d15215 |
| 24-May-2024 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp: improve blackhole support
There are two improvements to the TCP blackhole support: (1) If net.inet.tcp.blackhole is set to 2, also sent no RST whenever a segment is received on an existing
tcp: improve blackhole support
There are two improvements to the TCP blackhole support: (1) If net.inet.tcp.blackhole is set to 2, also sent no RST whenever a segment is received on an existing closed socket or if there is a port mismatch when using UDP encapsulation. (2) If net.inet.tcp.blackhole is set to 3, no RST segment is sent in response to incoming segments on closed sockets or in response to unexpected segments on listening sockets. Thanks to gallatin@ for suggesting such an improvement.
Reviewed by: gallatin MFC after: 1 week Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D45304
show more ...
|
#
fce03f85 |
| 05-May-2024 |
Randall Stewart <rrs@FreeBSD.org> |
TCP can be subject to Sack Attacks lets fix this issue.
There is a type of attack that a TCP peer can launch on a connection. This is for sure in Rack or BBR and probably even the default stack if i
TCP can be subject to Sack Attacks lets fix this issue.
There is a type of attack that a TCP peer can launch on a connection. This is for sure in Rack or BBR and probably even the default stack if it uses lists in sack processing. The idea of the attack is that the attacker is driving you to look at 100's of sack blocks that only update 1 byte. So for example if you have 1 - 10,000 bytes outstanding the attacker sends in something like:
ACK 0 SACK(1-512) SACK(1024 - 1536), SACK(2048-2536), SACK(4096 - 4608), SACK(8192-8704) This first sack looks fine but then the attacker sends
ACK 0 SACK(1-512) SACK(1025 - 1537), SACK(2049-2537), SACK(4097 - 4609), SACK(8193-8705) ACK 0 SACK(1-512) SACK(1027 - 1539), SACK(2051-2539), SACK(4099 - 4611), SACK(8195-8707) ... These blocks are making you hunt across your linked list and split things up so that you have an entry for every other byte. Has your list grows you spend more and more CPU running through the lists. The idea here is the attacker chooses entries as far apart as possible that make you run through the list. This example is small but in theory if the window is open to say 1Meg you could end up with 100's of thousands link list entries.
To combat this we introduce three things.
when the peer requests a very small MSS we stop processing SACK's from them. This prevents a malicious peer from just using a small MSS to do the same thing. Any time we get a sack block, we use the sack-filter to remove sacks that are smaller than the smallest v4 mss (minus 40 for max TCP options) unless it ties up to snd_max (since that is legal). All other sacks in theory should be at least an MSS. If we get such an attacker that means we basically start skipping all but MSS sized Sacked blocks. The sack filter used to throw away data when its bounds were exceeded, instead now we increase its size to 15 and then throw away sack's if the filter gets over-run to prevent the malicious attacker from over-running the sack filter and thus we start to process things anyway. The default stack will need to start using the sack-filter which we have talked about in past conference calls to take full advantage of the protections offered by it (and reduce cpu consumption when processing sacks).
After this set of changes is in rack can drop its SAD detection completely
Reviewed by:tuexen@, rscheff@ Differential Revision: <https://reviews.freebsd.org/D44903>
show more ...
|
#
c9cd686b |
| 18-Apr-2024 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp: drop data received after a FIN has been processed
RFC 9293 describes the handling of data in the CLOSE-WAIT, CLOSING, LAST-ACK, and TIME-WAIT states: This should not occur since a FIN has been
tcp: drop data received after a FIN has been processed
RFC 9293 describes the handling of data in the CLOSE-WAIT, CLOSING, LAST-ACK, and TIME-WAIT states: This should not occur since a FIN has been received from the remote side. Ignore the segment text. Therefore, implement this handling.
Reviewed by: rrs, rscheff MFC after: 3 days Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D44746
show more ...
|
#
e8c149ab |
| 07-Apr-2024 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp: add some debug output
Also log, when dropping text or FIN after having received a FIN. This is the intended behavior described in RFC 9293. A follow-up patch will enforce this behavior for the
tcp: add some debug output
Also log, when dropping text or FIN after having received a FIN. This is the intended behavior described in RFC 9293. A follow-up patch will enforce this behavior for the base stack and the RACK stack. Reviewed by: rscheff MFC after: 3 days Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D44669
show more ...
|
#
3e1c8a35 |
| 06-Apr-2024 |
Michael Tuexen <tuexen@FreeBSD.org> |
tcp: improve consistency
No functional change intended.
Reported by: Coverity Scan CID: 1523781 Reviewed by: rscheff MFC after: 3 days Sponsored by: Netflix, Inc. Differential Revision: https
tcp: improve consistency
No functional change intended.
Reported by: Coverity Scan CID: 1523781 Reviewed by: rscheff MFC after: 3 days Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D44645
show more ...
|
#
dd7b86e2 |
| 18-Mar-2024 |
Gleb Smirnoff <glebius@FreeBSD.org> |
tcp: remove IS_FASTOPEN() macro
The macro is more obfuscating than helping as it just checks a single flag of t_flags. All other t_flags bits are checked without a macro.
A bigger problem was that
tcp: remove IS_FASTOPEN() macro
The macro is more obfuscating than helping as it just checks a single flag of t_flags. All other t_flags bits are checked without a macro.
A bigger problem was that declaration of the macro in tcp_var.h depended on a kernel option. It is a bad practice to create such definitions in installable headers.
Reviewed by: rscheff, tuexen, kib Differential Revision: https://reviews.freebsd.org/D44362
show more ...
|
#
40fdc6d2 |
| 24-Feb-2024 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: provide correct snd_fack on post_recovery
Ensure that snd_fack holds a valid value when doing the post_recovery CC processing, for preparation of the cc_cubic update, so that local pipe calcula
tcp: provide correct snd_fack on post_recovery
Ensure that snd_fack holds a valid value when doing the post_recovery CC processing, for preparation of the cc_cubic update, so that local pipe calculations can correctly refer to snd_fack during and after CC events.
Reviewed By: tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D43957
show more ...
|
#
fcea1cc9 |
| 14-Feb-2024 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: fix RTO ssthresh for non-6675 pipe calculation
Follow up on D43768 to properly deal with the non-default pipe calculation. When CC_RTO is processed, the timeout will have already pulled back sn
tcp: fix RTO ssthresh for non-6675 pipe calculation
Follow up on D43768 to properly deal with the non-default pipe calculation. When CC_RTO is processed, the timeout will have already pulled back snd_nxt. Further, snd_fack is not pulled along with snd_una.
Reviewed By: tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D43876
show more ...
|
#
3eeb22cb |
| 10-Feb-2024 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: clean scoreboard when releasing the socket buffer
The SACK scoreboard is conceptually an extention of the socket buffer. Remove it when the socket buffer goes away with soisdisconnected(). Veri
tcp: clean scoreboard when releasing the socket buffer
The SACK scoreboard is conceptually an extention of the socket buffer. Remove it when the socket buffer goes away with soisdisconnected(). Verify that this is also the expected state in tcp_discardcb().
PR: 276761 Reviewed by: glebius, tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D43805
show more ...
|
#
0b3f9e43 |
| 27-Jan-2024 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: move cc_post_recovery past snd_una update
The RFC6675 pipe calculation (sack.revised, enabled by default since D28702), uses outdated information, while the previous default calculated it corre
tcp: move cc_post_recovery past snd_una update
The RFC6675 pipe calculation (sack.revised, enabled by default since D28702), uses outdated information, while the previous default calculated it correctly with up-to-date information from the incoming ACK.
This difference can become as large as the receive window (not the congestion window previously), potentially triggering a massive burst of new packets.
MFC after: 1 week Reviewed By: tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D43520
show more ...
|
#
2d05a1c8 |
| 25-Jan-2024 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: commonize check for more data to send, style changes
Use SEQ_SUB instead of a plain subtraction, for an implict type conversion and prevention of a possible overflow. Use curly brackets in stac
tcp: commonize check for more data to send, style changes
Use SEQ_SUB instead of a plain subtraction, for an implict type conversion and prevention of a possible overflow. Use curly brackets in stacked if statements throughout. Use of the ? operator to enhance readability when clearing the FIN flag in tcp_output().
None of the above change the function.
Reviewed By: tuexen, cc, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D43539
show more ...
|
#
c7c325d0 |
| 24-Jan-2024 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: pass maxseg around instead of calculating locally
Improve slowpath processing (reordering, retransmissions) slightly by calculating maxseg only once. This typically saves one of two calls to tc
tcp: pass maxseg around instead of calculating locally
Improve slowpath processing (reordering, retransmissions) slightly by calculating maxseg only once. This typically saves one of two calls to tcp_maxseg().
Reviewed By: glebius, tuexen, cc, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D43536
show more ...
|
#
429f14f8 |
| 08-Jan-2024 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: clean PRR state after ECN congestion recovery.
PRR state was not properly reset on subsequent ECN CE events. Clean up after local transmission failures too.
Reviewed by: tuexen, cc,
tcp: clean PRR state after ECN congestion recovery.
PRR state was not properly reset on subsequent ECN CE events. Clean up after local transmission failures too.
Reviewed by: tuexen, cc, #transport MFC after: 3 days Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D43170
show more ...
|
#
f4574e2d |
| 08-Jan-2024 |
Richard Scheffenegger <rscheff@FreeBSD.org> |
tcp: prevent spurious empty segments and fix uncommon panic
Only try sending more data on pure ACKs when there is more data available in the send buffer.
In the case of a retransmitted SYN not bein
tcp: prevent spurious empty segments and fix uncommon panic
Only try sending more data on pure ACKs when there is more data available in the send buffer.
In the case of a retransmitted SYN not being sent due to an internal error, the snd_una/snd_nxt accounting could be off, leading to a panic. Pulling snd_nxt up to snd_una prevents this from happening.
Reported by: fengdreamer@126.com Reviewed by: cc, tuexen, #transport MFC after: 1 week Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D43343
show more ...
|