tcp_sack.c - OpenGrok history log for /dflybsd-src/sys/netinet/tcp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: v6.4.0, v6.4.0rc1, v6.5.0, v6.2.2, v6.2.1, v6.3.0, v6.0.1, v6.0.0, v6.0.0rc1, v6.1.0, v5.8.3, v5.8.2, v5.8.1, v5.8.0, v5.9.0, v5.8.0rc1, v5.6.3, v5.6.2, v5.6.1, v5.6.0, v5.6.0rc1, v5.7.0, v5.4.3, v5.4.2, v5.4.1, v5.4.0, v5.5.0, v5.4.0rc1
# 63f17add	13-Nov-2018	Matthew Dillon <dillon@apollo.backplane.com>	kernel - Fix sack NULL pointer dereference * sack_block_lookup() can get confused when the passed-in sequence number appears to be less than sblk_start and greater than sblk_end. This situation kernel - Fix sack NULL pointer dereference * sack_block_lookup() can get confused when the passed-in sequence number appears to be less than sblk_start and greater than sblk_end. This situation can occur when the signed integer delta test has an overflow due to (sblk_end - seq) overflowing the sign bit verses (sblk_start - seq). The result is that sack_block_lookup() can crash on a NULL pointer indirection. * Check for the case, complain, and try to allow it. Though I suspect if the case occurs at all SACK will wind up with a broken list anyway. * I don't think this case can occur under normal conditions since TCP buffers do not grow to 2GB+ in size, so the crash we got was triggered by either an accidently malformed packet or an intentional one. show more ...
Revision tags: v5.2.2, v5.2.1, v5.2.0, v5.3.0, v5.2.0rc, v5.0.2, v5.0.1, v5.0.0, v5.0.0rc2, v5.1.0, v5.0.0rc1, v4.8.1, v4.8.0, v4.6.2, v4.9.0, v4.8.0rc, v4.6.1, v4.6.0, v4.6.0rc2, v4.6.0rc, v4.7.0, v4.4.3, v4.4.2, v4.4.1, v4.4.0, v4.5.0, v4.4.0rc, v4.2.4, v4.3.1, v4.2.3, v4.2.1, v4.2.0, v4.0.6, v4.3.0, v4.2.0rc, v4.0.5, v4.0.4, v4.0.3, v4.0.2, v4.0.1, v4.0.0, v4.0.0rc3, v4.0.0rc2, v4.0.0rc, v4.1.0, v3.8.2, v3.8.1, v3.6.3, v3.8.0, v3.8.0rc2, v3.9.0, v3.8.0rc, v3.6.2, v3.6.1, v3.6.0, v3.7.1, v3.6.0rc, v3.4.3, v3.4.2, v3.4.1, v3.4.0, v3.4.0rc, v3.5.0
# 4da66bbf	24-Jan-2013	Johannes Hofmann <johannes.hofmann@gmx.de>	merge
# 2fb3a851	17-Jan-2013	Sepherosa Ziehau <sephe@dragonflybsd.org>	tcp: Improve sender-sender and sender-receiver fairness on the same netisr Yield to other senders or receivers on the same netisr if the current TCP stream has sent certain amount of segments (curre tcp: Improve sender-sender and sender-receiver fairness on the same netisr Yield to other senders or receivers on the same netisr if the current TCP stream has sent certain amount of segments (currently 4) and is going to burst more segments. sysctl net.inet.tcp.fairsend could be used to tune how many segements are allowed to burst. For TSO capable devices, their TSO aggregate size limit could also affect the number of segments allowed to burst. Set net.inet.tcp.fairsend to 0 will allow single TCP stream to burst as much as it wants (the old TCP sender's behaviour). "Fairsend" is performed at the places that do not affect segment sending during congestion control: - User requested output path - ACK input path Measured improvement in the following setup: +---+ +---+ \| \|<-----------\| B \| \| \| +---+ \| A \| \| \| +---+ \| \|----------->\| C \| +---+ +---+ A (i7-2600, w/ HT enabled), 82571EB B (e3-1230, w/ HT enabled), 82574L C (e3-1230, w/ HT enabled), 82574L The performance stats are gathered from 'systat -if 1' When A runs 8 TCP senders to C and 8 TCP receivers from B, sending performance are same ~975Mbps, however, the receiving performance before this commit stumbles between 670Mbps and 850Mbps; w/ "fairsend" receiving performance stays at 981Mbps. When A runs 16 TCP senders to C and 16 TCP receivers from B, sending performance are same ~975Mbps, however, the receiving performance before this commit goes from 960Mbps to 980Mbps; w/ "fairsend" receiving performance stays at 981Mbps stably. When there are more senders and receivers running on A, there is no noticable performance difference on either sending or receiving between non-"fairsend" and "fairsend", because senders are no longer being able to do continuous large burst. "Fairsend" also improves Jain's fairness index between various amount of senders (8 ~ 128) a little bit (sending only tests). show more ...
Revision tags: v3.2.2, v3.2.1, v3.2.0, v3.3.0
# aa0b1d2b	07-Sep-2012	Matthew Dillon <dillon@apollo.backplane.com>	Merge branches 'hammer2' and 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into hammer2
# 859bc3f7	28-Aug-2012	Sepherosa Ziehau <sephe@dragonflybsd.org>	tcp: RFC3517bis is now officially RFC6675
Revision tags: v3.0.3
# 34a53f7c	29-Jun-2012	Matthew Dillon <dillon@apollo.backplane.com>	Merge branches 'hammer2' and 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into hammer2
# 08d3c60b	15-Jun-2012	Sepherosa Ziehau <sephe@dragonflybsd.org>	tcp/sack: Take bwnd into consideration when calculate length of new segment
# f616e306	15-Jun-2012	Sepherosa Ziehau <sephe@dragonflybsd.org>	tcp/sack: Discard HighRxt, RescueRxt and LostSeq along with SACK scoreboard
# 9437e5dc	31-May-2012	Matthew Dillon <dillon@apollo.backplane.com>	Merge branches 'hammer2' and 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into hammer2
# 9ba0ed2f	30-May-2012	Sepherosa Ziehau <sephe@dragonflybsd.org>	tcp/sack: If other side reneged, discard the current SACK scoreboard Other side reneging is detected using the first SACK record: If its left edge is less than or equal to the cumulative ACK of the tcp/sack: If other side reneged, discard the current SACK scoreboard Other side reneging is detected using the first SACK record: If its left edge is less than or equal to the cumulative ACK of the incoming segment, other side probably reneged. This fixes the later assertion that the first SACK record's left edge must be above snd_una in tcp_sack_first_unsacked_len() Add statistics about other side reneging show more ...
# aaf34417	28-May-2012	Sepherosa Ziehau <sephe@dragonflybsd.org>	tcp/sack: Constify function arguments if possible
# 8003f426	25-May-2012	Sepherosa Ziehau <sephe@dragonflybsd.org>	tcp/sack: Fix off-by-one bug when updating rescue SACK information
# ccb518ea	24-May-2012	Sepherosa Ziehau <sephe@dragonflybsd.org>	tcp/sack: Force out more segments allowed by "pipe" during fast recovery If some segments are cumulatively acked or SACKed, and HighRxt equals snd_una, one segment (new or retransmit) will be forced tcp/sack: Force out more segments allowed by "pipe" during fast recovery If some segments are cumulatively acked or SACKed, and HighRxt equals snd_una, one segment (new or retransmit) will be forced out even if cwnd and pipe don't allow it. When large amount of segments are lost, i.e. computed pipe could be large, this avoids unnecessary retransmit timeout and could perform as good as NewReno. Sysctl node net.inet.tcp.force_sackrxt could be tuned to burst out several retransmits, default is 1 (should be good enough). Set this sysctl to 0, SACK based fast recovery will obey the computed pipe. Several unnecessary retransmit timeout graph as described above: http://leaf.dragonflybsd.org/~sephe/no_force_sack_rexmt2_15.xpl (starts @15s) http://leaf.dragonflybsd.org/~sephe/no_force_sack_rexmt_54.xpl (starts @54s) show more ...
# ba0d6f99	24-May-2012	Sepherosa Ziehau <sephe@dragonflybsd.org>	tcp/sack: Use RFC3517bis IsLost(snd_una) as fallback of early retransmit Since we are less certain about whether is segment is lost or not when using IsLost(snd_una), we do not send out other unSACK tcp/sack: Use RFC3517bis IsLost(snd_una) as fallback of early retransmit Since we are less certain about whether is segment is lost or not when using IsLost(snd_una), we do not send out other unSACKed segments except the first unSACKed segment under this condition. Sending out other unSACKed segments could be too aggressive here; just wait for another ACK to tick out more unSACKed segments. show more ...
# eb5f6cff	23-May-2012	Sepherosa Ziehau <sephe@dragonflybsd.org>	tcp/sack: Fix the condition that SACK rescue retransmit can't be done If we have nothing left above the HighRxt, the first unSACKed segment will be used as the SACK rescue retransmit.
# f3c063ed	23-May-2012	Sepherosa Ziehau <sephe@dragonflybsd.org>	tcp: Optimize SACK scoreboard records consolidation a little bit If the SACK block and SACK scoreboard record are matched exactly, SACK scoreboard records consolidation is not needed at all.
# c3a4b1ee	19-May-2012	Matthew Dillon <dillon@apollo.backplane.com>	Merge branches 'hammer2' and 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into hammer2
# e2289e66	18-May-2012	Sepherosa Ziehau <sephe@dragonflybsd.org>	tcp: Implement RFC4653 Non-Congestion Robustness (NCR) It is enabled by default and can be disabled using sysctl node: net.inet.tcp.ncr As far as I have tested on heavily reordered network path, th tcp: Implement RFC4653 Non-Congestion Robustness (NCR) It is enabled by default and can be disabled using sysctl node: net.inet.tcp.ncr As far as I have tested on heavily reordered network path, this algorithm does avoid most of the spurious fast retransmits. While on the normal network path, the fast retransmits stil could be triggered properly. show more ...
# 16003dcf	16-May-2012	Matthew Dillon <dillon@apollo.backplane.com>	Merge branches 'hammer2' and 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into hammer2
# 0f9e45de	16-May-2012	Sepherosa Ziehau <sephe@dragonflybsd.org>	tcp: Use TAILQ for segments reassemble queue So the last segment of the reassemble queue could be peeked w/ minimal cost
# ec702664	11-May-2012	Matthew Dillon <dillon@apollo.backplane.com>	Merge branches 'hammer2' and 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into hammer2
# c7e6499a	10-May-2012	Sepherosa Ziehau <sephe@dragonflybsd.org>	tcp: Add sack_flags for SACK related operations This saves us 4 bits in the crowded t_flags
# e7c6ae22	10-May-2012	Matthew Dillon <dillon@apollo.backplane.com>	Merge branches 'hammer2' and 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into hammer2
# 5fd89c20	09-May-2012	Sepherosa Ziehau <sephe@dragonflybsd.org>	tcp/sack: Don't reduce retransmit threshold as recommended by RFC3517bis - Don't reduce byte threshold in IsLost() - Don't retransmit if IsLost(tcpcb.snd_una) is true They cause spurious retransmit tcp/sack: Don't reduce retransmit threshold as recommended by RFC3517bis - Don't reduce byte threshold in IsLost() - Don't retransmit if IsLost(tcpcb.snd_una) is true They cause spurious retransmits. Add sysctl node net.inet.tcp.rfc3517bis_rxt to enable the RFC3517bis recommended retransmit threshold reduction. It is disabled by default. show more ...
# ffe35e17	08-May-2012	Sepherosa Ziehau <sephe@dragonflybsd.org>	tcp/sack: Implement RFC3517bis http://tools.ietf.org/html/draft-ietf-tcpm-3517bis-02, which will be become "Standards Track" soon. net.inet.tcp.rfc3517bis sysctl node is added to enable this update tcp/sack: Implement RFC3517bis http://tools.ietf.org/html/draft-ietf-tcpm-3517bis-02, which will be become "Standards Track" soon. net.inet.tcp.rfc3517bis sysctl node is added to enable this update. It is off by default. show more ...
12 3