#
e41e61d5 |
| 16-Jan-2013 |
Sepherosa Ziehau <sephe@dragonflybsd.org> |
tcp/tso: Add per-device TSO aggregation size limit
- Prevent possible TSO large burst, when it is inappropriate (plenty of >24 segements bursts were observered, even when 32 parallel sending TCP
tcp/tso: Add per-device TSO aggregation size limit
- Prevent possible TSO large burst, when it is inappropriate (plenty of >24 segements bursts were observered, even when 32 parallel sending TCP streams are running on the same GigE NIC). TSO large burst has following drawbacks on a single TX queue, even on the devices that are multiple TX queues capable: o Delay other senders' packet transmission quite a lot. o Has negative effect on TCP receivers, which sends ACKs. o Cause buffer bloat in software sending queues, whose upper limit is based on "packet count". o Packet scheduler's decision could be less effective. On the other hand, TSO large burst could improve CPU usage. - Improve fairness between multiple TX queues on the devices that are multiple TX queues capable but only fetch data on TSO large packet boundary instead of TCP segment boundary.
Drivers could supply their own TSO aggregation size limit. If driver does not set it, the default value is 6000 (4 segments if MTU is 1500). The default value increases CPU usage a little bit: on i7-2600 w/ HT enabled, single TCP sending stream, CPU usage increases from 14%~17% to 17%~20%.
User could configure TSO aggregation size limit by using ifconfig(8): ifconfig ifaceX tsolen _n_
show more ...
|
#
7078f92b |
| 12-Jan-2013 |
Johannes Hofmann <johannes.hofmann@gmx.de> |
merge
|
#
4f483122 |
| 07-Jan-2013 |
Sascha Wildner <saw@online.de> |
kernel/tcp_{input,output}: Remove some unused variables.
|
Revision tags: v3.2.2, v3.2.1, v3.2.0, v3.3.0, v3.0.3 |
|
#
f78847aa |
| 16-Aug-2012 |
Matthew Dillon <dillon@apollo.backplane.com> |
Merge branches 'hammer2' and 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into hammer2
|
#
243cd031 |
| 16-Aug-2012 |
Sepherosa Ziehau <sephe@dragonflybsd.org> |
tcp: Stringent TSO segment length assertion.
|
#
1ebac809 |
| 04-Aug-2012 |
Matthew Dillon <dillon@apollo.backplane.com> |
Merge branches 'hammer2' and 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into hammer2
|
#
f0336d39 |
| 01-Aug-2012 |
Sepherosa Ziehau <sephe@dragonflybsd.org> |
mbuf: segsz -> tso_segsz, which is more expressive
|
#
7df36335 |
| 01-Aug-2012 |
Sepherosa Ziehau <sephe@dragonflybsd.org> |
mbuf: Save linker layer, IP and TCP/UDP header length
This could ease most drivers's TSO operation and avoid extra data area accessing during TSO setting up.
This could also help Intel's 1000M/10G
mbuf: Save linker layer, IP and TCP/UDP header length
This could ease most drivers's TSO operation and avoid extra data area accessing during TSO setting up.
This could also help Intel's 1000M/10G drivers' hardware checksum offloading, which requires protocol header length.
show more ...
|
#
5f60906c |
| 27-Jul-2012 |
Sepherosa Ziehau <sephe@dragonflybsd.org> |
tcp: Add TSO support for IPv4
It is implemented mainly according to NetBSD's TSO implementation.
Following stuffs are only in DragonFly - Add comment about devices' expected behaviour upon PUSH and
tcp: Add TSO support for IPv4
It is implemented mainly according to NetBSD's TSO implementation.
Following stuffs are only in DragonFly - Add comment about devices' expected behaviour upon PUSH and FIN flags Obtained-from: Microsoft's LSO online document - Don't use TSO, if there are SACK or DSACK blocks to report - Don't use TSO, if congestion window needs validation - Don't use TSO, if URG flag is to be set - Take IP and TCP header sizes into consideration when calculate the large TCP segment size - Pseudo checksum for the large TCP segment is calculated using only source address, destination address and IPPROTO_TCP according to Microsoft's LSO online document. This fashion of pseudo checksum calculation seems to be adopted by several NIC chips.
Several driver helper functions are added: - tcp_tso_pullup(), which extracts IPv4 and TCP header's location and length. And make sure that IPv4 and TCP headers are in contiguous memory. - ether_tso_pullup(), in addition to what tcp_tso_pullup() does, it also extracts ethernet header's length and make sure that ethernet, IPv4 and TCP headers are in contiguous memory.
Sysctl node net.inet.tcp.tso could be used to globally disable TSO. TSO is by default on.
tso/-tso are added to ifconfig(8), which could be used to enable or disable TSO on the specific interface.
show more ...
|
#
34a53f7c |
| 29-Jun-2012 |
Matthew Dillon <dillon@apollo.backplane.com> |
Merge branches 'hammer2' and 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into hammer2
|
#
cde03107 |
| 28-Jun-2012 |
Alex Hornung <ahornung@gmail.com> |
Merge branch 'master' of git://git.dragonflybsd.org/dragonfly
|
#
c1fabe85 |
| 15-Jun-2012 |
Sepherosa Ziehau <sephe@dragonflybsd.org> |
tcp: Add XMITNOW which bypasses the Nagle algorithm temporarily
This flag acts differently from ACKNOW that no pure ACK will be sent. It is currently used by the (extended) limited transmit and the
tcp: Add XMITNOW which bypasses the Nagle algorithm temporarily
This flag acts differently from ACKNOW that no pure ACK will be sent. It is currently used by the (extended) limited transmit and the SACK based fast recovery.
This flag is intended to fix the following bug in the SACK based fast recovery: The NextSeg() requires that if the unACKed segments could not pass IsLost(), previously unsent segment should be selected. In the application limited period, the size of the previously unsent segment could be less than the MSS, thus it could not be sent immediately according to the Nagle algorithm. In our SACK based fast recovery implementation, if the tcp_output() sends no segments, the current recovery transmit process will stop immediately. This could stop ACK clock and cause timeout retransmit, which could be avoided, if the Nagle algorithm is bypassed temporarily for the small unsent segment selected by NextSeg().
When this flag is used with (extended) limited transmit, certain amount of spurious early retranmits could be avoided.
show more ...
|
#
3718f371 |
| 15-Jun-2012 |
Sepherosa Ziehau <sephe@dragonflybsd.org> |
tcp_output: Always clear TF_ACKNOW before returning
|
#
16003dcf |
| 16-May-2012 |
Matthew Dillon <dillon@apollo.backplane.com> |
Merge branches 'hammer2' and 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into hammer2
|
#
0f9e45de |
| 16-May-2012 |
Sepherosa Ziehau <sephe@dragonflybsd.org> |
tcp: Use TAILQ for segments reassemble queue
So the last segment of the reassemble queue could be peeked w/ minimal cost
|
#
e06c72d3 |
| 30-Apr-2012 |
Matthew Dillon <dillon@apollo.backplane.com> |
Merge branches 'hammer2' and 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into hammer2
|
#
ed20d0e3 |
| 21-Apr-2012 |
Sascha Wildner <saw@online.de> |
kernel: Remove newlines from the panic messages that have one.
panic() itself will add a newline.
|
#
890ced9f |
| 20-Apr-2012 |
Matthew Dillon <dillon@apollo.backplane.com> |
Merge branches 'hammer2' and 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into hammer2
|
#
3d127502 |
| 18-Apr-2012 |
Sepherosa Ziehau <sephe@dragonflybsd.org> |
tcp: Correct sending idle detection and implement part of RFC2861
This commit mainly changes how cwnd is shinked after idle period on the send side.
- Properly detect sending idle period according
tcp: Correct sending idle detection and implement part of RFC2861
This commit mainly changes how cwnd is shinked after idle period on the send side.
- Properly detect sending idle period according to RFC5681. The problem of using reception time to detect sending idle period is described in RFC5681 as:
"... Using the last time a segment was received to determine whether or not to decrease cwnd can fail to deflate cwnd in the common case of persistent HTTP connections [HTH98]. In this case, a Web server receives a request before transmitting data to the Web client. The reception of the request makes the test for an idle connection fail, and allows the TCP to begin transmission with a possibly inappropriately large cwnd. ..."
This mainly affects HTTP/1.1 persistent connection performance after the connection is idled for a long time. The impact probably should not be drastic, since 80% HTTP/1.1 persistent connection delay between two requests are less then minimum RTO (1 second) as discovered by: "Overclocking the Yahoo! CDN for Faster Web Page Loads" http://conferences.sigcomm.org/imc/2011/docs/p569.pdf
Sysctl node net.inet.tcp.idle_restart is added to disable the cwnd shinking after idle period. It is on by default. And you can set it to 0 to restore old behaviour against HTTP/1.1 persistent connection.
- Implement part of RFC2861, which decays cwnd after idle period according to the length of sending idle period. The main difference between our implementation and the RFC2861 is that we don't let cwnd go below the value allowed by RFC5861.
Sysctl node net.inet.tcp.idle_cwv is added to disable CWV after sending idle period. It is on by default. Disable net.inet.tcp.idle_restart will also indirectly disable CWV after sending idle period.
The CWV during the application-limited period is not implemented by this commit. It is just too conservative, as discovered by: "Analysing TCP for Bursty Traffic, Int'l J. of Communications, Network and System Sciences, 7(3), July 2010."
- Add statistics about how much sending idle happened
show more ...
|
#
6f225a80 |
| 11-Apr-2012 |
Matthew Dillon <dillon@apollo.backplane.com> |
Merge branches 'hammer2' and 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into hammer2
|
#
c82f376d |
| 11-Apr-2012 |
Sepherosa Ziehau <sephe@dragonflybsd.org> |
tcp: RW=min(IW,cwnd) is standardized in RFC5681
|
#
4c986ba7 |
| 10-Apr-2012 |
Peter Avalos <pavalos@dragonflybsd.org> |
Merge remote-tracking branch 'origin/master'
|
#
9abfda27 |
| 29-Mar-2012 |
Matthew Dillon <dillon@apollo.backplane.com> |
Merge branches 'hammer2' and 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into hammer2
|
#
02cc2f35 |
| 27-Mar-2012 |
Sepherosa Ziehau <sephe@dragonflybsd.org> |
tcp/sack: Add more statistics
|
Revision tags: v3.0.2 |
|
#
fc2c45df |
| 18-Mar-2012 |
Matthew Dillon <dillon@apollo.backplane.com> |
Merge branches 'hammer2' and 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into hammer2
|