1.. BSD LICENSE 2 Copyright(c) 2017 Intel Corporation. All rights reserved. 3 All rights reserved. 4 5 Redistribution and use in source and binary forms, with or without 6 modification, are permitted provided that the following conditions 7 are met: 8 9 * Redistributions of source code must retain the above copyright 10 notice, this list of conditions and the following disclaimer. 11 * Redistributions in binary form must reproduce the above copyright 12 notice, this list of conditions and the following disclaimer in 13 the documentation and/or other materials provided with the 14 distribution. 15 * Neither the name of Intel Corporation nor the names of its 16 contributors may be used to endorse or promote products derived 17 from this software without specific prior written permission. 18 19 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 20 "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 21 LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 22 A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 23 OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 24 SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 25 LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 26 DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 27 THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 28 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 29 OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 30 31Generic Segmentation Offload Library 32==================================== 33 34Overview 35-------- 36Generic Segmentation Offload (GSO) is a widely used software implementation of 37TCP Segmentation Offload (TSO), which reduces per-packet processing overhead. 38Much like TSO, GSO gains performance by enabling upper layer applications to 39process a smaller number of large packets (e.g. MTU size of 64KB), instead of 40processing higher numbers of small packets (e.g. MTU size of 1500B), thus 41reducing per-packet overhead. 42 43For example, GSO allows guest kernel stacks to transmit over-sized TCP segments 44that far exceed the kernel interface's MTU; this eliminates the need to segment 45packets within the guest, and improves the data-to-overhead ratio of both the 46guest-host link, and PCI bus. The expectation of the guest network stack in this 47scenario is that segmentation of egress frames will take place either in the NIC 48HW, or where that hardware capability is unavailable, either in the host 49application, or network stack. 50 51Bearing that in mind, the GSO library enables DPDK applications to segment 52packets in software. Note however, that GSO is implemented as a standalone 53library, and not via a 'fallback' mechanism (i.e. for when TSO is unsupported 54in the underlying hardware); that is, applications must explicitly invoke the 55GSO library to segment packets. The size of GSO segments ``(segsz)`` is 56configurable by the application. 57 58Limitations 59----------- 60 61#. The GSO library doesn't check if input packets have correct checksums. 62 63#. In addition, the GSO library doesn't re-calculate checksums for segmented 64 packets (that task is left to the application). 65 66#. IP fragments are unsupported by the GSO library. 67 68#. The egress interface's driver must support multi-segment packets. 69 70#. Currently, the GSO library supports the following IPv4 packet types: 71 72 - TCP 73 - VxLAN 74 - GRE 75 76 See `Supported GSO Packet Types`_ for further details. 77 78Packet Segmentation 79------------------- 80 81The ``rte_gso_segment()`` function is the GSO library's primary 82segmentation API. 83 84Before performing segmentation, an application must create a GSO context object 85``(struct rte_gso_ctx)``, which provides the library with some of the 86information required to understand how the packet should be segmented. Refer to 87`How to Segment a Packet`_ for additional details on same. Once the GSO context 88has been created, and populated, the application can then use the 89``rte_gso_segment()`` function to segment packets. 90 91The GSO library typically stores each segment that it creates in two parts: the 92first part contains a copy of the original packet's headers, while the second 93part contains a pointer to an offset within the original packet. This mechanism 94is explained in more detail in `GSO Output Segment Format`_. 95 96The GSO library supports both single- and multi-segment input mbufs. 97 98GSO Output Segment Format 99~~~~~~~~~~~~~~~~~~~~~~~~~ 100To reduce the number of expensive memcpy operations required when segmenting a 101packet, the GSO library typically stores each segment that it creates as a 102two-part mbuf (technically, this is termed a 'two-segment' mbuf; however, since 103the elements produced by the API are also called 'segments', for clarity the 104term 'part' is used here instead). 105 106The first part of each output segment is a direct mbuf and contains a copy of 107the original packet's headers, which must be prepended to each output segment. 108These headers are copied from the original packet into each output segment. 109 110The second part of each output segment, represents a section of data from the 111original packet, i.e. a data segment. Rather than copy the data directly from 112the original packet into the output segment (which would impact performance 113considerably), the second part of each output segment is an indirect mbuf, 114which contains no actual data, but simply points to an offset within the 115original packet. 116 117The combination of the 'header' segment and the 'data' segment constitutes a 118single logical output GSO segment of the original packet. This is illustrated 119in :numref:`figure_gso-output-segment-format`. 120 121.. _figure_gso-output-segment-format: 122 123.. figure:: img/gso-output-segment-format.svg 124 :align: center 125 126 Two-part GSO output segment 127 128In one situation, the output segment may contain additional 'data' segments. 129This only occurs when: 130 131- the input packet on which GSO is to be performed is represented by a 132 multi-segment mbuf. 133 134- the output segment is required to contain data that spans the boundaries 135 between segments of the input multi-segment mbuf. 136 137The GSO library traverses each segment of the input packet, and produces 138numerous output segments; for optimal performance, the number of output 139segments is kept to a minimum. Consequently, the GSO library maximizes the 140amount of data contained within each output segment; i.e. each output segment 141``segsz`` bytes of data. The only exception to this is in the case of the very 142final output segment; if ``pkt_len`` % ``segsz``, then the final segment is 143smaller than the rest. 144 145In order for an output segment to meet its MSS, it may need to include data from 146multiple input segments. Due to the nature of indirect mbufs (each indirect mbuf 147can point to only one direct mbuf), the solution here is to add another indirect 148mbuf to the output segment; this additional segment then points to the next 149input segment. If necessary, this chaining process is repeated, until the sum of 150all of the data 'contained' in the output segment reaches ``segsz``. This 151ensures that the amount of data contained within each output segment is uniform, 152with the possible exception of the last segment, as previously described. 153 154:numref:`figure_gso-three-seg-mbuf` illustrates an example of a three-part 155output segment. In this example, the output segment needs to include data from 156the end of one input segment, and the beginning of another. To achieve this, 157an additional indirect mbuf is chained to the second part of the output segment, 158and is attached to the next input segment (i.e. it points to the data in the 159next input segment). 160 161.. _figure_gso-three-seg-mbuf: 162 163.. figure:: img/gso-three-seg-mbuf.svg 164 :align: center 165 166 Three-part GSO output segment 167 168Supported GSO Packet Types 169-------------------------- 170 171TCP/IPv4 GSO 172~~~~~~~~~~~~ 173TCP/IPv4 GSO supports segmentation of suitably large TCP/IPv4 packets, which 174may also contain an optional VLAN tag. 175 176VxLAN GSO 177~~~~~~~~~ 178VxLAN packets GSO supports segmentation of suitably large VxLAN packets, 179which contain an outer IPv4 header, inner TCP/IPv4 headers, and optional 180inner and/or outer VLAN tag(s). 181 182GRE GSO 183~~~~~~~ 184GRE GSO supports segmentation of suitably large GRE packets, which contain 185an outer IPv4 header, inner TCP/IPv4 headers, and an optional VLAN tag. 186 187How to Segment a Packet 188----------------------- 189 190To segment an outgoing packet, an application must: 191 192#. First create a GSO context ``(struct rte_gso_ctx)``; this contains: 193 194 - a pointer to the mbuf pool for allocating the direct buffers, which are 195 used to store the GSO segments' packet headers. 196 197 - a pointer to the mbuf pool for allocating indirect buffers, which are 198 used to locate GSO segments' packet payloads. 199 200 .. note:: 201 202 An application may use the same pool for both direct and indirect 203 buffers. However, since indirect mbufs simply store a pointer, the 204 application may reduce its memory consumption by creating a separate memory 205 pool, containing smaller elements, for the indirect pool. 206 207 208 - the size of each output segment, including packet headers and payload, 209 measured in bytes. 210 211 - the bit mask of required GSO types. The GSO library uses the same macros as 212 those that describe a physical device's TX offloading capabilities (i.e. 213 ``DEV_TX_OFFLOAD_*_TSO``) for gso_types. For example, if an application 214 wants to segment TCP/IPv4 packets, it should set gso_types to 215 ``DEV_TX_OFFLOAD_TCP_TSO``. The only other supported values currently 216 supported for gso_types are ``DEV_TX_OFFLOAD_VXLAN_TNL_TSO``, and 217 ``DEV_TX_OFFLOAD_GRE_TNL_TSO``; a combination of these macros is also 218 allowed. 219 220 - a flag, that indicates whether the IPv4 headers of output segments should 221 contain fixed or incremental ID values. 222 2232. Set the appropriate ol_flags in the mbuf. 224 225 - The GSO library use the value of an mbuf's ``ol_flags`` attribute to 226 to determine how a packet should be segmented. It is the application's 227 responsibility to ensure that these flags are set. 228 229 - For example, in order to segment TCP/IPv4 packets, the application should 230 add the ``PKT_TX_IPV4`` and ``PKT_TX_TCP_SEG`` flags to the mbuf's 231 ol_flags. 232 233 - If checksum calculation in hardware is required, the application should 234 also add the ``PKT_TX_TCP_CKSUM`` and ``PKT_TX_IP_CKSUM`` flags. 235 236#. Check if the packet should be processed. Packets with one of the 237 following properties are not processed and are returned immediately: 238 239 - Packet length is less than ``segsz`` (i.e. GSO is not required). 240 241 - Packet type is not supported by GSO library (see 242 `Supported GSO Packet Types`_). 243 244 - Application has not enabled GSO support for the packet type. 245 246 - Packet's ol_flags have been incorrectly set. 247 248#. Allocate space in which to store the output GSO segments. If the amount of 249 space allocated by the application is insufficient, segmentation will fail. 250 251#. Invoke the GSO segmentation API, ``rte_gso_segment()``. 252 253#. If required, update the L3 and L4 checksums of the newly-created segments. 254 For tunneled packets, the outer IPv4 headers' checksums should also be 255 updated. Alternatively, the application may offload checksum calculation 256 to HW. 257 258