1.. SPDX-License-Identifier: BSD-3-Clause 2 Copyright(c) 2017 Intel Corporation. 3 4Generic Segmentation Offload Library 5==================================== 6 7Overview 8-------- 9Generic Segmentation Offload (GSO) is a widely used software implementation of 10TCP Segmentation Offload (TSO), which reduces per-packet processing overhead. 11Much like TSO, GSO gains performance by enabling upper layer applications to 12process a smaller number of large packets (e.g. MTU size of 64KB), instead of 13processing higher numbers of small packets (e.g. MTU size of 1500B), thus 14reducing per-packet overhead. 15 16For example, GSO allows guest kernel stacks to transmit over-sized TCP segments 17that far exceed the kernel interface's MTU; this eliminates the need to segment 18packets within the guest, and improves the data-to-overhead ratio of both the 19guest-host link, and PCI bus. The expectation of the guest network stack in this 20scenario is that segmentation of egress frames will take place either in the NIC 21HW, or where that hardware capability is unavailable, either in the host 22application, or network stack. 23 24Bearing that in mind, the GSO library enables DPDK applications to segment 25packets in software. Note however, that GSO is implemented as a standalone 26library, and not via a 'fallback' mechanism (i.e. for when TSO is unsupported 27in the underlying hardware); that is, applications must explicitly invoke the 28GSO library to segment packets. The size of GSO segments ``(segsz)`` is 29configurable by the application. 30 31Limitations 32----------- 33 34#. The GSO library doesn't check if input packets have correct checksums. 35 36#. In addition, the GSO library doesn't re-calculate checksums for segmented 37 packets (that task is left to the application). 38 39#. IP fragments are unsupported by the GSO library. 40 41#. The egress interface's driver must support multi-segment packets. 42 43#. Currently, the GSO library supports the following IPv4 packet types: 44 45 - TCP 46 - UDP 47 - VxLAN 48 - GRE 49 50 See `Supported GSO Packet Types`_ for further details. 51 52Packet Segmentation 53------------------- 54 55The ``rte_gso_segment()`` function is the GSO library's primary 56segmentation API. 57 58Before performing segmentation, an application must create a GSO context object 59``(struct rte_gso_ctx)``, which provides the library with some of the 60information required to understand how the packet should be segmented. Refer to 61`How to Segment a Packet`_ for additional details on same. Once the GSO context 62has been created, and populated, the application can then use the 63``rte_gso_segment()`` function to segment packets. 64 65The GSO library typically stores each segment that it creates in two parts: the 66first part contains a copy of the original packet's headers, while the second 67part contains a pointer to an offset within the original packet. This mechanism 68is explained in more detail in `GSO Output Segment Format`_. 69 70The GSO library supports both single- and multi-segment input mbufs. 71 72GSO Output Segment Format 73~~~~~~~~~~~~~~~~~~~~~~~~~ 74To reduce the number of expensive memcpy operations required when segmenting a 75packet, the GSO library typically stores each segment that it creates as a 76two-part mbuf (technically, this is termed a 'two-segment' mbuf; however, since 77the elements produced by the API are also called 'segments', for clarity the 78term 'part' is used here instead). 79 80The first part of each output segment is a direct mbuf and contains a copy of 81the original packet's headers, which must be prepended to each output segment. 82These headers are copied from the original packet into each output segment. 83 84The second part of each output segment, represents a section of data from the 85original packet, i.e. a data segment. Rather than copy the data directly from 86the original packet into the output segment (which would impact performance 87considerably), the second part of each output segment is an indirect mbuf, 88which contains no actual data, but simply points to an offset within the 89original packet. 90 91The combination of the 'header' segment and the 'data' segment constitutes a 92single logical output GSO segment of the original packet. This is illustrated 93in :numref:`figure_gso-output-segment-format`. 94 95.. _figure_gso-output-segment-format: 96 97.. figure:: img/gso-output-segment-format.* 98 :align: center 99 100 Two-part GSO output segment 101 102In one situation, the output segment may contain additional 'data' segments. 103This only occurs when: 104 105- the input packet on which GSO is to be performed is represented by a 106 multi-segment mbuf. 107 108- the output segment is required to contain data that spans the boundaries 109 between segments of the input multi-segment mbuf. 110 111The GSO library traverses each segment of the input packet, and produces 112numerous output segments; for optimal performance, the number of output 113segments is kept to a minimum. Consequently, the GSO library maximizes the 114amount of data contained within each output segment; i.e. each output segment 115``segsz`` bytes of data. The only exception to this is in the case of the very 116final output segment; if ``pkt_len`` % ``segsz``, then the final segment is 117smaller than the rest. 118 119In order for an output segment to meet its MSS, it may need to include data from 120multiple input segments. Due to the nature of indirect mbufs (each indirect mbuf 121can point to only one direct mbuf), the solution here is to add another indirect 122mbuf to the output segment; this additional segment then points to the next 123input segment. If necessary, this chaining process is repeated, until the sum of 124all of the data 'contained' in the output segment reaches ``segsz``. This 125ensures that the amount of data contained within each output segment is uniform, 126with the possible exception of the last segment, as previously described. 127 128:numref:`figure_gso-three-seg-mbuf` illustrates an example of a three-part 129output segment. In this example, the output segment needs to include data from 130the end of one input segment, and the beginning of another. To achieve this, 131an additional indirect mbuf is chained to the second part of the output segment, 132and is attached to the next input segment (i.e. it points to the data in the 133next input segment). 134 135.. _figure_gso-three-seg-mbuf: 136 137.. figure:: img/gso-three-seg-mbuf.* 138 :align: center 139 140 Three-part GSO output segment 141 142Supported GSO Packet Types 143-------------------------- 144 145TCP/IPv4 GSO 146~~~~~~~~~~~~ 147TCP/IPv4 GSO supports segmentation of suitably large TCP/IPv4 packets, which 148may also contain an optional VLAN tag. 149 150UDP/IPv4 GSO 151~~~~~~~~~~~~ 152UDP/IPv4 GSO supports segmentation of suitably large UDP/IPv4 packets, which 153may also contain an optional VLAN tag. UDP GSO is the same as IP fragmentation. 154Specifically, UDP GSO treats the UDP header as a part of the payload and 155does not modify it during segmentation. Therefore, after UDP GSO, only the 156first output packet has the original UDP header, and others just have l2 157and l3 headers. 158 159VxLAN GSO 160~~~~~~~~~ 161VxLAN packets GSO supports segmentation of suitably large VxLAN packets, 162which contain an outer IPv4 header, inner TCP/IPv4 headers, and optional 163inner and/or outer VLAN tag(s). 164 165GRE GSO 166~~~~~~~ 167GRE GSO supports segmentation of suitably large GRE packets, which contain 168an outer IPv4 header, inner TCP/IPv4 headers, and an optional VLAN tag. 169 170How to Segment a Packet 171----------------------- 172 173To segment an outgoing packet, an application must: 174 175#. First create a GSO context ``(struct rte_gso_ctx)``; this contains: 176 177 - a pointer to the mbuf pool for allocating the direct buffers, which are 178 used to store the GSO segments' packet headers. 179 180 - a pointer to the mbuf pool for allocating indirect buffers, which are 181 used to locate GSO segments' packet payloads. 182 183 .. note:: 184 185 An application may use the same pool for both direct and indirect 186 buffers. However, since indirect mbufs simply store a pointer, the 187 application may reduce its memory consumption by creating a separate memory 188 pool, containing smaller elements, for the indirect pool. 189 190 191 - the size of each output segment, including packet headers and payload, 192 measured in bytes. 193 194 - the bit mask of required GSO types. The GSO library uses the same macros as 195 those that describe a physical device's TX offloading capabilities (i.e. 196 ``DEV_TX_OFFLOAD_*_TSO``) for gso_types. For example, if an application 197 wants to segment TCP/IPv4 packets, it should set gso_types to 198 ``DEV_TX_OFFLOAD_TCP_TSO``. The only other supported values currently 199 supported for gso_types are ``DEV_TX_OFFLOAD_VXLAN_TNL_TSO``, and 200 ``DEV_TX_OFFLOAD_GRE_TNL_TSO``; a combination of these macros is also 201 allowed. 202 203 - a flag, that indicates whether the IPv4 headers of output segments should 204 contain fixed or incremental ID values. 205 2062. Set the appropriate ol_flags in the mbuf. 207 208 - The GSO library use the value of an mbuf's ``ol_flags`` attribute to 209 to determine how a packet should be segmented. It is the application's 210 responsibility to ensure that these flags are set. 211 212 - For example, in order to segment TCP/IPv4 packets, the application should 213 add the ``PKT_TX_IPV4`` and ``PKT_TX_TCP_SEG`` flags to the mbuf's 214 ol_flags. 215 216 - If checksum calculation in hardware is required, the application should 217 also add the ``PKT_TX_TCP_CKSUM`` and ``PKT_TX_IP_CKSUM`` flags. 218 219#. Check if the packet should be processed. Packets with one of the 220 following properties are not processed and are returned immediately: 221 222 - Packet length is less than ``segsz`` (i.e. GSO is not required). 223 224 - Packet type is not supported by GSO library (see 225 `Supported GSO Packet Types`_). 226 227 - Application has not enabled GSO support for the packet type. 228 229 - Packet's ol_flags have been incorrectly set. 230 231#. Allocate space in which to store the output GSO segments. If the amount of 232 space allocated by the application is insufficient, segmentation will fail. 233 234#. Invoke the GSO segmentation API, ``rte_gso_segment()``. 235 236#. If required, update the L3 and L4 checksums of the newly-created segments. 237 For tunneled packets, the outer IPv4 headers' checksums should also be 238 updated. Alternatively, the application may offload checksum calculation 239 to HW. 240 241