15630257fSFerruh Yigit.. SPDX-License-Identifier: BSD-3-Clause 25630257fSFerruh Yigit Copyright(c) 2017 Intel Corporation. 3f6010c76SMark Kavanagh 4f6010c76SMark KavanaghGeneric Segmentation Offload Library 5f6010c76SMark Kavanagh==================================== 6f6010c76SMark Kavanagh 7f6010c76SMark KavanaghOverview 8f6010c76SMark Kavanagh-------- 9f6010c76SMark KavanaghGeneric Segmentation Offload (GSO) is a widely used software implementation of 10f6010c76SMark KavanaghTCP Segmentation Offload (TSO), which reduces per-packet processing overhead. 11f6010c76SMark KavanaghMuch like TSO, GSO gains performance by enabling upper layer applications to 12f6010c76SMark Kavanaghprocess a smaller number of large packets (e.g. MTU size of 64KB), instead of 13f6010c76SMark Kavanaghprocessing higher numbers of small packets (e.g. MTU size of 1500B), thus 14f6010c76SMark Kavanaghreducing per-packet overhead. 15f6010c76SMark Kavanagh 16f6010c76SMark KavanaghFor example, GSO allows guest kernel stacks to transmit over-sized TCP segments 17f6010c76SMark Kavanaghthat far exceed the kernel interface's MTU; this eliminates the need to segment 18f6010c76SMark Kavanaghpackets within the guest, and improves the data-to-overhead ratio of both the 19f6010c76SMark Kavanaghguest-host link, and PCI bus. The expectation of the guest network stack in this 20f6010c76SMark Kavanaghscenario is that segmentation of egress frames will take place either in the NIC 21f6010c76SMark KavanaghHW, or where that hardware capability is unavailable, either in the host 22f6010c76SMark Kavanaghapplication, or network stack. 23f6010c76SMark Kavanagh 24f6010c76SMark KavanaghBearing that in mind, the GSO library enables DPDK applications to segment 25f6010c76SMark Kavanaghpackets in software. Note however, that GSO is implemented as a standalone 26f6010c76SMark Kavanaghlibrary, and not via a 'fallback' mechanism (i.e. for when TSO is unsupported 27f6010c76SMark Kavanaghin the underlying hardware); that is, applications must explicitly invoke the 28f6010c76SMark KavanaghGSO library to segment packets. The size of GSO segments ``(segsz)`` is 29f6010c76SMark Kavanaghconfigurable by the application. 30f6010c76SMark Kavanagh 31f6010c76SMark KavanaghLimitations 32f6010c76SMark Kavanagh----------- 33f6010c76SMark Kavanagh 34f6010c76SMark Kavanagh#. The GSO library doesn't check if input packets have correct checksums. 35f6010c76SMark Kavanagh 36f6010c76SMark Kavanagh#. In addition, the GSO library doesn't re-calculate checksums for segmented 37f6010c76SMark Kavanagh packets (that task is left to the application). 38f6010c76SMark Kavanagh 39f6010c76SMark Kavanagh#. IP fragments are unsupported by the GSO library. 40f6010c76SMark Kavanagh 41f6010c76SMark Kavanagh#. The egress interface's driver must support multi-segment packets. 42f6010c76SMark Kavanagh 43f6010c76SMark Kavanagh#. Currently, the GSO library supports the following IPv4 packet types: 44f6010c76SMark Kavanagh 45f6010c76SMark Kavanagh - TCP 46250c9eb3SJiayu Hu - UDP 47f6010c76SMark Kavanagh - VxLAN 48f6010c76SMark Kavanagh - GRE 49f6010c76SMark Kavanagh 50f6010c76SMark Kavanagh See `Supported GSO Packet Types`_ for further details. 51f6010c76SMark Kavanagh 52f6010c76SMark KavanaghPacket Segmentation 53f6010c76SMark Kavanagh------------------- 54f6010c76SMark Kavanagh 55f6010c76SMark KavanaghThe ``rte_gso_segment()`` function is the GSO library's primary 56f6010c76SMark Kavanaghsegmentation API. 57f6010c76SMark Kavanagh 58f6010c76SMark KavanaghBefore performing segmentation, an application must create a GSO context object 59f6010c76SMark Kavanagh``(struct rte_gso_ctx)``, which provides the library with some of the 60f6010c76SMark Kavanaghinformation required to understand how the packet should be segmented. Refer to 61f6010c76SMark Kavanagh`How to Segment a Packet`_ for additional details on same. Once the GSO context 62f6010c76SMark Kavanaghhas been created, and populated, the application can then use the 63f6010c76SMark Kavanagh``rte_gso_segment()`` function to segment packets. 64f6010c76SMark Kavanagh 65f6010c76SMark KavanaghThe GSO library typically stores each segment that it creates in two parts: the 66f6010c76SMark Kavanaghfirst part contains a copy of the original packet's headers, while the second 67f6010c76SMark Kavanaghpart contains a pointer to an offset within the original packet. This mechanism 68f6010c76SMark Kavanaghis explained in more detail in `GSO Output Segment Format`_. 69f6010c76SMark Kavanagh 70f6010c76SMark KavanaghThe GSO library supports both single- and multi-segment input mbufs. 71f6010c76SMark Kavanagh 72f6010c76SMark KavanaghGSO Output Segment Format 73f6010c76SMark Kavanagh~~~~~~~~~~~~~~~~~~~~~~~~~ 74f6010c76SMark KavanaghTo reduce the number of expensive memcpy operations required when segmenting a 75f6010c76SMark Kavanaghpacket, the GSO library typically stores each segment that it creates as a 76f6010c76SMark Kavanaghtwo-part mbuf (technically, this is termed a 'two-segment' mbuf; however, since 77f6010c76SMark Kavanaghthe elements produced by the API are also called 'segments', for clarity the 78f6010c76SMark Kavanaghterm 'part' is used here instead). 79f6010c76SMark Kavanagh 80f6010c76SMark KavanaghThe first part of each output segment is a direct mbuf and contains a copy of 81f6010c76SMark Kavanaghthe original packet's headers, which must be prepended to each output segment. 82f6010c76SMark KavanaghThese headers are copied from the original packet into each output segment. 83f6010c76SMark Kavanagh 84f6010c76SMark KavanaghThe second part of each output segment, represents a section of data from the 85f6010c76SMark Kavanaghoriginal packet, i.e. a data segment. Rather than copy the data directly from 86f6010c76SMark Kavanaghthe original packet into the output segment (which would impact performance 87f6010c76SMark Kavanaghconsiderably), the second part of each output segment is an indirect mbuf, 88f6010c76SMark Kavanaghwhich contains no actual data, but simply points to an offset within the 89f6010c76SMark Kavanaghoriginal packet. 90f6010c76SMark Kavanagh 91f6010c76SMark KavanaghThe combination of the 'header' segment and the 'data' segment constitutes a 92f6010c76SMark Kavanaghsingle logical output GSO segment of the original packet. This is illustrated 93f6010c76SMark Kavanaghin :numref:`figure_gso-output-segment-format`. 94f6010c76SMark Kavanagh 95f6010c76SMark Kavanagh.. _figure_gso-output-segment-format: 96f6010c76SMark Kavanagh 977fe92871SThomas Monjalon.. figure:: img/gso-output-segment-format.* 98f6010c76SMark Kavanagh :align: center 99f6010c76SMark Kavanagh 100f6010c76SMark Kavanagh Two-part GSO output segment 101f6010c76SMark Kavanagh 102f6010c76SMark KavanaghIn one situation, the output segment may contain additional 'data' segments. 103f6010c76SMark KavanaghThis only occurs when: 104f6010c76SMark Kavanagh 105f6010c76SMark Kavanagh- the input packet on which GSO is to be performed is represented by a 106f6010c76SMark Kavanagh multi-segment mbuf. 107f6010c76SMark Kavanagh 108f6010c76SMark Kavanagh- the output segment is required to contain data that spans the boundaries 109f6010c76SMark Kavanagh between segments of the input multi-segment mbuf. 110f6010c76SMark Kavanagh 111f6010c76SMark KavanaghThe GSO library traverses each segment of the input packet, and produces 112f6010c76SMark Kavanaghnumerous output segments; for optimal performance, the number of output 113f6010c76SMark Kavanaghsegments is kept to a minimum. Consequently, the GSO library maximizes the 114f6010c76SMark Kavanaghamount of data contained within each output segment; i.e. each output segment 115f6010c76SMark Kavanagh``segsz`` bytes of data. The only exception to this is in the case of the very 116f6010c76SMark Kavanaghfinal output segment; if ``pkt_len`` % ``segsz``, then the final segment is 117f6010c76SMark Kavanaghsmaller than the rest. 118f6010c76SMark Kavanagh 119f6010c76SMark KavanaghIn order for an output segment to meet its MSS, it may need to include data from 120f6010c76SMark Kavanaghmultiple input segments. Due to the nature of indirect mbufs (each indirect mbuf 121f6010c76SMark Kavanaghcan point to only one direct mbuf), the solution here is to add another indirect 122f6010c76SMark Kavanaghmbuf to the output segment; this additional segment then points to the next 123f6010c76SMark Kavanaghinput segment. If necessary, this chaining process is repeated, until the sum of 124f6010c76SMark Kavanaghall of the data 'contained' in the output segment reaches ``segsz``. This 125f6010c76SMark Kavanaghensures that the amount of data contained within each output segment is uniform, 126f6010c76SMark Kavanaghwith the possible exception of the last segment, as previously described. 127f6010c76SMark Kavanagh 128f6010c76SMark Kavanagh:numref:`figure_gso-three-seg-mbuf` illustrates an example of a three-part 129f6010c76SMark Kavanaghoutput segment. In this example, the output segment needs to include data from 130f6010c76SMark Kavanaghthe end of one input segment, and the beginning of another. To achieve this, 131f6010c76SMark Kavanaghan additional indirect mbuf is chained to the second part of the output segment, 132f6010c76SMark Kavanaghand is attached to the next input segment (i.e. it points to the data in the 133f6010c76SMark Kavanaghnext input segment). 134f6010c76SMark Kavanagh 135f6010c76SMark Kavanagh.. _figure_gso-three-seg-mbuf: 136f6010c76SMark Kavanagh 1377fe92871SThomas Monjalon.. figure:: img/gso-three-seg-mbuf.* 138f6010c76SMark Kavanagh :align: center 139f6010c76SMark Kavanagh 140f6010c76SMark Kavanagh Three-part GSO output segment 141f6010c76SMark Kavanagh 142f6010c76SMark KavanaghSupported GSO Packet Types 143f6010c76SMark Kavanagh-------------------------- 144f6010c76SMark Kavanagh 145f6010c76SMark KavanaghTCP/IPv4 GSO 146f6010c76SMark Kavanagh~~~~~~~~~~~~ 147f6010c76SMark KavanaghTCP/IPv4 GSO supports segmentation of suitably large TCP/IPv4 packets, which 148f6010c76SMark Kavanaghmay also contain an optional VLAN tag. 149f6010c76SMark Kavanagh 150250c9eb3SJiayu HuUDP/IPv4 GSO 151250c9eb3SJiayu Hu~~~~~~~~~~~~ 152250c9eb3SJiayu HuUDP/IPv4 GSO supports segmentation of suitably large UDP/IPv4 packets, which 153250c9eb3SJiayu Humay also contain an optional VLAN tag. UDP GSO is the same as IP fragmentation. 154250c9eb3SJiayu HuSpecifically, UDP GSO treats the UDP header as a part of the payload and 155250c9eb3SJiayu Hudoes not modify it during segmentation. Therefore, after UDP GSO, only the 156250c9eb3SJiayu Hufirst output packet has the original UDP header, and others just have l2 157250c9eb3SJiayu Huand l3 headers. 158250c9eb3SJiayu Hu 159f6010c76SMark KavanaghVxLAN GSO 160f6010c76SMark Kavanagh~~~~~~~~~ 161f6010c76SMark KavanaghVxLAN packets GSO supports segmentation of suitably large VxLAN packets, 162f6010c76SMark Kavanaghwhich contain an outer IPv4 header, inner TCP/IPv4 headers, and optional 163f6010c76SMark Kavanaghinner and/or outer VLAN tag(s). 164f6010c76SMark Kavanagh 165f6010c76SMark KavanaghGRE GSO 166f6010c76SMark Kavanagh~~~~~~~ 167f6010c76SMark KavanaghGRE GSO supports segmentation of suitably large GRE packets, which contain 168f6010c76SMark Kavanaghan outer IPv4 header, inner TCP/IPv4 headers, and an optional VLAN tag. 169f6010c76SMark Kavanagh 170f6010c76SMark KavanaghHow to Segment a Packet 171f6010c76SMark Kavanagh----------------------- 172f6010c76SMark Kavanagh 173f6010c76SMark KavanaghTo segment an outgoing packet, an application must: 174f6010c76SMark Kavanagh 175f6010c76SMark Kavanagh#. First create a GSO context ``(struct rte_gso_ctx)``; this contains: 176f6010c76SMark Kavanagh 177f6010c76SMark Kavanagh - a pointer to the mbuf pool for allocating the direct buffers, which are 178f6010c76SMark Kavanagh used to store the GSO segments' packet headers. 179f6010c76SMark Kavanagh 180f6010c76SMark Kavanagh - a pointer to the mbuf pool for allocating indirect buffers, which are 181f6010c76SMark Kavanagh used to locate GSO segments' packet payloads. 182f6010c76SMark Kavanagh 183f6010c76SMark Kavanagh .. note:: 184f6010c76SMark Kavanagh 185f6010c76SMark Kavanagh An application may use the same pool for both direct and indirect 186653c9de1SMark Kavanagh buffers. However, since indirect mbufs simply store a pointer, the 187f6010c76SMark Kavanagh application may reduce its memory consumption by creating a separate memory 188f6010c76SMark Kavanagh pool, containing smaller elements, for the indirect pool. 189f6010c76SMark Kavanagh 190653c9de1SMark Kavanagh 191f6010c76SMark Kavanagh - the size of each output segment, including packet headers and payload, 192f6010c76SMark Kavanagh measured in bytes. 193f6010c76SMark Kavanagh 194f6010c76SMark Kavanagh - the bit mask of required GSO types. The GSO library uses the same macros as 195f6010c76SMark Kavanagh those that describe a physical device's TX offloading capabilities (i.e. 196f6010c76SMark Kavanagh ``DEV_TX_OFFLOAD_*_TSO``) for gso_types. For example, if an application 197f6010c76SMark Kavanagh wants to segment TCP/IPv4 packets, it should set gso_types to 198f6010c76SMark Kavanagh ``DEV_TX_OFFLOAD_TCP_TSO``. The only other supported values currently 199f6010c76SMark Kavanagh supported for gso_types are ``DEV_TX_OFFLOAD_VXLAN_TNL_TSO``, and 200f6010c76SMark Kavanagh ``DEV_TX_OFFLOAD_GRE_TNL_TSO``; a combination of these macros is also 201f6010c76SMark Kavanagh allowed. 202f6010c76SMark Kavanagh 203f6010c76SMark Kavanagh - a flag, that indicates whether the IPv4 headers of output segments should 204f6010c76SMark Kavanagh contain fixed or incremental ID values. 205f6010c76SMark Kavanagh 206f6010c76SMark Kavanagh2. Set the appropriate ol_flags in the mbuf. 207f6010c76SMark Kavanagh 208f6010c76SMark Kavanagh - The GSO library use the value of an mbuf's ``ol_flags`` attribute to 209*f43d3dbbSDavid Marchand determine how a packet should be segmented. It is the application's 210f6010c76SMark Kavanagh responsibility to ensure that these flags are set. 211f6010c76SMark Kavanagh 212f6010c76SMark Kavanagh - For example, in order to segment TCP/IPv4 packets, the application should 213f6010c76SMark Kavanagh add the ``PKT_TX_IPV4`` and ``PKT_TX_TCP_SEG`` flags to the mbuf's 214f6010c76SMark Kavanagh ol_flags. 215f6010c76SMark Kavanagh 216f6010c76SMark Kavanagh - If checksum calculation in hardware is required, the application should 217f6010c76SMark Kavanagh also add the ``PKT_TX_TCP_CKSUM`` and ``PKT_TX_IP_CKSUM`` flags. 218f6010c76SMark Kavanagh 219f6010c76SMark Kavanagh#. Check if the packet should be processed. Packets with one of the 220f6010c76SMark Kavanagh following properties are not processed and are returned immediately: 221f6010c76SMark Kavanagh 222f6010c76SMark Kavanagh - Packet length is less than ``segsz`` (i.e. GSO is not required). 223f6010c76SMark Kavanagh 224f6010c76SMark Kavanagh - Packet type is not supported by GSO library (see 225f6010c76SMark Kavanagh `Supported GSO Packet Types`_). 226f6010c76SMark Kavanagh 227f6010c76SMark Kavanagh - Application has not enabled GSO support for the packet type. 228f6010c76SMark Kavanagh 229f6010c76SMark Kavanagh - Packet's ol_flags have been incorrectly set. 230f6010c76SMark Kavanagh 231f6010c76SMark Kavanagh#. Allocate space in which to store the output GSO segments. If the amount of 232f6010c76SMark Kavanagh space allocated by the application is insufficient, segmentation will fail. 233f6010c76SMark Kavanagh 234f6010c76SMark Kavanagh#. Invoke the GSO segmentation API, ``rte_gso_segment()``. 235f6010c76SMark Kavanagh 236f6010c76SMark Kavanagh#. If required, update the L3 and L4 checksums of the newly-created segments. 237f6010c76SMark Kavanagh For tunneled packets, the outer IPv4 headers' checksums should also be 238f6010c76SMark Kavanagh updated. Alternatively, the application may offload checksum calculation 239f6010c76SMark Kavanagh to HW. 240f6010c76SMark Kavanagh 241