xref: /dpdk/doc/guides/prog_guide/generic_segmentation_offload_lib.rst (revision 41dd9a6bc2d9c6e20e139ad713cc9d172572dd43)
15630257fSFerruh Yigit..  SPDX-License-Identifier: BSD-3-Clause
25630257fSFerruh Yigit    Copyright(c) 2017 Intel Corporation.
3f6010c76SMark Kavanagh
4*41dd9a6bSDavid YoungGeneric Segmentation Offload (GSO) Library
5*41dd9a6bSDavid Young==========================================
6f6010c76SMark Kavanagh
7f6010c76SMark KavanaghOverview
8f6010c76SMark Kavanagh--------
9f6010c76SMark KavanaghGeneric Segmentation Offload (GSO) is a widely used software implementation of
10f6010c76SMark KavanaghTCP Segmentation Offload (TSO), which reduces per-packet processing overhead.
11f6010c76SMark KavanaghMuch like TSO, GSO gains performance by enabling upper layer applications to
12f6010c76SMark Kavanaghprocess a smaller number of large packets (e.g. MTU size of 64KB), instead of
13f6010c76SMark Kavanaghprocessing higher numbers of small packets (e.g. MTU size of 1500B), thus
14f6010c76SMark Kavanaghreducing per-packet overhead.
15f6010c76SMark Kavanagh
16f6010c76SMark KavanaghFor example, GSO allows guest kernel stacks to transmit over-sized TCP segments
17f6010c76SMark Kavanaghthat far exceed the kernel interface's MTU; this eliminates the need to segment
18f6010c76SMark Kavanaghpackets within the guest, and improves the data-to-overhead ratio of both the
19f6010c76SMark Kavanaghguest-host link, and PCI bus. The expectation of the guest network stack in this
20f6010c76SMark Kavanaghscenario is that segmentation of egress frames will take place either in the NIC
21f6010c76SMark KavanaghHW, or where that hardware capability is unavailable, either in the host
22f6010c76SMark Kavanaghapplication, or network stack.
23f6010c76SMark Kavanagh
24f6010c76SMark KavanaghBearing that in mind, the GSO library enables DPDK applications to segment
25f6010c76SMark Kavanaghpackets in software. Note however, that GSO is implemented as a standalone
26f6010c76SMark Kavanaghlibrary, and not via a 'fallback' mechanism (i.e. for when TSO is unsupported
27f6010c76SMark Kavanaghin the underlying hardware); that is, applications must explicitly invoke the
28c0d002aeSYi YangGSO library to segment packets, they also must call ``rte_pktmbuf_free()``
29c0d002aeSYi Yangto free mbuf GSO segments attached after calling ``rte_gso_segment()``.
30c0d002aeSYi YangThe size of GSO segments (``segsz``) is configurable by the application.
31f6010c76SMark Kavanagh
32f6010c76SMark KavanaghLimitations
33f6010c76SMark Kavanagh-----------
34f6010c76SMark Kavanagh
35f6010c76SMark Kavanagh#. The GSO library doesn't check if input packets have correct checksums.
36f6010c76SMark Kavanagh
37f6010c76SMark Kavanagh#. In addition, the GSO library doesn't re-calculate checksums for segmented
38f6010c76SMark Kavanagh   packets (that task is left to the application).
39f6010c76SMark Kavanagh
40f6010c76SMark Kavanagh#. IP fragments are unsupported by the GSO library.
41f6010c76SMark Kavanagh
42f6010c76SMark Kavanagh#. The egress interface's driver must support multi-segment packets.
43f6010c76SMark Kavanagh
44f6010c76SMark Kavanagh#. Currently, the GSO library supports the following IPv4 packet types:
45f6010c76SMark Kavanagh
46f6010c76SMark Kavanagh - TCP
47250c9eb3SJiayu Hu - UDP
4876f09394SYi Yang - VXLAN
4976f09394SYi Yang - GRE TCP
50f6010c76SMark Kavanagh
51f6010c76SMark Kavanagh  See `Supported GSO Packet Types`_ for further details.
52f6010c76SMark Kavanagh
53f6010c76SMark KavanaghPacket Segmentation
54f6010c76SMark Kavanagh-------------------
55f6010c76SMark Kavanagh
56f6010c76SMark KavanaghThe ``rte_gso_segment()`` function is the GSO library's primary
57f6010c76SMark Kavanaghsegmentation API.
58f6010c76SMark Kavanagh
59f6010c76SMark KavanaghBefore performing segmentation, an application must create a GSO context object
60f6010c76SMark Kavanagh``(struct rte_gso_ctx)``, which provides the library with some of the
61f6010c76SMark Kavanaghinformation required to understand how the packet should be segmented. Refer to
62f6010c76SMark Kavanagh`How to Segment a Packet`_ for additional details on same. Once the GSO context
63f6010c76SMark Kavanaghhas been created, and populated, the application can then use the
64f6010c76SMark Kavanagh``rte_gso_segment()`` function to segment packets.
65f6010c76SMark Kavanagh
66f6010c76SMark KavanaghThe GSO library typically stores each segment that it creates in two parts: the
67f6010c76SMark Kavanaghfirst part contains a copy of the original packet's headers, while the second
68f6010c76SMark Kavanaghpart contains a pointer to an offset within the original packet. This mechanism
69f6010c76SMark Kavanaghis explained in more detail in `GSO Output Segment Format`_.
70f6010c76SMark Kavanagh
71f6010c76SMark KavanaghThe GSO library supports both single- and multi-segment input mbufs.
72f6010c76SMark Kavanagh
73f6010c76SMark KavanaghGSO Output Segment Format
74f6010c76SMark Kavanagh~~~~~~~~~~~~~~~~~~~~~~~~~
75f6010c76SMark KavanaghTo reduce the number of expensive memcpy operations required when segmenting a
76f6010c76SMark Kavanaghpacket, the GSO library typically stores each segment that it creates as a
77f6010c76SMark Kavanaghtwo-part mbuf (technically, this is termed a 'two-segment' mbuf; however, since
78f6010c76SMark Kavanaghthe elements produced by the API are also called 'segments', for clarity the
79f6010c76SMark Kavanaghterm 'part' is used here instead).
80f6010c76SMark Kavanagh
81f6010c76SMark KavanaghThe first part of each output segment is a direct mbuf and contains a copy of
82f6010c76SMark Kavanaghthe original packet's headers, which must be prepended to each output segment.
83f6010c76SMark KavanaghThese headers are copied from the original packet into each output segment.
84f6010c76SMark Kavanagh
85f6010c76SMark KavanaghThe second part of each output segment, represents a section of data from the
86f6010c76SMark Kavanaghoriginal packet, i.e. a data segment. Rather than copy the data directly from
87f6010c76SMark Kavanaghthe original packet into the output segment (which would impact performance
88f6010c76SMark Kavanaghconsiderably), the second part of each output segment is an indirect mbuf,
89f6010c76SMark Kavanaghwhich contains no actual data, but simply points to an offset within the
90f6010c76SMark Kavanaghoriginal packet.
91f6010c76SMark Kavanagh
92f6010c76SMark KavanaghThe combination of the 'header' segment and the 'data' segment constitutes a
93f6010c76SMark Kavanaghsingle logical output GSO segment of the original packet. This is illustrated
94f6010c76SMark Kavanaghin :numref:`figure_gso-output-segment-format`.
95f6010c76SMark Kavanagh
96f6010c76SMark Kavanagh.. _figure_gso-output-segment-format:
97f6010c76SMark Kavanagh
987fe92871SThomas Monjalon.. figure:: img/gso-output-segment-format.*
99f6010c76SMark Kavanagh   :align: center
100f6010c76SMark Kavanagh
101f6010c76SMark Kavanagh   Two-part GSO output segment
102f6010c76SMark Kavanagh
103f6010c76SMark KavanaghIn one situation, the output segment may contain additional 'data' segments.
104f6010c76SMark KavanaghThis only occurs when:
105f6010c76SMark Kavanagh
106f6010c76SMark Kavanagh- the input packet on which GSO is to be performed is represented by a
107f6010c76SMark Kavanagh  multi-segment mbuf.
108f6010c76SMark Kavanagh
109f6010c76SMark Kavanagh- the output segment is required to contain data that spans the boundaries
110f6010c76SMark Kavanagh  between segments of the input multi-segment mbuf.
111f6010c76SMark Kavanagh
112f6010c76SMark KavanaghThe GSO library traverses each segment of the input packet, and produces
113f6010c76SMark Kavanaghnumerous output segments; for optimal performance, the number of output
114f6010c76SMark Kavanaghsegments is kept to a minimum. Consequently, the GSO library maximizes the
115f6010c76SMark Kavanaghamount of data contained within each output segment; i.e. each output segment
116f6010c76SMark Kavanagh``segsz`` bytes of data. The only exception to this is in the case of the very
117f6010c76SMark Kavanaghfinal output segment; if ``pkt_len`` % ``segsz``, then the final segment is
118f6010c76SMark Kavanaghsmaller than the rest.
119f6010c76SMark Kavanagh
120f6010c76SMark KavanaghIn order for an output segment to meet its MSS, it may need to include data from
121f6010c76SMark Kavanaghmultiple input segments. Due to the nature of indirect mbufs (each indirect mbuf
122f6010c76SMark Kavanaghcan point to only one direct mbuf), the solution here is to add another indirect
123f6010c76SMark Kavanaghmbuf to the output segment; this additional segment then points to the next
124f6010c76SMark Kavanaghinput segment. If necessary, this chaining process is repeated, until the sum of
125f6010c76SMark Kavanaghall of the data 'contained' in the output segment reaches ``segsz``. This
126f6010c76SMark Kavanaghensures that the amount of data contained within each output segment is uniform,
127f6010c76SMark Kavanaghwith the possible exception of the last segment, as previously described.
128f6010c76SMark Kavanagh
129f6010c76SMark Kavanagh:numref:`figure_gso-three-seg-mbuf` illustrates an example of a three-part
130f6010c76SMark Kavanaghoutput segment. In this example, the output segment needs to include data from
131f6010c76SMark Kavanaghthe end of one input segment, and the beginning of another. To achieve this,
132f6010c76SMark Kavanaghan additional indirect mbuf is chained to the second part of the output segment,
133f6010c76SMark Kavanaghand is attached to the next input segment (i.e. it points to the data in the
134f6010c76SMark Kavanaghnext input segment).
135f6010c76SMark Kavanagh
136f6010c76SMark Kavanagh.. _figure_gso-three-seg-mbuf:
137f6010c76SMark Kavanagh
1387fe92871SThomas Monjalon.. figure:: img/gso-three-seg-mbuf.*
139f6010c76SMark Kavanagh   :align: center
140f6010c76SMark Kavanagh
141f6010c76SMark Kavanagh   Three-part GSO output segment
142f6010c76SMark Kavanagh
143f6010c76SMark KavanaghSupported GSO Packet Types
144f6010c76SMark Kavanagh--------------------------
145f6010c76SMark Kavanagh
146f6010c76SMark KavanaghTCP/IPv4 GSO
147f6010c76SMark Kavanagh~~~~~~~~~~~~
148f6010c76SMark KavanaghTCP/IPv4 GSO supports segmentation of suitably large TCP/IPv4 packets, which
149f6010c76SMark Kavanaghmay also contain an optional VLAN tag.
150f6010c76SMark Kavanagh
151250c9eb3SJiayu HuUDP/IPv4 GSO
152250c9eb3SJiayu Hu~~~~~~~~~~~~
153250c9eb3SJiayu HuUDP/IPv4 GSO supports segmentation of suitably large UDP/IPv4 packets, which
154250c9eb3SJiayu Humay also contain an optional VLAN tag. UDP GSO is the same as IP fragmentation.
155250c9eb3SJiayu HuSpecifically, UDP GSO treats the UDP header as a part of the payload and
156250c9eb3SJiayu Hudoes not modify it during segmentation. Therefore, after UDP GSO, only the
157250c9eb3SJiayu Hufirst output packet has the original UDP header, and others just have l2
158250c9eb3SJiayu Huand l3 headers.
159250c9eb3SJiayu Hu
16076f09394SYi YangVXLAN IPv4 GSO
16176f09394SYi Yang~~~~~~~~~~~~~~
16276f09394SYi YangVXLAN packets GSO supports segmentation of suitably large VXLAN packets,
16376f09394SYi Yangwhich contain an outer IPv4 header, inner TCP/IPv4 or UDP/IPv4 headers, and
16476f09394SYi Yangoptional inner and/or outer VLAN tag(s).
165f6010c76SMark Kavanagh
16676f09394SYi YangGRE TCP/IPv4 GSO
16776f09394SYi Yang~~~~~~~~~~~~~~~~
168f6010c76SMark KavanaghGRE GSO supports segmentation of suitably large GRE packets, which contain
169f6010c76SMark Kavanaghan outer IPv4 header, inner TCP/IPv4 headers, and an optional VLAN tag.
170f6010c76SMark Kavanagh
171f6010c76SMark KavanaghHow to Segment a Packet
172f6010c76SMark Kavanagh-----------------------
173f6010c76SMark Kavanagh
174f6010c76SMark KavanaghTo segment an outgoing packet, an application must:
175f6010c76SMark Kavanagh
176f6010c76SMark Kavanagh#. First create a GSO context ``(struct rte_gso_ctx)``; this contains:
177f6010c76SMark Kavanagh
178f6010c76SMark Kavanagh   - a pointer to the mbuf pool for allocating the direct buffers, which are
179f6010c76SMark Kavanagh     used to store the GSO segments' packet headers.
180f6010c76SMark Kavanagh
181f6010c76SMark Kavanagh   - a pointer to the mbuf pool for allocating indirect buffers, which are
182f6010c76SMark Kavanagh     used to locate GSO segments' packet payloads.
183f6010c76SMark Kavanagh
184f6010c76SMark Kavanagh     .. note::
185f6010c76SMark Kavanagh
186f6010c76SMark Kavanagh       An application may use the same pool for both direct and indirect
187653c9de1SMark Kavanagh       buffers. However, since indirect mbufs simply store a pointer, the
188f6010c76SMark Kavanagh       application may reduce its memory consumption by creating a separate memory
189f6010c76SMark Kavanagh       pool, containing smaller elements, for the indirect pool.
190f6010c76SMark Kavanagh
191653c9de1SMark Kavanagh
192f6010c76SMark Kavanagh   - the size of each output segment, including packet headers and payload,
193f6010c76SMark Kavanagh     measured in bytes.
194f6010c76SMark Kavanagh
195f6010c76SMark Kavanagh   - the bit mask of required GSO types. The GSO library uses the same macros as
196f6010c76SMark Kavanagh     those that describe a physical device's TX offloading capabilities (i.e.
197295968d1SFerruh Yigit     ``RTE_ETH_TX_OFFLOAD_*_TSO``) for gso_types. For example, if an application
198f6010c76SMark Kavanagh     wants to segment TCP/IPv4 packets, it should set gso_types to
199295968d1SFerruh Yigit     ``RTE_ETH_TX_OFFLOAD_TCP_TSO``. The only other supported values currently
200295968d1SFerruh Yigit     supported for gso_types are ``RTE_ETH_TX_OFFLOAD_VXLAN_TNL_TSO``, and
201295968d1SFerruh Yigit     ``RTE_ETH_TX_OFFLOAD_GRE_TNL_TSO``; a combination of these macros is also
202f6010c76SMark Kavanagh     allowed.
203f6010c76SMark Kavanagh
204f6010c76SMark Kavanagh   - a flag, that indicates whether the IPv4 headers of output segments should
205f6010c76SMark Kavanagh     contain fixed or incremental ID values.
206f6010c76SMark Kavanagh
207fc7428eaSDavid Marchand#. Set the appropriate ol_flags in the mbuf.
208f6010c76SMark Kavanagh
209f6010c76SMark Kavanagh   - The GSO library use the value of an mbuf's ``ol_flags`` attribute to
210f43d3dbbSDavid Marchand     determine how a packet should be segmented. It is the application's
211f6010c76SMark Kavanagh     responsibility to ensure that these flags are set.
212f6010c76SMark Kavanagh
213f6010c76SMark Kavanagh   - For example, in order to segment TCP/IPv4 packets, the application should
214daa02b5cSOlivier Matz     add the ``RTE_MBUF_F_TX_IPV4`` and ``RTE_MBUF_F_TX_TCP_SEG`` flags to the mbuf's
215f6010c76SMark Kavanagh     ol_flags.
216f6010c76SMark Kavanagh
217f6010c76SMark Kavanagh   - If checksum calculation in hardware is required, the application should
218daa02b5cSOlivier Matz     also add the ``RTE_MBUF_F_TX_TCP_CKSUM`` and ``RTE_MBUF_F_TX_IP_CKSUM`` flags.
219f6010c76SMark Kavanagh
220f6010c76SMark Kavanagh#. Check if the packet should be processed. Packets with one of the
221f6010c76SMark Kavanagh   following properties are not processed and are returned immediately:
222f6010c76SMark Kavanagh
223f6010c76SMark Kavanagh   - Packet length is less than ``segsz`` (i.e. GSO is not required).
224f6010c76SMark Kavanagh
225f6010c76SMark Kavanagh   - Packet type is not supported by GSO library (see
226f6010c76SMark Kavanagh     `Supported GSO Packet Types`_).
227f6010c76SMark Kavanagh
228f6010c76SMark Kavanagh   - Application has not enabled GSO support for the packet type.
229f6010c76SMark Kavanagh
230f6010c76SMark Kavanagh   - Packet's ol_flags have been incorrectly set.
231f6010c76SMark Kavanagh
232f6010c76SMark Kavanagh#. Allocate space in which to store the output GSO segments. If the amount of
233f6010c76SMark Kavanagh   space allocated by the application is insufficient, segmentation will fail.
234f6010c76SMark Kavanagh
235f6010c76SMark Kavanagh#. Invoke the GSO segmentation API, ``rte_gso_segment()``.
236f6010c76SMark Kavanagh
237c0d002aeSYi Yang#. Call ``rte_pktmbuf_free()`` to free mbuf ``rte_gso_segment()`` segments.
238c0d002aeSYi Yang
239f6010c76SMark Kavanagh#. If required, update the L3 and L4 checksums of the newly-created segments.
240f6010c76SMark Kavanagh   For tunneled packets, the outer IPv4 headers' checksums should also be
241f6010c76SMark Kavanagh   updated. Alternatively, the application may offload checksum calculation
242f6010c76SMark Kavanagh   to HW.
243