xref: /dpdk/doc/guides/prog_guide/link_bonding_poll_mode_drv_lib.rst (revision 4f84008676739874712cb95f3c3df62198b80dc8)
15630257fSFerruh Yigit..  SPDX-License-Identifier: BSD-3-Clause
25630257fSFerruh Yigit    Copyright(c) 2010-2015 Intel Corporation.
3fc1f2750SBernard Iremonger
4fc1f2750SBernard IremongerLink Bonding Poll Mode Driver Library
5fc1f2750SBernard Iremonger=====================================
6fc1f2750SBernard Iremonger
7fc1f2750SBernard IremongerIn addition to Poll Mode Drivers (PMDs) for physical and virtual hardware,
848624fd9SSiobhan ButlerDPDK also includes a pure-software library that
93c532414SZhiyong Yangallows physical PMDs to be bonded together to create a single logical PMD.
10fc1f2750SBernard Iremonger
114a22e6eeSJohn McNamara.. figure:: img/bond-overview.*
124a22e6eeSJohn McNamara
13*4f840086SLong Wu   Bonding PMDs
144a22e6eeSJohn McNamara
15fc1f2750SBernard Iremonger
168809f78cSBruce RichardsonThe Link Bonding PMD library(librte_net_bond) supports bonding of groups of
173c532414SZhiyong Yang``rte_eth_dev`` ports of the same speed and duplex to provide similar
183c532414SZhiyong Yangcapabilities to that found in Linux bonding driver to allow the aggregation
1915e34522SLong Wuof multiple (member) NICs into a single logical interface between a server
20*4f840086SLong Wuand a switch. The new bonding PMD will then process these interfaces based on
213c532414SZhiyong Yangthe mode of operation specified to provide support for features such as
223c532414SZhiyong Yangredundant links, fault tolerance and/or load balancing.
23fc1f2750SBernard Iremonger
248809f78cSBruce RichardsonThe librte_net_bond library exports a C API which provides an API for the
25*4f840086SLong Wucreation of bonding devices as well as the configuration and management of the
26*4f840086SLong Wubonding device and its member devices.
27fc1f2750SBernard Iremonger
28fc1f2750SBernard Iremonger.. note::
29fc1f2750SBernard Iremonger
30b0152b1bSDeclan Doherty    The Link Bonding PMD Library is enabled by default in the build
3189c67ae2SCiara Power    configuration, the library can be disabled using the meson option
32c4498cb3SBen Magistro    "-Ddisable_drivers=net/bonding".
3389c67ae2SCiara Power
34fc1f2750SBernard Iremonger
35fc1f2750SBernard IremongerLink Bonding Modes Overview
36fc1f2750SBernard Iremonger---------------------------
37fc1f2750SBernard Iremonger
38af9f6e12SFerruh YigitCurrently the Link Bonding PMD library supports following modes of operation:
39fc1f2750SBernard Iremonger
40fc1f2750SBernard Iremonger*   **Round-Robin (Mode 0):**
41b0152b1bSDeclan Doherty
424a22e6eeSJohn McNamara.. figure:: img/bond-mode-0.*
434a22e6eeSJohn McNamara
444a22e6eeSJohn McNamara   Round-Robin (Mode 0)
454a22e6eeSJohn McNamara
46b0152b1bSDeclan Doherty
47b0152b1bSDeclan Doherty    This mode provides load balancing and fault tolerance by transmission of
4815e34522SLong Wu    packets in sequential order from the first available member device through
49b0152b1bSDeclan Doherty    the last. Packets are bulk dequeued from devices then serviced in a
50b0152b1bSDeclan Doherty    round-robin manner. This mode does not guarantee in order reception of
51b0152b1bSDeclan Doherty    packets and down stream should be able to handle out of order packets.
52fc1f2750SBernard Iremonger
53fc1f2750SBernard Iremonger*   **Active Backup (Mode 1):**
54b0152b1bSDeclan Doherty
554a22e6eeSJohn McNamara.. figure:: img/bond-mode-1.*
564a22e6eeSJohn McNamara
574a22e6eeSJohn McNamara   Active Backup (Mode 1)
584a22e6eeSJohn McNamara
59b0152b1bSDeclan Doherty
6015e34522SLong Wu    In this mode only one member in the bond is active at any time, a different
6115e34522SLong Wu    member becomes active if, and only if, the primary active member fails,
6215e34522SLong Wu    thereby providing fault tolerance to member failure. The single logical
63*4f840086SLong Wu    bonding interface's MAC address is externally visible on only one NIC (port)
64fc1f2750SBernard Iremonger    to avoid confusing the network switch.
65fc1f2750SBernard Iremonger
66fc1f2750SBernard Iremonger*   **Balance XOR (Mode 2):**
67b0152b1bSDeclan Doherty
684a22e6eeSJohn McNamara.. figure:: img/bond-mode-2.*
694a22e6eeSJohn McNamara
704a22e6eeSJohn McNamara   Balance XOR (Mode 2)
714a22e6eeSJohn McNamara
72b0152b1bSDeclan Doherty
73b0152b1bSDeclan Doherty    This mode provides transmit load balancing (based on the selected
74b0152b1bSDeclan Doherty    transmission policy) and fault tolerance. The default policy (layer2) uses
75b0152b1bSDeclan Doherty    a simple calculation based on the packet flow source and destination MAC
76*4f840086SLong Wu    addresses as well as the number of active members available to the bonding
7715e34522SLong Wu    device to classify the packet to a specific member to transmit on. Alternate
78b0152b1bSDeclan Doherty    transmission policies supported are layer 2+3, this takes the IP source and
7915e34522SLong Wu    destination addresses into the calculation of the transmit member port and
80b0152b1bSDeclan Doherty    the final supported policy is layer 3+4, this uses IP source and
81b0152b1bSDeclan Doherty    destination addresses as well as the TCP/UDP source and destination port.
82b0152b1bSDeclan Doherty
83b0152b1bSDeclan Doherty.. note::
84fea1d908SJohn McNamara    The coloring differences of the packets are used to identify different flow
85b0152b1bSDeclan Doherty    classification calculated by the selected transmit policy
86b0152b1bSDeclan Doherty
87fc1f2750SBernard Iremonger
88fc1f2750SBernard Iremonger*   **Broadcast (Mode 3):**
89b0152b1bSDeclan Doherty
904a22e6eeSJohn McNamara.. figure:: img/bond-mode-3.*
914a22e6eeSJohn McNamara
924a22e6eeSJohn McNamara   Broadcast (Mode 3)
934a22e6eeSJohn McNamara
94b0152b1bSDeclan Doherty
9515e34522SLong Wu    This mode provides fault tolerance by transmission of packets on all member
96b0152b1bSDeclan Doherty    ports.
97b0152b1bSDeclan Doherty
98b0152b1bSDeclan Doherty*   **Link Aggregation 802.3AD (Mode 4):**
99b0152b1bSDeclan Doherty
1004a22e6eeSJohn McNamara.. figure:: img/bond-mode-4.*
1014a22e6eeSJohn McNamara
1024a22e6eeSJohn McNamara   Link Aggregation 802.3AD (Mode 4)
1034a22e6eeSJohn McNamara
104b0152b1bSDeclan Doherty
105b0152b1bSDeclan Doherty    This mode provides dynamic link aggregation according to the 802.3ad
106b0152b1bSDeclan Doherty    specification. It negotiates and monitors aggregation groups that share the
107b0152b1bSDeclan Doherty    same speed and duplex settings using the selected balance transmit policy
108b0152b1bSDeclan Doherty    for balancing outgoing traffic.
109b0152b1bSDeclan Doherty
110b0152b1bSDeclan Doherty    DPDK implementation of this mode provide some additional requirements of
111b0152b1bSDeclan Doherty    the application.
112b0152b1bSDeclan Doherty
113b0152b1bSDeclan Doherty    #. It needs to call ``rte_eth_tx_burst`` and ``rte_eth_rx_burst`` with
114b0152b1bSDeclan Doherty       intervals period of less than 100ms.
115b0152b1bSDeclan Doherty
116b0152b1bSDeclan Doherty    #. Calls to ``rte_eth_tx_burst`` must have a buffer size of at least 2xN,
11715e34522SLong Wu       where N is the number of members. This is a space required for LACP
118b0152b1bSDeclan Doherty       frames. Additionally LACP packets are included in the statistics, but
119b0152b1bSDeclan Doherty       they are not returned to the application.
120b0152b1bSDeclan Doherty
121b0152b1bSDeclan Doherty*   **Transmit Load Balancing (Mode 5):**
122b0152b1bSDeclan Doherty
1234a22e6eeSJohn McNamara.. figure:: img/bond-mode-5.*
1244a22e6eeSJohn McNamara
1254a22e6eeSJohn McNamara   Transmit Load Balancing (Mode 5)
1264a22e6eeSJohn McNamara
127b0152b1bSDeclan Doherty
128b0152b1bSDeclan Doherty    This mode provides an adaptive transmit load balancing. It dynamically
12915e34522SLong Wu    changes the transmitting member, according to the computed load. Statistics
130b0152b1bSDeclan Doherty    are collected in 100ms intervals and scheduled every 10ms.
131b0152b1bSDeclan Doherty
132fc1f2750SBernard Iremonger
133fc1f2750SBernard IremongerImplementation Details
134fc1f2750SBernard Iremonger----------------------
135fc1f2750SBernard Iremonger
136*4f840086SLong WuThe librte_net_bond bonding device is compatible with the Ethernet device API
13748624fd9SSiobhan Butlerexported by the Ethernet PMDs described in the *DPDK API Reference*.
138fc1f2750SBernard Iremonger
139*4f840086SLong WuThe Link Bonding Library supports the creation of bonding devices at application
140b0152b1bSDeclan Dohertystartup time during EAL initialization using the ``--vdev`` option as well as
141b0152b1bSDeclan Dohertyprogrammatically via the C API ``rte_eth_bond_create`` function.
142fc1f2750SBernard Iremonger
143*4f840086SLong WuBonding devices support the dynamical addition and removal of member devices using
14415e34522SLong Wuthe ``rte_eth_bond_member_add`` / ``rte_eth_bond_member_remove`` APIs.
145fc1f2750SBernard Iremonger
146*4f840086SLong WuAfter a member device is added to a bonding device member is stopped using
147b0152b1bSDeclan Doherty``rte_eth_dev_stop`` and then reconfigured using ``rte_eth_dev_configure``
148b0152b1bSDeclan Dohertythe RX and TX queues are also reconfigured using ``rte_eth_tx_queue_setup`` /
149b0152b1bSDeclan Doherty``rte_eth_rx_queue_setup`` with the parameters use to configure the bonding
150734ce47fSTomasz Kulasekdevice. If RSS is enabled for bonding device, this mode is also enabled on new
15115e34522SLong Wumember and configured as well.
15249dad902SMatan AzradAny flow which was configured to the bond device also is configured to the added
15315e34522SLong Wumember.
154734ce47fSTomasz Kulasek
155734ce47fSTomasz KulasekSetting up multi-queue mode for bonding device to RSS, makes it fully
15615e34522SLong WuRSS-capable, so all members are synchronized with its configuration. This mode is
15715e34522SLong Wuintended to provide RSS configuration on members transparent for client
158734ce47fSTomasz Kulasekapplication implementation.
159734ce47fSTomasz Kulasek
160734ce47fSTomasz KulasekBonding device stores its own version of RSS settings i.e. RETA, RSS hash
16115e34522SLong Wufunction and RSS key, used to set up its members. That let to define the meaning
162734ce47fSTomasz Kulasekof RSS configuration of bonding device as desired configuration of whole bonding
16315e34522SLong Wu(as one unit), without pointing any of member inside. It is required to ensure
1642fe68f32SJohn McNamaraconsistency and made it more error-proof.
165734ce47fSTomasz Kulasek
166734ce47fSTomasz KulasekRSS hash function set for bonding device, is a maximal set of RSS hash functions
167*4f840086SLong Wusupported by all bonding members. RETA size is a GCD of all its RETA's sizes, so
16815e34522SLong Wuit can be easily used as a pattern providing expected behavior, even if member
169*4f840086SLong WuRETAs' sizes are different. If RSS Key is not set for bonding device, it's not
17015e34522SLong Wuchanged on the members and default key for device is used.
171734ce47fSTomasz Kulasek
172*4f840086SLong WuAs RSS configurations, there is flow consistency in the bonding members for the
17349dad902SMatan Azradnext rte flow operations:
17449dad902SMatan Azrad
17549dad902SMatan AzradValidate:
17615e34522SLong Wu	- Validate flow for each member, failure at least for one member causes to
17749dad902SMatan Azrad	  bond validation failure.
17849dad902SMatan Azrad
17949dad902SMatan AzradCreate:
18015e34522SLong Wu	- Create the flow in all members.
18115e34522SLong Wu	- Save all the members created flows objects in bonding internal flow
18249dad902SMatan Azrad	  structure.
18315e34522SLong Wu	- Failure in flow creation for existed member rejects the flow.
18415e34522SLong Wu	- Failure in flow creation for new members in member adding time rejects
18515e34522SLong Wu	  the member.
18649dad902SMatan Azrad
18749dad902SMatan AzradDestroy:
18815e34522SLong Wu	- Destroy the flow in all members and release the bond internal flow
18949dad902SMatan Azrad	  memory.
19049dad902SMatan Azrad
19149dad902SMatan AzradFlush:
19215e34522SLong Wu	- Destroy all the bonding PMD flows in all the members.
19349dad902SMatan Azrad
19449dad902SMatan Azrad.. note::
19549dad902SMatan Azrad
19615e34522SLong Wu    Don't call members flush directly, It destroys all the member flows which
19749dad902SMatan Azrad    may include external flows or the bond internal LACP flow.
19849dad902SMatan Azrad
19949dad902SMatan AzradQuery:
20015e34522SLong Wu	- Summarize flow counters from all the members, relevant only for
20149dad902SMatan Azrad	  ``RTE_FLOW_ACTION_TYPE_COUNT``.
20249dad902SMatan Azrad
20349dad902SMatan AzradIsolate:
20415e34522SLong Wu	- Call to flow isolate for all members.
20515e34522SLong Wu	- Failure in flow isolation for existed member rejects the isolate mode.
20615e34522SLong Wu	- Failure in flow isolation for new members in member adding time rejects
20715e34522SLong Wu	  the member.
20849dad902SMatan Azrad
209734ce47fSTomasz KulasekAll settings are managed through the bonding port API and always are propagated
21015e34522SLong Wuin one direction (from bonding to members).
211b0152b1bSDeclan Doherty
212b0152b1bSDeclan DohertyLink Status Change Interrupts / Polling
213b0152b1bSDeclan Doherty~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
214b0152b1bSDeclan Doherty
215b0152b1bSDeclan DohertyLink bonding devices support the registration of a link status change callback,
216b0152b1bSDeclan Dohertyusing the ``rte_eth_dev_callback_register`` API, this will be called when the
217b0152b1bSDeclan Dohertystatus of the bonding device changes. For example in the case of a bonding
21815e34522SLong Wudevice which has 3 members, the link status will change to up when one member
21915e34522SLong Wubecomes active or change to down when all members become inactive. There is no
22015e34522SLong Wucallback notification when a single member changes state and the previous
22115e34522SLong Wuconditions are not met. If a user wishes to monitor individual members then they
22215e34522SLong Wumust register callbacks with that member directly.
223b0152b1bSDeclan Doherty
224b0152b1bSDeclan DohertyThe link bonding library also supports devices which do not implement link
225f6d690f2STomasz Kulasekstatus change interrupts, this is achieved by polling the devices link status at
226b0152b1bSDeclan Dohertya defined period which is set using the ``rte_eth_bond_link_monitoring_set``
22715e34522SLong WuAPI, the default polling interval is 10ms. When a device is added as a member to
228b0152b1bSDeclan Dohertya bonding device it is determined using the ``RTE_PCI_DRV_INTR_LSC`` flag
229b0152b1bSDeclan Dohertywhether the device supports interrupts or whether the link status should be
230b0152b1bSDeclan Dohertymonitored by polling it.
231fc1f2750SBernard Iremonger
232fc1f2750SBernard IremongerRequirements / Limitations
233fc1f2750SBernard Iremonger~~~~~~~~~~~~~~~~~~~~~~~~~~
234fc1f2750SBernard Iremonger
235b0152b1bSDeclan DohertyThe current implementation only supports devices that support the same speed
236*4f840086SLong Wuand duplex to be added as a members to the same bonding device. The bonding device
237*4f840086SLong Wuinherits these attributes from the first active member added to the bonding
238*4f840086SLong Wudevice and then all further members added to the bonding device must support
239b0152b1bSDeclan Dohertythese parameters.
240fc1f2750SBernard Iremonger
24115e34522SLong WuA bonding device must have a minimum of one member before the bonding device
242b0152b1bSDeclan Dohertyitself can be started.
243fc1f2750SBernard Iremonger
244734ce47fSTomasz KulasekTo use a bonding device dynamic RSS configuration feature effectively, it is
24515e34522SLong Wualso required, that all members should be RSS-capable and support, at least one
246734ce47fSTomasz Kulasekcommon hash function available for each of them. Changing RSS key is only
24715e34522SLong Wupossible, when all member devices support the same key size.
248734ce47fSTomasz Kulasek
24915e34522SLong WuTo prevent inconsistency on how members process packets, once a device is added
25049dad902SMatan Azradto a bonding device, RSS and rte flow configurations should be managed through
25115e34522SLong Wuthe bonding device API, and not directly on the member.
252734ce47fSTomasz Kulasek
253b0152b1bSDeclan DohertyLike all other PMD, all functions exported by a PMD are lock-free functions
254b0152b1bSDeclan Dohertythat are assumed not to be invoked in parallel on different logical cores to
255b0152b1bSDeclan Dohertywork on the same target object.
256fc1f2750SBernard Iremonger
257b0152b1bSDeclan DohertyIt should also be noted that the PMD receive function should not be invoked
258*4f840086SLong Wudirectly on a member devices after they have been to a bonding device since
25915e34522SLong Wupackets read directly from the member device will no longer be available to the
260*4f840086SLong Wubonding device to read.
261fc1f2750SBernard Iremonger
262fc1f2750SBernard IremongerConfiguration
263fc1f2750SBernard Iremonger~~~~~~~~~~~~~
264fc1f2750SBernard Iremonger
265b0152b1bSDeclan DohertyLink bonding devices are created using the ``rte_eth_bond_create`` API
266fc1f2750SBernard Iremongerwhich requires a unique device name, the bonding mode,
267fc1f2750SBernard Iremongerand the socket Id to allocate the bonding device's resources on.
268*4f840086SLong WuThe other configurable parameters for a bonding device are its member devices,
26915e34522SLong Wuits primary member, a user defined MAC address and transmission policy to use if
270b0152b1bSDeclan Dohertythe device is in balance XOR mode.
271fc1f2750SBernard Iremonger
27215e34522SLong WuMember Devices
27315e34522SLong Wu^^^^^^^^^^^^^^
274fc1f2750SBernard Iremonger
27515e34522SLong WuBonding devices support up to a maximum of ``RTE_MAX_ETHPORTS`` member devices
27615e34522SLong Wuof the same speed and duplex. Ethernet devices can be added as a member to a
277*4f840086SLong Wumaximum of one bonding device. Member devices are reconfigured with the
278*4f840086SLong Wuconfiguration of the bonding device on being added to a bonding device.
279fc1f2750SBernard Iremonger
280*4f840086SLong WuThe bonding also guarantees to return the MAC address of the member device to its
28115e34522SLong Wuoriginal value of removal of a member from it.
282fc1f2750SBernard Iremonger
28315e34522SLong WuPrimary Member
28415e34522SLong Wu^^^^^^^^^^^^^^
285fc1f2750SBernard Iremonger
286*4f840086SLong WuThe primary member is used to define the default port to use when a bonding
287b0152b1bSDeclan Dohertydevice is in active backup mode. A different port will only be used if, and
288b0152b1bSDeclan Dohertyonly if, the current primary port goes down. If the user does not specify a
289*4f840086SLong Wuprimary port it will default to being the first port added to the bonding device.
290fc1f2750SBernard Iremonger
291fc1f2750SBernard IremongerMAC Address
292fc1f2750SBernard Iremonger^^^^^^^^^^^
293fc1f2750SBernard Iremonger
294*4f840086SLong WuThe bonding device can be configured with a user specified MAC address, this
29515e34522SLong Wuaddress will be inherited by the some/all member devices depending on the
296b0152b1bSDeclan Dohertyoperating mode. If the device is in active backup mode then only the primary
29715e34522SLong Wudevice will have the user specified MAC, all other members will retain their
29815e34522SLong Wuoriginal MAC address. In mode 0, 2, 3, 4 all members devices are configure with
299*4f840086SLong Wuthe bonding devices MAC address.
300fc1f2750SBernard Iremonger
301*4f840086SLong WuIf a user defined MAC address is not defined then the bonding device will
30215e34522SLong Wudefault to using the primary members MAC address.
303fc1f2750SBernard Iremonger
304fc1f2750SBernard IremongerBalance XOR Transmit Policies
305fc1f2750SBernard Iremonger^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
306fc1f2750SBernard Iremonger
307*4f840086SLong WuThere are 3 supported transmission policies for bonding device running in
308b0152b1bSDeclan DohertyBalance XOR mode. Layer 2, Layer 2+3, Layer 3+4.
309fc1f2750SBernard Iremonger
310b0152b1bSDeclan Doherty*   **Layer 2:**   Ethernet MAC address based balancing is the default
311b0152b1bSDeclan Doherty    transmission policy for Balance XOR bonding mode. It uses a simple XOR
312b0152b1bSDeclan Doherty    calculation on the source MAC address and destination MAC address of the
31315e34522SLong Wu    packet and then calculate the modulus of this value to calculate the member
314b0152b1bSDeclan Doherty    device to transmit the packet on.
315fc1f2750SBernard Iremonger
316b0152b1bSDeclan Doherty*   **Layer 2 + 3:** Ethernet MAC address & IP Address based balancing uses a
317b0152b1bSDeclan Doherty    combination of source/destination MAC addresses and the source/destination
31815e34522SLong Wu    IP addresses of the data packet to decide which member port the packet will
319b0152b1bSDeclan Doherty    be transmitted on.
320fc1f2750SBernard Iremonger
321b0152b1bSDeclan Doherty*   **Layer 3 + 4:**  IP Address & UDP Port based  balancing uses a combination
322b0152b1bSDeclan Doherty    of source/destination IP Address and the source/destination UDP ports of
32315e34522SLong Wu    the packet of the data packet to decide which member port the packet will be
324b0152b1bSDeclan Doherty    transmitted on.
325fc1f2750SBernard Iremonger
326b0152b1bSDeclan DohertyAll these policies support 802.1Q VLAN Ethernet packets, as well as IPv4, IPv6
327b0152b1bSDeclan Dohertyand UDP protocols for load balancing.
328fc1f2750SBernard Iremonger
329fc1f2750SBernard IremongerUsing Link Bonding Devices
330fc1f2750SBernard Iremonger--------------------------
331fc1f2750SBernard Iremonger
3328809f78cSBruce RichardsonThe librte_net_bond library supports two modes of device creation, the libraries
333b0152b1bSDeclan Dohertyexport full C API or using the EAL command line to statically configure link
334b0152b1bSDeclan Dohertybonding devices at application startup. Using the EAL option it is possible to
335b0152b1bSDeclan Dohertyuse link bonding functionality transparently without specific knowledge of the
336b0152b1bSDeclan Dohertylibraries API, this can be used, for example, to add bonding functionality,
337b0152b1bSDeclan Dohertysuch as active backup, to an existing application which has no knowledge of
338b0152b1bSDeclan Dohertythe link bonding C API.
339fc1f2750SBernard Iremonger
340fc1f2750SBernard IremongerUsing the Poll Mode Driver from an Application
341fc1f2750SBernard Iremonger~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
342fc1f2750SBernard Iremonger
3438809f78cSBruce RichardsonUsing the librte_net_bond libraries API it is possible to dynamically create
344b0152b1bSDeclan Dohertyand manage link bonding device from within any application. Link bonding
345f6d690f2STomasz Kulasekdevices are created using the ``rte_eth_bond_create`` API which requires a
346b0152b1bSDeclan Dohertyunique device name, the link bonding mode to initial the device in and finally
347b0152b1bSDeclan Dohertythe socket Id which to allocate the devices resources onto. After successful
348b0152b1bSDeclan Dohertycreation of a bonding device it must be configured using the generic Ethernet
349b0152b1bSDeclan Dohertydevice configure API ``rte_eth_dev_configure`` and then the RX and TX queues
350b0152b1bSDeclan Dohertywhich will be used must be setup using ``rte_eth_tx_queue_setup`` /
351b0152b1bSDeclan Doherty``rte_eth_rx_queue_setup``.
352fc1f2750SBernard Iremonger
35315e34522SLong WuMember devices can be dynamically added and removed from a link bonding device
35415e34522SLong Wuusing the ``rte_eth_bond_member_add`` / ``rte_eth_bond_member_remove``
35515e34522SLong WuAPIs but at least one member device must be added to the link bonding device
356b0152b1bSDeclan Dohertybefore it can be started using ``rte_eth_dev_start``.
357fc1f2750SBernard Iremonger
358*4f840086SLong WuThe link status of a bonding device is dictated by that of its members, if all
35915e34522SLong Wumember device link status are down or if all members are removed from the link
360b0152b1bSDeclan Dohertybonding device then the link status of the bonding device will go down.
361fc1f2750SBernard Iremonger
362b0152b1bSDeclan DohertyIt is also possible to configure / query the configuration of the control
363*4f840086SLong Wuparameters of a bonding device using the provided APIs
364b0152b1bSDeclan Doherty``rte_eth_bond_mode_set/ get``, ``rte_eth_bond_primary_set/get``,
365b0152b1bSDeclan Doherty``rte_eth_bond_mac_set/reset`` and ``rte_eth_bond_xmit_policy_set/get``.
366fc1f2750SBernard Iremonger
367fc1f2750SBernard IremongerUsing Link Bonding Devices from the EAL Command Line
368fc1f2750SBernard Iremonger~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
369fc1f2750SBernard Iremonger
370b0152b1bSDeclan DohertyLink bonding devices can be created at application startup time using the
371b0152b1bSDeclan Doherty``--vdev`` EAL command line option. The device name must start with the
37210c929f5SHerakliusz Lipiecnet_bonding prefix followed by numbers or letters. The name must be unique for
373b0152b1bSDeclan Dohertyeach device. Each device can have multiple options arranged in a comma
374b0152b1bSDeclan Dohertyseparated list. Multiple devices definitions can be arranged by calling the
375b0152b1bSDeclan Doherty``--vdev`` option multiple times.
376b0152b1bSDeclan Doherty
377fc1f2750SBernard IremongerDevice names and bonding options must be separated by commas as shown below:
378fc1f2750SBernard Iremonger
379fc1f2750SBernard Iremonger.. code-block:: console
380fc1f2750SBernard Iremonger
38189c67ae2SCiara Power    ./<build_dir>/app/dpdk-testpmd -l 0-3 -n 4 --vdev 'net_bonding0,bond_opt0=..,bond opt1=..'--vdev 'net_bonding1,bond _opt0=..,bond_opt1=..'
382fc1f2750SBernard Iremonger
383fc1f2750SBernard IremongerLink Bonding EAL Options
384fc1f2750SBernard Iremonger^^^^^^^^^^^^^^^^^^^^^^^^
385fc1f2750SBernard Iremonger
386b0152b1bSDeclan DohertyThere are multiple ways of definitions that can be assessed and combined as
387b0152b1bSDeclan Dohertylong as the following two rules are respected:
388fc1f2750SBernard Iremonger
38910c929f5SHerakliusz Lipiec*   A unique device name, in the format of net_bondingX is provided,
390fc1f2750SBernard Iremonger    where X can be any combination of numbers and/or letters,
391fc1f2750SBernard Iremonger    and the name is no greater than 32 characters long.
392fc1f2750SBernard Iremonger
393*4f840086SLong Wu*   A least one member device is provided with for each bonding device definition.
394fc1f2750SBernard Iremonger
395*4f840086SLong Wu*   The operation mode of the bonding device being created is provided.
396fc1f2750SBernard Iremonger
397fc1f2750SBernard IremongerThe different options are:
398fc1f2750SBernard Iremonger
399fc1f2750SBernard Iremonger*   mode: Integer value defining the bonding mode of the device.
400b0152b1bSDeclan Doherty    Currently supports modes 0,1,2,3,4,5 (round-robin, active backup, balance,
401b0152b1bSDeclan Doherty    broadcast, link aggregation, transmit load balancing).
402b0152b1bSDeclan Doherty
403b0152b1bSDeclan Doherty.. code-block:: console
404fc1f2750SBernard Iremonger
405fc1f2750SBernard Iremonger        mode=2
406fc1f2750SBernard Iremonger
407*4f840086SLong Wu*   member: Defines the PMD device which will be added as member to the bonding
408f6d690f2STomasz Kulasek    device. This option can be selected multiple times, for each device to be
40915e34522SLong Wu    added as a member. Physical devices should be specified using their PCI
410b0152b1bSDeclan Doherty    address, in the format domain:bus:devid.function
411b0152b1bSDeclan Doherty
412b0152b1bSDeclan Doherty.. code-block:: console
413fc1f2750SBernard Iremonger
41415e34522SLong Wu        member=0000:0a:00.0,member=0000:0a:00.1
415fc1f2750SBernard Iremonger
41615e34522SLong Wu*   primary: Optional parameter which defines the primary member port,
41715e34522SLong Wu    is used in active backup mode to select the primary member for data TX/RX if
418b0152b1bSDeclan Doherty    it is available. The primary port also is used to select the MAC address to
41915e34522SLong Wu    use when it is not defined by the user. This defaults to the first member
42015e34522SLong Wu    added to the device if it is specified. The primary device must be a member
421*4f840086SLong Wu    of the bonding device.
422b0152b1bSDeclan Doherty
423b0152b1bSDeclan Doherty.. code-block:: console
424fc1f2750SBernard Iremonger
425fc1f2750SBernard Iremonger        primary=0000:0a:00.0
426fc1f2750SBernard Iremonger
427b0152b1bSDeclan Doherty*   socket_id: Optional parameter used to select which socket on a NUMA device
428*4f840086SLong Wu    the bonding devices resources will be allocated on.
429b0152b1bSDeclan Doherty
430b0152b1bSDeclan Doherty.. code-block:: console
431fc1f2750SBernard Iremonger
432fc1f2750SBernard Iremonger        socket_id=0
433fc1f2750SBernard Iremonger
434b0152b1bSDeclan Doherty*   mac: Optional parameter to select a MAC address for link bonding device,
43515e34522SLong Wu    this overrides the value of the primary member device.
436b0152b1bSDeclan Doherty
437b0152b1bSDeclan Doherty.. code-block:: console
438fc1f2750SBernard Iremonger
439fc1f2750SBernard Iremonger        mac=00:1e:67:1d:fd:1d
440fc1f2750SBernard Iremonger
441b0152b1bSDeclan Doherty*   xmit_policy: Optional parameter which defines the transmission policy when
442*4f840086SLong Wu    the bonding device is in  balance mode. If not user specified this defaults
443b0152b1bSDeclan Doherty    to l2 (layer 2) forwarding, the other transmission policies available are
444b0152b1bSDeclan Doherty    l23 (layer 2+3) and l34 (layer 3+4)
445fc1f2750SBernard Iremonger
446b0152b1bSDeclan Doherty.. code-block:: console
447b0152b1bSDeclan Doherty
448b0152b1bSDeclan Doherty        xmit_policy=l23
449b0152b1bSDeclan Doherty
450b0152b1bSDeclan Doherty*   lsc_poll_period_ms: Optional parameter which defines the polling interval
451b0152b1bSDeclan Doherty    in milli-seconds at which devices which don't support lsc interrupts are
452b0152b1bSDeclan Doherty    checked for a change in the devices link status
453b0152b1bSDeclan Doherty
454b0152b1bSDeclan Doherty.. code-block:: console
455b0152b1bSDeclan Doherty
456b0152b1bSDeclan Doherty        lsc_poll_period_ms=100
457b0152b1bSDeclan Doherty
458b0152b1bSDeclan Doherty*   up_delay: Optional parameter which adds a delay in milli-seconds to the
459b0152b1bSDeclan Doherty    propagation of a devices link status changing to up, by default this
460b0152b1bSDeclan Doherty    parameter is zero.
461b0152b1bSDeclan Doherty
462b0152b1bSDeclan Doherty.. code-block:: console
463b0152b1bSDeclan Doherty
464b0152b1bSDeclan Doherty        up_delay=10
465b0152b1bSDeclan Doherty
466b0152b1bSDeclan Doherty*   down_delay: Optional parameter which adds a delay in milli-seconds to the
467b0152b1bSDeclan Doherty    propagation of a devices link status changing to down, by default this
468b0152b1bSDeclan Doherty    parameter is zero.
469b0152b1bSDeclan Doherty
470b0152b1bSDeclan Doherty.. code-block:: console
471b0152b1bSDeclan Doherty
472b0152b1bSDeclan Doherty        down_delay=50
473fc1f2750SBernard Iremonger
474fc1f2750SBernard IremongerExamples of Usage
475fc1f2750SBernard Iremonger^^^^^^^^^^^^^^^^^
476fc1f2750SBernard Iremonger
477*4f840086SLong WuCreate a bonding device in round robin mode with two members specified by their PCI address:
478fc1f2750SBernard Iremonger
479fc1f2750SBernard Iremonger.. code-block:: console
480fc1f2750SBernard Iremonger
48115e34522SLong Wu    ./<build_dir>/app/dpdk-testpmd -l 0-3 -n 4 --vdev 'net_bonding0,mode=0,member=0000:0a:00.01,member=0000:04:00.00' -- --port-topology=chained
482fc1f2750SBernard Iremonger
483*4f840086SLong WuCreate a bonding device in round robin mode with two members specified by their PCI address and an overriding MAC address:
484fc1f2750SBernard Iremonger
485fc1f2750SBernard Iremonger.. code-block:: console
486fc1f2750SBernard Iremonger
48715e34522SLong Wu    ./<build_dir>/app/dpdk-testpmd -l 0-3 -n 4 --vdev 'net_bonding0,mode=0,member=0000:0a:00.01,member=0000:04:00.00,mac=00:1e:67:1d:fd:1d' -- --port-topology=chained
488fc1f2750SBernard Iremonger
489*4f840086SLong WuCreate a bonding device in active backup mode with two members specified, and a primary member specified by their PCI addresses:
490fc1f2750SBernard Iremonger
491fc1f2750SBernard Iremonger.. code-block:: console
492fc1f2750SBernard Iremonger
49315e34522SLong Wu    ./<build_dir>/app/dpdk-testpmd -l 0-3 -n 4 --vdev 'net_bonding0,mode=1,member=0000:0a:00.01,member=0000:04:00.00,primary=0000:0a:00.01' -- --port-topology=chained
494fc1f2750SBernard Iremonger
495*4f840086SLong WuCreate a bonding device in balance mode with two members specified by their PCI addresses, and a transmission policy of layer 3 + 4 forwarding:
496fc1f2750SBernard Iremonger
497fc1f2750SBernard Iremonger.. code-block:: console
498fc1f2750SBernard Iremonger
49915e34522SLong Wu    ./<build_dir>/app/dpdk-testpmd -l 0-3 -n 4 --vdev 'net_bonding0,mode=2,member=0000:0a:00.01,member=0000:04:00.00,xmit_policy=l34' -- --port-topology=chained
500703178f8SDavid Marchand
501703178f8SDavid Marchand.. _bonding_testpmd_commands:
502703178f8SDavid Marchand
503703178f8SDavid MarchandTestpmd driver specific commands
504703178f8SDavid Marchand--------------------------------
505703178f8SDavid Marchand
506703178f8SDavid MarchandSome bonding driver specific features are integrated in testpmd.
507703178f8SDavid Marchand
508*4f840086SLong Wucreate bonding device
509*4f840086SLong Wu~~~~~~~~~~~~~~~~~~~~~
510703178f8SDavid Marchand
511703178f8SDavid MarchandCreate a new bonding device::
512703178f8SDavid Marchand
513*4f840086SLong Wu   testpmd> create bonding device (mode) (socket)
514703178f8SDavid Marchand
515*4f840086SLong WuFor example, to create a bonding device in mode 1 on socket 0::
516703178f8SDavid Marchand
517*4f840086SLong Wu   testpmd> create bonding device 1 0
518*4f840086SLong Wu   created new bonding device (port X)
519703178f8SDavid Marchand
52015e34522SLong Wuadd bonding member
52115e34522SLong Wu~~~~~~~~~~~~~~~~~~
522703178f8SDavid Marchand
523703178f8SDavid MarchandAdds Ethernet device to a Link Bonding device::
524703178f8SDavid Marchand
52515e34522SLong Wu   testpmd> add bonding member (member id) (port id)
526703178f8SDavid Marchand
527703178f8SDavid MarchandFor example, to add Ethernet device (port 6) to a Link Bonding device (port 10)::
528703178f8SDavid Marchand
52915e34522SLong Wu   testpmd> add bonding member 6 10
530703178f8SDavid Marchand
531703178f8SDavid Marchand
53215e34522SLong Wuremove bonding member
53315e34522SLong Wu~~~~~~~~~~~~~~~~~~~~~
534703178f8SDavid Marchand
53515e34522SLong WuRemoves an Ethernet member device from a Link Bonding device::
536703178f8SDavid Marchand
53715e34522SLong Wu   testpmd> remove bonding member (member id) (port id)
538703178f8SDavid Marchand
53915e34522SLong WuFor example, to remove Ethernet member device (port 6) to a Link Bonding device (port 10)::
540703178f8SDavid Marchand
54115e34522SLong Wu   testpmd> remove bonding member 6 10
542703178f8SDavid Marchand
543703178f8SDavid Marchandset bonding mode
544703178f8SDavid Marchand~~~~~~~~~~~~~~~~
545703178f8SDavid Marchand
546703178f8SDavid MarchandSet the Link Bonding mode of a Link Bonding device::
547703178f8SDavid Marchand
548703178f8SDavid Marchand   testpmd> set bonding mode (value) (port id)
549703178f8SDavid Marchand
550703178f8SDavid MarchandFor example, to set the bonding mode of a Link Bonding device (port 10) to broadcast (mode 3)::
551703178f8SDavid Marchand
552703178f8SDavid Marchand   testpmd> set bonding mode 3 10
553703178f8SDavid Marchand
554703178f8SDavid Marchandset bonding primary
555703178f8SDavid Marchand~~~~~~~~~~~~~~~~~~~
556703178f8SDavid Marchand
55715e34522SLong WuSet an Ethernet member device as the primary device on a Link Bonding device::
558703178f8SDavid Marchand
55915e34522SLong Wu   testpmd> set bonding primary (member id) (port id)
560703178f8SDavid Marchand
56115e34522SLong WuFor example, to set the Ethernet member device (port 6) as the primary port of a Link Bonding device (port 10)::
562703178f8SDavid Marchand
563703178f8SDavid Marchand   testpmd> set bonding primary 6 10
564703178f8SDavid Marchand
565703178f8SDavid Marchandset bonding mac
566703178f8SDavid Marchand~~~~~~~~~~~~~~~
567703178f8SDavid Marchand
568703178f8SDavid MarchandSet the MAC address of a Link Bonding device::
569703178f8SDavid Marchand
570703178f8SDavid Marchand   testpmd> set bonding mac (port id) (mac)
571703178f8SDavid Marchand
572703178f8SDavid MarchandFor example, to set the MAC address of a Link Bonding device (port 10) to 00:00:00:00:00:01::
573703178f8SDavid Marchand
574703178f8SDavid Marchand   testpmd> set bonding mac 10 00:00:00:00:00:01
575703178f8SDavid Marchand
576703178f8SDavid Marchandset bonding balance_xmit_policy
577703178f8SDavid Marchand~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
578703178f8SDavid Marchand
579703178f8SDavid MarchandSet the transmission policy for a Link Bonding device when it is in Balance XOR mode::
580703178f8SDavid Marchand
581703178f8SDavid Marchand   testpmd> set bonding balance_xmit_policy (port_id) (l2|l23|l34)
582703178f8SDavid Marchand
583703178f8SDavid MarchandFor example, set a Link Bonding device (port 10) to use a balance policy of layer 3+4 (IP addresses & UDP ports)::
584703178f8SDavid Marchand
585703178f8SDavid Marchand   testpmd> set bonding balance_xmit_policy 10 l34
586703178f8SDavid Marchand
587703178f8SDavid Marchand
588703178f8SDavid Marchandset bonding mon_period
589703178f8SDavid Marchand~~~~~~~~~~~~~~~~~~~~~~
590703178f8SDavid Marchand
591703178f8SDavid MarchandSet the link status monitoring polling period in milliseconds for a bonding device.
592703178f8SDavid Marchand
59315e34522SLong WuThis adds support for PMD member devices which do not support link status interrupts.
594703178f8SDavid MarchandWhen the mon_period is set to a value greater than 0 then all PMD's which do not support
595703178f8SDavid Marchandlink status ISR will be queried every polling interval to check if their link status has changed::
596703178f8SDavid Marchand
597703178f8SDavid Marchand   testpmd> set bonding mon_period (port_id) (value)
598703178f8SDavid Marchand
599*4f840086SLong WuFor example, to set the link status monitoring polling period of bonding device (port 5) to 150ms::
600703178f8SDavid Marchand
601703178f8SDavid Marchand   testpmd> set bonding mon_period 5 150
602703178f8SDavid Marchand
603703178f8SDavid Marchand
604703178f8SDavid Marchandset bonding lacp dedicated_queue
605703178f8SDavid Marchand~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
606703178f8SDavid Marchand
60715e34522SLong WuEnable dedicated tx/rx queues on bonding devices members to handle LACP control plane traffic
608703178f8SDavid Marchandwhen in mode 4 (link-aggregation-802.3ad)::
609703178f8SDavid Marchand
610703178f8SDavid Marchand   testpmd> set bonding lacp dedicated_queues (port_id) (enable|disable)
611703178f8SDavid Marchand
612703178f8SDavid Marchand
613703178f8SDavid Marchandset bonding agg_mode
614703178f8SDavid Marchand~~~~~~~~~~~~~~~~~~~~
615703178f8SDavid Marchand
616703178f8SDavid MarchandEnable one of the specific aggregators mode when in mode 4 (link-aggregation-802.3ad)::
617703178f8SDavid Marchand
618703178f8SDavid Marchand   testpmd> set bonding agg_mode (port_id) (bandwidth|count|stable)
619703178f8SDavid Marchand
620703178f8SDavid Marchand
621703178f8SDavid Marchandshow bonding config
622703178f8SDavid Marchand~~~~~~~~~~~~~~~~~~~
623703178f8SDavid Marchand
624f3b5f3d3SChengwen FengShow the current configuration of a Link Bonding device,
625f3b5f3d3SChengwen Fengit also shows link-aggregation-802.3ad information if the link mode is mode 4::
626703178f8SDavid Marchand
627703178f8SDavid Marchand   testpmd> show bonding config (port id)
628703178f8SDavid Marchand
629703178f8SDavid MarchandFor example,
63015e34522SLong Wuto show the configuration a Link Bonding device (port 9) with 3 member devices (1, 3, 4)
631703178f8SDavid Marchandin balance mode with a transmission policy of layer 2+3::
632703178f8SDavid Marchand
633703178f8SDavid Marchand   testpmd> show bonding config 9
634f3b5f3d3SChengwen Feng     - Dev basic:
635f3b5f3d3SChengwen Feng        Bonding mode: BALANCE(2)
636703178f8SDavid Marchand        Balance Xmit Policy: BALANCE_XMIT_POLICY_LAYER23
63715e34522SLong Wu        Members (3): [1 3 4]
63815e34522SLong Wu        Active Members (3): [1 3 4]
639703178f8SDavid Marchand        Primary: [3]
640