xref: /dpdk/doc/guides/nics/i40e.rst (revision 68a03efeed657e6e05f281479b33b51102797e15)
1..  SPDX-License-Identifier: BSD-3-Clause
2    Copyright(c) 2016 Intel Corporation.
3
4I40E Poll Mode Driver
5======================
6
7The i40e PMD (**librte_net_i40e**) provides poll mode driver support for
810/25/40 Gbps Intel® Ethernet 700 Series Network Adapters based on
9the Intel Ethernet Controller X710/XL710/XXV710 and Intel Ethernet
10Connection X722 (only support part of features).
11
12
13Features
14--------
15
16Features of the i40e PMD are:
17
18- Multiple queues for TX and RX
19- Receiver Side Scaling (RSS)
20- MAC/VLAN filtering
21- Packet type information
22- Flow director
23- Cloud filter
24- Checksum offload
25- VLAN/QinQ stripping and inserting
26- TSO offload
27- Promiscuous mode
28- Multicast mode
29- Port hardware statistics
30- Jumbo frames
31- Link state information
32- Link flow control
33- Mirror on port, VLAN and VSI
34- Interrupt mode for RX
35- Scattered and gather for TX and RX
36- Vector Poll mode driver
37- DCB
38- VMDQ
39- SR-IOV VF
40- Hot plug
41- IEEE1588/802.1AS timestamping
42- VF Daemon (VFD) - EXPERIMENTAL
43- Dynamic Device Personalization (DDP)
44- Queue region configuration
45- Virtual Function Port Representors
46- Malicious Device Drive event catch and notify
47- Generic flow API
48
49Linux Prerequisites
50-------------------
51
52- Identifying your adapter using `Intel Support
53  <http://www.intel.com/support>`_ and get the latest NVM/FW images.
54
55- Follow the DPDK :ref:`Getting Started Guide for Linux <linux_gsg>` to setup the basic DPDK environment.
56
57- To get better performance on Intel platforms, please follow the "How to get best performance with NICs on Intel platforms"
58  section of the :ref:`Getting Started Guide for Linux <linux_gsg>`.
59
60- Upgrade the NVM/FW version following the `Intel® Ethernet NVM Update Tool Quick Usage Guide for Linux
61  <https://www-ssl.intel.com/content/www/us/en/embedded/products/networking/nvm-update-tool-quick-linux-usage-guide.html>`_ and `Intel® Ethernet NVM Update Tool: Quick Usage Guide for EFI <https://www.intel.com/content/www/us/en/embedded/products/networking/nvm-update-tool-quick-efi-usage-guide.html>`_ if needed.
62
63- For information about supported media, please refer to this document: `Intel® Ethernet Controller X710/XXV710/XL710 Feature Support Matrix
64  <http://www.intel.com/content/dam/www/public/us/en/documents/release-notes/xl710-ethernet-controller-feature-matrix.pdf>`_.
65
66   .. Note::
67
68      * Some adapters based on the Intel(R) Ethernet Controller 700 Series only
69        support Intel Ethernet Optics modules. On these adapters, other modules are not
70        supported and will not function.
71
72      * For connections based on Intel(R) Ethernet Controller 700 Series,
73        support is dependent on your system board. Please see your vendor for details.
74
75      * In all cases Intel recommends using Intel Ethernet Optics; other modules
76        may function but are not validated by Intel. Contact Intel for supported media types.
77
78Windows Prerequisites
79---------------------
80
81- Follow the DPDK `Getting Started Guide for Windows <https://doc.dpdk.org/guides/windows_gsg/index.html>`_ to setup the basic DPDK environment.
82
83- Identify the Intel® Ethernet adapter and get the latest NVM/FW version.
84
85- To access any Intel® Ethernet hardware, load the NetUIO driver in place of existing built-in (inbox) driver.
86
87- To load NetUIO driver, follow the steps mentioned in `dpdk-kmods repository
88  <https://git.dpdk.org/dpdk-kmods/tree/windows/netuio/README.rst>`_.
89
90Recommended Matching List
91-------------------------
92
93It is highly recommended to upgrade the i40e kernel driver and firmware to
94avoid the compatibility issues with i40e PMD. Here is the suggested matching
95list which has been tested and verified. The detailed information can refer
96to chapter Tested Platforms/Tested NICs in release notes.
97
98For X710/XL710/XXV710,
99
100   +--------------+-----------------------+------------------+
101   | DPDK version | Kernel driver version | Firmware version |
102   +==============+=======================+==================+
103   |    21.02     |         2.14.13       |       8.00       |
104   +--------------+-----------------------+------------------+
105   |    20.11     |         2.13.10       |       8.00       |
106   +--------------+-----------------------+------------------+
107   |    20.08     |         2.12.6        |       7.30       |
108   +--------------+-----------------------+------------------+
109   |    20.05     |         2.11.27       |       7.30       |
110   +--------------+-----------------------+------------------+
111   |    20.02     |         2.10.19       |       7.20       |
112   +--------------+-----------------------+------------------+
113   |    19.11     |         2.9.21        |       7.00       |
114   +--------------+-----------------------+------------------+
115   |    19.08     |         2.8.43        |       7.00       |
116   +--------------+-----------------------+------------------+
117   |    19.05     |         2.7.29        |       6.80       |
118   +--------------+-----------------------+------------------+
119   |    19.02     |         2.7.26        |       6.80       |
120   +--------------+-----------------------+------------------+
121   |    18.11     |         2.4.6         |       6.01       |
122   +--------------+-----------------------+------------------+
123   |    18.08     |         2.4.6         |       6.01       |
124   +--------------+-----------------------+------------------+
125   |    18.05     |         2.4.6         |       6.01       |
126   +--------------+-----------------------+------------------+
127   |    18.02     |         2.4.3         |       6.01       |
128   +--------------+-----------------------+------------------+
129   |    17.11     |         2.1.26        |       6.01       |
130   +--------------+-----------------------+------------------+
131   |    17.08     |         2.0.19        |       6.01       |
132   +--------------+-----------------------+------------------+
133   |    17.05     |         1.5.23        |       5.05       |
134   +--------------+-----------------------+------------------+
135   |    17.02     |         1.5.23        |       5.05       |
136   +--------------+-----------------------+------------------+
137   |    16.11     |         1.5.23        |       5.05       |
138   +--------------+-----------------------+------------------+
139   |    16.07     |         1.4.25        |       5.04       |
140   +--------------+-----------------------+------------------+
141   |    16.04     |         1.4.25        |       5.02       |
142   +--------------+-----------------------+------------------+
143
144
145For X722,
146
147   +--------------+-----------------------+------------------+
148   | DPDK version | Kernel driver version | Firmware version |
149   +==============+=======================+==================+
150   |    21.02     |         2.14.13       |       5.00       |
151   +--------------+-----------------------+------------------+
152   |    20.11     |         2.13.10       |       5.00       |
153   +--------------+-----------------------+------------------+
154   |    20.08     |         2.12.6        |       4.11       |
155   +--------------+-----------------------+------------------+
156   |    20.05     |         2.11.27       |       4.11       |
157   +--------------+-----------------------+------------------+
158   |    20.02     |         2.10.19       |       4.11       |
159   +--------------+-----------------------+------------------+
160   |    19.11     |         2.9.21        |       4.10       |
161   +--------------+-----------------------+------------------+
162   |    19.08     |         2.9.21        |       4.10       |
163   +--------------+-----------------------+------------------+
164   |    19.05     |         2.7.29        |       3.33       |
165   +--------------+-----------------------+------------------+
166   |    19.02     |         2.7.26        |       3.33       |
167   +--------------+-----------------------+------------------+
168   |    18.11     |         2.4.6         |       3.33       |
169   +--------------+-----------------------+------------------+
170
171
172Pre-Installation Configuration
173------------------------------
174
175Config File Options
176~~~~~~~~~~~~~~~~~~~
177
178The following options can be modified in the ``config/rte_config.h`` file.
179
180- ``RTE_LIBRTE_I40E_QUEUE_NUM_PER_PF`` (default ``64``)
181
182  Number of queues reserved for PF.
183
184- ``RTE_LIBRTE_I40E_QUEUE_NUM_PER_VM`` (default ``4``)
185
186  Number of queues reserved for each VMDQ Pool.
187
188Runtime Config Options
189~~~~~~~~~~~~~~~~~~~~~~
190
191- ``Reserved number of Queues per VF`` (default ``4``)
192
193  The number of reserved queue per VF is determined by its host PF. If the
194  PCI address of an i40e PF is aaaa:bb.cc, the number of reserved queues per
195  VF can be configured with EAL parameter like -a aaaa:bb.cc,queue-num-per-vf=n.
196  The value n can be 1, 2, 4, 8 or 16. If no such parameter is configured, the
197  number of reserved queues per VF is 4 by default. If VF request more than
198  reserved queues per VF, PF will able to allocate max to 16 queues after a VF
199  reset.
200
201
202- ``Support multiple driver`` (default ``disable``)
203
204  There was a multiple driver support issue during use of 700 series Ethernet
205  Adapter with both Linux kernel and DPDK PMD. To fix this issue, ``devargs``
206  parameter ``support-multi-driver`` is introduced, for example::
207
208    -a 84:00.0,support-multi-driver=1
209
210  With the above configuration, DPDK PMD will not change global registers, and
211  will switch PF interrupt from IntN to Int0 to avoid interrupt conflict between
212  DPDK and Linux Kernel.
213
214- ``Support VF Port Representor`` (default ``not enabled``)
215
216  The i40e PF PMD supports the creation of VF port representors for the control
217  and monitoring of i40e virtual function devices. Each port representor
218  corresponds to a single virtual function of that device. Using the ``devargs``
219  option ``representor`` the user can specify which virtual functions to create
220  port representors for on initialization of the PF PMD by passing the VF IDs of
221  the VFs which are required.::
222
223  -a DBDF,representor=[0,1,4]
224
225  Currently hot-plugging of representor ports is not supported so all required
226  representors must be specified on the creation of the PF.
227
228- ``Enable validation for VF message`` (default ``not enabled``)
229
230  The PF counts messages from each VF. If in any period of seconds the message
231  statistic from a VF exceeds maximal limitation, the PF will ignore any new message
232  from that VF for some seconds.
233  Format -- "maximal-message@period-seconds:ignore-seconds"
234  For example::
235
236  -a 84:00.0,vf_msg_cfg=80@120:180
237
238Vector RX Pre-conditions
239~~~~~~~~~~~~~~~~~~~~~~~~
240For Vector RX it is assumed that the number of descriptor rings will be a power
241of 2. With this pre-condition, the ring pointer can easily scroll back to the
242head after hitting the tail without a conditional check. In addition Vector RX
243can use this assumption to do a bit mask using ``ring_size - 1``.
244
245Driver compilation and testing
246------------------------------
247
248Refer to the document :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>`
249for details.
250
251
252SR-IOV: Prerequisites and sample Application Notes
253--------------------------------------------------
254
255#. Load the kernel module:
256
257   .. code-block:: console
258
259      modprobe i40e
260
261   Check the output in dmesg:
262
263   .. code-block:: console
264
265      i40e 0000:83:00.1 ens802f0: renamed from eth0
266
267#. Bring up the PF ports:
268
269   .. code-block:: console
270
271      ifconfig ens802f0 up
272
273#. Create VF device(s):
274
275   Echo the number of VFs to be created into the ``sriov_numvfs`` sysfs entry
276   of the parent PF.
277
278   Example:
279
280   .. code-block:: console
281
282      echo 2 > /sys/devices/pci0000:00/0000:00:03.0/0000:81:00.0/sriov_numvfs
283
284
285#. Assign VF MAC address:
286
287   Assign MAC address to the VF using iproute2 utility. The syntax is:
288
289   .. code-block:: console
290
291      ip link set <PF netdev id> vf <VF id> mac <macaddr>
292
293   Example:
294
295   .. code-block:: console
296
297      ip link set ens802f0 vf 0 mac a0:b0:c0:d0:e0:f0
298
299#. Assign VF to VM, and bring up the VM.
300   Please see the documentation for the *I40E/IXGBE/IGB Virtual Function Driver*.
301
302#. Running testpmd:
303
304   Follow instructions available in the document
305   :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>`
306   to run testpmd.
307
308   Example output:
309
310   .. code-block:: console
311
312      ...
313      EAL: PCI device 0000:83:00.0 on NUMA socket 1
314      EAL: probe driver: 8086:1572 rte_i40e_pmd
315      EAL: PCI memory mapped at 0x7f7f80000000
316      EAL: PCI memory mapped at 0x7f7f80800000
317      PMD: eth_i40e_dev_init(): FW 5.0 API 1.5 NVM 05.00.02 eetrack 8000208a
318      Interactive-mode selected
319      Configuring Port 0 (socket 0)
320      ...
321
322      PMD: i40e_dev_rx_queue_setup(): Rx Burst Bulk Alloc Preconditions are
323      satisfied.Rx Burst Bulk Alloc function will be used on port=0, queue=0.
324
325      ...
326      Port 0: 68:05:CA:26:85:84
327      Checking link statuses...
328      Port 0 Link Up - speed 10000 Mbps - full-duplex
329      Done
330
331      testpmd>
332
333
334Sample Application Notes
335------------------------
336
337Vlan filter
338~~~~~~~~~~~
339
340Vlan filter only works when Promiscuous mode is off.
341
342To start ``testpmd``, and add vlan 10 to port 0:
343
344.. code-block:: console
345
346    ./<build_dir>/app/dpdk-testpmd -l 0-15 -n 4 -- -i --forward-mode=mac
347    ...
348
349    testpmd> set promisc 0 off
350    testpmd> rx_vlan add 10 0
351
352
353Flow Director
354~~~~~~~~~~~~~
355
356The Flow Director works in receive mode to identify specific flows or sets of flows and route them to specific queues.
357The Flow Director filters can match the different fields for different type of packet: flow type, specific input set per flow type and the flexible payload.
358
359The default input set of each flow type is::
360
361   ipv4-other : src_ip_address, dst_ip_address
362   ipv4-frag  : src_ip_address, dst_ip_address
363   ipv4-tcp   : src_ip_address, dst_ip_address, src_port, dst_port
364   ipv4-udp   : src_ip_address, dst_ip_address, src_port, dst_port
365   ipv4-sctp  : src_ip_address, dst_ip_address, src_port, dst_port,
366                verification_tag
367   ipv6-other : src_ip_address, dst_ip_address
368   ipv6-frag  : src_ip_address, dst_ip_address
369   ipv6-tcp   : src_ip_address, dst_ip_address, src_port, dst_port
370   ipv6-udp   : src_ip_address, dst_ip_address, src_port, dst_port
371   ipv6-sctp  : src_ip_address, dst_ip_address, src_port, dst_port,
372                verification_tag
373   l2_payload : ether_type
374
375The flex payload is selected from offset 0 to 15 of packet's payload by default, while it is masked out from matching.
376
377Start ``testpmd`` with ``--disable-rss`` and ``--pkt-filter-mode=perfect``:
378
379.. code-block:: console
380
381   ./<build_dir>/app/dpdk-testpmd -l 0-15 -n 4 -- -i --disable-rss \
382                 --pkt-filter-mode=perfect --rxq=8 --txq=8 --nb-cores=8 \
383                 --nb-ports=1
384
385Add a rule to direct ``ipv4-udp`` packet whose ``dst_ip=2.2.2.5, src_ip=2.2.2.3, src_port=32, dst_port=32`` to queue 1:
386
387.. code-block:: console
388
389   testpmd> flow create 0 ingress pattern eth / ipv4 src is 2.2.2.3 \
390            dst is 2.2.2.5 / udp src is 32 dst is 32 / end \
391            actions mark id 1 / queue index 1 / end
392
393Check the flow director status:
394
395.. code-block:: console
396
397   testpmd> show port fdir 0
398
399   ######################## FDIR infos for port 0      ####################
400     MODE:   PERFECT
401     SUPPORTED FLOW TYPE:  ipv4-frag ipv4-tcp ipv4-udp ipv4-sctp ipv4-other
402                           ipv6-frag ipv6-tcp ipv6-udp ipv6-sctp ipv6-other
403			   l2_payload
404     FLEX PAYLOAD INFO:
405     max_len:	    16	        payload_limit: 480
406     payload_unit:  2	        payload_seg:   3
407     bitmask_unit:  2	        bitmask_num:   2
408     MASK:
409       vlan_tci: 0x0000,
410       src_ipv4: 0x00000000,
411       dst_ipv4: 0x00000000,
412       src_port: 0x0000,
413       dst_port: 0x0000
414       src_ipv6: 0x00000000,0x00000000,0x00000000,0x00000000,
415       dst_ipv6: 0x00000000,0x00000000,0x00000000,0x00000000
416     FLEX PAYLOAD SRC OFFSET:
417       L2_PAYLOAD:    0      1	    2	   3	  4	 5	6  ...
418       L3_PAYLOAD:    0      1	    2	   3	  4	 5	6  ...
419       L4_PAYLOAD:    0      1	    2	   3	  4	 5	6  ...
420     FLEX MASK CFG:
421       ipv4-udp:    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
422       ipv4-tcp:    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
423       ipv4-sctp:   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
424       ipv4-other:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
425       ipv4-frag:   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
426       ipv6-udp:    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
427       ipv6-tcp:    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
428       ipv6-sctp:   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
429       ipv6-other:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
430       ipv6-frag:   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
431       l2_payload:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
432     guarant_count: 1	        best_count:    0
433     guarant_space: 512         best_space:    7168
434     collision:     0	        free:	       0
435     maxhash:	    0	        maxlen:        0
436     add:	    0	        remove:        0
437     f_add:	    0	        f_remove:      0
438
439
440Floating VEB
441~~~~~~~~~~~~~
442
443The Intel® Ethernet 700 Series support a feature called
444"Floating VEB".
445
446A Virtual Ethernet Bridge (VEB) is an IEEE Edge Virtual Bridging (EVB) term
447for functionality that allows local switching between virtual endpoints within
448a physical endpoint and also with an external bridge/network.
449
450A "Floating" VEB doesn't have an uplink connection to the outside world so all
451switching is done internally and remains within the host. As such, this
452feature provides security benefits.
453
454In addition, a Floating VEB overcomes a limitation of normal VEBs where they
455cannot forward packets when the physical link is down. Floating VEBs don't need
456to connect to the NIC port so they can still forward traffic from VF to VF
457even when the physical link is down.
458
459Therefore, with this feature enabled VFs can be limited to communicating with
460each other but not an outside network, and they can do so even when there is
461no physical uplink on the associated NIC port.
462
463To enable this feature, the user should pass a ``devargs`` parameter to the
464EAL, for example::
465
466    -a 84:00.0,enable_floating_veb=1
467
468In this configuration the PMD will use the floating VEB feature for all the
469VFs created by this PF device.
470
471Alternatively, the user can specify which VFs need to connect to this floating
472VEB using the ``floating_veb_list`` argument::
473
474    -a 84:00.0,enable_floating_veb=1,floating_veb_list=1;3-4
475
476In this example ``VF1``, ``VF3`` and ``VF4`` connect to the floating VEB,
477while other VFs connect to the normal VEB.
478
479The current implementation only supports one floating VEB and one regular
480VEB. VFs can connect to a floating VEB or a regular VEB according to the
481configuration passed on the EAL command line.
482
483The floating VEB functionality requires a NIC firmware version of 5.0
484or greater.
485
486Dynamic Device Personalization (DDP)
487~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
488
489The Intel® Ethernet 700 Series except for the Intel Ethernet Connection
490X722 support a feature called "Dynamic Device Personalization (DDP)",
491which is used to configure hardware by downloading a profile to support
492protocols/filters which are not supported by default. The DDP
493functionality requires a NIC firmware version of 6.0 or greater.
494
495Current implementation supports GTP-C/GTP-U/PPPoE/PPPoL2TP/ESP,
496steering can be used with rte_flow API.
497
498GTPv1 package is released, and it can be downloaded from
499https://downloadcenter.intel.com/download/27587.
500
501PPPoE package is released, and it can be downloaded from
502https://downloadcenter.intel.com/download/28040.
503
504ESP-AH package is released, and it can be downloaded from
505https://downloadcenter.intel.com/download/29446.
506
507Load a profile which supports GTP and store backup profile:
508
509.. code-block:: console
510
511   testpmd> ddp add 0 ./gtp.pkgo,./backup.pkgo
512
513Delete a GTP profile and restore backup profile:
514
515.. code-block:: console
516
517   testpmd> ddp del 0 ./backup.pkgo
518
519Get loaded DDP package info list:
520
521.. code-block:: console
522
523   testpmd> ddp get list 0
524
525Display information about a GTP profile:
526
527.. code-block:: console
528
529   testpmd> ddp get info ./gtp.pkgo
530
531Input set configuration
532~~~~~~~~~~~~~~~~~~~~~~~
533Input set for any PCTYPE can be configured with user defined configuration,
534For example, to use only 48bit prefix for IPv6 src address for IPv6 TCP RSS:
535
536.. code-block:: console
537
538   testpmd> port config 0 pctype 43 hash_inset clear all
539   testpmd> port config 0 pctype 43 hash_inset set field 13
540   testpmd> port config 0 pctype 43 hash_inset set field 14
541   testpmd> port config 0 pctype 43 hash_inset set field 15
542
543Queue region configuration
544~~~~~~~~~~~~~~~~~~~~~~~~~~~
545The Intel® Ethernet 700 Series supports a feature of queue regions
546configuration for RSS in the PF, so that different traffic classes or
547different packet classification types can be separated to different
548queues in different queue regions. There is an API for configuration
549of queue regions in RSS with a command line. It can parse the parameters
550of the region index, queue number, queue start index, user priority, traffic
551classes and so on. Depending on commands from the command line, it will call
552i40e private APIs and start the process of setting or flushing the queue
553region configuration. As this feature is specific for i40e only private
554APIs are used. These new ``test_pmd`` commands are as shown below. For
555details please refer to :doc:`../testpmd_app_ug/index`.
556
557.. code-block:: console
558
559   testpmd> set port (port_id) queue-region region_id (value) \
560		queue_start_index (value) queue_num (value)
561   testpmd> set port (port_id) queue-region region_id (value) flowtype (value)
562   testpmd> set port (port_id) queue-region UP (value) region_id (value)
563   testpmd> set port (port_id) queue-region flush (on|off)
564   testpmd> show port (port_id) queue-region
565
566Generic flow API
567~~~~~~~~~~~~~~~~~~~
568
569- ``RSS Flow``
570
571  RSS Flow supports to set hash input set, hash function, enable hash
572  and configure queues.
573  For example:
574  Configure queues as queue 0, 1, 2, 3.
575
576  .. code-block:: console
577
578    testpmd> flow create 0 ingress pattern end actions rss types end \
579      queues 0 1 2 3 end / end
580
581  Enable hash and set input set for ipv4-tcp.
582
583  .. code-block:: console
584
585    testpmd> flow create 0 ingress pattern eth / ipv4 / tcp / end \
586      actions rss types ipv4-tcp l3-src-only end queues end / end
587
588  Set symmetric hash enable for flow type ipv4-tcp.
589
590  .. code-block:: console
591
592    testpmd> flow create 0 ingress pattern eth / ipv4 / tcp / end \
593      actions rss types ipv4-tcp end queues end func symmetric_toeplitz / end
594
595  Set hash function as simple xor.
596
597  .. code-block:: console
598
599    testpmd> flow create 0 ingress pattern end actions rss types end \
600      queues end func simple_xor / end
601
602Limitations or Known issues
603---------------------------
604
605MPLS packet classification
606~~~~~~~~~~~~~~~~~~~~~~~~~~
607
608For firmware versions prior to 5.0, MPLS packets are not recognized by the NIC.
609The L2 Payload flow type in flow director can be used to classify MPLS packet
610by using a command in testpmd like:
611
612   testpmd> flow_director_filter 0 mode IP add flow l2_payload ether \
613            0x8847 flexbytes () fwd pf queue <N> fd_id <M>
614
615With the NIC firmware version 5.0 or greater, some limited MPLS support
616is added: Native MPLS (MPLS in Ethernet) skip is implemented, while no
617new packet type, no classification or offload are possible. With this change,
618L2 Payload flow type in flow director cannot be used to classify MPLS packet
619as with previous firmware versions. Meanwhile, the Ethertype filter can be
620used to classify MPLS packet by using a command in testpmd like:
621
622   testpmd> flow create 0 ingress pattern eth type is 0x8847 / end \
623            actions queue index <M> / end
624
62516 Byte RX Descriptor setting on DPDK VF
626~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
627
628Currently the VF's RX descriptor mode is decided by PF. There's no PF-VF
629interface for VF to request the RX descriptor mode, also no interface to notify
630VF its own RX descriptor mode.
631For all available versions of the i40e driver, these drivers don't support 16
632byte RX descriptor. If the Linux i40e kernel driver is used as host driver,
633while DPDK i40e PMD is used as the VF driver, DPDK cannot choose 16 byte receive
634descriptor. The reason is that the RX descriptor is already set to 32 byte by
635the i40e kernel driver.
636In the future, if the Linux i40e driver supports 16 byte RX descriptor, user
637should make sure the DPDK VF uses the same RX descriptor mode, 16 byte or 32
638byte, as the PF driver.
639
640The same rule for DPDK PF + DPDK VF. The PF and VF should use the same RX
641descriptor mode. Or the VF RX will not work.
642
643Receive packets with Ethertype 0x88A8
644~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
645
646Due to the FW limitation, PF can receive packets with Ethertype 0x88A8
647only when floating VEB is disabled.
648
649Incorrect Rx statistics when packet is oversize
650~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
651
652When a packet is over maximum frame size, the packet is dropped.
653However, the Rx statistics, when calling `rte_eth_stats_get` incorrectly
654shows it as received.
655
656RX/TX statistics may be incorrect when register overflowed
657~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
658
659The rx_bytes/tx_bytes statistics register is 48 bit length.
660Although this limitation is enlarged to 64 bit length on the software side,
661but there is no way to detect if the overflow occurred more than once.
662So rx_bytes/tx_bytes statistics data is correct when statistics are
663updated at least once between two overflows.
664
665VF & TC max bandwidth setting
666~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
667
668The per VF max bandwidth and per TC max bandwidth cannot be enabled in parallel.
669The behavior is different when handling per VF and per TC max bandwidth setting.
670When enabling per VF max bandwidth, SW will check if per TC max bandwidth is
671enabled. If so, return failure.
672When enabling per TC max bandwidth, SW will check if per VF max bandwidth
673is enabled. If so, disable per VF max bandwidth and continue with per TC max
674bandwidth setting.
675
676TC TX scheduling mode setting
677~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
678
679There are 2 TX scheduling modes for TCs, round robin and strict priority mode.
680If a TC is set to strict priority mode, it can consume unlimited bandwidth.
681It means if APP has set the max bandwidth for that TC, it comes to no
682effect.
683It's suggested to set the strict priority mode for a TC that is latency
684sensitive but no consuming much bandwidth.
685
686VF performance is impacted by PCI extended tag setting
687~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
688
689To reach maximum NIC performance in the VF the PCI extended tag must be
690enabled. The DPDK i40e PF driver will set this feature during initialization,
691but the kernel PF driver does not. So when running traffic on a VF which is
692managed by the kernel PF driver, a significant NIC performance downgrade has
693been observed (for 64 byte packets, there is about 25% line-rate downgrade for
694a 25GbE device and about 35% for a 40GbE device).
695
696For kernel version >= 4.11, the kernel's PCI driver will enable the extended
697tag if it detects that the device supports it. So by default, this is not an
698issue. For kernels <= 4.11 or when the PCI extended tag is disabled it can be
699enabled using the steps below.
700
701#. Get the current value of the PCI configure register::
702
703      setpci -s <XX:XX.X> a8.w
704
705#. Set bit 8::
706
707      value = value | 0x100
708
709#. Set the PCI configure register with new value::
710
711      setpci -s <XX:XX.X> a8.w=<value>
712
713Vlan strip of VF
714~~~~~~~~~~~~~~~~
715
716The VF vlan strip function is only supported in the i40e kernel driver >= 2.1.26.
717
718DCB function
719~~~~~~~~~~~~
720
721DCB works only when RSS is enabled.
722
723Global configuration warning
724~~~~~~~~~~~~~~~~~~~~~~~~~~~~
725
726I40E PMD will set some global registers to enable some function or set some
727configure. Then when using different ports of the same NIC with Linux kernel
728and DPDK, the port with Linux kernel will be impacted by the port with DPDK.
729For example, register I40E_GL_SWT_L2TAGCTRL is used to control L2 tag, i40e
730PMD uses I40E_GL_SWT_L2TAGCTRL to set vlan TPID. If setting TPID in port A
731with DPDK, then the configuration will also impact port B in the NIC with
732kernel driver, which don't want to use the TPID.
733So PMD reports warning to clarify what is changed by writing global register.
734
735Cloud Filter
736~~~~~~~~~~~~
737
738When programming cloud filters for IPv4/6_UDP/TCP/SCTP with SRC port only or DST port only,
739it will make any cloud filter using inner_vlan or tunnel key invalid. Default configuration will be
740recovered only by NIC core reset.
741
742Mirror rule limitation for X722
743~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
744
745Due to firmware restriction of X722, the same VSI cannot have more than one mirror rule.
746
747High Performance of Small Packets on 40GbE NIC
748----------------------------------------------
749
750As there might be firmware fixes for performance enhancement in latest version
751of firmware image, the firmware update might be needed for getting high performance.
752Check the Intel support website for the latest firmware updates.
753Users should consult the release notes specific to a DPDK release to identify
754the validated firmware version for a NIC using the i40e driver.
755
756Use 16 Bytes RX Descriptor Size
757~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
758
759As i40e PMD supports both 16 and 32 bytes RX descriptor sizes, and 16 bytes size can provide helps to high performance of small packets.
760In ``config/rte_config.h`` set the following to use 16 bytes size RX descriptors::
761
762   #define RTE_LIBRTE_I40E_16BYTE_RX_DESC 1
763
764Input set requirement of each pctype for FDIR
765~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
766
767Each PCTYPE can only have one specific FDIR input set at one time.
768For example, if creating 2 rte_flow rules with different input set for one PCTYPE,
769it will fail and return the info "Conflict with the first rule's input set",
770which means the current rule's input set conflicts with the first rule's.
771Remove the first rule if want to change the input set of the PCTYPE.
772
773Example of getting best performance with l3fwd example
774------------------------------------------------------
775
776The following is an example of running the DPDK ``l3fwd`` sample application to get high performance with a
777server with Intel Xeon processors and Intel Ethernet CNA XL710.
778
779The example scenario is to get best performance with two Intel Ethernet CNA XL710 40GbE ports.
780See :numref:`figure_intel_perf_test_setup` for the performance test setup.
781
782.. _figure_intel_perf_test_setup:
783
784.. figure:: img/intel_perf_test_setup.*
785
786   Performance Test Setup
787
788
7891. Add two Intel Ethernet CNA XL710 to the platform, and use one port per card to get best performance.
790   The reason for using two NICs is to overcome a PCIe v3.0 limitation since it cannot provide 80GbE bandwidth
791   for two 40GbE ports, but two different PCIe v3.0 x8 slot can.
792   Refer to the sample NICs output above, then we can select ``82:00.0`` and ``85:00.0`` as test ports::
793
794      82:00.0 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583]
795      85:00.0 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583]
796
7972. Connect the ports to the traffic generator. For high speed testing, it's best to use a hardware traffic generator.
798
7993. Check the PCI devices numa node (socket id) and get the cores number on the exact socket id.
800   In this case, ``82:00.0`` and ``85:00.0`` are both in socket 1, and the cores on socket 1 in the referenced platform
801   are 18-35 and 54-71.
802   Note: Don't use 2 logical cores on the same core (e.g core18 has 2 logical cores, core18 and core54), instead, use 2 logical
803   cores from different cores (e.g core18 and core19).
804
8054. Bind these two ports to igb_uio.
806
8075. As to Intel Ethernet CNA XL710 40GbE port, we need at least two queue pairs to achieve best performance, then two queues per port
808   will be required, and each queue pair will need a dedicated CPU core for receiving/transmitting packets.
809
8106. The DPDK sample application ``l3fwd`` will be used for performance testing, with using two ports for bi-directional forwarding.
811   Compile the ``l3fwd sample`` with the default lpm mode.
812
8137. The command line of running l3fwd would be something like the following::
814
815      ./dpdk-l3fwd -l 18-21 -n 4 -a 82:00.0 -a 85:00.0 \
816              -- -p 0x3 --config '(0,0,18),(0,1,19),(1,0,20),(1,1,21)'
817
818   This means that the application uses core 18 for port 0, queue pair 0 forwarding, core 19 for port 0, queue pair 1 forwarding,
819   core 20 for port 1, queue pair 0 forwarding, and core 21 for port 1, queue pair 1 forwarding.
820
8218. Configure the traffic at a traffic generator.
822
823   * Start creating a stream on packet generator.
824
825   * Set the Ethernet II type to 0x0800.
826
827Tx bytes affected by the link status change
828~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
829
830For firmware versions prior to 6.01 for X710 series and 3.33 for X722 series, the tx_bytes statistics data is affected by
831the link down event. Each time the link status changes to down, the tx_bytes decreases 110 bytes.
832