xref: /dpdk/doc/guides/nics/thunderx.rst (revision 3cc6ecfdfe85d2577fef30e1791bb7534e3d60b3)
1..  SPDX-License-Identifier: BSD-3-Clause
2    Copyright(c) 2016 Cavium, Inc
3
4ThunderX NICVF Poll Mode Driver
5===============================
6
7The ThunderX NICVF PMD (**librte_pmd_thunderx_nicvf**) provides poll mode driver
8support for the inbuilt NIC found in the **Cavium ThunderX** SoC family
9as well as their virtual functions (VF) in SR-IOV context.
10
11More information can be found at `Cavium, Inc Official Website
12<http://www.cavium.com/ThunderX_ARM_Processors.html>`_.
13
14Features
15--------
16
17Features of the ThunderX PMD are:
18
19- Multiple queues for TX and RX
20- Receive Side Scaling (RSS)
21- Packet type information
22- Checksum offload
23- Promiscuous mode
24- Multicast mode
25- Port hardware statistics
26- Jumbo frames
27- Link state information
28- Setting up link state.
29- Scattered and gather for TX and RX
30- VLAN stripping
31- SR-IOV VF
32- NUMA support
33- Multi queue set support (up to 96 queues (12 queue sets)) per port
34- Skip data bytes
35
36Supported ThunderX SoCs
37-----------------------
38- CN88xx
39- CN81xx
40- CN83xx
41
42Prerequisites
43-------------
44- Follow the DPDK :ref:`Getting Started Guide for Linux <linux_gsg>` to setup the basic DPDK environment.
45
46Pre-Installation Configuration
47------------------------------
48
49Config File Options
50~~~~~~~~~~~~~~~~~~~
51
52The following options can be modified in the ``config`` file.
53Please note that enabling debugging options may affect system performance.
54
55- ``CONFIG_RTE_LIBRTE_THUNDERX_NICVF_PMD`` (default ``y``)
56
57  Toggle compilation of the ``librte_pmd_thunderx_nicvf`` driver.
58
59- ``CONFIG_RTE_LIBRTE_THUNDERX_NICVF_DEBUG_RX`` (default ``n``)
60
61  Toggle asserts of receive fast path.
62
63- ``CONFIG_RTE_LIBRTE_THUNDERX_NICVF_DEBUG_TX`` (default ``n``)
64
65  Toggle asserts of transmit fast path.
66
67Driver compilation and testing
68------------------------------
69
70Refer to the document :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>`
71for details.
72
73To compile the ThunderX NICVF PMD for Linux arm64 gcc,
74use arm64-thunderx-linux-gcc as target.
75
76Linux
77-----
78
79SR-IOV: Prerequisites and sample Application Notes
80~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
81
82Current ThunderX NIC PF/VF kernel modules maps each physical Ethernet port
83automatically to virtual function (VF) and presented them as PCIe-like SR-IOV device.
84This section provides instructions to configure SR-IOV with Linux OS.
85
86#. Verify PF devices capabilities using ``lspci``:
87
88   .. code-block:: console
89
90      lspci -vvv
91
92   Example output:
93
94   .. code-block:: console
95
96      0002:01:00.0 Ethernet controller: Cavium Networks Device a01e (rev 01)
97      ...
98      Capabilities: [100 v1] Alternative Routing-ID Interpretation (ARI)
99      ...
100      Capabilities: [180 v1] Single Root I/O Virtualization (SR-IOV)
101      ...
102      Kernel driver in use: thunder-nic
103      ...
104
105   .. note::
106
107      Unless ``thunder-nic`` driver is in use make sure your kernel config includes ``CONFIG_THUNDER_NIC_PF`` setting.
108
109#. Verify VF devices capabilities and drivers using ``lspci``:
110
111   .. code-block:: console
112
113      lspci -vvv
114
115   Example output:
116
117   .. code-block:: console
118
119      0002:01:00.1 Ethernet controller: Cavium Networks Device 0011 (rev 01)
120      ...
121      Capabilities: [100 v1] Alternative Routing-ID Interpretation (ARI)
122      ...
123      Kernel driver in use: thunder-nicvf
124      ...
125
126      0002:01:00.2 Ethernet controller: Cavium Networks Device 0011 (rev 01)
127      ...
128      Capabilities: [100 v1] Alternative Routing-ID Interpretation (ARI)
129      ...
130      Kernel driver in use: thunder-nicvf
131      ...
132
133   .. note::
134
135      Unless ``thunder-nicvf`` driver is in use make sure your kernel config includes ``CONFIG_THUNDER_NIC_VF`` setting.
136
137#. Pass VF device to VM context (PCIe Passthrough):
138
139   The VF devices may be passed through to the guest VM using qemu or
140   virt-manager or virsh etc.
141
142   Example qemu guest launch command:
143
144   .. code-block:: console
145
146      sudo qemu-system-aarch64 -name vm1 \
147      -machine virt,gic_version=3,accel=kvm,usb=off \
148      -cpu host -m 4096 \
149      -smp 4,sockets=1,cores=8,threads=1 \
150      -nographic -nodefaults \
151      -kernel <kernel image> \
152      -append "root=/dev/vda console=ttyAMA0 rw hugepagesz=512M hugepages=3" \
153      -device vfio-pci,host=0002:01:00.1 \
154      -drive file=<rootfs.ext3>,if=none,id=disk1,format=raw  \
155      -device virtio-blk-device,scsi=off,drive=disk1,id=virtio-disk1,bootindex=1 \
156      -netdev tap,id=net0,ifname=tap0,script=/etc/qemu-ifup_thunder \
157      -device virtio-net-device,netdev=net0 \
158      -serial stdio \
159      -mem-path /dev/hugepages
160
161#. Enable **VFIO-NOIOMMU** mode (optional):
162
163   .. code-block:: console
164
165      echo 1 > /sys/module/vfio/parameters/enable_unsafe_noiommu_mode
166
167   .. note::
168
169      **VFIO-NOIOMMU** is required only when running in VM context and should not be enabled otherwise.
170
171#. Running testpmd:
172
173   Follow instructions available in the document
174   :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>`
175   to run testpmd.
176
177   Example output:
178
179   .. code-block:: console
180
181      ./arm64-thunderx-linux-gcc/app/testpmd -l 0-3 -n 4 -w 0002:01:00.2 \
182        -- -i --no-flush-rx \
183        --port-topology=loop
184
185      ...
186
187      PMD: rte_nicvf_pmd_init(): librte_pmd_thunderx nicvf version 1.0
188
189      ...
190      EAL:   probe driver: 177d:11 rte_nicvf_pmd
191      EAL:   using IOMMU type 1 (Type 1)
192      EAL:   PCI memory mapped at 0x3ffade50000
193      EAL: Trying to map BAR 4 that contains the MSI-X table.
194           Trying offsets: 0x40000000000:0x0000, 0x10000:0x1f0000
195      EAL:   PCI memory mapped at 0x3ffadc60000
196      PMD: nicvf_eth_dev_init(): nicvf: device (177d:11) 2:1:0:2
197      PMD: nicvf_eth_dev_init(): node=0 vf=1 mode=tns-bypass sqs=false
198           loopback_supported=true
199      PMD: nicvf_eth_dev_init(): Port 0 (177d:11) mac=a6:c6:d9:17:78:01
200      Interactive-mode selected
201      Configuring Port 0 (socket 0)
202      ...
203
204      PMD: nicvf_dev_configure(): Configured ethdev port0 hwcap=0x0
205      Port 0: A6:C6:D9:17:78:01
206      Checking link statuses...
207      Port 0 Link Up - speed 10000 Mbps - full-duplex
208      Done
209      testpmd>
210
211Multiple Queue Set per DPDK port configuration
212~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
213
214There are two types of VFs:
215
216- Primary VF
217- Secondary VF
218
219Each port consists of a primary VF and n secondary VF(s). Each VF provides 8 Tx/Rx queues to a port.
220When a given port is configured to use more than 8 queues, it requires one (or more) secondary VF.
221Each secondary VF adds 8 additional queues to the queue set.
222
223During PMD driver initialization, the primary VF's are enumerated by checking the
224specific flag (see sqs message in DPDK boot log - sqs indicates secondary queue set).
225They are at the beginning of VF list (the remain ones are secondary VF's).
226
227The primary VFs are used as master queue sets. Secondary VFs provide
228additional queue sets for primary ones. If a port is configured for more then
2298 queues than it will request for additional queues from secondary VFs.
230
231Secondary VFs cannot be shared between primary VFs.
232
233Primary VFs are present on the beginning of the 'Network devices using kernel
234driver' list, secondary VFs are on the remaining on the remaining part of the list.
235
236   .. note::
237
238      The VNIC driver in the multiqueue setup works differently than other drivers like `ixgbe`.
239      We need to bind separately each specific queue set device with the ``usertools/dpdk-devbind.py`` utility.
240
241   .. note::
242
243      Depending on the hardware used, the kernel driver sets a threshold ``vf_id``. VFs that try to attached with an id below or equal to
244      this boundary are considered primary VFs. VFs that try to attach with an id above this boundary are considered secondary VFs.
245
246LBK HW Access
247~~~~~~~~~~~~~
248
249Loopback HW Unit (LBK) receives packets from NIC-RX and sends packets back to NIC-TX.
250The loopback block has N channels and contains data buffering that is shared across
251all channels. Four primary VFs are reserved as loopback ports.
252
253Example device binding
254~~~~~~~~~~~~~~~~~~~~~~
255
256If a system has three interfaces, a total of 18 VF devices will be created
257on a non-NUMA machine.
258
259   .. note::
260
261      NUMA systems have 12 VFs per port and non-NUMA 6 VFs per port.
262
263   .. code-block:: console
264
265      # usertools/dpdk-devbind.py --status
266
267      Network devices using DPDK-compatible driver
268      ============================================
269      <none>
270
271      Network devices using kernel driver
272      ===================================
273      0000:01:10.0 'THUNDERX BGX (Common Ethernet Interface) a026' if= drv=thunder-BGX unused=vfio-pci
274      0000:01:10.1 'THUNDERX BGX (Common Ethernet Interface) a026' if= drv=thunder-BGX unused=vfio-pci
275      0001:01:00.0 'THUNDERX Network Interface Controller a01e' if= drv=thunder-nic unused=vfio-pci
276      0001:01:00.1 'Device a034' if=eth0 drv=thunder-nicvf unused=vfio-pci
277      0001:01:00.2 'Device a034' if=eth1 drv=thunder-nicvf unused=vfio-pci
278      0001:01:00.3 'Device a034' if=eth2 drv=thunder-nicvf unused=vfio-pci
279      0001:01:00.4 'Device a034' if=eth3 drv=thunder-nicvf unused=vfio-pci
280      0001:01:00.5 'Device a034' if=eth4 drv=thunder-nicvf unused=vfio-pci
281      0001:01:00.6 'Device a034' if=lbk0 drv=thunder-nicvf unused=vfio-pci
282      0001:01:00.7 'Device a034' if=lbk1 drv=thunder-nicvf unused=vfio-pci
283      0001:01:01.0 'Device a034' if=lbk2 drv=thunder-nicvf unused=vfio-pci
284      0001:01:01.1 'Device a034' if=lbk3 drv=thunder-nicvf unused=vfio-pci
285      0001:01:01.2 'Device a034' if= drv=thunder-nicvf unused=vfio-pci
286      0001:01:01.3 'Device a034' if= drv=thunder-nicvf unused=vfio-pci
287      0001:01:01.4 'Device a034' if= drv=thunder-nicvf unused=vfio-pci
288      0001:01:01.5 'Device a034' if= drv=thunder-nicvf unused=vfio-pci
289      0001:01:01.6 'Device a034' if= drv=thunder-nicvf unused=vfio-pci
290      0001:01:01.7 'Device a034' if= drv=thunder-nicvf unused=vfio-pci
291      0001:01:02.0 'Device a034' if= drv=thunder-nicvf unused=vfio-pci
292      0001:01:02.1 'Device a034' if= drv=thunder-nicvf unused=vfio-pci
293      0001:01:02.2 'Device a034' if= drv=thunder-nicvf unused=vfio-pci
294
295      Other network devices
296      =====================
297      0002:00:03.0 'Device a01f' unused=vfio-pci,uio_pci_generic
298
299   .. note::
300
301      Here total no of primary VFs = 5 (variable, depends on no of ethernet ports present) + 4 (fixed, loopback ports).
302      Ethernet ports are indicated as `if=eth0` while loopback ports as `if=lbk0`.
303
304We want to bind two physical interfaces with 24 queues each device, we attach two primary VFs
305and four secondary VFs. In our example we choose two 10G interfaces eth1 (0002:01:00.2) and eth2 (0002:01:00.3).
306We will choose four secondary queue sets from the ending of the list (0001:01:01.2-0002:01:02.2).
307
308
309#. Bind two primary VFs to the ``vfio-pci`` driver:
310
311   .. code-block:: console
312
313      usertools/dpdk-devbind.py -b vfio-pci 0002:01:00.2
314      usertools/dpdk-devbind.py -b vfio-pci 0002:01:00.3
315
316#. Bind four primary VFs to the ``vfio-pci`` driver:
317
318   .. code-block:: console
319
320      usertools/dpdk-devbind.py -b vfio-pci 0002:01:01.7
321      usertools/dpdk-devbind.py -b vfio-pci 0002:01:02.0
322      usertools/dpdk-devbind.py -b vfio-pci 0002:01:02.1
323      usertools/dpdk-devbind.py -b vfio-pci 0002:01:02.2
324
325The nicvf thunderx driver will make use of attached secondary VFs automatically during the interface configuration stage.
326
327Thunder-nic VF's
328~~~~~~~~~~~~~~~~
329
330Use sysfs to distinguish thunder-nic primary VFs and secondary VFs.
331   .. code-block:: console
332
333      ls -l /sys/bus/pci/drivers/thunder-nic/
334      total 0
335      drwxr-xr-x  2 root root     0 Jan 22 11:19 ./
336      drwxr-xr-x 86 root root     0 Jan 22 11:07 ../
337      lrwxrwxrwx  1 root root     0 Jan 22 11:19 0001:01:00.0 -> '../../../../devices/platform/soc@0/849000000000.pci/pci0001:00/0001:00:10.0/0001:01:00.0'/
338
339   .. code-block:: console
340
341      cat /sys/bus/pci/drivers/thunder-nic/0001\:01\:00.0/sriov_sqs_assignment
342      12
343      0 0001:01:00.1 vfio-pci +: 12 13
344      1 0001:01:00.2 thunder-nicvf -:
345      2 0001:01:00.3 thunder-nicvf -:
346      3 0001:01:00.4 thunder-nicvf -:
347      4 0001:01:00.5 thunder-nicvf -:
348      5 0001:01:00.6 thunder-nicvf -:
349      6 0001:01:00.7 thunder-nicvf -:
350      7 0001:01:01.0 thunder-nicvf -:
351      8 0001:01:01.1 thunder-nicvf -:
352      9 0001:01:01.2 thunder-nicvf -:
353      10 0001:01:01.3 thunder-nicvf -:
354      11 0001:01:01.4 thunder-nicvf -:
355      12 0001:01:01.5 vfio-pci: 0
356      13 0001:01:01.6 vfio-pci: 0
357      14 0001:01:01.7 thunder-nicvf: 255
358      15 0001:01:02.0 thunder-nicvf: 255
359      16 0001:01:02.1 thunder-nicvf: 255
360      17 0001:01:02.2 thunder-nicvf: 255
361      18 0001:01:02.3 thunder-nicvf: 255
362      19 0001:01:02.4 thunder-nicvf: 255
363      20 0001:01:02.5 thunder-nicvf: 255
364      21 0001:01:02.6 thunder-nicvf: 255
365      22 0001:01:02.7 thunder-nicvf: 255
366      23 0001:01:03.0 thunder-nicvf: 255
367      24 0001:01:03.1 thunder-nicvf: 255
368      25 0001:01:03.2 thunder-nicvf: 255
369      26 0001:01:03.3 thunder-nicvf: 255
370      27 0001:01:03.4 thunder-nicvf: 255
371      28 0001:01:03.5 thunder-nicvf: 255
372      29 0001:01:03.6 thunder-nicvf: 255
373      30 0001:01:03.7 thunder-nicvf: 255
374      31 0001:01:04.0 thunder-nicvf: 255
375
376Every column that ends with 'thunder-nicvf: number' can be used as secondary VF.
377In printout above all entres after '14 0001:01:01.7 thunder-nicvf: 255' can be used as secondary VF.
378
379Debugging Options
380-----------------
381
382EAL command option to change  log level
383   .. code-block:: console
384
385      --log-level=pmd.net.thunderx.driver:info
386      or
387      --log-level=pmd.net.thunderx.driver,7
388
389Module params
390--------------
391
392skip_data_bytes
393~~~~~~~~~~~~~~~
394This feature is used to create a hole between HEADROOM and actual data. Size of hole is specified
395in bytes as module param("skip_data_bytes") to pmd.
396This scheme is useful when application would like to insert vlan header without disturbing HEADROOM.
397
398Example:
399   .. code-block:: console
400
401      -w 0002:01:00.2,skip_data_bytes=8
402
403Limitations
404-----------
405
406CRC stripping
407~~~~~~~~~~~~~
408
409The ThunderX SoC family NICs strip the CRC for every packets coming into the
410host interface irrespective of the offload configuration.
411
412Maximum packet length
413~~~~~~~~~~~~~~~~~~~~~
414
415The ThunderX SoC family NICs support a maximum of a 9K jumbo frame. The value
416is fixed and cannot be changed. So, even when the ``rxmode.max_rx_pkt_len``
417member of ``struct rte_eth_conf`` is set to a value lower than 9200, frames
418up to 9200 bytes can still reach the host interface.
419
420Maximum packet segments
421~~~~~~~~~~~~~~~~~~~~~~~
422
423The ThunderX SoC family NICs support up to 12 segments per packet when working
424in scatter/gather mode. So, setting MTU will result with ``EINVAL`` when the
425frame size does not fit in the maximum number of segments.
426
427skip_data_bytes
428~~~~~~~~~~~~~~~
429
430Maximum limit of skip_data_bytes is 128 bytes and number of bytes should be multiple of 8.
431