xref: /dpdk/doc/guides/nics/intel_vf.rst (revision ea952d7b68f0c45e92d14710457534a80e75d021)
1..  SPDX-License-Identifier: BSD-3-Clause
2    Copyright(c) 2010-2014 Intel Corporation.
3
4.. include:: <isonum.txt>
5
6Intel Virtual Function Driver
7=============================
8
9Supported Intel® Ethernet Controllers (see the *DPDK Release Notes* for details)
10support the following modes of operation in a virtualized environment:
11
12*   **SR-IOV mode**: Involves direct assignment of part of the port resources to different guest operating systems
13    using the PCI-SIG Single Root I/O Virtualization (SR IOV) standard,
14    also known as "native mode" or "pass-through" mode.
15    In this chapter, this mode is referred to as IOV mode.
16
17*   **VMDq mode**: Involves central management of the networking resources by an IO Virtual Machine (IOVM) or
18    a Virtual Machine Monitor (VMM), also known as software switch acceleration mode.
19    In this chapter, this mode is referred to as the Next Generation VMDq mode.
20
21SR-IOV Mode Utilization in a DPDK Environment
22---------------------------------------------
23
24The DPDK uses the SR-IOV feature for hardware-based I/O sharing in IOV mode.
25Therefore, it is possible to partition SR-IOV capability on Ethernet controller NIC resources logically and
26expose them to a virtual machine as a separate PCI function called a "Virtual Function".
27Refer to :numref:`figure_single_port_nic`.
28
29Therefore, a NIC is logically distributed among multiple virtual machines (as shown in :numref:`figure_single_port_nic`),
30while still having global data in common to share with the Physical Function and other Virtual Functions.
31The DPDK fm10kvf, iavf, igbvf or ixgbevf as a Poll Mode Driver (PMD) serves for the Intel® 82576 Gigabit Ethernet Controller,
32Intel® Ethernet Controller I350 family, Intel® 82599 10 Gigabit Ethernet Controller NIC,
33Intel® Fortville 10/40 Gigabit Ethernet Controller NIC's virtual PCI function, or PCIe host-interface of the Intel Ethernet Switch
34FM10000 Series.
35Meanwhile the DPDK Poll Mode Driver (PMD) also supports "Physical Function" of such NIC's on the host.
36
37The DPDK PF/VF Poll Mode Driver (PMD) supports the Layer 2 switch on Intel® 82576 Gigabit Ethernet Controller,
38Intel® Ethernet Controller I350 family, Intel® 82599 10 Gigabit Ethernet Controller,
39and Intel® Fortville 10/40 Gigabit Ethernet Controller NICs so that guest can choose it for inter virtual machine traffic in SR-IOV mode.
40
41For more detail on SR-IOV, please refer to the following documents:
42
43*   `SR-IOV provides hardware based I/O sharing <http://www.intel.com/network/connectivity/solutions/vmdc.htm>`_
44
45*   `PCI-SIG-Single Root I/O Virtualization Support on IA
46    <http://www.intel.com/content/www/us/en/pci-express/pci-sig-single-root-io-virtualization-support-in-virtualization-technology-for-connectivity-paper.html>`_
47
48*   `Scalable I/O Virtualized Servers <http://www.intel.com/content/www/us/en/virtualization/server-virtualization/scalable-i-o-virtualized-servers-paper.html>`_
49
50.. _figure_single_port_nic:
51
52.. figure:: img/single_port_nic.*
53
54   Virtualization for a Single Port NIC in SR-IOV Mode
55
56
57Physical and Virtual Function Infrastructure
58~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
59
60The following describes the Physical Function and Virtual Functions infrastructure for the supported Ethernet Controller NICs.
61
62Virtual Functions operate under the respective Physical Function on the same NIC Port and therefore have no access
63to the global NIC resources that are shared between other functions for the same NIC port.
64
65A Virtual Function has basic access to the queue resources and control structures of the queues assigned to it.
66For global resource access, a Virtual Function has to send a request to the Physical Function for that port,
67and the Physical Function operates on the global resources on behalf of the Virtual Function.
68For this out-of-band communication, an SR-IOV enabled NIC provides a memory buffer for each Virtual Function,
69which is called a "Mailbox".
70
71Intel® Ethernet Adaptive Virtual Function
72^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
73Adaptive Virtual Function (IAVF) is a SR-IOV Virtual Function with the same device id (8086:1889) on different Intel Ethernet Controller.
74IAVF Driver is VF driver which supports for all future Intel devices without requiring a VM update. And since this happens to be an adaptive VF driver,
75every new drop of the VF driver would add more and more advanced features that can be turned on in the VM if the underlying HW device supports those
76advanced features based on a device agnostic way without ever compromising on the base functionality. IAVF provides generic hardware interface and
77interface between IAVF driver and a compliant PF driver is specified.
78
79Intel products starting Ethernet Controller 700 Series to support Adaptive Virtual Function.
80
81The way to generate Virtual Function is like normal, and the resource of VF assignment depends on the NIC Infrastructure.
82
83For more detail on SR-IOV, please refer to the following documents:
84
85*   `Intel® IAVF HAS <https://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/ethernet-adaptive-virtual-function-hardware-spec.pdf>`_
86
87.. note::
88
89    To use DPDK IAVF PMD on Intel® 700 Series Ethernet Controller, the device id (0x1889) need to specified during device
90    assignment in hypervisor. Take qemu for example, the device assignment should carry the IAVF device id (0x1889) like
91    ``-device vfio-pci,x-pci-device-id=0x1889,host=03:0a.0``.
92
93    When IAVF is backed by an Intel® E810 device, the "Protocol Extraction" feature which is supported by ice PMD is also
94    available for IAVF PMD. The same devargs with the same parameters can be applied to IAVF PMD, for detail please reference
95    the section ``Protocol extraction for per queue`` of ice.rst.
96
97    Quanta size configuration is also supported when IAVF is backed by an Intel® E810 device by setting ``devargs``
98    parameter ``quanta_size`` like ``-a 18:00.0,quanta_size=2048``. The default value is 1024, and quanta size should be
99    set as the product of 64 in legacy host interface mode.
100
101    When IAVF is backed by an Intel® E810 device or an Intel® 700 Series Ethernet device, the reset watchdog is enabled
102    when link state changes to down. The default period is 2000us, defined by ``IAVF_DEV_WATCHDOG_PERIOD``.
103    Set ``devargs`` parameter ``watchdog_period`` to adjust the watchdog period in microseconds, or set it to 0 to disable the watchdog,
104    for example, ``-a 18:01.0,watchdog_period=5000`` or ``-a 18:01.0,watchdog_period=0``.
105
106    Enable VF auto-reset by setting the devargs parameter like ``-a 18:01.0,auto_reset=1``
107    when IAVF is backed by an Intel\ |reg| E810 device
108    or an Intel\ |reg| 700 Series Ethernet device.
109
110    Stop polling Rx/Tx hardware queue when link is down
111    by setting the ``devargs`` parameter like ``-a 18:01.0,no-poll-on-link-down=1``
112    when IAVF is backed by an Intel\ |reg| E810 device or an Intel\ |reg| 700 Series Ethernet device.
113
114    Similarly, when IAVF is backed by an Intel\ |reg| E810 device
115    or an Intel\ |reg| 700 Series Ethernet device,
116    set the ``devargs`` parameter ``mbuf_check`` to enable Tx diagnostics.
117    For example, ``-a 18:01.0,mbuf_check=<case>`` or ``-a 18:01.0,mbuf_check=[<case1>,<case2>...]``.
118    Thereafter, ``rte_eth_xstats_get()`` can be used to get the error counts,
119    which are collected in ``tx_mbuf_error_packets`` xstats.
120    In testpmd these can be shown via: ``testpmd> show port xstats all``.
121    Supported values for the ``case`` parameter are:
122
123    * ``mbuf``: Check for corrupted mbuf.
124    * ``size``: Check min/max packet length according to HW spec.
125    * ``segment``: Check number of mbuf segments does not exceed HW limits.
126    * ``offload``: Check for use of an unsupported offload flag.
127
128The PCIE host-interface of Intel Ethernet Switch FM10000 Series VF infrastructure
129^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
130
131In a virtualized environment, the programmer can enable a maximum of *64 Virtual Functions (VF)*
132globally per PCIE host-interface of the Intel Ethernet Switch FM10000 Series device.
133Each VF can have a maximum of 16 queue pairs.
134The Physical Function in host could be only configured by the Linux* fm10k driver
135(in the case of the Linux Kernel-based Virtual Machine [KVM]), DPDK PMD PF driver doesn't support it yet.
136
137For example,
138
139*   Using Linux* fm10k driver:
140
141    .. code-block:: console
142
143        rmmod fm10k (To remove the fm10k module)
144        insmod fm0k.ko max_vfs=2,2 (To enable two Virtual Functions per port)
145
146Virtual Function enumeration is performed in the following sequence by the Linux* pci driver for a dual-port NIC.
147When you enable the four Virtual Functions with the above command, the four enabled functions have a Function#
148represented by (Bus#, Device#, Function#) in sequence starting from 0 to 3.
149However:
150
151*   Virtual Functions 0 and 2 belong to Physical Function 0
152
153*   Virtual Functions 1 and 3 belong to Physical Function 1
154
155.. note::
156
157    The above is an important consideration to take into account when targeting specific packets to a selected port.
158
159Intel® X710/XL710 Gigabit Ethernet Controller VF Infrastructure
160^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
161
162In a virtualized environment, the programmer can enable a maximum of *128 Virtual Functions (VF)*
163globally per Intel® X710/XL710 Gigabit Ethernet Controller NIC device.
164The Physical Function in host could be either configured by the Linux* i40e driver
165(in the case of the Linux Kernel-based Virtual Machine [KVM]) or by DPDK PMD PF driver.
166When using both DPDK PMD PF/VF drivers, the whole NIC will be taken over by DPDK based application.
167
168For example,
169
170*   Using Linux* i40e  driver:
171
172    .. code-block:: console
173
174        rmmod i40e (To remove the i40e module)
175        insmod i40e.ko max_vfs=2,2 (To enable two Virtual Functions per port)
176
177*   Using the DPDK PMD PF i40e driver, bind the PF device to ``vfio_pci`` or ``igb_uio`` and
178    create VF devices. See :ref:`linux_gsg_binding_kernel`.
179
180    Launch the DPDK testpmd/example or your own host daemon application using the DPDK PMD library.
181
182Virtual Function enumeration is performed in the following sequence by the Linux* pci driver for a dual-port NIC.
183When you enable the four Virtual Functions with the above command, the four enabled functions have a Function#
184represented by (Bus#, Device#, Function#) in sequence starting from 0 to 3.
185However:
186
187*   Virtual Functions 0 and 2 belong to Physical Function 0
188
189*   Virtual Functions 1 and 3 belong to Physical Function 1
190
191.. note::
192
193    The above is an important consideration to take into account when targeting specific packets to a selected port.
194
195    For Intel® X710/XL710 Gigabit Ethernet Controller, queues are in pairs. One queue pair means one receive queue and
196    one transmit queue. The default number of queue pairs per VF is 4, and can be 16 in maximum.
197
198Intel® 82599 10 Gigabit Ethernet Controller VF Infrastructure
199^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
200
201The programmer can enable a maximum of *63 Virtual Functions* and there must be *one Physical Function* per Intel® 82599
20210 Gigabit Ethernet Controller NIC port.
203The reason for this is that the device allows for a maximum of 128 queues per port and a virtual/physical function has to
204have at least one queue pair (RX/TX).
205The current implementation of the DPDK ixgbevf driver supports a single queue pair (RX/TX) per Virtual Function.
206The Physical Function in host could be either configured by the Linux* ixgbe driver
207(in the case of the Linux Kernel-based Virtual Machine [KVM]) or by DPDK PMD PF driver.
208When using both DPDK PMD PF/VF drivers, the whole NIC will be taken over by DPDK based application.
209
210For example,
211
212*   Using Linux* ixgbe driver:
213
214    .. code-block:: console
215
216        rmmod ixgbe (To remove the ixgbe module)
217        insmod ixgbe max_vfs=2,2 (To enable two Virtual Functions per port)
218
219*   Using the DPDK PMD PF ixgbe driver, bind the PF device to ``vfio_pci`` or ``igb_uio`` and
220    create VF devices. See :ref:`linux_gsg_binding_kernel`.
221
222    Launch the DPDK testpmd/example or your own host daemon application using the DPDK PMD library.
223
224*   Using the DPDK PMD PF ixgbe driver to enable VF RSS:
225
226    Same steps as above to bind the PF device, create VF devices, and
227    launch the DPDK testpmd/example or your own host daemon application using the DPDK PMD library.
228
229    The available queue number (at most 4) per VF depends on the total number of pool, which is
230    determined by the max number of VF at PF initialization stage and the number of queue specified
231    in config:
232
233    *   If the max number of VFs (max_vfs) is set in the range of 1 to 32:
234
235        If the number of Rx queues is specified as 4 (``--rxq=4`` in testpmd), then there are totally 32
236        pools (RTE_ETH_32_POOLS), and each VF could have 4 Rx queues;
237
238        If the number of Rx queues is specified as 2 (``--rxq=2`` in testpmd), then there are totally 32
239        pools (RTE_ETH_32_POOLS), and each VF could have 2 Rx queues;
240
241    *   If the max number of VFs (max_vfs) is in the range of 33 to 64:
242
243        If the number of Rx queues in specified as 4 (``--rxq=4`` in testpmd), then error message is expected
244        as ``rxq`` is not correct at this case;
245
246        If the number of rxq is 2 (``--rxq=2`` in testpmd), then there is totally 64 pools (RTE_ETH_64_POOLS),
247        and each VF have 2 Rx queues;
248
249    On host, to enable VF RSS functionality, rx mq mode should be set as RTE_ETH_MQ_RX_VMDQ_RSS
250    or RTE_ETH_MQ_RX_RSS mode, and SRIOV mode should be activated (max_vfs >= 1).
251    It also needs config VF RSS information like hash function, RSS key, RSS key length.
252
253.. note::
254
255    The limitation for VF RSS on Intel® 82599 10 Gigabit Ethernet Controller is:
256    The hash and key are shared among PF and all VF, the RETA table with 128 entries is also shared
257    among PF and all VF; So it could not to provide a method to query the hash and reta content per
258    VF on guest, while, if possible, please query them on host for the shared RETA information.
259
260Virtual Function enumeration is performed in the following sequence by the Linux* pci driver for a dual-port NIC.
261When you enable the four Virtual Functions with the above command, the four enabled functions have a Function#
262represented by (Bus#, Device#, Function#) in sequence starting from 0 to 3.
263However:
264
265*   Virtual Functions 0 and 2 belong to Physical Function 0
266
267*   Virtual Functions 1 and 3 belong to Physical Function 1
268
269.. note::
270
271    The above is an important consideration to take into account when targeting specific packets to a selected port.
272
273Intel® 82576 Gigabit Ethernet Controller and Intel® Ethernet Controller I350 Family VF Infrastructure
274^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
275
276In a virtualized environment, an Intel® 82576 Gigabit Ethernet Controller serves up to eight virtual machines (VMs).
277The controller has 16 TX and 16 RX queues.
278They are generally referred to (or thought of) as queue pairs (one TX and one RX queue).
279This gives the controller 16 queue pairs.
280
281A pool is a group of queue pairs for assignment to the same VF, used for transmit and receive operations.
282The controller has eight pools, with each pool containing two queue pairs, that is, two TX and two RX queues assigned to each VF.
283
284In a virtualized environment, an Intel® Ethernet Controller I350 family device serves up to eight virtual machines (VMs) per port.
285The eight queues can be accessed by eight different VMs if configured correctly (the i350 has 4x1GbE ports each with 8T X and 8 RX queues),
286that means, one Transmit and one Receive queue assigned to each VF.
287
288For example,
289
290*   Using Linux* igb driver:
291
292    .. code-block:: console
293
294        rmmod igb (To remove the igb module)
295        insmod igb max_vfs=2,2 (To enable two Virtual Functions per port)
296
297*   Using the DPDK PMD PF igb driver, bind the PF device to ``vfio_pci`` or ``igb_uio`` and
298    create VF devices. See :ref:`linux_gsg_binding_kernel`.
299
300    Launch DPDK testpmd/example or your own host daemon application using the DPDK PMD library.
301
302Virtual Function enumeration is performed in the following sequence by the Linux* pci driver for a four-port NIC.
303When you enable the four Virtual Functions with the above command, the four enabled functions have a Function#
304represented by (Bus#, Device#, Function#) in sequence, starting from 0 to 7.
305However:
306
307*   Virtual Functions 0 and 4 belong to Physical Function 0
308
309*   Virtual Functions 1 and 5 belong to Physical Function 1
310
311*   Virtual Functions 2 and 6 belong to Physical Function 2
312
313*   Virtual Functions 3 and 7 belong to Physical Function 3
314
315.. note::
316
317    The above is an important consideration to take into account when targeting specific packets to a selected port.
318
319Validated Hypervisors
320~~~~~~~~~~~~~~~~~~~~~
321
322The validated hypervisor is:
323
324*   KVM (Kernel Virtual Machine) with  Qemu, version 0.14.0
325
326However, the hypervisor is bypassed to configure the Virtual Function devices using the Mailbox interface,
327the solution is hypervisor-agnostic.
328Xen* and VMware* (when SR- IOV is supported) will also be able to support the DPDK with Virtual Function driver support.
329
330Expected Guest Operating System in Virtual Machine
331~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
332
333The expected guest operating systems in a virtualized environment are:
334
335*   Fedora* 14 (64-bit)
336
337*   Ubuntu* 10.04 (64-bit)
338
339For supported kernel versions, refer to the *DPDK Release Notes*.
340
341.. _intel_vf_kvm:
342
343Setting Up a KVM Virtual Machine Monitor
344----------------------------------------
345
346The following describes a target environment:
347
348*   Host Operating System: Fedora 14
349
350*   Hypervisor: KVM (Kernel Virtual Machine) with Qemu  version 0.14.0
351
352*   Guest Operating System: Fedora 14
353
354*   Linux Kernel Version: Refer to the  *DPDK Getting Started Guide*
355
356*   Target Applications:  l2fwd, l3fwd-vf
357
358The setup procedure is as follows:
359
360#.  Before booting the Host OS, open **BIOS setup** and enable **Intel® VT features**.
361
362#.  While booting the Host OS kernel, pass the intel_iommu=on kernel command line argument using GRUB.
363    When using DPDK PF driver on host, pass the iommu=pt kernel command line argument in GRUB.
364
365#.  Download qemu-kvm-0.14.0 from
366    `http://sourceforge.net/projects/kvm/files/qemu-kvm/ <http://sourceforge.net/projects/kvm/files/qemu-kvm/>`_
367    and install it in the Host OS using the following steps:
368
369    When using a recent kernel (2.6.25+) with kvm modules included:
370
371    .. code-block:: console
372
373        tar xzf qemu-kvm-release.tar.gz
374        cd qemu-kvm-release
375        ./configure --prefix=/usr/local/kvm
376        make
377        sudo make install
378        sudo /sbin/modprobe kvm-intel
379
380    When using an older kernel, or a kernel from a distribution without the kvm modules,
381    you must download (from the same link), compile and install the modules yourself:
382
383    .. code-block:: console
384
385        tar xjf kvm-kmod-release.tar.bz2
386        cd kvm-kmod-release
387        ./configure
388        make
389        sudo make install
390        sudo /sbin/modprobe kvm-intel
391
392    qemu-kvm installs in the /usr/local/bin directory.
393
394    For more details about KVM configuration and usage, please refer to:
395
396    `http://www.linux-kvm.org/page/HOWTO1 <http://www.linux-kvm.org/page/HOWTO1>`_.
397
398#.  Create a Virtual Machine and install Fedora 14 on the Virtual Machine.
399    This is referred to as the Guest Operating System (Guest OS).
400
401#.  Download and install the latest ixgbe driver from
402    `intel.com <https://downloadcenter.intel.com/download/14687>`_.
403
404#.  In the Host OS
405
406    When using Linux kernel ixgbe driver, unload the Linux ixgbe driver and reload it with the max_vfs=2,2 argument:
407
408    .. code-block:: console
409
410        rmmod ixgbe
411        modprobe ixgbe max_vfs=2,2
412
413    When using DPDK PMD PF driver, bind the PF device to ``vfio_pci`` or ``igb_uio`` and
414    create VF devices. See :ref:`linux_gsg_binding_kernel`.
415
416    Let say we have a machine with four physical ixgbe ports:
417
418
419        0000:02:00.0
420
421        0000:02:00.1
422
423        0000:0e:00.0
424
425        0000:0e:00.1
426
427    The mentioned steps above should result in two vfs for device 0000:02:00.0:
428
429    .. code-block:: console
430
431        ls -alrt /sys/bus/pci/devices/0000\:02\:00.0/virt*
432        lrwxrwxrwx. 1 root root 0 Apr 13 05:40 /sys/bus/pci/devices/0000:02:00.0/virtfn1 -> ../0000:02:10.2
433        lrwxrwxrwx. 1 root root 0 Apr 13 05:40 /sys/bus/pci/devices/0000:02:00.0/virtfn0 -> ../0000:02:10.0
434
435    It also creates two vfs for device 0000:02:00.1:
436
437    .. code-block:: console
438
439        ls -alrt /sys/bus/pci/devices/0000\:02\:00.1/virt*
440        lrwxrwxrwx. 1 root root 0 Apr 13 05:51 /sys/bus/pci/devices/0000:02:00.1/virtfn1 -> ../0000:02:10.3
441        lrwxrwxrwx. 1 root root 0 Apr 13 05:51 /sys/bus/pci/devices/0000:02:00.1/virtfn0 -> ../0000:02:10.1
442
443#.  List the PCI devices connected and notice that the Host OS shows two Physical Functions (traditional ports)
444    and four Virtual Functions (two for each port).
445    This is the result of the previous step.
446
447#.  Insert the pci_stub module to hold the PCI devices that are freed from the default driver using the following command
448    (see http://www.linux-kvm.org/page/How_to_assign_devices_with_VT-d_in_KVM Section 4 for more information):
449
450    .. code-block:: console
451
452        sudo /sbin/modprobe pci-stub
453
454    Unbind the default driver from the PCI devices representing the Virtual Functions.
455    A script to perform this action is as follows:
456
457    .. code-block:: console
458
459        echo "8086 10ed" > /sys/bus/pci/drivers/pci-stub/new_id
460        echo 0000:08:10.0 > /sys/bus/pci/devices/0000:08:10.0/driver/unbind
461        echo 0000:08:10.0 > /sys/bus/pci/drivers/pci-stub/bind
462
463    where, 0000:08:10.0 belongs to the Virtual Function visible in the Host OS.
464
465#.  Now, start the Virtual Machine by running the following command:
466
467    .. code-block:: console
468
469        /usr/local/kvm/bin/qemu-system-x86_64 -m 4096 -smp 4 -boot c -hda lucid.qcow2 -device pci-assign,host=08:10.0
470
471    where:
472
473        — -m = memory to assign
474
475        — -smp = number of smp cores
476
477        — -boot = boot option
478
479        — -hda = virtual disk image
480
481        — -device = device to attach
482
483    .. note::
484
485        — The pci-assign,host=08:10.0 value indicates that you want to attach a PCI device
486        to a Virtual Machine and the respective (Bus:Device.Function)
487        numbers should be passed for the Virtual Function to be attached.
488
489        — qemu-kvm-0.14.0 allows a maximum of four PCI devices assigned to a VM,
490        but this is qemu-kvm version dependent since qemu-kvm-0.14.1 allows a maximum of five PCI devices.
491
492        — qemu-system-x86_64 also has a -cpu command line option that is used to select the cpu_model
493        to emulate in a Virtual Machine. Therefore, it can be used as:
494
495        .. code-block:: console
496
497            /usr/local/kvm/bin/qemu-system-x86_64 -cpu ?
498
499            (to list all available cpu_models)
500
501            /usr/local/kvm/bin/qemu-system-x86_64 -m 4096 -cpu host -smp 4 -boot c -hda lucid.qcow2 -device pci-assign,host=08:10.0
502
503            (to use the same cpu_model equivalent to the host cpu)
504
505        For more information, please refer to: `http://wiki.qemu.org/Features/CPUModels <http://wiki.qemu.org/Features/CPUModels>`_.
506
507#.  If use vfio-pci to pass through device instead of pci-assign, steps 8 and 9 need to be updated to bind device to vfio-pci and
508    replace pci-assign with vfio-pci when start virtual machine.
509
510    .. code-block:: console
511
512        sudo /sbin/modprobe vfio-pci
513
514        echo "8086 10ed" > /sys/bus/pci/drivers/vfio-pci/new_id
515        echo 0000:08:10.0 > /sys/bus/pci/devices/0000:08:10.0/driver/unbind
516        echo 0000:08:10.0 > /sys/bus/pci/drivers/vfio-pci/bind
517
518        /usr/local/kvm/bin/qemu-system-x86_64 -m 4096 -smp 4 -boot c -hda lucid.qcow2 -device vfio-pci,host=08:10.0
519
520#.  Install and run DPDK host app to take  over the Physical Function. Eg.
521
522    .. code-block:: console
523
524        ./<build_dir>/app/dpdk-testpmd -l 0-3 -n 4 -- -i
525
526#.  Finally, access the Guest OS using vncviewer with the localhost:5900 port and check the lspci command output in the Guest OS.
527    The virtual functions will be listed as available for use.
528
529#.  Configure and install the DPDK on the Guest OS as normal, that is, there is no change to the normal installation procedure.
530
531.. note::
532
533    If you are unable to compile the DPDK and you are getting "error: CPU you selected does not support x86-64 instruction set",
534    power off the Guest OS and start the virtual machine with the correct -cpu option in the qemu- system-x86_64 command as shown in step 9.
535    You must select the best x86_64 cpu_model to emulate or you can select host option if available.
536
537.. note::
538
539    Run the DPDK l2fwd sample application in the Guest OS with Hugepages enabled.
540    For the expected benchmark performance, you must pin the cores from the Guest OS to the Host OS (taskset can be used to do this) and
541    you must also look at the PCI Bus layout on the board to ensure you are not running the traffic over the QPI Interface.
542
543.. note::
544
545    *   The Virtual Machine Manager (the Fedora package name is virt-manager) is a utility for virtual machine management
546        that can also be used to create, start, stop and delete virtual machines.
547        If this option is used, step 2 and 6 in the instructions provided will be different.
548
549    *   virsh, a command line utility for virtual machine management,
550        can also be used to bind and unbind devices to a virtual machine in Ubuntu.
551        If this option is used, step 6 in the instructions provided will be different.
552
553    *   The Virtual Machine Monitor (see :numref:`figure_perf_benchmark`) is equivalent to a Host OS with KVM installed as described in the instructions.
554
555.. _figure_perf_benchmark:
556
557.. figure:: img/perf_benchmark.*
558
559   Performance Benchmark Setup
560
561
562DPDK SR-IOV PMD PF/VF Driver Usage Model
563----------------------------------------
564
565Fast Host-based Packet Processing
566~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
567
568Software Defined Network (SDN) trends are demanding fast host-based packet handling.
569In a virtualization environment,
570the DPDK VF PMD performs the same throughput result as a non-VT native environment.
571
572With such host instance fast packet processing, lots of services such as filtering, QoS,
573DPI can be offloaded on the host fast path.
574
575:numref:`figure_fast_pkt_proc` shows the scenario where some VMs directly communicate externally via a VFs,
576while others connect to a virtual switch and share the same uplink bandwidth.
577
578.. _figure_fast_pkt_proc:
579
580.. figure:: img/fast_pkt_proc.*
581
582   Fast Host-based Packet Processing
583
584
585SR-IOV (PF/VF) Approach for Inter-VM Communication
586--------------------------------------------------
587
588Inter-VM data communication is one of the traffic bottle necks in virtualization platforms.
589SR-IOV device assignment helps a VM to attach the real device, taking advantage of the bridge in the NIC.
590So VF-to-VF traffic within the same physical port (VM0<->VM1) have hardware acceleration.
591However, when VF crosses physical ports (VM0<->VM2), there is no such hardware bridge.
592In this case, the DPDK PMD PF driver provides host forwarding between such VMs.
593
594:numref:`figure_inter_vm_comms` shows an example.
595In this case an update of the MAC address lookup tables in both the NIC and host DPDK application is required.
596
597In the NIC, writing the destination of a MAC address belongs to another cross device VM to the PF specific pool.
598So when a packet comes in, its destination MAC address will match and forward to the host DPDK PMD application.
599
600In the host DPDK application, the behavior is similar to L2 forwarding,
601that is, the packet is forwarded to the correct PF pool.
602The SR-IOV NIC switch forwards the packet to a specific VM according to the MAC destination address
603which belongs to the destination VF on the VM.
604
605.. _figure_inter_vm_comms:
606
607.. figure:: img/inter_vm_comms.*
608
609   Inter-VM Communication
610
611
612Windows Support
613---------------
614
615*   IAVF PMD currently is supported only inside Windows guest created on Linux host.
616
617*   Physical PCI resources are exposed as virtual functions
618    into Windows VM using SR-IOV pass-through feature.
619
620*   Create a Windows guest on Linux host using KVM hypervisor.
621    Refer to the steps mentioned in the above section: :ref:`intel_vf_kvm`.
622
623*   In the Host machine, download and install the kernel Ethernet driver
624    for `i40e <https://downloadcenter.intel.com/download/24411>`_
625    or `ice <https://downloadcenter.intel.com/download/29746>`_.
626
627*   For Windows guest, install NetUIO driver
628    in place of existing built-in (inbox) Virtual Function driver.
629
630*   To load NetUIO driver, follow the steps mentioned in `dpdk-kmods repository
631    <https://git.dpdk.org/dpdk-kmods/tree/windows/netuio/README.rst>`_.
632
633
634Inline IPsec Support
635--------------------
636
637*   IAVF PMD supports inline crypto processing depending on the underlying
638    hardware crypto capabilities. IPsec Security Gateway Sample Application
639    supports inline IPsec processing for IAVF PMD. For more details see the
640    IPsec Security Gateway Sample Application and Security library
641    documentation.
642
643
644Diagnostic Utilities
645--------------------
646
647Register mbuf dynfield to test Tx LLDP
648~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
649
650Register an mbuf dynfield ``IAVF_TX_LLDP_DYNFIELD`` on ``dev_start``
651to indicate the need to send LLDP packet.
652This dynfield needs to be set to 1 when preparing packet.
653
654For ``dpdk-testpmd`` application, it needs to stop and restart Tx port to take effect.
655
656Usage::
657
658    testpmd> set tx lldp on
659
660
661Limitations or Knowing issues
662-----------------------------
663
66416 Byte RX Descriptor setting is not available
665~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
666
667Currently the VF's RX descriptor size is decided by PF. There's no PF-VF
668interface for VF to request the RX descriptor size, also no interface to notify
669VF its own RX descriptor size.
670For all available versions of the kernel PF drivers, these drivers don't
671support 16 bytes RX descriptor. If the Linux kernel driver is used as host driver,
672while DPDK iavf PMD is used as the VF driver, DPDK cannot choose 16 bytes receive
673descriptor. The reason is that the RX descriptor is already set to 32 bytes by
674the all existing kernel driver.
675In the future, if the any kernel driver supports 16 bytes RX descriptor, user
676should make sure the DPDK VF uses the same RX descriptor size.
677
678i40e: VF performance is impacted by PCI extended tag setting
679~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
680
681To reach maximum NIC performance in the VF the PCI extended tag must be
682enabled. But the kernel driver does not set this feature during initialization.
683So when running traffic on a VF which is managed by the kernel PF driver, a
684significant NIC performance downgrade has been observed (for 64 byte packets,
685there is about 25% line-rate downgrade for a 25GbE device and about 35% for a
68640GbE device).
687
688For kernel version >= 4.11, the kernel's PCI driver will enable the extended
689tag if it detects that the device supports it. So by default, this is not an
690issue. For kernels <= 4.11 or when the PCI extended tag is disabled it can be
691enabled using the steps below.
692
693#. Get the current value of the PCI configure register::
694
695      setpci -s <XX:XX.X> a8.w
696
697#. Set bit 8::
698
699      value = value | 0x100
700
701#. Set the PCI configure register with new value::
702
703      setpci -s <XX:XX.X> a8.w=<value>
704
705i40e: Vlan strip of VF
706~~~~~~~~~~~~~~~~~~~~~~
707
708The VF vlan strip function is only supported in the i40e kernel driver >= 2.1.26.
709
710i40e: Vlan filtering of VF
711~~~~~~~~~~~~~~~~~~~~~~~~~~
712
713For i40e driver 2.17.15, configuring VLAN filters from the DPDK VF is unsupported.
714When applying VLAN filters on the VF it must first be configured from the
715corresponding PF.
716
717ice: VF inserts VLAN tag incorrectly on AVX-512 Tx path
718~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
719
720When the kernel driver requests the VF to use the L2TAG2 field of the Tx context
721descriptor to insert the hardware offload VLAN tag,
722AVX-512 Tx path cannot handle this case correctly
723due to its lack of support for the Tx context descriptor.
724
725The VLAN tag will be inserted to the wrong location (inner of QinQ)
726on AVX-512 Tx path.
727That is inconsistent with the behavior of PF (outer of QinQ).
728The ice kernel driver version newer than 1.8.9 requests to use L2TAG2
729and has this issue.
730
731Set the parameter `--force-max-simd-bitwidth` as 64/128/256
732to avoid selecting AVX-512 Tx path.
733
734ice: VLAN tag length not included in MTU
735~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
736
737When configuring MTU for a VF, MTU must not include VLAN tag length.
738In practice, when kernel driver configures VLAN filtering for a VF,
739the VLAN header tag length will be automatically added to MTU when configuring queues.
740As a consequence, when attempting to configure a VF port with MTU that,
741together with a VLAN tag header, exceeds maximum supported MTU,
742port configuration will fail if kernel driver has configured VLAN filtering on that VF.
743