1.. BSD LICENSE 2 Copyright(c) 2010-2014 Intel Corporation. All rights reserved. 3 All rights reserved. 4 5 Redistribution and use in source and binary forms, with or without 6 modification, are permitted provided that the following conditions 7 are met: 8 9 * Redistributions of source code must retain the above copyright 10 notice, this list of conditions and the following disclaimer. 11 * Redistributions in binary form must reproduce the above copyright 12 notice, this list of conditions and the following disclaimer in 13 the documentation and/or other materials provided with the 14 distribution. 15 * Neither the name of Intel Corporation nor the names of its 16 contributors may be used to endorse or promote products derived 17 from this software without specific prior written permission. 18 19 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 20 "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 21 LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 22 A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 23 OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 24 SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 25 LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 26 DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 27 THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 28 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 29 OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 30 31 32Vhost Sample Application 33======================== 34 35The vhost sample application demonstrates integration of the Data Plane Development Kit (DPDK) 36with the Linux* KVM hypervisor by implementing the vhost-net offload API. 37The sample application performs simple packet switching between virtual machines based on Media Access Control 38(MAC) address or Virtual Local Area Network (VLAN) tag. 39The splitting of ethernet traffic from an external switch is performed in hardware by the Virtual Machine Device Queues 40(VMDQ) and Data Center Bridging (DCB) features of the Intel® 82599 10 Gigabit Ethernet Controller. 41 42Background 43---------- 44 45Virtio networking (virtio-net) was developed as the Linux* KVM para-virtualized method for communicating network packets 46between host and guest. 47It was found that virtio-net performance was poor due to context switching and packet copying between host, guest, and QEMU. 48The following figure shows the system architecture for a virtio- based networking (virtio-net). 49 50.. _figure_16: 51 52**Figure16. QEMU Virtio-net (prior to vhost-net)** 53 54.. image19_png has been renamed 55 56|qemu_virtio_net| 57 58The Linux* Kernel vhost-net module was developed as an offload mechanism for virtio-net. 59The vhost-net module enables KVM (QEMU) to offload the servicing of virtio-net devices to the vhost-net kernel module, 60reducing the context switching and packet copies in the virtual dataplane. 61 62This is achieved by QEMU sharing the following information with the vhost-net module through the vhost-net API: 63 64* The layout of the guest memory space, to enable the vhost-net module to translate addresses. 65 66* The locations of virtual queues in QEMU virtual address space, 67 to enable the vhost module to read/write directly to and from the virtqueues. 68 69* An event file descriptor (eventfd) configured in KVM to send interrupts to the virtio- net device driver in the guest. 70 This enables the vhost-net module to notify (call) the guest. 71 72* An eventfd configured in KVM to be triggered on writes to the virtio-net device's 73 Peripheral Component Interconnect (PCI) config space. 74 This enables the vhost-net module to receive notifications (kicks) from the guest. 75 76The following figure shows the system architecture for virtio-net networking with vhost-net offload. 77 78.. _figure_17: 79 80**Figure 17. Virtio with Linux* Kernel Vhost** 81 82.. image20_png has been renamed 83 84|virtio_linux_vhost| 85 86Sample Code Overview 87-------------------- 88 89The DPDK vhost-net sample code demonstrates KVM (QEMU) offloading the servicing of a Virtual Machine's (VM's) 90virtio-net devices to a DPDK-based application in place of the kernel's vhost-net module. 91 92The DPDK vhost-net sample code is a simple packet switching application with the following features: 93 94* Management of virtio-net device creation/destruction events. 95 96* Mapping of the VM's physical memory into the DPDK vhost-net sample code's address space. 97 98* Triggering/receiving notifications to/from VMs via eventfds. 99 100* A virtio-net back-end implementation providing a subset of virtio-net features. 101 102* Packet switching between virtio-net devices and the network interface card, 103 including using VMDQs to reduce the switching that needs to be performed in software. 104 105The following figure shows the architecture of the Vhost sample application. 106 107.. _figure_18: 108 109**Figure 18. Vhost-net Architectural Overview** 110 111.. image21_png has been renamed 112 113|vhost_net_arch| 114 115The following figure shows the flow of packets through the vhost-net sample application. 116 117.. _figure_19: 118 119**Figure 19. Packet Flow Through the vhost-net Sample Application** 120 121.. image22_png has been renamed 122 123|vhost_net_sample_app| 124 125Supported Distributions 126----------------------- 127 128The example in this section have been validated with the following distributions: 129 130* Fedora* 18 131 132* Fedora* 19 133 134Prerequisites 135------------- 136 137This section lists prerequisite packages that must be installed. 138 139Installing Packages on the Host 140~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 141 142The vhost sample code uses the following packages; fuse, fuse-devel, and kernel- modules-extra. 143 144#. Install Fuse Development Libraries and headers: 145 146 .. code-block:: console 147 148 yum -y install fuse fuse-devel 149 150#. Install the Cuse Kernel Module: 151 152 .. code-block:: console 153 154 yum -y install kernel-modules-extra 155 156Setting up the Execution Environment 157~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 158 159The vhost sample code requires that QEMU allocates a VM's memory on the hugetlbfs file system. 160As the vhost sample code requires hugepages, 161the best practice is to partition the system into separate hugepage mount points for the VMs and the vhost sample code. 162 163.. note:: 164 165 This is best-practice only and is not mandatory. 166 For systems that only support 2 MB page sizes, 167 both QEMU and vhost sample code can use the same hugetlbfs mount point without issue. 168 169**QEMU** 170 171VMs with gigabytes of memory can benefit from having QEMU allocate their memory from 1 GB huge pages. 1721 GB huge pages must be allocated at boot time by passing kernel parameters through the grub boot loader. 173 174#. Calculate the maximum memory usage of all VMs to be run on the system. 175 Then, round this value up to the nearest Gigabyte the execution environment will require. 176 177#. Edit the /etc/default/grub file, and add the following to the GRUB_CMDLINE_LINUX entry: 178 179 .. code-block:: console 180 181 GRUB_CMDLINE_LINUX="... hugepagesz=1G hugepages=<Number of hugepages required> default_hugepagesz=1G" 182 183#. Update the grub boot loader: 184 185 .. code-block:: console 186 187 grub2-mkconfig -o /boot/grub2/grub.cfg 188 189#. Reboot the system. 190 191#. The hugetlbfs mount point (/dev/hugepages) should now default to allocating gigabyte pages. 192 193.. note:: 194 195 Making the above modification will change the system default hugepage size to 1 GB for all applications. 196 197**Vhost Sample Code** 198 199In this section, we create a second hugetlbs mount point to allocate hugepages for the DPDK vhost sample code. 200 201#. Allocate sufficient 2 MB pages for the DPDK vhost sample code: 202 203 .. code-block:: console 204 205 echo 256 > /sys/kernel/mm/hugepages/hugepages-2048kB/ nr_hugepages 206 207#. Mount hugetlbs at a separate mount point for 2 MB pages: 208 209 .. code-block:: console 210 211 mount -t hugetlbfs nodev /mnt/huge -o pagesize=2M 212 213The above steps can be automated by doing the following: 214 215#. Edit /etc/fstab to add an entry to automatically mount the second hugetlbfs mount point: 216 217 :: 218 219 hugetlbfs <tab> /mnt/huge <tab> hugetlbfs defaults,pagesize=1G 0 0 220 221#. Edit the /etc/default/grub file, and add the following to the GRUB_CMDLINE_LINUX entry: 222 223 :: 224 225 GRUB_CMDLINE_LINUX="... hugepagesz=2M hugepages=256 ... default_hugepagesz=1G" 226 227#. Update the grub bootloader: 228 229 .. code-block:: console 230 231 grub2-mkconfig -o /boot/grub2/grub.cfg 232 233#. Reboot the system. 234 235.. note:: 236 237 Ensure that the default hugepage size after this setup is 1 GB. 238 239Setting up the Guest Execution Environment 240~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 241 242It is recommended for testing purposes that the DPDK testpmd sample application is used in the guest to forward packets, 243the reasons for this are discussed in Section 22.7, "Running the Virtual Machine (QEMU)". 244 245The testpmd application forwards packets between pairs of Ethernet devices, 246it requires an even number of Ethernet devices (virtio or otherwise) to execute. 247It is therefore recommended to create multiples of two virtio-net devices for each Virtual Machine either through libvirt or 248at the command line as follows. 249 250.. note:: 251 252 Observe that in the example, "-device" and "-netdev" are repeated for two virtio-net devices. 253 254.. code-block:: console 255 256 user@target:~$ qemu-system-x86_64 ... \ 257 -netdev tap,id=hostnet1,vhost=on,vhostfd=<open fd> \ 258 -device virtio-net-pci, netdev=hostnet1,id=net1 \ 259 -netdev tap,id=hostnet2,vhost=on,vhostfd=<open fd> \ 260 -device virtio-net-pci, netdev=hostnet2,id=net1 261 262 263Compiling the Sample Code 264------------------------- 265 266#. Go to the examples directory: 267 268 .. code-block:: console 269 270 export RTE_SDK=/path/to/rte_sdk cd ${RTE_SDK}/examples/vhost-net 271 272#. Set the target (a default target is used if not specified). For example: 273 274 .. code-block:: console 275 276 export RTE_TARGET=x86_64-native-linuxapp-gcc 277 278 See the DPDK Getting Started Guide for possible RTE_TARGET values. 279 280#. Build the application: 281 282 .. code-block:: console 283 284 make 285 286 .. note:: 287 288 Note For zero copy, need firstly disable CONFIG_RTE_MBUF_SCATTER_GATHER, 289 CONFIG_RTE_LIBRTE_IP_FRAG and CONFIG_RTE_LIBRTE_DISTRIBUTOR 290 in the config file and then re-configure and compile the core lib, and then build the application: 291 292 .. code-block:: console 293 294 vi ${RTE_SDK}/config/common_linuxapp 295 296 change it as follows: 297 298 :: 299 300 CONFIG_RTE_MBUF_SCATTER_GATHER=n 301 CONFIG_RTE_LIBRTE_IP_FRAG=n 302 CONFIG_RTE_LIBRTE_DISTRIBUTOR=n 303 304 .. code-block:: console 305 306 cd ${RTE_SDK} 307 make config ${RTE_TARGET} 308 make install ${RTE_TARGET} 309 cd ${RTE_SDK}/examples/vhost 310 make 311 312#. Go to the eventfd_link directory: 313 314 .. code-block:: console 315 316 cd ${RTE_SDK}/examples/vhost-net/eventfd_link 317 318#. Build the eventfd_link kernel module: 319 320 .. code-block:: console 321 322 make 323 324Running the Sample Code 325----------------------- 326 327#. Install the cuse kernel module: 328 329 .. code-block:: console 330 331 modprobe cuse 332 333#. Go to the eventfd_link directory: 334 335 .. code-block:: console 336 337 export RTE_SDK=/path/to/rte_sdk 338 cd ${RTE_SDK}/examples/vhost-net/eventfd_link 339 340#. Install the eventfd_link module: 341 342 .. code-block:: console 343 344 insmod ./eventfd_link.ko 345 346#. Go to the examples directory: 347 348 .. code-block:: console 349 350 export RTE_SDK=/path/to/rte_sdk 351 cd ${RTE_SDK}/examples/vhost-net 352 353#. Run the vhost-switch sample code: 354 355 .. code-block:: console 356 357 user@target:~$ ./build/app/vhost-switch -c f -n 4 --huge-dir / mnt/huge -- -p 0x1 --dev-basename usvhost --dev-index 1 358 359.. note:: 360 361 Please note the huge-dir parameter instructs the DPDK to allocate its memory from the 2 MB page hugetlbfs. 362 363Parameters 364~~~~~~~~~~ 365 366**Basename and Index.** 367The DPDK vhost-net sample code uses a Linux* character device to communicate with QEMU. 368The basename and the index are used to generate the character devices name. 369 370 /dev/<basename>-<index> 371 372The index parameter is provided for a situation where multiple instances of the virtual switch is required. 373 374For compatibility with the QEMU wrapper script, a base name of "usvhost" and an index of "1" should be used: 375 376.. code-block:: console 377 378 user@target:~$ ./build/app/vhost-switch -c f -n 4 --huge-dir / mnt/huge -- -p 0x1 --dev-basename usvhost --dev-index 1 379 380**vm2vm.** 381The vm2vm parameter disable/set mode of packet switching between guests in the host. 382Value of "0" means disabling vm2vm implies that on virtual machine packet transmission will always go to the Ethernet port; 383Value of "1" means software mode packet forwarding between guests, it needs packets copy in vHOST, 384so valid only in one-copy implementation, and invalid for zero copy implementation; 385value of "2" means hardware mode packet forwarding between guests, it allows packets go to the Ethernet port, 386hardware L2 switch will determine which guest the packet should forward to or need send to external, 387which bases on the packet destination MAC address and VLAN tag. 388 389.. code-block:: console 390 391 user@target:~$ ./build/app/vhost-switch -c f -n 4 --huge-dir /mnt/huge -- --vm2vm [0,1,2] 392 393**Mergeable Buffers.** 394The mergeable buffers parameter controls how virtio-net descriptors are used for virtio-net headers. 395In a disabled state, one virtio-net header is used per packet buffer; 396in an enabled state one virtio-net header is used for multiple packets. 397The default value is 0 or disabled since recent kernels virtio-net drivers show performance degradation with this feature is enabled. 398 399.. code-block:: console 400 401 user@target:~$ ./build/app/vhost-switch -c f -n 4 --huge-dir / mnt/huge -- --mergeable [0,1] 402 403**Stats.** 404The stats parameter controls the printing of virtio-net device statistics. 405The parameter specifies an interval second to print statistics, with an interval of 0 seconds disabling statistics. 406 407.. code-block:: console 408 409 user@target:~$ ./build/app/vhost-switch -c f -n 4 --huge-dir / mnt/huge -- --stats [0,n] 410 411**RX Retry.** 412The rx-retry option enables/disables enqueue retries when the guests RX queue is full. 413This feature resolves a packet loss that is observed at high data-rates, 414by allowing it to delay and retry in the receive path. 415This option is enabled by default. 416 417.. code-block:: console 418 419 user@target:~$ ./build/app/vhost-switch -c f -n 4 --huge-dir / mnt/huge -- --rx-retry [0,1] 420 421**RX Retry Number.** 422The rx-retry-num option specifies the number of retries on an RX burst, 423it takes effect only when rx retry is enabled. 424The default value is 4. 425 426.. code-block:: console 427 428 user@target:~$ ./build/app/vhost-switch -c f -n 4 --huge-dir / mnt/huge -- --rx-retry 1 --rx-retry-num 5 429 430**RX Retry Delay Time.** 431The rx-retry-delay option specifies the timeout (in micro seconds) between retries on an RX burst, 432it takes effect only when rx retry is enabled. 433The default value is 15. 434 435.. code-block:: console 436 437 user@target:~$ ./build/app/vhost-switch -c f -n 4 --huge-dir / mnt/huge -- --rx-retry 1 --rx-retry-delay 20 438 439**Zero copy.** 440The zero copy option enables/disables the zero copy mode for RX/TX packet, 441in the zero copy mode the packet buffer address from guest translate into host physical address 442and then set directly as DMA address. 443If the zero copy mode is disabled, then one copy mode is utilized in the sample. 444This option is disabled by default. 445 446.. code-block:: console 447 448 user@target:~$ ./build/app/vhost-switch -c f -n 4 --huge-dir /mnt/huge -- --zero-copy [0,1] 449 450**RX descriptor number.** 451The RX descriptor number option specify the Ethernet RX descriptor number, 452Linux legacy virtio-net has different behaviour in how to use the vring descriptor from DPDK based virtio-net PMD, 453the former likely allocate half for virtio header, another half for frame buffer, 454while the latter allocate all for frame buffer, 455this lead to different number for available frame buffer in vring, 456and then lead to different Ethernet RX descriptor number could be used in zero copy mode. 457So it is valid only in zero copy mode is enabled. The value is 32 by default. 458 459.. code-block:: console 460 461 user@target:~$ ./build/app/vhost-switch -c f -n 4 --huge-dir /mnt/huge -- --zero-copy 1 --rx-desc-num [0, n] 462 463**TX descriptornumber.** 464The TX descriptor number option specify the Ethernet TX descriptor number, it is valid only in zero copy mode is enabled. 465The value is 64 by default. 466 467.. code-block:: console 468 469 user@target:~$ ./build/app/vhost-switch -c f -n 4 --huge-dir /mnt/huge -- --zero-copy 1 --tx-desc-num [0, n] 470 471Running the Virtual Machine (QEMU) 472---------------------------------- 473 474QEMU must be executed with specific parameters to: 475 476* Ensure the guest is configured to use virtio-net network adapters. 477 478 .. code-block:: console 479 480 user@target:~$ qemu-system-x86_64 ... -device virtio-net-pci, netdev=hostnet1,id=net1 ... 481 482* Ensure the guest's virtio-net network adapter is configured with offloads disabled. 483 484 .. code-block:: console 485 486 user@target:~$ qemu-system-x86_64 ... -device virtio-net-pci, netdev=hostnet1,id=net1,csum=off,gso=off,guest_tso4=off,guest_ tso6=off,guest_ecn=off 487 488* Redirect QEMU to communicate with the DPDK vhost-net sample code in place of the vhost-net kernel module. 489 490 .. code-block:: console 491 492 user@target:~$ qemu-system-x86_64 ... -netdev tap,id=hostnet1,vhost=on,vhostfd=<open fd> ... 493 494* Enable the vhost-net sample code to map the VM's memory into its own process address space. 495 496 .. code-block:: console 497 498 user@target:~$ qemu-system-x86_64 ... -mem-prealloc -mem-path / dev/hugepages ... 499 500.. note:: 501 502 The QEMU wrapper (qemu-wrap.py) is a Python script designed to automate the QEMU configuration described above. 503 It also facilitates integration with libvirt, although the script may also be used standalone without libvirt. 504 505Redirecting QEMU to vhost-net Sample Code 506~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 507 508To redirect QEMU to the vhost-net sample code implementation of the vhost-net API, 509an open file descriptor must be passed to QEMU running as a child process. 510 511.. code-block:: python 512 513 #!/usr/bin/python 514 fd = os.open("/dev/usvhost-1", os.O_RDWR) 515 subprocess.call("qemu-system-x86_64 ... . -netdev tap,id=vhostnet0,vhost=on,vhostfd=" + fd +"...", shell=True) 516 517.. note:: 518 519 This process is automated in the QEMU wrapper script discussed in Section 22.7.3. 520 521Mapping the Virtual Machine's Memory 522~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 523 524For the DPDK vhost-net sample code to be run correctly, QEMU must allocate the VM's memory on hugetlbfs. 525This is done by specifying mem-prealloc and mem-path when executing QEMU. 526The vhost-net sample code accesses the virtio-net device's virtual rings and packet buffers 527by finding and mapping the VM's physical memory on hugetlbfs. 528In this case, the path passed to the guest should be that of the 1 GB page hugetlbfs: 529 530.. code-block:: console 531 532 user@target:~$ qemu-system-x86_64 ... -mem-prealloc -mem-path / dev/hugepages ... 533 534.. note:: 535 536 This process is automated in the QEMU wrapper script discussed in Section 22.7.3. 537 538QEMU Wrapper Script 539~~~~~~~~~~~~~~~~~~~ 540 541The QEMU wrapper script automatically detects and calls QEMU with the necessary parameters required 542to integrate with the vhost sample code. 543It performs the following actions: 544 545* Automatically detects the location of the hugetlbfs and inserts this into the command line parameters. 546 547* Automatically open file descriptors for each virtio-net device and inserts this into the command line parameters. 548 549* Disables offloads on each virtio-net device. 550 551* Calls Qemu passing both the command line parameters passed to the script itself and those it has auto-detected. 552 553The QEMU wrapper script will automatically configure calls to QEMU: 554 555.. code-block:: console 556 557 user@target:~$ qemu-wrap.py -machine pc-i440fx-1.4,accel=kvm,usb=off -cpu SandyBridge -smp 4,sockets=4,cores=1,threads=1 558 -netdev tap,id=hostnet1,vhost=on -device virtio-net-pci,netdev=hostnet1,id=net1 -hda <disk img> -m 4096 559 560which will become the following call to QEMU: 561 562.. code-block:: console 563 564 /usr/local/bin/qemu-system-x86_64 -machine pc-i440fx-1.4,accel=kvm,usb=off -cpu SandyBridge -smp 4,sockets=4,cores=1,threads=1 565 -netdev tap,id=hostnet1,vhost=on,vhostfd=<open fd> -device virtio-net- pci,netdev=hostnet1,id=net1, 566 csum=off,gso=off,guest_tso4=off,gu est_tso6=off,guest_ecn=off -hda <disk img> -m 4096 -mem-path /dev/hugepages -mem-prealloc 567 568Libvirt Integration 569~~~~~~~~~~~~~~~~~~~ 570 571The QEMU wrapper script (qemu-wrap.py) "wraps" libvirt calls to QEMU, 572such that QEMU is called with the correct parameters described above. 573To call the QEMU wrapper automatically from libvirt, the following configuration changes must be made: 574 575* Place the QEMU wrapper script in libvirt's binary search PATH ($PATH). 576 A good location is in the directory that contains the QEMU binary. 577 578* Ensure that the script has the same owner/group and file permissions as the QEMU binary. 579 580* Update the VM xml file using virsh edit <vm name>: 581 582 * Set the VM to use the launch script 583 584 * Set the emulator path contained in the #<emulator><emulator/> tags For example, 585 replace <emulator>/usr/bin/qemu-kvm<emulator/> with <emulator>/usr/bin/qemu-wrap.py<emulator/> 586 587 * Set the VM's virtio-net device's to use vhost-net offload: 588 589 .. code-block:: xml 590 591 <interface type="network"> 592 <model type="virtio"/> 593 <driver name="vhost"/> 594 <interface/> 595 596 * Enable libvirt to access the DPDK Vhost sample code's character device file by adding it 597 to controllers cgroup for libvirtd using the following steps: 598 599 .. code-block:: xml 600 601 cgroup_controllers = [ ... "devices", ... ] clear_emulator_capabilities = 0 602 user = "root" group = "root" 603 cgroup_device_acl = [ 604 "/dev/null", "/dev/full", "/dev/zero", 605 "/dev/random", "/dev/urandom", 606 "/dev/ptmx", "/dev/kvm", "/dev/kqemu", 607 "/dev/rtc", "/dev/hpet", "/dev/net/tun", 608 "/dev/<devbase-name>-<index>", 609 ] 610 611* Disable SELinux or set to permissive mode. 612 613 614* Mount cgroup device controller: 615 616 .. code-block:: console 617 618 user@target:~$ mkdir /dev/cgroup 619 user@target:~$ mount -t cgroup none /dev/cgroup -o devices 620 621* Restart the libvirtd system process 622 623 For example, on Fedora* "systemctl restart libvirtd.service" 624 625* Edit the configuration parameters section of the script: 626 627 * Configure the "emul_path" variable to point to the QEMU emulator. 628 629 .. code-block:: xml 630 631 emul_path = "/usr/local/bin/qemu-system-x86_64" 632 633 * Configure the "us_vhost_path" variable to point to the DPDK vhost- net sample code's character devices name. 634 DPDK vhost-net sample code's character device will be in the format "/dev/<basename>-<index>". 635 636 .. code-block:: xml 637 638 us_vhost_path = "/dev/usvhost-1" 639 640Common Issues 641~~~~~~~~~~~~~ 642 643**QEMU failing to allocate memory on hugetlbfs.** 644 645file_ram_alloc: can't mmap RAM pages: Cannot allocate memory 646 647When running QEMU the above error implies that it has failed to allocate memory for the Virtual Machine on the hugetlbfs. 648This is typically due to insufficient hugepages being free to support the allocation request. 649The number of free hugepages can be checked as follows: 650 651.. code-block:: console 652 653 user@target:cat /sys/kernel/mm/hugepages/hugepages-<pagesize> / nr_hugepages 654 655The command above indicates how many hugepages are free to support QEMU's allocation request. 656 657Running DPDK in the Virtual Machine 658----------------------------------- 659 660For the DPDK vhost-net sample code to switch packets into the VM, 661the sample code must first learn the MAC address of the VM's virtio-net device. 662The sample code detects the address from packets being transmitted from the VM, similar to a learning switch. 663 664This behavior requires no special action or configuration with the Linux* virtio-net driver in the VM 665as the Linux* Kernel will automatically transmit packets during device initialization. 666However, DPDK-based applications must be modified to automatically transmit packets during initialization 667to facilitate the DPDK vhost- net sample code's MAC learning. 668 669The DPDK testpmd application can be configured to automatically transmit packets during initialization 670and to act as an L2 forwarding switch. 671 672Testpmd MAC Forwarding 673~~~~~~~~~~~~~~~~~~~~~~ 674 675At high packet rates, a minor packet loss may be observed. 676To resolve this issue, a "wait and retry" mode is implemented in the testpmd and vhost sample code. 677In the "wait and retry" mode if the virtqueue is found to be full, then testpmd waits for a period of time before retrying to enqueue packets. 678 679The "wait and retry" algorithm is implemented in DPDK testpmd as a forwarding method call "mac_retry". 680The following sequence diagram describes the algorithm in detail. 681 682.. _figure_20: 683 684**Figure 20. Packet Flow on TX in DPDK-testpmd** 685 686.. image23_png has been renamed 687 688|tx_dpdk_testpmd| 689 690Running Testpmd 691~~~~~~~~~~~~~~~ 692 693The testpmd application is automatically built when DPDK is installed. 694Run the testpmd application as follows: 695 696.. code-block:: console 697 698 user@target:~$ x86_64-native-linuxapp-gcc/app/testpmd -c 0x3 -- n 4 -socket-mem 128 -- --burst=64 -i 699 700The destination MAC address for packets transmitted on each port can be set at the command line: 701 702.. code-block:: console 703 704 user@target:~$ x86_64-native-linuxapp-gcc/app/testpmd -c 0x3 -- n 4 -socket-mem 128 -- --burst=64 -i --eth- peer=0,aa:bb:cc:dd:ee:ff --eth-peer=1,ff,ee,dd,cc,bb,aa 705 706* Packets received on port 1 will be forwarded on port 0 to MAC address 707 708 aa:bb:cc:dd:ee:ff. 709 710* Packets received on port 0 will be forwarded on port 1 to MAC address 711 712 ff,ee,dd,cc,bb,aa. 713 714The testpmd application can then be configured to act as an L2 forwarding application: 715 716.. code-block:: console 717 718 testpmd> set fwd mac_retry 719 720The testpmd can then be configured to start processing packets, 721transmitting packets first so the DPDK vhost sample code on the host can learn the MAC address: 722 723.. code-block:: console 724 725 testpmd> start tx_first 726 727.. note:: 728 729 Please note "set fwd mac_retry" is used in place of "set fwd mac_fwd" to ensure the retry feature is activated. 730 731Passing Traffic to the Virtual Machine Device 732--------------------------------------------- 733 734For a virtio-net device to receive traffic, 735the traffic's Layer 2 header must include both the virtio-net device's MAC address and VLAN tag. 736The DPDK sample code behaves in a similar manner to a learning switch in that 737it learns the MAC address of the virtio-net devices from the first transmitted packet. 738On learning the MAC address, 739the DPDK vhost sample code prints a message with the MAC address and VLAN tag virtio-net device. 740For example: 741 742.. code-block:: console 743 744 DATA: (0) MAC_ADDRESS cc:bb:bb:bb:bb:bb and VLAN_TAG 1000 registered 745 746The above message indicates that device 0 has been registered with MAC address cc:bb:bb:bb:bb:bb and VLAN tag 1000. 747Any packets received on the NIC with these values is placed on the devices receive queue. 748When a virtio-net device transmits packets, the VLAN tag is added to the packet by the DPDK vhost sample code. 749 750.. |vhost_net_arch| image:: img/vhost_net_arch.* 751 752.. |qemu_virtio_net| image:: img/qemu_virtio_net.* 753 754.. |tx_dpdk_testpmd| image:: img/tx_dpdk_testpmd.* 755 756.. |vhost_net_sample_app| image:: img/vhost_net_sample_app.* 757 758.. |virtio_linux_vhost| image:: img/virtio_linux_vhost.* 759