15630257fSFerruh Yigit.. SPDX-License-Identifier: BSD-3-Clause 25630257fSFerruh Yigit Copyright(c) 2010-2016 Intel Corporation. 3d0dff9baSBernard Iremonger 4d0dff9baSBernard IremongerVhost Sample Application 5d0dff9baSBernard Iremonger======================== 6d0dff9baSBernard Iremonger 7a971c509SYuanhan LiuThe vhost sample application demonstrates integration of the Data Plane 8a971c509SYuanhan LiuDevelopment Kit (DPDK) with the Linux* KVM hypervisor by implementing the 9a971c509SYuanhan Liuvhost-net offload API. The sample application performs simple packet 10a971c509SYuanhan Liuswitching between virtual machines based on Media Access Control (MAC) 11a971c509SYuanhan Liuaddress or Virtual Local Area Network (VLAN) tag. The splitting of Ethernet 12a971c509SYuanhan Liutraffic from an external switch is performed in hardware by the Virtual 13a971c509SYuanhan LiuMachine Device Queues (VMDQ) and Data Center Bridging (DCB) features of 14a971c509SYuanhan Liuthe Intel® 82599 10 Gigabit Ethernet Controller. 15d0dff9baSBernard Iremonger 16a971c509SYuanhan LiuTesting steps 17d0dff9baSBernard Iremonger------------- 18d0dff9baSBernard Iremonger 19a971c509SYuanhan LiuThis section shows the steps how to test a typical PVP case with this 2025651c56SHerakliusz Lipiecdpdk-vhost sample, whereas packets are received from the physical NIC 21a971c509SYuanhan Liuport first and enqueued to the VM's Rx queue. Through the guest testpmd's 22a971c509SYuanhan Liudefault forwarding mode (io forward), those packets will be put into 2325651c56SHerakliusz Lipiecthe Tx queue. The dpdk-vhost example, in turn, gets the packets and 24a971c509SYuanhan Liuputs back to the same physical NIC port. 25d0dff9baSBernard Iremonger 26a971c509SYuanhan LiuBuild 27a971c509SYuanhan Liu~~~~~ 28d0dff9baSBernard Iremonger 297cacb056SHerakliusz LipiecTo compile the sample application see :doc:`compiling`. 30d0dff9baSBernard Iremonger 317cacb056SHerakliusz LipiecThe application is located in the ``vhost`` sub-directory. 327cacb056SHerakliusz Lipiec 337cacb056SHerakliusz Lipiec.. note:: 34a971c509SYuanhan Liu In this example, you need build DPDK both on the host and inside guest. 35a971c509SYuanhan Liu 36*5f4f26d3SHerakliusz Lipiec. _vhost_app_run_vm: 37a971c509SYuanhan Liu 38a971c509SYuanhan LiuStart the VM 39a971c509SYuanhan Liu~~~~~~~~~~~~ 40d0dff9baSBernard Iremonger 41d0dff9baSBernard Iremonger.. code-block:: console 42d0dff9baSBernard Iremonger 43a971c509SYuanhan Liu qemu-system-x86_64 -machine accel=kvm -cpu host \ 44a971c509SYuanhan Liu -m $mem -object memory-backend-file,id=mem,size=$mem,mem-path=/dev/hugepages,share=on \ 45a971c509SYuanhan Liu -mem-prealloc -numa node,memdev=mem \ 46a971c509SYuanhan Liu \ 47a971c509SYuanhan Liu -chardev socket,id=char1,path=/tmp/sock0,server \ 48a971c509SYuanhan Liu -netdev type=vhost-user,id=hostnet1,chardev=char1 \ 49a971c509SYuanhan Liu -device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:00:00:14 \ 50a971c509SYuanhan Liu ... 519bc23cb8SHuawei Xie 52d0dff9baSBernard Iremonger.. note:: 53a971c509SYuanhan Liu For basic vhost-user support, QEMU 2.2 (or above) is required. For 54a971c509SYuanhan Liu some specific features, a higher version might be need. Such as 55a971c509SYuanhan Liu QEMU 2.7 (or above) for the reconnect feature. 56d0dff9baSBernard Iremonger 57*5f4f26d3SHerakliusz Lipiec 58*5f4f26d3SHerakliusz LipiecStart the vswitch example 59*5f4f26d3SHerakliusz Lipiec~~~~~~~~~~~~~~~~~~~~~~~~~ 60*5f4f26d3SHerakliusz Lipiec 61*5f4f26d3SHerakliusz Lipiec.. code-block:: console 62*5f4f26d3SHerakliusz Lipiec 63*5f4f26d3SHerakliusz Lipiec ./dpdk-vhost -l 0-3 -n 4 --socket-mem 1024 \ 64*5f4f26d3SHerakliusz Lipiec -- --socket-file /tmp/sock0 --client \ 65*5f4f26d3SHerakliusz Lipiec ... 66*5f4f26d3SHerakliusz Lipiec 67*5f4f26d3SHerakliusz LipiecCheck the `Parameters`_ section for the explanations on what do those 68*5f4f26d3SHerakliusz Lipiecparameters mean. 69*5f4f26d3SHerakliusz Lipiec 70a971c509SYuanhan Liu.. _vhost_app_run_dpdk_inside_guest: 71d0dff9baSBernard Iremonger 72a971c509SYuanhan LiuRun testpmd inside guest 73a971c509SYuanhan Liu~~~~~~~~~~~~~~~~~~~~~~~~ 7405632179SBernard Iremonger 75a971c509SYuanhan LiuMake sure you have DPDK built inside the guest. Also make sure the 76b7fe612aSThomas Monjaloncorresponding virtio-net PCI device is bond to a UIO driver, which 77a971c509SYuanhan Liucould be done by: 78a971c509SYuanhan Liu 79a971c509SYuanhan Liu.. code-block:: console 80a971c509SYuanhan Liu 81*5f4f26d3SHerakliusz Lipiec modprobe vfio-pci 82*5f4f26d3SHerakliusz Lipiec dpdk/usertools/dpdk-devbind.py -b vfio-pci 0000:00:04.0 83a971c509SYuanhan Liu 84a971c509SYuanhan LiuThen start testpmd for packet forwarding testing. 85a971c509SYuanhan Liu 86a971c509SYuanhan Liu.. code-block:: console 87a971c509SYuanhan Liu 88e2a94f9aSCiara Power ./<build_dir>/app/dpdk-testpmd -l 0-1 -- -i 89a971c509SYuanhan Liu > start tx_first 90a971c509SYuanhan Liu 91*5f4f26d3SHerakliusz LipiecFor more information about vIOMMU and NO-IOMMU and VFIO please refer to 92*5f4f26d3SHerakliusz Lipiec:doc:`/../linux_gsg/linux_drivers` section of the DPDK Getting started guide. 93*5f4f26d3SHerakliusz Lipiec 94a971c509SYuanhan LiuInject packets 95a971c509SYuanhan Liu-------------- 96a971c509SYuanhan Liu 9725651c56SHerakliusz LipiecWhile a virtio-net is connected to dpdk-vhost, a VLAN tag starts with 98a971c509SYuanhan Liu1000 is assigned to it. So make sure configure your packet generator 99a971c509SYuanhan Liuwith the right MAC and VLAN tag, you should be able to see following 10025651c56SHerakliusz Lipieclog from the dpdk-vhost console. It means you get it work:: 101a971c509SYuanhan Liu 102a971c509SYuanhan Liu VHOST_DATA: (0) mac 52:54:00:00:00:14 and vlan 1000 registered 103a971c509SYuanhan Liu 10405632179SBernard Iremonger 105513b0723SMauricio Vasquez B.. _vhost_app_parameters: 106513b0723SMauricio Vasquez B 107d0dff9baSBernard IremongerParameters 108a971c509SYuanhan Liu---------- 109d0dff9baSBernard Iremonger 110a971c509SYuanhan Liu**--socket-file path** 111a971c509SYuanhan LiuSpecifies the vhost-user socket file path. 112d0dff9baSBernard Iremonger 113a971c509SYuanhan Liu**--client** 114a971c509SYuanhan LiuDPDK vhost-user will act as the client mode when such option is given. 115a971c509SYuanhan LiuIn the client mode, QEMU will create the socket file. Otherwise, DPDK 116a971c509SYuanhan Liuwill create it. Put simply, it's the server to create the socket file. 117d0dff9baSBernard Iremonger 118d0dff9baSBernard Iremonger 119a971c509SYuanhan Liu**--vm2vm mode** 120a971c509SYuanhan LiuThe vm2vm parameter sets the mode of packet switching between guests in 121a971c509SYuanhan Liuthe host. 122d0dff9baSBernard Iremonger 123d629b7b5SJohn McNamara- 0 disables vm2vm, implying that VM's packets will always go to the NIC port. 124a971c509SYuanhan Liu- 1 means a normal mac lookup packet routing. 125a971c509SYuanhan Liu- 2 means hardware mode packet forwarding between guests, it allows packets 126a971c509SYuanhan Liu go to the NIC port, hardware L2 switch will determine which guest the 127a971c509SYuanhan Liu packet should forward to or need send to external, which bases on the 128a971c509SYuanhan Liu packet destination MAC address and VLAN tag. 129d0dff9baSBernard Iremonger 130a971c509SYuanhan Liu**--mergeable 0|1** 131a971c509SYuanhan LiuSet 0/1 to disable/enable the mergeable Rx feature. It's disabled by default. 132d0dff9baSBernard Iremonger 133a971c509SYuanhan Liu**--stats interval** 134d0dff9baSBernard IremongerThe stats parameter controls the printing of virtio-net device statistics. 135a971c509SYuanhan LiuThe parameter specifies an interval (in unit of seconds) to print statistics, 136a971c509SYuanhan Liuwith an interval of 0 seconds disabling statistics. 137d0dff9baSBernard Iremonger 138a971c509SYuanhan Liu**--rx-retry 0|1** 139a971c509SYuanhan LiuThe rx-retry option enables/disables enqueue retries when the guests Rx queue 140a971c509SYuanhan Liuis full. This feature resolves a packet loss that is observed at high data 141a971c509SYuanhan Liurates, by allowing it to delay and retry in the receive path. This option is 142a971c509SYuanhan Liuenabled by default. 143d0dff9baSBernard Iremonger 144a971c509SYuanhan Liu**--rx-retry-num num** 145a971c509SYuanhan LiuThe rx-retry-num option specifies the number of retries on an Rx burst, it 146a971c509SYuanhan Liutakes effect only when rx retry is enabled. The default value is 4. 147d0dff9baSBernard Iremonger 148a971c509SYuanhan Liu**--rx-retry-delay msec** 149a971c509SYuanhan LiuThe rx-retry-delay option specifies the timeout (in micro seconds) between 150a971c509SYuanhan Liuretries on an RX burst, it takes effect only when rx retry is enabled. The 151a971c509SYuanhan Liudefault value is 15. 152d0dff9baSBernard Iremonger 153cc9ecbb4SMarvin Liu**--builtin-net-driver** 154cc9ecbb4SMarvin LiuA very simple vhost-user net driver which demonstrates how to use the generic 155cc9ecbb4SMarvin Liuvhost APIs will be used when this option is given. It is disabled by default. 156cc9ecbb4SMarvin Liu 1573a04ecb2SCheng Jiang**--dmas** 1583a04ecb2SCheng JiangThis parameter is used to specify the assigned DMA device of a vhost device. 1593a04ecb2SCheng JiangAsync vhost-user net driver will be used if --dmas is set. For example 160a543dcb7SXuan Ding--dmas [txd0@00:04.0,txd1@00:04.1,rxd0@00:04.2,rxd1@00:04.3] means use 161a543dcb7SXuan DingDMA channel 00:04.0/00:04.2 for vhost device 0 enqueue/dequeue operation 162a543dcb7SXuan Dingand use DMA channel 00:04.1/00:04.3 for vhost device 1 enqueue/dequeue 163a543dcb7SXuan Dingoperation. The index of the device corresponds to the socket file in order, 164a543dcb7SXuan Dingthat means vhost device 0 is created through the first socket file, vhost 165a543dcb7SXuan Dingdevice 1 is created through the second socket file, and so on. 1663a04ecb2SCheng Jiang 167*5f4f26d3SHerakliusz Lipiec**--total-num-mbufs 0-N** 168*5f4f26d3SHerakliusz LipiecThis parameter sets the number of mbufs to be allocated in mbuf pools, 169*5f4f26d3SHerakliusz Lipiecthe default value is 147456. This is can be used if launch of a port fails 170*5f4f26d3SHerakliusz Lipiecdue to shortage of mbufs. 171*5f4f26d3SHerakliusz Lipiec 172*5f4f26d3SHerakliusz Lipiec**--tso 0|1** 173*5f4f26d3SHerakliusz LipiecDisables/enables TCP segment offload. 174*5f4f26d3SHerakliusz Lipiec 175*5f4f26d3SHerakliusz Lipiec**--tx-csum 0|1** 176*5f4f26d3SHerakliusz LipiecDisables/enables TX checksum offload. 177*5f4f26d3SHerakliusz Lipiec 178*5f4f26d3SHerakliusz Lipiec**-p mask** 179*5f4f26d3SHerakliusz LipiecPort mask which specifies the ports to be used 180*5f4f26d3SHerakliusz Lipiec 181d0dff9baSBernard IremongerCommon Issues 182a971c509SYuanhan Liu------------- 183d0dff9baSBernard Iremonger 184a971c509SYuanhan Liu* QEMU fails to allocate memory on hugetlbfs, with an error like the 185a971c509SYuanhan Liu following:: 186d0dff9baSBernard Iremonger 187d0dff9baSBernard Iremonger file_ram_alloc: can't mmap RAM pages: Cannot allocate memory 188d0dff9baSBernard Iremonger 189a971c509SYuanhan Liu When running QEMU the above error indicates that it has failed to allocate 190a971c509SYuanhan Liu memory for the Virtual Machine on the hugetlbfs. This is typically due to 191a971c509SYuanhan Liu insufficient hugepages being free to support the allocation request. The 192a971c509SYuanhan Liu number of free hugepages can be checked as follows: 193d0dff9baSBernard Iremonger 194d0dff9baSBernard Iremonger .. code-block:: console 195d0dff9baSBernard Iremonger 196de34aaa9SThomas Monjalon dpdk-hugepages.py --show 197d0dff9baSBernard Iremonger 198a971c509SYuanhan Liu The command above indicates how many hugepages are free to support QEMU's 199a971c509SYuanhan Liu allocation request. 200d0dff9baSBernard Iremonger 201a971c509SYuanhan Liu* Failed to build DPDK in VM 202c0f4d7a4SChangchun Ouyang 203a971c509SYuanhan Liu Make sure "-cpu host" QEMU option is given. 204cde83b8bSXiao Wang 205cde83b8bSXiao Wang* Device start fails if NIC's max queues > the default number of 128 206cde83b8bSXiao Wang 207cde83b8bSXiao Wang mbuf pool size is dependent on the MAX_QUEUES configuration, if NIC's 208cde83b8bSXiao Wang max queue number is larger than 128, device start will fail due to 209*5f4f26d3SHerakliusz Lipiec insufficient mbuf. This can be adjusted using ``--total-num-mbufs`` 210*5f4f26d3SHerakliusz Lipiec parameter. 211cde83b8bSXiao Wang 212cc9ecbb4SMarvin Liu* Option "builtin-net-driver" is incompatible with QEMU 213cc9ecbb4SMarvin Liu 214cc9ecbb4SMarvin Liu QEMU vhost net device start will fail if protocol feature is not negotiated. 21535bd0a5cSSean Morrissey DPDK virtio-user PMD can be the replacement of QEMU. 216b1692872SXuan Ding 217b1692872SXuan Ding* Device start fails when enabling "builtin-net-driver" without memory 218b1692872SXuan Ding pre-allocation 219b1692872SXuan Ding 220b1692872SXuan Ding The builtin example doesn't support dynamic memory allocation. When vhost 221b1692872SXuan Ding backend enables "builtin-net-driver", "--socket-mem" option should be 22235bd0a5cSSean Morrissey added at virtio-user PMD side as a startup item. 223