1.. BSD LICENSE 2 Copyright(c) 2016 Red Hat, Inc. All rights reserved. 3 All rights reserved. 4 5 Redistribution and use in source and binary forms, with or without 6 modification, are permitted provided that the following conditions 7 are met: 8 9 * Redistributions of source code must retain the above copyright 10 notice, this list of conditions and the following disclaimer. 11 * Redistributions in binary form must reproduce the above copyright 12 notice, this list of conditions and the following disclaimer in 13 the documentation and/or other materials provided with the 14 distribution. 15 * Neither the name of Intel Corporation nor the names of its 16 contributors may be used to endorse or promote products derived 17 from this software without specific prior written permission. 18 19 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 20 "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 21 LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 22 A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 23 OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 24 SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 25 LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 26 DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 27 THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 28 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 29 OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 30 31 32PVP reference benchmark setup using testpmd 33=========================================== 34 35This guide lists the steps required to setup a PVP benchmark using testpmd as 36a simple forwarder between NICs and Vhost interfaces. The goal of this setup 37is to have a reference PVP benchmark without using external vSwitches (OVS, 38VPP, ...) to make it easier to obtain reproducible results and to facilitate 39continuous integration testing. 40 41The guide covers two ways of launching the VM, either by directly calling the 42QEMU command line, or by relying on libvirt. It has been tested with DPDK 43v16.11 using RHEL7 for both host and guest. 44 45 46Setup overview 47-------------- 48 49.. _figure_pvp_2nics: 50 51.. figure:: img/pvp_2nics.* 52 53 PVP setup using 2 NICs 54 55In this diagram, each red arrow represents one logical core. This use-case 56requires 6 dedicated logical cores. A forwarding configuration with a single 57NIC is also possible, requiring 3 logical cores. 58 59 60Host setup 61---------- 62 63In this setup, we isolate 6 cores (from CPU2 to CPU7) on the same NUMA 64node. Two cores are assigned to the VM vCPUs running testpmd and four are 65assigned to testpmd on the host. 66 67 68Host tuning 69~~~~~~~~~~~ 70 71#. On BIOS, disable turbo-boost and hyper-threads. 72 73#. Append these options to Kernel command line: 74 75 .. code-block:: console 76 77 intel_pstate=disable mce=ignore_ce default_hugepagesz=1G hugepagesz=1G hugepages=6 isolcpus=2-7 rcu_nocbs=2-7 nohz_full=2-7 iommu=pt intel_iommu=on 78 79#. Disable hyper-threads at runtime if necessary or if BIOS is not accessible: 80 81 .. code-block:: console 82 83 cat /sys/devices/system/cpu/cpu*[0-9]/topology/thread_siblings_list \ 84 | sort | uniq \ 85 | awk -F, '{system("echo 0 > /sys/devices/system/cpu/cpu"$2"/online")}' 86 87#. Disable NMIs: 88 89 .. code-block:: console 90 91 echo 0 > /proc/sys/kernel/nmi_watchdog 92 93#. Exclude isolated CPUs from the writeback cpumask: 94 95 .. code-block:: console 96 97 echo ffffff03 > /sys/bus/workqueue/devices/writeback/cpumask 98 99#. Isolate CPUs from IRQs: 100 101 .. code-block:: console 102 103 clear_mask=0xfc #Isolate CPU2 to CPU7 from IRQs 104 for i in /proc/irq/*/smp_affinity 105 do 106 echo "obase=16;$(( 0x$(cat $i) & ~$clear_mask ))" | bc > $i 107 done 108 109 110Qemu build 111~~~~~~~~~~ 112 113Build Qemu: 114 115 .. code-block:: console 116 117 git clone git://git.qemu.org/qemu.git 118 cd qemu 119 mkdir bin 120 cd bin 121 ../configure --target-list=x86_64-softmmu 122 make 123 124 125DPDK build 126~~~~~~~~~~ 127 128Build DPDK: 129 130 .. code-block:: console 131 132 git clone git://dpdk.org/dpdk 133 cd dpdk 134 export RTE_SDK=$PWD 135 make install T=x86_64-native-linuxapp-gcc DESTDIR=install 136 137 138Testpmd launch 139~~~~~~~~~~~~~~ 140 141#. Assign NICs to DPDK: 142 143 .. code-block:: console 144 145 modprobe vfio-pci 146 $RTE_SDK/install/sbin/dpdk-devbind -b vfio-pci 0000:11:00.0 0000:11:00.1 147 148 .. Note:: 149 150 The Sandy Bridge family seems to have some IOMMU limitations giving poor 151 performance results. To achieve good performance on these machines 152 consider using UIO instead. 153 154#. Launch the testpmd application: 155 156 .. code-block:: console 157 158 $RTE_SDK/install/bin/testpmd -l 0,2,3,4,5 --socket-mem=1024 -n 4 \ 159 --vdev 'net_vhost0,iface=/tmp/vhost-user1' \ 160 --vdev 'net_vhost1,iface=/tmp/vhost-user2' -- \ 161 --portmask=f -i --rxq=1 --txq=1 \ 162 --nb-cores=4 --forward-mode=io 163 164 With this command, isolated CPUs 2 to 5 will be used as lcores for PMD threads. 165 166#. In testpmd interactive mode, set the portlist to obtain the correct port 167 chaining: 168 169 .. code-block:: console 170 171 set portlist 0,2,1,3 172 start 173 174 175VM launch 176~~~~~~~~~ 177 178The VM may be launched either by calling QEMU directly, or by using libvirt. 179 180Qemu way 181^^^^^^^^ 182 183Launch QEMU with two Virtio-net devices paired to the vhost-user sockets 184created by testpmd. Below example uses default Virtio-net options, but options 185may be specified, for example to disable mergeable buffers or indirect 186descriptors. 187 188 .. code-block:: console 189 190 <QEMU path>/bin/x86_64-softmmu/qemu-system-x86_64 \ 191 -enable-kvm -cpu host -m 3072 -smp 3 \ 192 -chardev socket,id=char0,path=/tmp/vhost-user1 \ 193 -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \ 194 -device virtio-net-pci,netdev=mynet1,mac=52:54:00:02:d9:01,addr=0x10 \ 195 -chardev socket,id=char1,path=/tmp/vhost-user2 \ 196 -netdev type=vhost-user,id=mynet2,chardev=char1,vhostforce \ 197 -device virtio-net-pci,netdev=mynet2,mac=52:54:00:02:d9:02,addr=0x11 \ 198 -object memory-backend-file,id=mem,size=3072M,mem-path=/dev/hugepages,share=on \ 199 -numa node,memdev=mem -mem-prealloc \ 200 -net user,hostfwd=tcp::1002$1-:22 -net nic \ 201 -qmp unix:/tmp/qmp.socket,server,nowait \ 202 -monitor stdio <vm_image>.qcow2 203 204You can use this `qmp-vcpu-pin <https://patchwork.kernel.org/patch/9361617/>`_ 205script to pin vCPUs. 206 207It can be used as follows, for example to pin 3 vCPUs to CPUs 1, 6 and 7, 208where isolated CPUs 6 and 7 will be used as lcores for Virtio PMDs: 209 210 .. code-block:: console 211 212 export PYTHONPATH=$PYTHONPATH:<QEMU path>/scripts/qmp 213 ./qmp-vcpu-pin -s /tmp/qmp.socket 1 6 7 214 215Libvirt way 216^^^^^^^^^^^ 217 218Some initial steps are required for libvirt to be able to connect to testpmd's 219sockets. 220 221First, SELinux policy needs to be set to permissive, since testpmd is 222generally run as root (note, as reboot is required): 223 224 .. code-block:: console 225 226 cat /etc/selinux/config 227 228 # This file controls the state of SELinux on the system. 229 # SELINUX= can take one of these three values: 230 # enforcing - SELinux security policy is enforced. 231 # permissive - SELinux prints warnings instead of enforcing. 232 # disabled - No SELinux policy is loaded. 233 SELINUX=permissive 234 235 # SELINUXTYPE= can take one of three two values: 236 # targeted - Targeted processes are protected, 237 # minimum - Modification of targeted policy. 238 # Only selected processes are protected. 239 # mls - Multi Level Security protection. 240 SELINUXTYPE=targeted 241 242 243Also, Qemu needs to be run as root, which has to be specified in 244``/etc/libvirt/qemu.conf``: 245 246 .. code-block:: console 247 248 user = "root" 249 250Once the domain created, the following snippet is an extract of he most 251important information (hugepages, vCPU pinning, Virtio PCI devices): 252 253 .. code-block:: xml 254 255 <domain type='kvm'> 256 <memory unit='KiB'>3145728</memory> 257 <currentMemory unit='KiB'>3145728</currentMemory> 258 <memoryBacking> 259 <hugepages> 260 <page size='1048576' unit='KiB' nodeset='0'/> 261 </hugepages> 262 <locked/> 263 </memoryBacking> 264 <vcpu placement='static'>3</vcpu> 265 <cputune> 266 <vcpupin vcpu='0' cpuset='1'/> 267 <vcpupin vcpu='1' cpuset='6'/> 268 <vcpupin vcpu='2' cpuset='7'/> 269 <emulatorpin cpuset='0'/> 270 </cputune> 271 <numatune> 272 <memory mode='strict' nodeset='0'/> 273 </numatune> 274 <os> 275 <type arch='x86_64' machine='pc-i440fx-rhel7.0.0'>hvm</type> 276 <boot dev='hd'/> 277 </os> 278 <cpu mode='host-passthrough'> 279 <topology sockets='1' cores='3' threads='1'/> 280 <numa> 281 <cell id='0' cpus='0-2' memory='3145728' unit='KiB' memAccess='shared'/> 282 </numa> 283 </cpu> 284 <devices> 285 <interface type='vhostuser'> 286 <mac address='56:48:4f:53:54:01'/> 287 <source type='unix' path='/tmp/vhost-user1' mode='client'/> 288 <model type='virtio'/> 289 <driver name='vhost' rx_queue_size='256' /> 290 <address type='pci' domain='0x0000' bus='0x00' slot='0x10' function='0x0'/> 291 </interface> 292 <interface type='vhostuser'> 293 <mac address='56:48:4f:53:54:02'/> 294 <source type='unix' path='/tmp/vhost-user2' mode='client'/> 295 <model type='virtio'/> 296 <driver name='vhost' rx_queue_size='256' /> 297 <address type='pci' domain='0x0000' bus='0x00' slot='0x11' function='0x0'/> 298 </interface> 299 </devices> 300 </domain> 301 302 303Guest setup 304----------- 305 306 307Guest tuning 308~~~~~~~~~~~~ 309 310#. Append these options to the Kernel command line: 311 312 .. code-block:: console 313 314 default_hugepagesz=1G hugepagesz=1G hugepages=1 intel_iommu=on iommu=pt isolcpus=1,2 rcu_nocbs=1,2 nohz_full=1,2 315 316#. Disable NMIs: 317 318 .. code-block:: console 319 320 echo 0 > /proc/sys/kernel/nmi_watchdog 321 322#. Exclude isolated CPU1 and CPU2 from the writeback cpumask: 323 324 .. code-block:: console 325 326 echo 1 > /sys/bus/workqueue/devices/writeback/cpumask 327 328#. Isolate CPUs from IRQs: 329 330 .. code-block:: console 331 332 clear_mask=0x6 #Isolate CPU1 and CPU2 from IRQs 333 for i in /proc/irq/*/smp_affinity 334 do 335 echo "obase=16;$(( 0x$(cat $i) & ~$clear_mask ))" | bc > $i 336 done 337 338 339DPDK build 340~~~~~~~~~~ 341 342Build DPDK: 343 344 .. code-block:: console 345 346 git clone git://dpdk.org/dpdk 347 cd dpdk 348 export RTE_SDK=$PWD 349 make install T=x86_64-native-linuxapp-gcc DESTDIR=install 350 351 352Testpmd launch 353~~~~~~~~~~~~~~ 354 355Probe vfio module without iommu: 356 357 .. code-block:: console 358 359 modprobe -r vfio_iommu_type1 360 modprobe -r vfio 361 modprobe vfio enable_unsafe_noiommu_mode=1 362 cat /sys/module/vfio/parameters/enable_unsafe_noiommu_mode 363 modprobe vfio-pci 364 365Bind the virtio-net devices to DPDK: 366 367 .. code-block:: console 368 369 $RTE_SDK/usertools/dpdk-devbind.py -b vfio-pci 0000:00:10.0 0000:00:11.0 370 371Start testpmd: 372 373 .. code-block:: console 374 375 $RTE_SDK/install/bin/testpmd -l 0,1,2 --socket-mem 1024 -n 4 \ 376 --proc-type auto --file-prefix pg -- \ 377 --portmask=3 --forward-mode=macswap --port-topology=chained \ 378 --disable-rss -i --rxq=1 --txq=1 \ 379 --rxd=256 --txd=256 --nb-cores=2 --auto-start 380 381Results template 382---------------- 383 384Below template should be used when sharing results: 385 386 .. code-block:: none 387 388 Traffic Generator: <Test equipment (e.g. IXIA, Moongen, ...)> 389 Acceptable Loss: <n>% 390 Validation run time: <n>min 391 Host DPDK version/commit: <version, SHA-1> 392 Guest DPDK version/commit: <version, SHA-1> 393 Patches applied: <link to patchwork> 394 QEMU version/commit: <version> 395 Virtio features: <features (e.g. mrg_rxbuf='off', leave empty if default)> 396 CPU: <CPU model>, <CPU frequency> 397 NIC: <NIC model> 398 Result: <n> Mpps 399