1.. BSD LICENSE 2 Copyright(c) 2016 Red Hat, Inc. All rights reserved. 3 All rights reserved. 4 5 Redistribution and use in source and binary forms, with or without 6 modification, are permitted provided that the following conditions 7 are met: 8 9 * Redistributions of source code must retain the above copyright 10 notice, this list of conditions and the following disclaimer. 11 * Redistributions in binary form must reproduce the above copyright 12 notice, this list of conditions and the following disclaimer in 13 the documentation and/or other materials provided with the 14 distribution. 15 * Neither the name of Intel Corporation nor the names of its 16 contributors may be used to endorse or promote products derived 17 from this software without specific prior written permission. 18 19 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 20 "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 21 LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 22 A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 23 OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 24 SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 25 LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 26 DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 27 THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 28 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 29 OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 30 31 32PVP reference benchmark setup using testpmd 33=========================================== 34 35This guide lists the steps required to setup a PVP benchmark using testpmd as 36a simple forwarder between NICs and Vhost interfaces. The goal of this setup 37is to have a reference PVP benchmark without using external vSwitches (OVS, 38VPP, ...) to make it easier to obtain reproducible results and to facilitate 39continuous integration testing. 40 41The guide covers two ways of launching the VM, either by directly calling the 42QEMU command line, or by relying on libvirt. It has been tested with DPDK 43v16.11 using RHEL7 for both host and guest. 44 45 46Setup overview 47-------------- 48 49.. _figure_pvp_2nics: 50 51.. figure:: img/pvp_2nics.* 52 53 PVP setup using 2 NICs 54 55In this diagram, each red arrow represents one logical core. This use-case 56requires 6 dedicated logical cores. A forwarding configuration with a single 57NIC is also possible, requiring 3 logical cores. 58 59 60Host setup 61---------- 62 63In this setup, we isolate 6 cores (from CPU2 to CPU7) on the same NUMA 64node. Two cores are assigned to the VM vCPUs running testpmd and four are 65assigned to testpmd on the host. 66 67 68Host tuning 69~~~~~~~~~~~ 70 71#. On BIOS, disable turbo-boost and hyper-threads. 72 73#. Append these options to Kernel command line: 74 75 .. code-block:: console 76 77 intel_pstate=disable mce=ignore_ce default_hugepagesz=1G hugepagesz=1G hugepages=6 isolcpus=2-7 rcu_nocbs=2-7 nohz_full=2-7 iommu=pt intel_iommu=on 78 79#. Disable hyper-threads at runtime if necessary or if BIOS is not accessible: 80 81 .. code-block:: console 82 83 cat /sys/devices/system/cpu/cpu*[0-9]/topology/thread_siblings_list \ 84 | sort | uniq \ 85 | awk -F, '{system("echo 0 > /sys/devices/system/cpu/cpu"$2"/online")}' 86 87#. Disable NMIs: 88 89 .. code-block:: console 90 91 echo 0 > /proc/sys/kernel/nmi_watchdog 92 93#. Exclude isolated CPUs from the writeback cpumask: 94 95 .. code-block:: console 96 97 echo ffffff03 > /sys/bus/workqueue/devices/writeback/cpumask 98 99#. Isolate CPUs from IRQs: 100 101 .. code-block:: console 102 103 clear_mask=0xfc #Isolate CPU2 to CPU7 from IRQs 104 for i in /proc/irq/*/smp_affinity 105 do 106 echo "obase=16;$(( 0x$(cat $i) & ~$clear_mask ))" | bc > $i 107 done 108 109 110Qemu build 111~~~~~~~~~~ 112 113Build Qemu: 114 115 .. code-block:: console 116 117 git clone git://git.qemu.org/qemu.git 118 cd qemu 119 mkdir bin 120 cd bin 121 ../configure --target-list=x86_64-softmmu 122 123 124DPDK build 125~~~~~~~~~~ 126 127Build DPDK: 128 129 .. code-block:: console 130 131 git clone git://dpdk.org/dpdk 132 cd dpdk 133 export RTE_SDK=$PWD 134 make install T=x86_64-native-linuxapp-gcc DESTDIR=install 135 136 137Testpmd launch 138~~~~~~~~~~~~~~ 139 140#. Assign NICs to DPDK: 141 142 .. code-block:: console 143 144 modprobe vfio-pci 145 $RTE_SDK/install/sbin/dpdk-devbind -b vfio-pci 0000:11:00.0 0000:11:00.1 146 147 .. Note:: 148 149 The Sandy Bridge family seems to have some IOMMU limitations giving poor 150 performance results. To achieve good performance on these machines 151 consider using UIO instead. 152 153#. Launch the testpmd application: 154 155 .. code-block:: console 156 157 $RTE_SDK/install/bin/testpmd -l 0,2,3,4,5 --socket-mem=1024 -n 4 \ 158 --vdev 'net_vhost0,iface=/tmp/vhost-user1' \ 159 --vdev 'net_vhost1,iface=/tmp/vhost-user2' -- \ 160 --portmask=f --disable-hw-vlan -i --rxq=1 --txq=1 161 --nb-cores=4 --forward-mode=io 162 163 With this command, isolated CPUs 2 to 5 will be used as lcores for PMD threads. 164 165#. In testpmd interactive mode, set the portlist to obtain the correct port 166 chaining: 167 168 .. code-block:: console 169 170 set portlist 0,2,1,3 171 start 172 173 174VM launch 175~~~~~~~~~ 176 177The VM may be launched either by calling QEMU directly, or by using libvirt. 178 179Qemu way 180^^^^^^^^ 181 182Launch QEMU with two Virtio-net devices paired to the vhost-user sockets 183created by testpmd. Below example uses default Virtio-net options, but options 184may be specified, for example to disable mergeable buffers or indirect 185descriptors. 186 187 .. code-block:: console 188 189 <QEMU path>/bin/x86_64-softmmu/qemu-system-x86_64 \ 190 -enable-kvm -cpu host -m 3072 -smp 3 \ 191 -chardev socket,id=char0,path=/tmp/vhost-user1 \ 192 -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \ 193 -device virtio-net-pci,netdev=mynet1,mac=52:54:00:02:d9:01,addr=0x10 \ 194 -chardev socket,id=char1,path=/tmp/vhost-user2 \ 195 -netdev type=vhost-user,id=mynet2,chardev=char1,vhostforce \ 196 -device virtio-net-pci,netdev=mynet2,mac=52:54:00:02:d9:02,addr=0x11 \ 197 -object memory-backend-file,id=mem,size=3072M,mem-path=/dev/hugepages,share=on \ 198 -numa node,memdev=mem -mem-prealloc \ 199 -net user,hostfwd=tcp::1002$1-:22 -net nic \ 200 -qmp unix:/tmp/qmp.socket,server,nowait \ 201 -monitor stdio <vm_image>.qcow2 202 203You can use this `qmp-vcpu-pin <https://patchwork.kernel.org/patch/9361617/>`_ 204script to pin vCPUs. 205 206It can be used as follows, for example to pin 3 vCPUs to CPUs 1, 6 and 7, 207where isolated CPUs 6 and 7 will be used as lcores for Virtio PMDs: 208 209 .. code-block:: console 210 211 export PYTHONPATH=$PYTHONPATH:<QEMU path>/scripts/qmp 212 ./qmp-vcpu-pin -s /tmp/qmp.socket 1 6 7 213 214Libvirt way 215^^^^^^^^^^^ 216 217Some initial steps are required for libvirt to be able to connect to testpmd's 218sockets. 219 220First, SELinux policy needs to be set to permissive, since testpmd is 221generally run as root (note, as reboot is required): 222 223 .. code-block:: console 224 225 cat /etc/selinux/config 226 227 # This file controls the state of SELinux on the system. 228 # SELINUX= can take one of these three values: 229 # enforcing - SELinux security policy is enforced. 230 # permissive - SELinux prints warnings instead of enforcing. 231 # disabled - No SELinux policy is loaded. 232 SELINUX=permissive 233 234 # SELINUXTYPE= can take one of three two values: 235 # targeted - Targeted processes are protected, 236 # minimum - Modification of targeted policy. 237 # Only selected processes are protected. 238 # mls - Multi Level Security protection. 239 SELINUXTYPE=targeted 240 241 242Also, Qemu needs to be run as root, which has to be specified in 243``/etc/libvirt/qemu.conf``: 244 245 .. code-block:: console 246 247 user = "root" 248 249Once the domain created, the following snippet is an extract of he most 250important information (hugepages, vCPU pinning, Virtio PCI devices): 251 252 .. code-block:: xml 253 254 <domain type='kvm'> 255 <memory unit='KiB'>3145728</memory> 256 <currentMemory unit='KiB'>3145728</currentMemory> 257 <memoryBacking> 258 <hugepages> 259 <page size='1048576' unit='KiB' nodeset='0'/> 260 </hugepages> 261 <locked/> 262 </memoryBacking> 263 <vcpu placement='static'>3</vcpu> 264 <cputune> 265 <vcpupin vcpu='0' cpuset='1'/> 266 <vcpupin vcpu='1' cpuset='6'/> 267 <vcpupin vcpu='2' cpuset='7'/> 268 <emulatorpin cpuset='0'/> 269 </cputune> 270 <numatune> 271 <memory mode='strict' nodeset='0'/> 272 </numatune> 273 <os> 274 <type arch='x86_64' machine='pc-i440fx-rhel7.0.0'>hvm</type> 275 <boot dev='hd'/> 276 </os> 277 <cpu mode='host-passthrough'> 278 <topology sockets='1' cores='3' threads='1'/> 279 <numa> 280 <cell id='0' cpus='0-2' memory='3145728' unit='KiB' memAccess='shared'/> 281 </numa> 282 </cpu> 283 <devices> 284 <interface type='vhostuser'> 285 <mac address='56:48:4f:53:54:01'/> 286 <source type='unix' path='/tmp/vhost-user1' mode='client'/> 287 <model type='virtio'/> 288 <driver name='vhost' rx_queue_size='256' /> 289 <address type='pci' domain='0x0000' bus='0x00' slot='0x10' function='0x0'/> 290 </interface> 291 <interface type='vhostuser'> 292 <mac address='56:48:4f:53:54:02'/> 293 <source type='unix' path='/tmp/vhost-user2' mode='client'/> 294 <model type='virtio'/> 295 <driver name='vhost' rx_queue_size='256' /> 296 <address type='pci' domain='0x0000' bus='0x00' slot='0x11' function='0x0'/> 297 </interface> 298 </devices> 299 </domain> 300 301 302Guest setup 303----------- 304 305 306Guest tuning 307~~~~~~~~~~~~ 308 309#. Append these options to the Kernel command line: 310 311 .. code-block:: console 312 313 default_hugepagesz=1G hugepagesz=1G hugepages=1 intel_iommu=on iommu=pt isolcpus=1,2 rcu_nocbs=1,2 nohz_full=1,2 314 315#. Disable NMIs: 316 317 .. code-block:: console 318 319 echo 0 > /proc/sys/kernel/nmi_watchdog 320 321#. Exclude isolated CPU1 and CPU2 from the writeback cpumask: 322 323 .. code-block:: console 324 325 echo 1 > /sys/bus/workqueue/devices/writeback/cpumask 326 327#. Isolate CPUs from IRQs: 328 329 .. code-block:: console 330 331 clear_mask=0x6 #Isolate CPU1 and CPU2 from IRQs 332 for i in /proc/irq/*/smp_affinity 333 do 334 echo "obase=16;$(( 0x$(cat $i) & ~$clear_mask ))" | bc > $i 335 done 336 337 338DPDK build 339~~~~~~~~~~ 340 341Build DPDK: 342 343 .. code-block:: console 344 345 git clone git://dpdk.org/dpdk 346 cd dpdk 347 export RTE_SDK=$PWD 348 make install T=x86_64-native-linuxapp-gcc DESTDIR=install 349 350 351Testpmd launch 352~~~~~~~~~~~~~~ 353 354Probe vfio module without iommu: 355 356 .. code-block:: console 357 358 modprobe -r vfio_iommu_type1 359 modprobe -r vfio 360 modprobe vfio enable_unsafe_noiommu_mode=1 361 cat /sys/module/vfio/parameters/enable_unsafe_noiommu_mode 362 modprobe vfio-pci 363 364Bind the virtio-net devices to DPDK: 365 366 .. code-block:: console 367 368 $RTE_SDK/tools/dpdk-devbind.py -b vfio-pci 0000:00:10.0 0000:00:11.0 369 370Start testpmd: 371 372 .. code-block:: console 373 374 $RTE_SDK/install/bin/testpmd -l 0,1,2 --socket-mem 1024 -n 4 \ 375 --proc-type auto --file-prefix pg -- \ 376 --portmask=3 --forward-mode=macswap --port-topology=chained \ 377 --disable-hw-vlan --disable-rss -i --rxq=1 --txq=1 \ 378 --rxd=256 --txd=256 --nb-cores=2 --auto-start 379 380Results template 381---------------- 382 383Below template should be used when sharing results: 384 385 .. code-block:: none 386 387 Traffic Generator: <Test equipment (e.g. IXIA, Moongen, ...)> 388 Acceptable Loss: <n>% 389 Validation run time: <n>min 390 Host DPDK version/commit: <version, SHA-1> 391 Guest DPDK version/commit: <version, SHA-1> 392 Patches applied: <link to patchwork> 393 QEMU version/commit: <version> 394 Virtio features: <features (e.g. mrg_rxbuf='off', leave empty if default)> 395 CPU: <CPU model>, <CPU frequency> 396 NIC: <NIC model> 397 Result: <n> Mpps 398