xref: /dpdk/doc/guides/howto/virtio_user_for_container_networking.rst (revision 79238624c2b19bd71a20839ebf304e1eeb8dee9c)
15630257fSFerruh Yigit..  SPDX-License-Identifier: BSD-3-Clause
25630257fSFerruh Yigit    Copyright(c) 2016 Intel Corporation.
350665deeSJianfeng Tan
450665deeSJianfeng Tan.. _virtio_user_for_container_networking:
550665deeSJianfeng Tan
650665deeSJianfeng TanVirtio_user for Container Networking
750665deeSJianfeng Tan====================================
850665deeSJianfeng Tan
950665deeSJianfeng TanContainer becomes more and more popular for strengths, like low overhead, fast
1050665deeSJianfeng Tanboot-up time, and easy to deploy, etc. How to use DPDK to accelerate container
1150665deeSJianfeng Tannetworking becomes a common question for users. There are two use models of
1250665deeSJianfeng Tanrunning DPDK inside containers, as shown in
1350665deeSJianfeng Tan:numref:`figure_use_models_for_running_dpdk_in_containers`.
1450665deeSJianfeng Tan
1550665deeSJianfeng Tan.. _figure_use_models_for_running_dpdk_in_containers:
1650665deeSJianfeng Tan
1750665deeSJianfeng Tan.. figure:: img/use_models_for_running_dpdk_in_containers.*
1850665deeSJianfeng Tan
1950665deeSJianfeng Tan   Use models of running DPDK inside container
2050665deeSJianfeng Tan
2150665deeSJianfeng TanThis page will only cover aggregation model.
2250665deeSJianfeng Tan
2350665deeSJianfeng TanOverview
2450665deeSJianfeng Tan--------
2550665deeSJianfeng Tan
2650665deeSJianfeng TanThe virtual device, virtio-user, with unmodified vhost-user backend, is designed
2750665deeSJianfeng Tanfor high performance user space container networking or inter-process
2850665deeSJianfeng Tancommunication (IPC).
2950665deeSJianfeng Tan
3050665deeSJianfeng TanThe overview of accelerating container networking by virtio-user is shown
3150665deeSJianfeng Tanin :numref:`figure_virtio_user_for_container_networking`.
3250665deeSJianfeng Tan
3350665deeSJianfeng Tan.. _figure_virtio_user_for_container_networking:
3450665deeSJianfeng Tan
3550665deeSJianfeng Tan.. figure:: img/virtio_user_for_container_networking.*
3650665deeSJianfeng Tan
3750665deeSJianfeng Tan   Overview of accelerating container networking by virtio-user
3850665deeSJianfeng Tan
3950665deeSJianfeng TanDifferent virtio PCI devices we usually use as a para-virtualization I/O in the
4050665deeSJianfeng Tancontext of QEMU/VM, the basic idea here is to present a kind of virtual devices,
4150665deeSJianfeng Tanwhich can be attached and initialized by DPDK. The device emulation layer by
4250665deeSJianfeng TanQEMU in VM's context is saved by just registering a new kind of virtual device
4350665deeSJianfeng Tanin DPDK's ether layer. And to minimize the change, we reuse already-existing
4450665deeSJianfeng Tanvirtio PMD code (driver/net/virtio/).
4550665deeSJianfeng Tan
4650665deeSJianfeng TanVirtio, in essence, is a shm-based solution to transmit/receive packets. How is
4750665deeSJianfeng Tanmemory shared? In VM's case, qemu always shares the whole physical layout of VM
4850665deeSJianfeng Tanto vhost backend. But it's not feasible for a container, as a process, to share
4950665deeSJianfeng Tanall virtual memory regions to backend. So only those virtual memory regions
5050665deeSJianfeng Tan(aka, hugepages initialized in DPDK) are sent to backend. It restricts that only
5150665deeSJianfeng Tanaddresses in these areas can be used to transmit or receive packets.
5250665deeSJianfeng Tan
5350665deeSJianfeng TanSample Usage
5450665deeSJianfeng Tan------------
5550665deeSJianfeng Tan
5650665deeSJianfeng TanHere we use Docker as container engine. It also applies to LXC, Rocket with
5750665deeSJianfeng Tansome minor changes.
5850665deeSJianfeng Tan
5950665deeSJianfeng Tan#. Write a Dockerfile like below.
6050665deeSJianfeng Tan
6150665deeSJianfeng Tan    .. code-block:: console
6250665deeSJianfeng Tan
6350665deeSJianfeng Tan	cat <<EOT >> Dockerfile
6450665deeSJianfeng Tan	FROM ubuntu:latest
6550665deeSJianfeng Tan	WORKDIR /usr/src/dpdk
6650665deeSJianfeng Tan	COPY . /usr/src/dpdk
67*79238624SCiara Power	ENV PATH "$PATH:/usr/src/dpdk/<build_dir>/app/"
6850665deeSJianfeng Tan	EOT
6950665deeSJianfeng Tan
7050665deeSJianfeng Tan#. Build a Docker image.
7150665deeSJianfeng Tan
7250665deeSJianfeng Tan    .. code-block:: console
7350665deeSJianfeng Tan
7450665deeSJianfeng Tan	docker build -t dpdk-app-testpmd .
7550665deeSJianfeng Tan
7650665deeSJianfeng Tan#. Start a testpmd on the host with a vhost-user port.
7750665deeSJianfeng Tan
7850665deeSJianfeng Tan    .. code-block:: console
7950665deeSJianfeng Tan
8035b09d76SKeith Wiles        $(testpmd) -l 0-1 -n 4 --socket-mem 1024,1024 \
8152d6beb9SYong Wang            --vdev 'eth_vhost0,iface=/tmp/sock0' \
8252d6beb9SYong Wang            --file-prefix=host --no-pci -- -i
8350665deeSJianfeng Tan
8450665deeSJianfeng Tan#. Start a container instance with a virtio-user port.
8550665deeSJianfeng Tan
8650665deeSJianfeng Tan    .. code-block:: console
8750665deeSJianfeng Tan
8850665deeSJianfeng Tan        docker run -i -t -v /tmp/sock0:/var/run/usvhost \
8950665deeSJianfeng Tan            -v /dev/hugepages:/dev/hugepages \
9035b09d76SKeith Wiles            dpdk-app-testpmd testpmd -l 6-7 -n 4 -m 1024 --no-pci \
9150665deeSJianfeng Tan            --vdev=virtio_user0,path=/var/run/usvhost \
9252d6beb9SYong Wang            --file-prefix=container \
9371ac6399STiwei Bie            -- -i
9450665deeSJianfeng Tan
9550665deeSJianfeng TanNote: If we run all above setup on the host, it's a shm-based IPC.
9650665deeSJianfeng Tan
9750665deeSJianfeng TanLimitations
9850665deeSJianfeng Tan-----------
9950665deeSJianfeng Tan
10050665deeSJianfeng TanWe have below limitations in this solution:
10150665deeSJianfeng Tan * Cannot work with --huge-unlink option. As we need to reopen the hugepage
10250665deeSJianfeng Tan   file to share with vhost backend.
10350665deeSJianfeng Tan * Cannot work with --no-huge option. Currently, DPDK uses anonymous mapping
10450665deeSJianfeng Tan   under this option which cannot be reopened to share with vhost backend.
10550665deeSJianfeng Tan * Cannot work when there are more than VHOST_MEMORY_MAX_NREGIONS(8) hugepages.
106169a9da6SJianfeng Tan   If you have more regions (especially when 2MB hugepages are used), the option,
107169a9da6SJianfeng Tan   --single-file-segments, can help to reduce the number of shared files.
10850665deeSJianfeng Tan * Applications should not use file name like HUGEFILE_FMT ("%smap_%d"). That
10950665deeSJianfeng Tan   will bring confusion when sharing hugepage files with backend by name.
11050665deeSJianfeng Tan * Root privilege is a must. DPDK resolves physical addresses of hugepages
11150665deeSJianfeng Tan   which seems not necessary, and some discussions are going on to remove this
11250665deeSJianfeng Tan   restriction.
113