xref: /spdk/docker/README.md (revision 1204ddffad58ec444b8b953f331ca32f4cfefa9f)
1e8ea27f8SMichal Berger# SPDK Docker suite
2e8ea27f8SMichal Berger
3e8ea27f8SMichal BergerThis suite is meant to serve as an example of how SPDK can be encapsulated
4e8ea27f8SMichal Bergerinto docker container images. The example containers consist of SPDK NVMe-oF
5e8ea27f8SMichal Bergertarget sharing devices to another SPDK NVMe-oF application. Which serves
6e8ea27f8SMichal Bergeras both initiator and target. Finally a traffic generator based on FIO
7e8ea27f8SMichal Bergerissues I/O to the connected devices.
874dcb373SMarcin SpiewakPlease note that some simplifications have been made to the configuration files
974dcb373SMarcin Spiewakfor the purpose of the example, please do not use the files directly in
1074dcb373SMarcin Spiewakthe production environment.
11e8ea27f8SMichal Berger
12e8ea27f8SMichal Berger## Prerequisites
13e8ea27f8SMichal Berger
14e8ea27f8SMichal Bergerdocker: We recommend version 20.10 and above because it supports cgroups v2 for
15e8ea27f8SMichal Bergercustomization of host resources like CPUs, memory, and block I/O.
16e8ea27f8SMichal Berger
17e8ea27f8SMichal Bergerdocker-compose: We recommend using 1.29.2 version or newer.
18e8ea27f8SMichal Berger
19e8ea27f8SMichal Bergerkernel: Hugepages must be allocated prior running the containers and hugetlbfs
20e8ea27f8SMichal Bergermount must be available under /dev/hugepages. Also, tmpfs should be mounted
21e8ea27f8SMichal Bergerunder /dev/shm. Depending on the use-case, some kernel modules should be also
22e8ea27f8SMichal Bergerloaded into the kernel prior running the containers.
23e8ea27f8SMichal Berger
24e8ea27f8SMichal Bergerproxy: If you are working behind firewall make sure dockerd is aware of the
25e8ea27f8SMichal Bergerproxy. Please refer to:
26e8ea27f8SMichal Berger[docker-proxy](https://docs.docker.com/config/daemon/systemd/#httphttps-proxy)
27e8ea27f8SMichal Berger
28e8ea27f8SMichal BergerTo pass `$http_proxy` to docker-compose build use:
29e8ea27f8SMichal Berger~~~{.sh}
30e8ea27f8SMichal Bergerdocker-compose build --build-arg PROXY=$http_proxy
31e8ea27f8SMichal Berger~~~
32e8ea27f8SMichal Berger
33e8ea27f8SMichal Berger## How-To
34e8ea27f8SMichal Berger
35e8ea27f8SMichal Berger`docker-compose.yaml` shows an example deployment of the storage containers based on SPDK.
3615b0fb3aSTomasz ZawadzkiRunning `docker-compose build` creates 5 docker images:
37e8ea27f8SMichal Berger
38e8ea27f8SMichal Berger- build_base
39e8ea27f8SMichal Berger- storage-target
40e8ea27f8SMichal Berger- proxy-container
4115b0fb3aSTomasz Zawadzki- traffic-generator-nvme
4215b0fb3aSTomasz Zawadzki- traffic-generator-virtio
43e8ea27f8SMichal Berger
44e8ea27f8SMichal BergerThe `build_base` image provides the core components required to containerize SPDK
45ba453fbeSKarol Lateckiapplications. The fedora:35 image from the Fedora Container Registry is used and then SPDK is installed. SPDK is installed out of `build_base/spdk.tar.gz` provided.
46e8ea27f8SMichal BergerSee `build_base` folder for details on what's included in the final image.
47e8ea27f8SMichal Berger
48e8ea27f8SMichal BergerRunning `docker-compose up` creates 3 docker containers:
49e8ea27f8SMichal Berger
50*1204ddffSBoris Glimcher- storage-target: Contains SPDK NVMe-oF target exposing single subsystem to `proxy-container` based on malloc bdev.
51*1204ddffSBoris Glimcher- proxy-container: Connecting to `storage-target` and then exposing the same devices to `traffic-generator-nvme` using NVMe-oF and to `traffic-generator-virtio` using Virtio.
52*1204ddffSBoris Glimcher- traffic-generator-nvme: Contains FIO using SPDK plugin to connect to `proxy-container` and runs a sample workload.
53*1204ddffSBoris Glimcher- traffic-generator-virtio: Contains FIO using SPDK plugin to connect to `proxy-container` and runs a sample workload.
54e8ea27f8SMichal Berger
55e8ea27f8SMichal BergerEach container is connected to a separate "spdk" network which is created before
56e8ea27f8SMichal Bergerdeploying the containers. See `docker-compose.yaml` for the network's detailed setup and ip assignment.
57e8ea27f8SMichal Berger
58e8ea27f8SMichal BergerAll the above boils down to:
59e8ea27f8SMichal Berger
60e8ea27f8SMichal Berger~~~{.sh}
61e8ea27f8SMichal Bergercd docker
62e8ea27f8SMichal Bergertar -czf build_base/spdk.tar.gz --exclude='docker/*' -C .. .
63e8ea27f8SMichal Bergerdocker-compose build
64e8ea27f8SMichal Bergerdocker-compose up
65e8ea27f8SMichal Berger~~~
66e8ea27f8SMichal Berger
67e8ea27f8SMichal BergerThe `storage-target` and `proxy-container` can be started as services.
6815b0fb3aSTomasz ZawadzkiAllowing for multiple traffic generator containers to connect.
69e8ea27f8SMichal Berger
70e8ea27f8SMichal Berger~~~{.sh}
71e8ea27f8SMichal Bergerdocker-compose up -d proxy-container
7215b0fb3aSTomasz Zawadzkidocker-compose run traffic-generator-nvme
7315b0fb3aSTomasz Zawadzkidocker-compose run traffic-generator-virtio
74e8ea27f8SMichal Berger~~~
75e8ea27f8SMichal Berger
763f912cf0SMichal BergerEnvironment variables to containers can be passed as shown in
77e8ea27f8SMichal Berger[docs](https://docs.docker.com/compose/environment-variables/).
78e8ea27f8SMichal BergerFor example extra arguments to fio can be passed as so:
79e8ea27f8SMichal Berger
80e8ea27f8SMichal Berger~~~{.sh}
8115b0fb3aSTomasz Zawadzkidocker-compose run -e FIO_ARGS="--minimal" traffic-generator-nvme
82e8ea27f8SMichal Berger~~~
83e8ea27f8SMichal Berger
84e8ea27f8SMichal BergerAs each container includes SPDK installation it is possible to use rpc.py to
85e8ea27f8SMichal Bergerexamine the final setup. E.g.:
86e8ea27f8SMichal Berger
87e8ea27f8SMichal Berger~~~{.sh}
88e8ea27f8SMichal Bergerdocker-compose exec storage-target rpc.py bdev_get_bdevs
89e8ea27f8SMichal Bergerdocker-compose exec proxy-container rpc.py nvmf_get_subsystems
90e8ea27f8SMichal Berger~~~
91e8ea27f8SMichal Berger
920f57273aSBoris Glimcher## Monitoring
930f57273aSBoris Glimcher
940f57273aSBoris Glimcher`docker-compose.monitoring.yaml` shows an example deployment of the storage containers based on SPDK.
950f57273aSBoris Glimcher
960f57273aSBoris GlimcherRunning `docker-compose -f docker-compose.monitoring.yaml up` creates 3 docker containers:
970f57273aSBoris Glimcher
98*1204ddffSBoris Glimcher- storage-target: Contains SPDK NVMe-oF target exposing single subsystem based on malloc bdev.
99*1204ddffSBoris Glimcher- [telegraf](https://www.influxdata.com/time-series-platform/telegraf/) is a very minimal memory footprint agent for collecting and sending metrics and events.
100*1204ddffSBoris Glimcher- [prometheus](https://prometheus.io/) is leading open-source monitoring solution.
1010f57273aSBoris Glimcher
102*1204ddffSBoris Glimcher`telegraf` connects to `spdk` via `rpc_http_proxy.py` and uses `bdev_get_iostat` commands to fetch bdev statistics.
1030f57273aSBoris Glimcher
104*1204ddffSBoris GlimcherIn order to see data change, once all of the 3 containers are brought up, use `docker-compose run traffic-generator-nvme` to generate some traffic.
1050f57273aSBoris Glimcher
1060f57273aSBoris GlimcherOpen Prometheus UI or query via cmdline. E.g.:
1070f57273aSBoris Glimcher
1080f57273aSBoris Glimcher~~~{.sh}
1090f57273aSBoris Glimchercurl --fail http://127.0.0.1:9090/api/v1/query?query=spdk_bytes_read
1100f57273aSBoris Glimchercurl --fail http://127.0.0.1:9090/api/v1/query?query=spdk_bytes_written
1110f57273aSBoris Glimcher~~~
1120f57273aSBoris Glimcher
113e8ea27f8SMichal Berger## Caveats
114e8ea27f8SMichal Berger
115e8ea27f8SMichal Berger- If you run docker < 20.10 under distro which switched fully to cgroups2
116e8ea27f8SMichal Berger  (e.g. f33) make sure that /sys/fs/cgroup/systemd exists otherwise docker/build
117e8ea27f8SMichal Berger  will simply fail.
118e8ea27f8SMichal Berger- Each SPDK app inside the containers is limited to single, separate CPU.
119