xref: /spdk/doc/nvmf.md (revision 2f2acf4eb25cee406c156120cee22721275ca7fd)
11a787169SDaniel Verkamp# NVMe over Fabrics Target {#nvmf}
21a787169SDaniel Verkamp
31a787169SDaniel Verkamp@sa @ref nvme_fabrics_host
4*2f2acf4eSJim Harris@sa @ref tracepoints
51a787169SDaniel Verkamp
61e1fd9acSwawryk## NVMe-oF Target Getting Started Guide {#nvmf_getting_started}
71a787169SDaniel Verkamp
8972b7880SZiye YangThe SPDK NVMe over Fabrics target is a user space application that presents block devices over a fabrics
9972b7880SZiye Yangsuch as Ethernet, Infiniband or Fibre Channel. SPDK currently supports RDMA and TCP transports.
101a787169SDaniel Verkamp
11972b7880SZiye YangThe NVMe over Fabrics specification defines subsystems that can be exported over different transports.
12972b7880SZiye YangSPDK has chosen to call the software that exports these subsystems a "target", which is the term used
131a787169SDaniel Verkampfor iSCSI. The specification refers to the "client" that connects to the target as a "host". Many
141a787169SDaniel Verkamppeople will also refer to the host as an "initiator", which is the equivalent thing in iSCSI
151a787169SDaniel Verkampparlance. SPDK will try to stick to the terms "target" and "host" to match the specification.
161a787169SDaniel Verkamp
1746092156SDaniel VerkampThe Linux kernel also implements an NVMe-oF target and host, and SPDK is tested for
1846092156SDaniel Verkampinteroperability with the Linux kernel implementations.
191a787169SDaniel Verkamp
2039240a9eSCunyin ChangIf you want to kill the application using signal, make sure use the SIGTERM, then the application
2139240a9eSCunyin Changwill release all the share memory resource before exit, the SIGKILL will make the share memory
2239240a9eSCunyin Changresource have no chance to be released by application, you may need to release the resource manually.
2339240a9eSCunyin Chang
24972b7880SZiye Yang## RDMA transport support {#nvmf_rdma_transport}
251a787169SDaniel Verkamp
26972b7880SZiye YangIt requires an RDMA-capable NIC with its corresponding OFED (OpenFabrics Enterprise Distribution)
27972b7880SZiye Yangsoftware package installed to run. Maybe OS distributions provide packages, but OFED is also
28972b7880SZiye Yangavailable [here](https://downloads.openfabrics.org/OFED/).
29972b7880SZiye Yang
30972b7880SZiye Yang### Prerequisites {#nvmf_prereqs}
31972b7880SZiye Yang
32fe5954c6SMaciej WawrykTo build nvmf_tgt with the RDMA transport, there are some additional dependencies,
33fe5954c6SMaciej Wawrykwhich can be install using pkgdep.sh script.
341a787169SDaniel Verkamp
351a787169SDaniel Verkamp~~~{.sh}
36fe5954c6SMaciej Wawryksudo scripts/pkgdep.sh --rdma
371a787169SDaniel Verkamp~~~
381a787169SDaniel Verkamp
3946092156SDaniel VerkampThen build SPDK with RDMA enabled:
401a787169SDaniel Verkamp
411a787169SDaniel Verkamp~~~{.sh}
4246092156SDaniel Verkamp./configure --with-rdma <other config parameters>
4346092156SDaniel Verkampmake
441a787169SDaniel Verkamp~~~
451a787169SDaniel Verkamp
466b9b448eSBen WalkerOnce built, the binary will be in `build/bin`.
471a787169SDaniel Verkamp
48972b7880SZiye Yang### Prerequisites for InfiniBand/RDMA Verbs {#nvmf_prereqs_verbs}
491a787169SDaniel Verkamp
50972b7880SZiye YangBefore starting our NVMe-oF target with the RDMA transport we must load the InfiniBand and RDMA modules
51972b7880SZiye Yangthat allow userspace processes to use InfiniBand/RDMA verbs directly.
521a787169SDaniel Verkamp
531a787169SDaniel Verkamp~~~{.sh}
541a787169SDaniel Verkampmodprobe ib_cm
551a787169SDaniel Verkampmodprobe ib_core
560126d706SSeth Howell# Please note that ib_ucm does not exist in newer versions of the kernel and is not required.
570126d706SSeth Howellmodprobe ib_ucm || true
581a787169SDaniel Verkampmodprobe ib_umad
591a787169SDaniel Verkampmodprobe ib_uverbs
601a787169SDaniel Verkampmodprobe iw_cm
611a787169SDaniel Verkampmodprobe rdma_cm
621a787169SDaniel Verkampmodprobe rdma_ucm
631a787169SDaniel Verkamp~~~
641a787169SDaniel Verkamp
65972b7880SZiye Yang### Prerequisites for RDMA NICs {#nvmf_prereqs_rdma_nics}
661a787169SDaniel Verkamp
671a787169SDaniel VerkampBefore starting our NVMe-oF target we must detect RDMA NICs and assign them IP addresses.
681a787169SDaniel Verkamp
690126d706SSeth Howell### Finding RDMA NICs and associated network interfaces
700126d706SSeth Howell
710126d706SSeth Howell~~~{.sh}
720126d706SSeth Howellls /sys/class/infiniband/*/device/net
730126d706SSeth Howell~~~
740126d706SSeth Howell
75972b7880SZiye Yang#### Mellanox ConnectX-3 RDMA NICs
761a787169SDaniel Verkamp
771a787169SDaniel Verkamp~~~{.sh}
781a787169SDaniel Verkampmodprobe mlx4_core
791a787169SDaniel Verkampmodprobe mlx4_ib
801a787169SDaniel Verkampmodprobe mlx4_en
811a787169SDaniel Verkamp~~~
821a787169SDaniel Verkamp
83972b7880SZiye Yang#### Mellanox ConnectX-4 RDMA NICs
841a787169SDaniel Verkamp
851a787169SDaniel Verkamp~~~{.sh}
861a787169SDaniel Verkampmodprobe mlx5_core
871a787169SDaniel Verkampmodprobe mlx5_ib
881a787169SDaniel Verkamp~~~
891a787169SDaniel Verkamp
90972b7880SZiye Yang#### Assigning IP addresses to RDMA NICs
911a787169SDaniel Verkamp
921a787169SDaniel Verkamp~~~{.sh}
931a787169SDaniel Verkampifconfig eth1 192.168.100.8 netmask 255.255.255.0 up
941a787169SDaniel Verkampifconfig eth2 192.168.100.9 netmask 255.255.255.0 up
951a787169SDaniel Verkamp~~~
961a787169SDaniel Verkamp
97972b7880SZiye Yang### RDMA Limitations {#nvmf_rdma_limitations}
98972b7880SZiye Yang
99972b7880SZiye YangAs RDMA NICs put a limitation on the number of memory regions registered, the SPDK NVMe-oF
100972b7880SZiye Yangtarget application may eventually start failing to allocate more DMA-able memory. This is
101972b7880SZiye Yangan imperfection of the DPDK dynamic memory management and is most likely to occur with too
102972b7880SZiye Yangmany 2MB hugepages reserved at runtime. One type of memory bottleneck is the number of NIC memory
103972b7880SZiye Yangregions, e.g., some NICs report as many as 2048 for the maximum number of memory regions. This
104972b7880SZiye Yanggives us a 4GB memory limit with 2MB hugepages for the total memory regions. It can be overcome by
105972b7880SZiye Yangusing 1GB hugepages or by pre-reserving memory at application startup with `--mem-size` or `-s`
106972b7880SZiye Yangoption. All pre-reserved memory will be registered as a single region, but won't be returned to the
107972b7880SZiye Yangsystem until the SPDK application is terminated.
108972b7880SZiye Yang
10997b0c5d3SKonrad SztyberAnother known issue occurs when using the E810 NICs in RoCE mode. Specifically, the NVMe-oF target
11097b0c5d3SKonrad Sztybersometimes cannot destroy a qpair, because its posted work requests don't get flushed.  It can cause
11197b0c5d3SKonrad Sztyberthe NVMe-oF target application unable to terminate cleanly.
11297b0c5d3SKonrad Sztyber
113972b7880SZiye Yang## TCP transport support {#nvmf_tcp_transport}
114972b7880SZiye Yang
115972b7880SZiye YangThe transport is built into the nvmf_tgt by default, and it does not need any special libraries.
116972b7880SZiye Yang
117ed56a3d4SAnil Veerabhadrappa## FC transport support {#nvmf_fc_transport}
118ed56a3d4SAnil Veerabhadrappa
119ed56a3d4SAnil VeerabhadrappaTo build nvmf_tgt with the FC transport, there is an additional FC LLD (Low Level Driver) code dependency.
120ed56a3d4SAnil VeerabhadrappaPlease contact your FC vendor for instructions to obtain FC driver module.
121ed56a3d4SAnil Veerabhadrappa
122ed56a3d4SAnil Veerabhadrappa### Broadcom FC LLD code
12393be26a5SKarol Latecki
124ed56a3d4SAnil VeerabhadrappaFC LLD driver for Broadcom FC NVMe capable adapters can be obtained from,
125ed56a3d4SAnil Veerabhadrappahttps://github.com/ecdufcdrvr/bcmufctdrvr.
126ed56a3d4SAnil Veerabhadrappa
12771efe5dbSKarol Latecki### Fetch FC LLD module and then build SPDK with FC enabled
12893be26a5SKarol Latecki
129ed56a3d4SAnil VeerabhadrappaAfter cloning SPDK repo and initialize submodules, FC LLD library is built which then can be linked with
130ed56a3d4SAnil Veerabhadrappathe fc transport.
131ed56a3d4SAnil Veerabhadrappa
132ed56a3d4SAnil Veerabhadrappa~~~{.sh}
13301a88849Spaul lusegit clone https://github.com/spdk/spdk --recursive
134ed56a3d4SAnil Veerabhadrappagit clone https://github.com/ecdufcdrvr/bcmufctdrvr fc
13501a88849Spaul lusecd fc
136ed56a3d4SAnil Veerabhadrappamake DPDK_DIR=../spdk/dpdk/build SPDK_DIR=../spdk
137ed56a3d4SAnil Veerabhadrappacd ../spdk
138ed56a3d4SAnil Veerabhadrappa./configure --with-fc=../fc/build
139ed56a3d4SAnil Veerabhadrappamake
140ed56a3d4SAnil Veerabhadrappa~~~
141ed56a3d4SAnil Veerabhadrappa
142471300a3SKonrad Sztyber## Configuring the SPDK NVMe over Fabrics Target {#nvmf_config}
143471300a3SKonrad Sztyber
144471300a3SKonrad SztyberAn NVMe over Fabrics target can be configured using JSON RPCs.
145471300a3SKonrad SztyberThe basic RPCs needed to configure the NVMe-oF subsystem are detailed below. More information about
146471300a3SKonrad Sztyberworking with NVMe over Fabrics specific RPCs can be found on the @ref jsonrpc_components_nvmf_tgt RPC page.
147471300a3SKonrad Sztyber
1480126d706SSeth Howell### Using RPCs {#nvmf_config_rpc}
1490126d706SSeth Howell
1505240cbbbSSeth HowellStart the nvmf_tgt application with elevated privileges. Once the target is started,
1515240cbbbSSeth Howellthe nvmf_create_transport rpc can be used to initialize a given transport. Below is an
152972b7880SZiye Yangexample where the target is started and configured with two different transports.
153adde7ea5SAlexey MarchukThe RDMA transport is configured with an I/O unit size of 8192 bytes, max I/O size 131072 and an
154adde7ea5SAlexey Marchukin capsule data size of 8192 bytes. The TCP transport is configured with an I/O unit size of
155972b7880SZiye Yang16384 bytes, 8 max qpairs per controller, and an in capsule data size of 8192 bytes.
1561a787169SDaniel Verkamp
1571a787169SDaniel Verkamp~~~{.sh}
1586b9b448eSBen Walkerbuild/bin/nvmf_tgt
159adde7ea5SAlexey Marchukscripts/rpc.py nvmf_create_transport -t RDMA -u 8192 -i 131072 -c 8192
160102ab669SShuhei Matsumotoscripts/rpc.py nvmf_create_transport -t TCP -u 16384 -m 8 -c 8192
1611a787169SDaniel Verkamp~~~
1621a787169SDaniel Verkamp
1630126d706SSeth HowellBelow is an example of creating a malloc bdev and assigning it to a subsystem. Adjust the bdevs,
164972b7880SZiye YangNQN, serial number, and IP address with RDMA transport to your own circumstances. If you replace
165972b7880SZiye Yang"rdma" with "TCP", then the subsystem will add a listener with TCP transport.
16646092156SDaniel Verkamp
1671a787169SDaniel Verkamp~~~{.sh}
1687964f1dfSPawel Kaminskiscripts/rpc.py bdev_malloc_create -b Malloc0 512 512
1697538af70SMaciej Wawrykscripts/rpc.py nvmf_create_subsystem nqn.2016-06.io.spdk:cnode1 -a -s SPDK00000000000001 -d SPDK_Controller1
1700126d706SSeth Howellscripts/rpc.py nvmf_subsystem_add_ns nqn.2016-06.io.spdk:cnode1 Malloc0
1710126d706SSeth Howellscripts/rpc.py nvmf_subsystem_add_listener nqn.2016-06.io.spdk:cnode1 -t rdma -a 192.168.100.8 -s 4420
1721a787169SDaniel Verkamp~~~
1731a787169SDaniel Verkamp
1740126d706SSeth Howell### NQN Formal Definition
175cb0f4879SSeth Howell
176cb0f4879SSeth HowellNVMe qualified names or NQNs are defined in section 7.9 of the
177cb0f4879SSeth Howell[NVMe specification](http://nvmexpress.org/wp-content/uploads/NVM_Express_Revision_1.3.pdf). SPDK has attempted to
178cb0f4879SSeth Howellformalize that definition using [Extended Backus-Naur form](https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form).
179cb0f4879SSeth HowellSPDK modules use this formal definition (provided below) when validating NQNs.
180cb0f4879SSeth Howell
181cb0f4879SSeth Howell~~~{.sh}
182cb0f4879SSeth Howell
183cb0f4879SSeth HowellBasic Types
184cb0f4879SSeth Howellyear = 4 * digit ;
185cb0f4879SSeth Howellmonth = '01' | '02' | '03' | '04' | '05' | '06' | '07' | '08' | '09' | '10' | '11' | '12' ;
186cb0f4879SSeth Howelldigit = '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' ;
18712fcbc9bSwawrykhex digit = 'A' | 'B' | 'C' | 'D' | 'E' | 'F' | 'a' | 'b' | 'c' | 'd' | 'e' | 'f' | '0' |
18812fcbc9bSwawryk'1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' ;
189cb0f4879SSeth Howell
190cb0f4879SSeth HowellNQN Definition
191cb0f4879SSeth HowellNVMe Qualified Name = ( NVMe-oF Discovery NQN | NVMe UUID NQN | NVMe Domain NQN ), '\0' ;
192cb0f4879SSeth HowellNVMe-oF Discovery NQN = "nqn.2014-08.org.nvmexpress.discovery" ;
193cb0f4879SSeth HowellNVMe UUID NQN = "nqn.2014-08.org.nvmexpress:uuid:", string UUID ;
194cb0f4879SSeth Howellstring UUID = 8 * hex digit, '-', 3 * (4 * hex digit, '-'), 12 * hex digit ;
195cb0f4879SSeth HowellNVMe Domain NQN = "nqn.", year, '-', month, '.', reverse domain, ':', utf-8 string ;
196cb0f4879SSeth Howell
197cb0f4879SSeth Howell~~~
198cb0f4879SSeth Howell
199cb0f4879SSeth HowellPlease note that the following types from the definition above are defined elsewhere:
2003d8a0b19SKarol Latecki
201cb0f4879SSeth Howell1. utf-8 string: Defined in [rfc 3629](https://tools.ietf.org/html/rfc3629).
202cb0f4879SSeth Howell2. reverse domain: Equivalent to domain name as defined in [rfc 1034](https://tools.ietf.org/html/rfc1034).
203cb0f4879SSeth Howell
204cb0f4879SSeth HowellWhile not stated in the formal definition, SPDK enforces the requirement from the spec that the
205cb0f4879SSeth Howell"maximum name is 223 bytes in length". SPDK does not include the null terminating character when
206cb0f4879SSeth Howelldefining the length of an nqn, and will accept an nqn containing up to 223 valid bytes with an
207cb0f4879SSeth Howelladditional null terminator. To be precise, SPDK follows the same conventions as the c standard
208cb0f4879SSeth Howelllibrary function [strlen()](http://man7.org/linux/man-pages/man3/strlen.3.html).
209cb0f4879SSeth Howell
210cb0f4879SSeth Howell#### NQN Comparisons
211cb0f4879SSeth Howell
212cb0f4879SSeth HowellSPDK compares NQNs byte for byte without case matching or unicode normalization. This has specific implications for
213cb0f4879SSeth Howelluuid based NQNs. The following pair of NQNs, for example, would not match when compared in the SPDK NVMe-oF Target:
214cb0f4879SSeth Howell
215cb0f4879SSeth Howellnqn.2014-08.org.nvmexpress:uuid:11111111-aaaa-bbdd-ffee-123456789abc
216cb0f4879SSeth Howellnqn.2014-08.org.nvmexpress:uuid:11111111-AAAA-BBDD-FFEE-123456789ABC
217cb0f4879SSeth Howell
218cb0f4879SSeth HowellIn order to ensure the consistency of uuid based NQNs while using SPDK, users should use lowercase when representing
219cb0f4879SSeth Howellalphabetic hex digits in their NQNs.
220cb0f4879SSeth Howell
22146092156SDaniel Verkamp### Assigning CPU Cores to the NVMe over Fabrics Target {#nvmf_config_lcore}
22246092156SDaniel Verkamp
22346092156SDaniel VerkampSPDK uses the [DPDK Environment Abstraction Layer](http://dpdk.org/doc/guides/prog_guide/env_abstraction_layer.html)
22446092156SDaniel Verkampto gain access to hardware resources such as huge memory pages and CPU core(s). DPDK EAL provides
22546092156SDaniel Verkampfunctions to assign threads to specific cores.
2261f813ec3SChen WangTo ensure the SPDK NVMe-oF target has the best performance, configure the NICs and NVMe devices to
22746092156SDaniel Verkampbe located on the same NUMA node.
22846092156SDaniel Verkamp
22946092156SDaniel VerkampThe `-m` core mask option specifies a bit mask of the CPU cores that
23046092156SDaniel VerkampSPDK is allowed to execute work items on.
23146092156SDaniel VerkampFor example, to allow SPDK to use cores 24, 25, 26 and 27:
23246092156SDaniel Verkamp~~~{.sh}
2336b9b448eSBen Walkerbuild/bin/nvmf_tgt -m 0xF000000
23446092156SDaniel Verkamp~~~
23546092156SDaniel Verkamp
23646092156SDaniel Verkamp## Configuring the Linux NVMe over Fabrics Host {#nvmf_host}
23746092156SDaniel Verkamp
23846092156SDaniel VerkampBoth the Linux kernel and SPDK implement an NVMe over Fabrics host.
239972b7880SZiye YangThe Linux kernel NVMe-oF RDMA host support is provided by the `nvme-rdma` driver
240972b7880SZiye Yang(to support RDMA transport) and `nvme-tcp` (to support TCP transport). And the
241972b7880SZiye Yangfollowing shows two different commands for loading the driver.
24246092156SDaniel Verkamp
24346092156SDaniel Verkamp~~~{.sh}
24446092156SDaniel Verkampmodprobe nvme-rdma
245972b7880SZiye Yangmodprobe nvme-tcp
24646092156SDaniel Verkamp~~~
24746092156SDaniel Verkamp
24846092156SDaniel VerkampThe nvme-cli tool may be used to interface with the Linux kernel NVMe over Fabrics host.
249972b7880SZiye YangSee below for examples of the discover, connect and disconnect commands. In all three instances, the
250972b7880SZiye Yangtransport can be changed to TCP by interchanging 'rdma' for 'tcp'.
25146092156SDaniel Verkamp
25246092156SDaniel VerkampDiscovery:
25346092156SDaniel Verkamp~~~{.sh}
25446092156SDaniel Verkampnvme discover -t rdma -a 192.168.100.8 -s 4420
25546092156SDaniel Verkamp~~~
25646092156SDaniel Verkamp
25746092156SDaniel VerkampConnect:
25846092156SDaniel Verkamp~~~{.sh}
25946092156SDaniel Verkampnvme connect -t rdma -n "nqn.2016-06.io.spdk:cnode1" -a 192.168.100.8 -s 4420
26046092156SDaniel Verkamp~~~
26146092156SDaniel Verkamp
26246092156SDaniel VerkampDisconnect:
26346092156SDaniel Verkamp~~~{.sh}
26446092156SDaniel Verkampnvme disconnect -n "nqn.2016-06.io.spdk:cnode1"
26546092156SDaniel Verkamp~~~
2666eeb762fSJim Harris
2676eeb762fSJim Harris## Enabling NVMe-oF target tracepoints for offline analysis and debug {#nvmf_trace}
2686eeb762fSJim Harris
2696c275b7aSJohn KariukiSPDK has a tracing framework for capturing low-level event information at runtime.
270*2f2acf4eSJim Harris@ref tracepoints enable analysis of both performance and application crashes.
271e745bb65STheo Jepsen
272e745bb65STheo Jepsen## Enabling NVMe-oF Multipath
273e745bb65STheo Jepsen
274e745bb65STheo JepsenThe SPDK NVMe-oF target and initiator support multiple independent paths to the same NVMe-oF subsystem.
275e745bb65STheo JepsenFor step-by-step instructions for configuring and switching between paths, see @ref nvmf_multipath_howto .
2762ce2fe09SKrzysztof Karas
2772ce2fe09SKrzysztof Karas## Enabling NVMe-oF TLS
2782ce2fe09SKrzysztof Karas
2792ce2fe09SKrzysztof KarasThe SPDK NVMe-oF target and initiator support establishing a secure TCP connection using Transport
2802ce2fe09SKrzysztof KarasLayer Security (TLS) protocol in compliance with NVMe TCP transport specification. Only version 1.3
281c6c96e48SKrzysztof Karasof the TLS protocol is supported. This feature is considered experimental.
2822ce2fe09SKrzysztof Karas
28382f7ed1cSKonrad SztyberCurrently, it is only possible to establish a fabric secure channel using TLS. The channel is
28482f7ed1cSKonrad Sztyberprotected by a symmetric pre-shared key (PSK) using either `TLS_AES_256_GCM_SHA384` (recommended) or
28582f7ed1cSKonrad Sztyber`TLS_AES_128_GCM_SHA256` cipher suite. The cipher suite is selected based on the hash function
28682f7ed1cSKonrad Sztyberassociated with a key. During configuration, the keys are expected to be in the PSK interchange
28782f7ed1cSKonrad Sztyberformat (see NVMe TCP transport specification 1.0c, section 3.6.1.5).
2882ce2fe09SKrzysztof Karas
2892ce2fe09SKrzysztof KarasThe target supports assigning different keys for each host connecting to a given subsystem. It is
2902ce2fe09SKrzysztof Karasalso possible for a single host to use different keys for different subsystems. The keys are
2912ce2fe09SKrzysztof Karasexpected to be placed in separate files (with permissions configured only to allow read/write
2922ce2fe09SKrzysztof Karasaccess to the owner) and can be configured using the `--psk` option in the `nvmf_subsystem_add_host`
2932ce2fe09SKrzysztof KarasRPC. Additionally, to allow establishing TLS connections on a given listener, it must be created
2942ce2fe09SKrzysztof Karaswith `--secure-channel` option enabled. It's also worth noting that this option is mutually
2952ce2fe09SKrzysztof Karasexclusive with `--allow-any-host` subsystem option and trying to add a listener to such a subsystem
2962ce2fe09SKrzysztof Karaswill result in an error.
2972ce2fe09SKrzysztof Karas
2982ce2fe09SKrzysztof KarasOn the initiator side, the key can be specified using `--psk` option in the
2992ce2fe09SKrzysztof Karas`bdev_nvme_attach_controller` RPC.
3002ce2fe09SKrzysztof Karas
3012ce2fe09SKrzysztof KarasRecommendations on the pre-shared keys:
3022ce2fe09SKrzysztof Karas
3032ce2fe09SKrzysztof Karas* It is strongly recommended to change the keys at least once a year.
3042ce2fe09SKrzysztof Karas* Use a strong cryptographic random number generator that provides sufficient entropy
3052ce2fe09SKrzysztof Karas  to generate the keys (e.g. HSM).
3062ce2fe09SKrzysztof Karas* Use a single key to secure transmission between two systems only.
3072ce2fe09SKrzysztof Karas* Delete files containing PSKs as soon as they are not needed.
3082ce2fe09SKrzysztof Karas
3092ce2fe09SKrzysztof KarasAdditionally, it is recommended to follow:
3102ce2fe09SKrzysztof Karas[RFC 9257 'Guidance for External Pre-Shared Key (PSK) Usage in TLS'](https://www.rfc-editor.org/rfc/rfc9257.html)
3112ce2fe09SKrzysztof Karas
3122ce2fe09SKrzysztof Karas### Target setup
3132ce2fe09SKrzysztof Karas
3142ce2fe09SKrzysztof Karas~~~{.sh}
3152ce2fe09SKrzysztof Karascat key.txt
3162ce2fe09SKrzysztof KarasNVMeTLSkey-1:01:MDAxMTIyMzM0NDU1NjY3Nzg4OTlhYWJiY2NkZGVlZmZwJEiQ:
3172ce2fe09SKrzysztof Karas
3182ce2fe09SKrzysztof Karasbuild/bin/nvmf_tgt &
3192ce2fe09SKrzysztof Karasscripts/rpc.py nvmf_create_transport -t TCP
3202ce2fe09SKrzysztof Karasscripts/rpc.py nvmf_create_subsystem nqn.2016-06.io.spdk:cnode1 -s SPDK00000000000001 -m 10
3212ce2fe09SKrzysztof Karasscripts/rpc.py nvmf_subsystem_add_listener nqn.2016-06.io.spdk:cnode1 -t tcp -a 127.0.0.1 -s 4420 \
3222ce2fe09SKrzysztof Karas               --secure-channel
3232ce2fe09SKrzysztof Karasscripts/rpc.py nvmf_subsystem_add_host nqn.2016-06.io.spdk:cnode1 nqn.2016-06.io.spdk:host1 \
3242ce2fe09SKrzysztof Karas               --psk key.txt
3252ce2fe09SKrzysztof Karas~~~
3262ce2fe09SKrzysztof Karas
3272ce2fe09SKrzysztof Karas### Initiator setup
3282ce2fe09SKrzysztof Karas
3292ce2fe09SKrzysztof KarasFor SPDK initiator example, bdevperf application may be used, because it depends on SPDK's
3302ce2fe09SKrzysztof KarasNVMe TCP driver.
3312ce2fe09SKrzysztof Karas
3322ce2fe09SKrzysztof Karas~~~{.sh}
3332ce2fe09SKrzysztof Karascat key.txt
3342ce2fe09SKrzysztof KarasNVMeTLSkey-1:01:MDAxMTIyMzM0NDU1NjY3Nzg4OTlhYWJiY2NkZGVlZmZwJEiQ:
3352ce2fe09SKrzysztof Karas
3362ce2fe09SKrzysztof Karasbuild/examples/bdevperf -m 0x2 -z -r /var/tmp/bdevperf.sock -q 128 -o 4096 -w verify -t 10 &
3372ce2fe09SKrzysztof Karasscripts/rpc.py -s /var/tmp/bdevperf.sock bdev_nvme_attach_controller -b TLSTEST -t tcp -a 127.0.0.1 \
3382ce2fe09SKrzysztof Karas               -s 4420 -f ipv4 -n nqn.2016-06.io.spdk:cnode1 -q nqn.2016-06.io.spdk:host1 \
3392ce2fe09SKrzysztof Karas               --psk key.txt
3402ce2fe09SKrzysztof Karas~~~
3412ce2fe09SKrzysztof Karas
3422ce2fe09SKrzysztof KarasFirst of the two commands will launch bdevperf, the second one will attempt to construct NVMe bdev
3432ce2fe09SKrzysztof Karasand establish TLS connection. Of course, the same PSK must be used on both the target and the
3442ce2fe09SKrzysztof Karasinitiator side.
34582f7ed1cSKonrad Sztyber
34682f7ed1cSKonrad Sztyber## NVMe-oF in-band authentication
34782f7ed1cSKonrad Sztyber
34882f7ed1cSKonrad SztyberThe NVMe-oF driver and NVMe-oF target both support in-band authentication using the DH-HMAC-CHAP
34982f7ed1cSKonrad Sztyberprotocol.  It allows the target to authenticate the host and the host to authenticate the target
35082f7ed1cSKonrad Sztyber(the latter part is optional).
35182f7ed1cSKonrad Sztyber
35282f7ed1cSKonrad SztyberThe authentication will be performed if a subsystem is configured to allow a host with a set of
35382f7ed1cSKonrad SztyberDH-HMAC-CHAP keys.  Each host is allowed to use different keys to connect to different subsystems
35482f7ed1cSKonrad Sztyberand each subsystem might use different keys for different hosts.  For instance, the following
35582f7ed1cSKonrad Sztyberconfigures three hosts, two of which can request bidirectional authentication:
35682f7ed1cSKonrad Sztyber
35782f7ed1cSKonrad Sztyber```{.sh}
35882f7ed1cSKonrad Sztyber$ scripts/rpc.py nvmf_subsystem_add_host nqn.2024-05.io.spdk:cnode0 nqn.2024-05.io.spdk:host0 \
35982f7ed1cSKonrad Sztyber    --dhchap-key key0 --dhchap-ctrlr-key ctrlr-key0
36082f7ed1cSKonrad Sztyber$ scripts/rpc.py nvmf_subsystem_add_host nqn.2024-05.io.spdk:cnode0 nqn.2024-05.io.spdk:host1 \
36182f7ed1cSKonrad Sztyber    --dhchap-key key1 --dhchap-ctrlr-key ctrlr-key1
36282f7ed1cSKonrad Sztyber$ scripts/rpc.py nvmf_subsystem_add_host nqn.2024-05.io.spdk:cnode0 nqn.2024-05.io.spdk:host2 \
36382f7ed1cSKonrad Sztyber    --dhchap-key key2
36482f7ed1cSKonrad Sztyber```
36582f7ed1cSKonrad Sztyber
3668a4b7226SKonrad SztyberAdditionally, it's possible to change the keys while preserving existing connections to a subsystem
3678a4b7226SKonrad Sztybervia `nvmf_subsystem_set_keys`.  After that's done, new connections and reauthentication requests
3688a4b7226SKonrad Sztyberwill be required to use the new keys.
3698a4b7226SKonrad Sztyber
3708a4b7226SKonrad Sztyber```{.sh}
3718a4b7226SKonrad Sztyber$ scripts/rpc.py nvmf_subsystem_add_host nqn.2024-05.io.spdk:cnode0 nqn.2024-05.io.spdk:host0 \
3728a4b7226SKonrad Sztyber    --dhchap-key key0 --dhchap-ctrlr-key ctrlr-key0
3738a4b7226SKonrad Sztyber# Host nqn.2024-05.io.spdk:host0 connects to subsystem nqn.2024-05.io.spdk:cnode0
3748a4b7226SKonrad Sztyber$ scripts/rpc.py nvmf_subsystem_set_keys nqn.2024-05.io.spdk:cnode0 nqn.2024-05.io.spdk:host0 \
3758a4b7226SKonrad Sztyber    --dhchap-key key1 --dhchap-ctrlr-key ctrlr-key1
3768a4b7226SKonrad Sztyber```
3778a4b7226SKonrad Sztyber
37882f7ed1cSKonrad SztyberOn the host side, the keys are specified when attaching controllers, e.g.:
37982f7ed1cSKonrad Sztyber
38082f7ed1cSKonrad Sztyber```{.sh}
38182f7ed1cSKonrad Sztyber$ scripts/rpc.py bdev_nvme_attach_controller -b nvme0 -t tcp -f ipv4 -a 127.0.0.1 -s 4420 \
38282f7ed1cSKonrad Sztyber    -n nqn.2024-05.io.spdk:cnode0 -q nqn.2024-05.io.spdk:host0 --dhchap-key key0 \
38382f7ed1cSKonrad Sztyber    --dhchap-ctrlr-key ctrlr-key0
38482f7ed1cSKonrad Sztyber```
38582f7ed1cSKonrad Sztyber
38682f7ed1cSKonrad SztyberAll hash functions/Diffie-Hellman groups defined in the NVMe Base Specification 2.0d are supported
38782f7ed1cSKonrad Sztyberand the algorithms used for a given DH-HMAC-CHAP transaction are negotiated at the beginning.  The
38882f7ed1cSKonrad SztyberSPDK NVMe-oF target selects the strongest available hash/group depending on its configuration and
38982f7ed1cSKonrad Sztyberthe capabilities of a peer.  Users can limit the allowed hash functions and/or Diffie-Hellman groups
39082f7ed1cSKonrad Sztybervia RPCs.  For example, the following limits the target (`nvmf_set_config`) and the driver
39182f7ed1cSKonrad Sztyber(`bdev_nvme_set_options`) to use sha384, sha512 and ffdhe6114, ffdhe8192:
39282f7ed1cSKonrad Sztyber
39382f7ed1cSKonrad Sztyber```{.sh}
39482f7ed1cSKonrad Sztyber$ scripts/rpc.py nvmf_set_config --dhchap-digests sha384,sha512 \
39582f7ed1cSKonrad Sztyber    --dhchap-dhgroups ffdhe6114,ffdhe8192
39682f7ed1cSKonrad Sztyber$ scripts/rpc.py bdev_nvme_set_options --dhchap-digests sha384,sha512 \
39782f7ed1cSKonrad Sztyber    --dhchap-dhgroups ffdhe6114,ffdhe8192
39882f7ed1cSKonrad Sztyber```
399e0966436SKonrad Sztyber
400e0966436SKonrad SztyberThe NVMe specification describes the method for using in-band authentication in conjunction with
401e0966436SKonrad Sztyberestablishing a secure channel (e.g. TLS).  However, that isn't supported currently, so in order to
402e0966436SKonrad Sztyberperform in-band authentication, hosts must connect over regular listeners (i.e. those that weren't
403e0966436SKonrad Sztybercreated with the `--secure-channel` option).
404