11a787169SDaniel Verkamp# NVMe over Fabrics Target {#nvmf} 21a787169SDaniel Verkamp 31a787169SDaniel Verkamp@sa @ref nvme_fabrics_host 4*2f2acf4eSJim Harris@sa @ref tracepoints 51a787169SDaniel Verkamp 61e1fd9acSwawryk## NVMe-oF Target Getting Started Guide {#nvmf_getting_started} 71a787169SDaniel Verkamp 8972b7880SZiye YangThe SPDK NVMe over Fabrics target is a user space application that presents block devices over a fabrics 9972b7880SZiye Yangsuch as Ethernet, Infiniband or Fibre Channel. SPDK currently supports RDMA and TCP transports. 101a787169SDaniel Verkamp 11972b7880SZiye YangThe NVMe over Fabrics specification defines subsystems that can be exported over different transports. 12972b7880SZiye YangSPDK has chosen to call the software that exports these subsystems a "target", which is the term used 131a787169SDaniel Verkampfor iSCSI. The specification refers to the "client" that connects to the target as a "host". Many 141a787169SDaniel Verkamppeople will also refer to the host as an "initiator", which is the equivalent thing in iSCSI 151a787169SDaniel Verkampparlance. SPDK will try to stick to the terms "target" and "host" to match the specification. 161a787169SDaniel Verkamp 1746092156SDaniel VerkampThe Linux kernel also implements an NVMe-oF target and host, and SPDK is tested for 1846092156SDaniel Verkampinteroperability with the Linux kernel implementations. 191a787169SDaniel Verkamp 2039240a9eSCunyin ChangIf you want to kill the application using signal, make sure use the SIGTERM, then the application 2139240a9eSCunyin Changwill release all the share memory resource before exit, the SIGKILL will make the share memory 2239240a9eSCunyin Changresource have no chance to be released by application, you may need to release the resource manually. 2339240a9eSCunyin Chang 24972b7880SZiye Yang## RDMA transport support {#nvmf_rdma_transport} 251a787169SDaniel Verkamp 26972b7880SZiye YangIt requires an RDMA-capable NIC with its corresponding OFED (OpenFabrics Enterprise Distribution) 27972b7880SZiye Yangsoftware package installed to run. Maybe OS distributions provide packages, but OFED is also 28972b7880SZiye Yangavailable [here](https://downloads.openfabrics.org/OFED/). 29972b7880SZiye Yang 30972b7880SZiye Yang### Prerequisites {#nvmf_prereqs} 31972b7880SZiye Yang 32fe5954c6SMaciej WawrykTo build nvmf_tgt with the RDMA transport, there are some additional dependencies, 33fe5954c6SMaciej Wawrykwhich can be install using pkgdep.sh script. 341a787169SDaniel Verkamp 351a787169SDaniel Verkamp~~~{.sh} 36fe5954c6SMaciej Wawryksudo scripts/pkgdep.sh --rdma 371a787169SDaniel Verkamp~~~ 381a787169SDaniel Verkamp 3946092156SDaniel VerkampThen build SPDK with RDMA enabled: 401a787169SDaniel Verkamp 411a787169SDaniel Verkamp~~~{.sh} 4246092156SDaniel Verkamp./configure --with-rdma <other config parameters> 4346092156SDaniel Verkampmake 441a787169SDaniel Verkamp~~~ 451a787169SDaniel Verkamp 466b9b448eSBen WalkerOnce built, the binary will be in `build/bin`. 471a787169SDaniel Verkamp 48972b7880SZiye Yang### Prerequisites for InfiniBand/RDMA Verbs {#nvmf_prereqs_verbs} 491a787169SDaniel Verkamp 50972b7880SZiye YangBefore starting our NVMe-oF target with the RDMA transport we must load the InfiniBand and RDMA modules 51972b7880SZiye Yangthat allow userspace processes to use InfiniBand/RDMA verbs directly. 521a787169SDaniel Verkamp 531a787169SDaniel Verkamp~~~{.sh} 541a787169SDaniel Verkampmodprobe ib_cm 551a787169SDaniel Verkampmodprobe ib_core 560126d706SSeth Howell# Please note that ib_ucm does not exist in newer versions of the kernel and is not required. 570126d706SSeth Howellmodprobe ib_ucm || true 581a787169SDaniel Verkampmodprobe ib_umad 591a787169SDaniel Verkampmodprobe ib_uverbs 601a787169SDaniel Verkampmodprobe iw_cm 611a787169SDaniel Verkampmodprobe rdma_cm 621a787169SDaniel Verkampmodprobe rdma_ucm 631a787169SDaniel Verkamp~~~ 641a787169SDaniel Verkamp 65972b7880SZiye Yang### Prerequisites for RDMA NICs {#nvmf_prereqs_rdma_nics} 661a787169SDaniel Verkamp 671a787169SDaniel VerkampBefore starting our NVMe-oF target we must detect RDMA NICs and assign them IP addresses. 681a787169SDaniel Verkamp 690126d706SSeth Howell### Finding RDMA NICs and associated network interfaces 700126d706SSeth Howell 710126d706SSeth Howell~~~{.sh} 720126d706SSeth Howellls /sys/class/infiniband/*/device/net 730126d706SSeth Howell~~~ 740126d706SSeth Howell 75972b7880SZiye Yang#### Mellanox ConnectX-3 RDMA NICs 761a787169SDaniel Verkamp 771a787169SDaniel Verkamp~~~{.sh} 781a787169SDaniel Verkampmodprobe mlx4_core 791a787169SDaniel Verkampmodprobe mlx4_ib 801a787169SDaniel Verkampmodprobe mlx4_en 811a787169SDaniel Verkamp~~~ 821a787169SDaniel Verkamp 83972b7880SZiye Yang#### Mellanox ConnectX-4 RDMA NICs 841a787169SDaniel Verkamp 851a787169SDaniel Verkamp~~~{.sh} 861a787169SDaniel Verkampmodprobe mlx5_core 871a787169SDaniel Verkampmodprobe mlx5_ib 881a787169SDaniel Verkamp~~~ 891a787169SDaniel Verkamp 90972b7880SZiye Yang#### Assigning IP addresses to RDMA NICs 911a787169SDaniel Verkamp 921a787169SDaniel Verkamp~~~{.sh} 931a787169SDaniel Verkampifconfig eth1 192.168.100.8 netmask 255.255.255.0 up 941a787169SDaniel Verkampifconfig eth2 192.168.100.9 netmask 255.255.255.0 up 951a787169SDaniel Verkamp~~~ 961a787169SDaniel Verkamp 97972b7880SZiye Yang### RDMA Limitations {#nvmf_rdma_limitations} 98972b7880SZiye Yang 99972b7880SZiye YangAs RDMA NICs put a limitation on the number of memory regions registered, the SPDK NVMe-oF 100972b7880SZiye Yangtarget application may eventually start failing to allocate more DMA-able memory. This is 101972b7880SZiye Yangan imperfection of the DPDK dynamic memory management and is most likely to occur with too 102972b7880SZiye Yangmany 2MB hugepages reserved at runtime. One type of memory bottleneck is the number of NIC memory 103972b7880SZiye Yangregions, e.g., some NICs report as many as 2048 for the maximum number of memory regions. This 104972b7880SZiye Yanggives us a 4GB memory limit with 2MB hugepages for the total memory regions. It can be overcome by 105972b7880SZiye Yangusing 1GB hugepages or by pre-reserving memory at application startup with `--mem-size` or `-s` 106972b7880SZiye Yangoption. All pre-reserved memory will be registered as a single region, but won't be returned to the 107972b7880SZiye Yangsystem until the SPDK application is terminated. 108972b7880SZiye Yang 10997b0c5d3SKonrad SztyberAnother known issue occurs when using the E810 NICs in RoCE mode. Specifically, the NVMe-oF target 11097b0c5d3SKonrad Sztybersometimes cannot destroy a qpair, because its posted work requests don't get flushed. It can cause 11197b0c5d3SKonrad Sztyberthe NVMe-oF target application unable to terminate cleanly. 11297b0c5d3SKonrad Sztyber 113972b7880SZiye Yang## TCP transport support {#nvmf_tcp_transport} 114972b7880SZiye Yang 115972b7880SZiye YangThe transport is built into the nvmf_tgt by default, and it does not need any special libraries. 116972b7880SZiye Yang 117ed56a3d4SAnil Veerabhadrappa## FC transport support {#nvmf_fc_transport} 118ed56a3d4SAnil Veerabhadrappa 119ed56a3d4SAnil VeerabhadrappaTo build nvmf_tgt with the FC transport, there is an additional FC LLD (Low Level Driver) code dependency. 120ed56a3d4SAnil VeerabhadrappaPlease contact your FC vendor for instructions to obtain FC driver module. 121ed56a3d4SAnil Veerabhadrappa 122ed56a3d4SAnil Veerabhadrappa### Broadcom FC LLD code 12393be26a5SKarol Latecki 124ed56a3d4SAnil VeerabhadrappaFC LLD driver for Broadcom FC NVMe capable adapters can be obtained from, 125ed56a3d4SAnil Veerabhadrappahttps://github.com/ecdufcdrvr/bcmufctdrvr. 126ed56a3d4SAnil Veerabhadrappa 12771efe5dbSKarol Latecki### Fetch FC LLD module and then build SPDK with FC enabled 12893be26a5SKarol Latecki 129ed56a3d4SAnil VeerabhadrappaAfter cloning SPDK repo and initialize submodules, FC LLD library is built which then can be linked with 130ed56a3d4SAnil Veerabhadrappathe fc transport. 131ed56a3d4SAnil Veerabhadrappa 132ed56a3d4SAnil Veerabhadrappa~~~{.sh} 13301a88849Spaul lusegit clone https://github.com/spdk/spdk --recursive 134ed56a3d4SAnil Veerabhadrappagit clone https://github.com/ecdufcdrvr/bcmufctdrvr fc 13501a88849Spaul lusecd fc 136ed56a3d4SAnil Veerabhadrappamake DPDK_DIR=../spdk/dpdk/build SPDK_DIR=../spdk 137ed56a3d4SAnil Veerabhadrappacd ../spdk 138ed56a3d4SAnil Veerabhadrappa./configure --with-fc=../fc/build 139ed56a3d4SAnil Veerabhadrappamake 140ed56a3d4SAnil Veerabhadrappa~~~ 141ed56a3d4SAnil Veerabhadrappa 142471300a3SKonrad Sztyber## Configuring the SPDK NVMe over Fabrics Target {#nvmf_config} 143471300a3SKonrad Sztyber 144471300a3SKonrad SztyberAn NVMe over Fabrics target can be configured using JSON RPCs. 145471300a3SKonrad SztyberThe basic RPCs needed to configure the NVMe-oF subsystem are detailed below. More information about 146471300a3SKonrad Sztyberworking with NVMe over Fabrics specific RPCs can be found on the @ref jsonrpc_components_nvmf_tgt RPC page. 147471300a3SKonrad Sztyber 1480126d706SSeth Howell### Using RPCs {#nvmf_config_rpc} 1490126d706SSeth Howell 1505240cbbbSSeth HowellStart the nvmf_tgt application with elevated privileges. Once the target is started, 1515240cbbbSSeth Howellthe nvmf_create_transport rpc can be used to initialize a given transport. Below is an 152972b7880SZiye Yangexample where the target is started and configured with two different transports. 153adde7ea5SAlexey MarchukThe RDMA transport is configured with an I/O unit size of 8192 bytes, max I/O size 131072 and an 154adde7ea5SAlexey Marchukin capsule data size of 8192 bytes. The TCP transport is configured with an I/O unit size of 155972b7880SZiye Yang16384 bytes, 8 max qpairs per controller, and an in capsule data size of 8192 bytes. 1561a787169SDaniel Verkamp 1571a787169SDaniel Verkamp~~~{.sh} 1586b9b448eSBen Walkerbuild/bin/nvmf_tgt 159adde7ea5SAlexey Marchukscripts/rpc.py nvmf_create_transport -t RDMA -u 8192 -i 131072 -c 8192 160102ab669SShuhei Matsumotoscripts/rpc.py nvmf_create_transport -t TCP -u 16384 -m 8 -c 8192 1611a787169SDaniel Verkamp~~~ 1621a787169SDaniel Verkamp 1630126d706SSeth HowellBelow is an example of creating a malloc bdev and assigning it to a subsystem. Adjust the bdevs, 164972b7880SZiye YangNQN, serial number, and IP address with RDMA transport to your own circumstances. If you replace 165972b7880SZiye Yang"rdma" with "TCP", then the subsystem will add a listener with TCP transport. 16646092156SDaniel Verkamp 1671a787169SDaniel Verkamp~~~{.sh} 1687964f1dfSPawel Kaminskiscripts/rpc.py bdev_malloc_create -b Malloc0 512 512 1697538af70SMaciej Wawrykscripts/rpc.py nvmf_create_subsystem nqn.2016-06.io.spdk:cnode1 -a -s SPDK00000000000001 -d SPDK_Controller1 1700126d706SSeth Howellscripts/rpc.py nvmf_subsystem_add_ns nqn.2016-06.io.spdk:cnode1 Malloc0 1710126d706SSeth Howellscripts/rpc.py nvmf_subsystem_add_listener nqn.2016-06.io.spdk:cnode1 -t rdma -a 192.168.100.8 -s 4420 1721a787169SDaniel Verkamp~~~ 1731a787169SDaniel Verkamp 1740126d706SSeth Howell### NQN Formal Definition 175cb0f4879SSeth Howell 176cb0f4879SSeth HowellNVMe qualified names or NQNs are defined in section 7.9 of the 177cb0f4879SSeth Howell[NVMe specification](http://nvmexpress.org/wp-content/uploads/NVM_Express_Revision_1.3.pdf). SPDK has attempted to 178cb0f4879SSeth Howellformalize that definition using [Extended Backus-Naur form](https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form). 179cb0f4879SSeth HowellSPDK modules use this formal definition (provided below) when validating NQNs. 180cb0f4879SSeth Howell 181cb0f4879SSeth Howell~~~{.sh} 182cb0f4879SSeth Howell 183cb0f4879SSeth HowellBasic Types 184cb0f4879SSeth Howellyear = 4 * digit ; 185cb0f4879SSeth Howellmonth = '01' | '02' | '03' | '04' | '05' | '06' | '07' | '08' | '09' | '10' | '11' | '12' ; 186cb0f4879SSeth Howelldigit = '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' ; 18712fcbc9bSwawrykhex digit = 'A' | 'B' | 'C' | 'D' | 'E' | 'F' | 'a' | 'b' | 'c' | 'd' | 'e' | 'f' | '0' | 18812fcbc9bSwawryk'1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' ; 189cb0f4879SSeth Howell 190cb0f4879SSeth HowellNQN Definition 191cb0f4879SSeth HowellNVMe Qualified Name = ( NVMe-oF Discovery NQN | NVMe UUID NQN | NVMe Domain NQN ), '\0' ; 192cb0f4879SSeth HowellNVMe-oF Discovery NQN = "nqn.2014-08.org.nvmexpress.discovery" ; 193cb0f4879SSeth HowellNVMe UUID NQN = "nqn.2014-08.org.nvmexpress:uuid:", string UUID ; 194cb0f4879SSeth Howellstring UUID = 8 * hex digit, '-', 3 * (4 * hex digit, '-'), 12 * hex digit ; 195cb0f4879SSeth HowellNVMe Domain NQN = "nqn.", year, '-', month, '.', reverse domain, ':', utf-8 string ; 196cb0f4879SSeth Howell 197cb0f4879SSeth Howell~~~ 198cb0f4879SSeth Howell 199cb0f4879SSeth HowellPlease note that the following types from the definition above are defined elsewhere: 2003d8a0b19SKarol Latecki 201cb0f4879SSeth Howell1. utf-8 string: Defined in [rfc 3629](https://tools.ietf.org/html/rfc3629). 202cb0f4879SSeth Howell2. reverse domain: Equivalent to domain name as defined in [rfc 1034](https://tools.ietf.org/html/rfc1034). 203cb0f4879SSeth Howell 204cb0f4879SSeth HowellWhile not stated in the formal definition, SPDK enforces the requirement from the spec that the 205cb0f4879SSeth Howell"maximum name is 223 bytes in length". SPDK does not include the null terminating character when 206cb0f4879SSeth Howelldefining the length of an nqn, and will accept an nqn containing up to 223 valid bytes with an 207cb0f4879SSeth Howelladditional null terminator. To be precise, SPDK follows the same conventions as the c standard 208cb0f4879SSeth Howelllibrary function [strlen()](http://man7.org/linux/man-pages/man3/strlen.3.html). 209cb0f4879SSeth Howell 210cb0f4879SSeth Howell#### NQN Comparisons 211cb0f4879SSeth Howell 212cb0f4879SSeth HowellSPDK compares NQNs byte for byte without case matching or unicode normalization. This has specific implications for 213cb0f4879SSeth Howelluuid based NQNs. The following pair of NQNs, for example, would not match when compared in the SPDK NVMe-oF Target: 214cb0f4879SSeth Howell 215cb0f4879SSeth Howellnqn.2014-08.org.nvmexpress:uuid:11111111-aaaa-bbdd-ffee-123456789abc 216cb0f4879SSeth Howellnqn.2014-08.org.nvmexpress:uuid:11111111-AAAA-BBDD-FFEE-123456789ABC 217cb0f4879SSeth Howell 218cb0f4879SSeth HowellIn order to ensure the consistency of uuid based NQNs while using SPDK, users should use lowercase when representing 219cb0f4879SSeth Howellalphabetic hex digits in their NQNs. 220cb0f4879SSeth Howell 22146092156SDaniel Verkamp### Assigning CPU Cores to the NVMe over Fabrics Target {#nvmf_config_lcore} 22246092156SDaniel Verkamp 22346092156SDaniel VerkampSPDK uses the [DPDK Environment Abstraction Layer](http://dpdk.org/doc/guides/prog_guide/env_abstraction_layer.html) 22446092156SDaniel Verkampto gain access to hardware resources such as huge memory pages and CPU core(s). DPDK EAL provides 22546092156SDaniel Verkampfunctions to assign threads to specific cores. 2261f813ec3SChen WangTo ensure the SPDK NVMe-oF target has the best performance, configure the NICs and NVMe devices to 22746092156SDaniel Verkampbe located on the same NUMA node. 22846092156SDaniel Verkamp 22946092156SDaniel VerkampThe `-m` core mask option specifies a bit mask of the CPU cores that 23046092156SDaniel VerkampSPDK is allowed to execute work items on. 23146092156SDaniel VerkampFor example, to allow SPDK to use cores 24, 25, 26 and 27: 23246092156SDaniel Verkamp~~~{.sh} 2336b9b448eSBen Walkerbuild/bin/nvmf_tgt -m 0xF000000 23446092156SDaniel Verkamp~~~ 23546092156SDaniel Verkamp 23646092156SDaniel Verkamp## Configuring the Linux NVMe over Fabrics Host {#nvmf_host} 23746092156SDaniel Verkamp 23846092156SDaniel VerkampBoth the Linux kernel and SPDK implement an NVMe over Fabrics host. 239972b7880SZiye YangThe Linux kernel NVMe-oF RDMA host support is provided by the `nvme-rdma` driver 240972b7880SZiye Yang(to support RDMA transport) and `nvme-tcp` (to support TCP transport). And the 241972b7880SZiye Yangfollowing shows two different commands for loading the driver. 24246092156SDaniel Verkamp 24346092156SDaniel Verkamp~~~{.sh} 24446092156SDaniel Verkampmodprobe nvme-rdma 245972b7880SZiye Yangmodprobe nvme-tcp 24646092156SDaniel Verkamp~~~ 24746092156SDaniel Verkamp 24846092156SDaniel VerkampThe nvme-cli tool may be used to interface with the Linux kernel NVMe over Fabrics host. 249972b7880SZiye YangSee below for examples of the discover, connect and disconnect commands. In all three instances, the 250972b7880SZiye Yangtransport can be changed to TCP by interchanging 'rdma' for 'tcp'. 25146092156SDaniel Verkamp 25246092156SDaniel VerkampDiscovery: 25346092156SDaniel Verkamp~~~{.sh} 25446092156SDaniel Verkampnvme discover -t rdma -a 192.168.100.8 -s 4420 25546092156SDaniel Verkamp~~~ 25646092156SDaniel Verkamp 25746092156SDaniel VerkampConnect: 25846092156SDaniel Verkamp~~~{.sh} 25946092156SDaniel Verkampnvme connect -t rdma -n "nqn.2016-06.io.spdk:cnode1" -a 192.168.100.8 -s 4420 26046092156SDaniel Verkamp~~~ 26146092156SDaniel Verkamp 26246092156SDaniel VerkampDisconnect: 26346092156SDaniel Verkamp~~~{.sh} 26446092156SDaniel Verkampnvme disconnect -n "nqn.2016-06.io.spdk:cnode1" 26546092156SDaniel Verkamp~~~ 2666eeb762fSJim Harris 2676eeb762fSJim Harris## Enabling NVMe-oF target tracepoints for offline analysis and debug {#nvmf_trace} 2686eeb762fSJim Harris 2696c275b7aSJohn KariukiSPDK has a tracing framework for capturing low-level event information at runtime. 270*2f2acf4eSJim Harris@ref tracepoints enable analysis of both performance and application crashes. 271e745bb65STheo Jepsen 272e745bb65STheo Jepsen## Enabling NVMe-oF Multipath 273e745bb65STheo Jepsen 274e745bb65STheo JepsenThe SPDK NVMe-oF target and initiator support multiple independent paths to the same NVMe-oF subsystem. 275e745bb65STheo JepsenFor step-by-step instructions for configuring and switching between paths, see @ref nvmf_multipath_howto . 2762ce2fe09SKrzysztof Karas 2772ce2fe09SKrzysztof Karas## Enabling NVMe-oF TLS 2782ce2fe09SKrzysztof Karas 2792ce2fe09SKrzysztof KarasThe SPDK NVMe-oF target and initiator support establishing a secure TCP connection using Transport 2802ce2fe09SKrzysztof KarasLayer Security (TLS) protocol in compliance with NVMe TCP transport specification. Only version 1.3 281c6c96e48SKrzysztof Karasof the TLS protocol is supported. This feature is considered experimental. 2822ce2fe09SKrzysztof Karas 28382f7ed1cSKonrad SztyberCurrently, it is only possible to establish a fabric secure channel using TLS. The channel is 28482f7ed1cSKonrad Sztyberprotected by a symmetric pre-shared key (PSK) using either `TLS_AES_256_GCM_SHA384` (recommended) or 28582f7ed1cSKonrad Sztyber`TLS_AES_128_GCM_SHA256` cipher suite. The cipher suite is selected based on the hash function 28682f7ed1cSKonrad Sztyberassociated with a key. During configuration, the keys are expected to be in the PSK interchange 28782f7ed1cSKonrad Sztyberformat (see NVMe TCP transport specification 1.0c, section 3.6.1.5). 2882ce2fe09SKrzysztof Karas 2892ce2fe09SKrzysztof KarasThe target supports assigning different keys for each host connecting to a given subsystem. It is 2902ce2fe09SKrzysztof Karasalso possible for a single host to use different keys for different subsystems. The keys are 2912ce2fe09SKrzysztof Karasexpected to be placed in separate files (with permissions configured only to allow read/write 2922ce2fe09SKrzysztof Karasaccess to the owner) and can be configured using the `--psk` option in the `nvmf_subsystem_add_host` 2932ce2fe09SKrzysztof KarasRPC. Additionally, to allow establishing TLS connections on a given listener, it must be created 2942ce2fe09SKrzysztof Karaswith `--secure-channel` option enabled. It's also worth noting that this option is mutually 2952ce2fe09SKrzysztof Karasexclusive with `--allow-any-host` subsystem option and trying to add a listener to such a subsystem 2962ce2fe09SKrzysztof Karaswill result in an error. 2972ce2fe09SKrzysztof Karas 2982ce2fe09SKrzysztof KarasOn the initiator side, the key can be specified using `--psk` option in the 2992ce2fe09SKrzysztof Karas`bdev_nvme_attach_controller` RPC. 3002ce2fe09SKrzysztof Karas 3012ce2fe09SKrzysztof KarasRecommendations on the pre-shared keys: 3022ce2fe09SKrzysztof Karas 3032ce2fe09SKrzysztof Karas* It is strongly recommended to change the keys at least once a year. 3042ce2fe09SKrzysztof Karas* Use a strong cryptographic random number generator that provides sufficient entropy 3052ce2fe09SKrzysztof Karas to generate the keys (e.g. HSM). 3062ce2fe09SKrzysztof Karas* Use a single key to secure transmission between two systems only. 3072ce2fe09SKrzysztof Karas* Delete files containing PSKs as soon as they are not needed. 3082ce2fe09SKrzysztof Karas 3092ce2fe09SKrzysztof KarasAdditionally, it is recommended to follow: 3102ce2fe09SKrzysztof Karas[RFC 9257 'Guidance for External Pre-Shared Key (PSK) Usage in TLS'](https://www.rfc-editor.org/rfc/rfc9257.html) 3112ce2fe09SKrzysztof Karas 3122ce2fe09SKrzysztof Karas### Target setup 3132ce2fe09SKrzysztof Karas 3142ce2fe09SKrzysztof Karas~~~{.sh} 3152ce2fe09SKrzysztof Karascat key.txt 3162ce2fe09SKrzysztof KarasNVMeTLSkey-1:01:MDAxMTIyMzM0NDU1NjY3Nzg4OTlhYWJiY2NkZGVlZmZwJEiQ: 3172ce2fe09SKrzysztof Karas 3182ce2fe09SKrzysztof Karasbuild/bin/nvmf_tgt & 3192ce2fe09SKrzysztof Karasscripts/rpc.py nvmf_create_transport -t TCP 3202ce2fe09SKrzysztof Karasscripts/rpc.py nvmf_create_subsystem nqn.2016-06.io.spdk:cnode1 -s SPDK00000000000001 -m 10 3212ce2fe09SKrzysztof Karasscripts/rpc.py nvmf_subsystem_add_listener nqn.2016-06.io.spdk:cnode1 -t tcp -a 127.0.0.1 -s 4420 \ 3222ce2fe09SKrzysztof Karas --secure-channel 3232ce2fe09SKrzysztof Karasscripts/rpc.py nvmf_subsystem_add_host nqn.2016-06.io.spdk:cnode1 nqn.2016-06.io.spdk:host1 \ 3242ce2fe09SKrzysztof Karas --psk key.txt 3252ce2fe09SKrzysztof Karas~~~ 3262ce2fe09SKrzysztof Karas 3272ce2fe09SKrzysztof Karas### Initiator setup 3282ce2fe09SKrzysztof Karas 3292ce2fe09SKrzysztof KarasFor SPDK initiator example, bdevperf application may be used, because it depends on SPDK's 3302ce2fe09SKrzysztof KarasNVMe TCP driver. 3312ce2fe09SKrzysztof Karas 3322ce2fe09SKrzysztof Karas~~~{.sh} 3332ce2fe09SKrzysztof Karascat key.txt 3342ce2fe09SKrzysztof KarasNVMeTLSkey-1:01:MDAxMTIyMzM0NDU1NjY3Nzg4OTlhYWJiY2NkZGVlZmZwJEiQ: 3352ce2fe09SKrzysztof Karas 3362ce2fe09SKrzysztof Karasbuild/examples/bdevperf -m 0x2 -z -r /var/tmp/bdevperf.sock -q 128 -o 4096 -w verify -t 10 & 3372ce2fe09SKrzysztof Karasscripts/rpc.py -s /var/tmp/bdevperf.sock bdev_nvme_attach_controller -b TLSTEST -t tcp -a 127.0.0.1 \ 3382ce2fe09SKrzysztof Karas -s 4420 -f ipv4 -n nqn.2016-06.io.spdk:cnode1 -q nqn.2016-06.io.spdk:host1 \ 3392ce2fe09SKrzysztof Karas --psk key.txt 3402ce2fe09SKrzysztof Karas~~~ 3412ce2fe09SKrzysztof Karas 3422ce2fe09SKrzysztof KarasFirst of the two commands will launch bdevperf, the second one will attempt to construct NVMe bdev 3432ce2fe09SKrzysztof Karasand establish TLS connection. Of course, the same PSK must be used on both the target and the 3442ce2fe09SKrzysztof Karasinitiator side. 34582f7ed1cSKonrad Sztyber 34682f7ed1cSKonrad Sztyber## NVMe-oF in-band authentication 34782f7ed1cSKonrad Sztyber 34882f7ed1cSKonrad SztyberThe NVMe-oF driver and NVMe-oF target both support in-band authentication using the DH-HMAC-CHAP 34982f7ed1cSKonrad Sztyberprotocol. It allows the target to authenticate the host and the host to authenticate the target 35082f7ed1cSKonrad Sztyber(the latter part is optional). 35182f7ed1cSKonrad Sztyber 35282f7ed1cSKonrad SztyberThe authentication will be performed if a subsystem is configured to allow a host with a set of 35382f7ed1cSKonrad SztyberDH-HMAC-CHAP keys. Each host is allowed to use different keys to connect to different subsystems 35482f7ed1cSKonrad Sztyberand each subsystem might use different keys for different hosts. For instance, the following 35582f7ed1cSKonrad Sztyberconfigures three hosts, two of which can request bidirectional authentication: 35682f7ed1cSKonrad Sztyber 35782f7ed1cSKonrad Sztyber```{.sh} 35882f7ed1cSKonrad Sztyber$ scripts/rpc.py nvmf_subsystem_add_host nqn.2024-05.io.spdk:cnode0 nqn.2024-05.io.spdk:host0 \ 35982f7ed1cSKonrad Sztyber --dhchap-key key0 --dhchap-ctrlr-key ctrlr-key0 36082f7ed1cSKonrad Sztyber$ scripts/rpc.py nvmf_subsystem_add_host nqn.2024-05.io.spdk:cnode0 nqn.2024-05.io.spdk:host1 \ 36182f7ed1cSKonrad Sztyber --dhchap-key key1 --dhchap-ctrlr-key ctrlr-key1 36282f7ed1cSKonrad Sztyber$ scripts/rpc.py nvmf_subsystem_add_host nqn.2024-05.io.spdk:cnode0 nqn.2024-05.io.spdk:host2 \ 36382f7ed1cSKonrad Sztyber --dhchap-key key2 36482f7ed1cSKonrad Sztyber``` 36582f7ed1cSKonrad Sztyber 3668a4b7226SKonrad SztyberAdditionally, it's possible to change the keys while preserving existing connections to a subsystem 3678a4b7226SKonrad Sztybervia `nvmf_subsystem_set_keys`. After that's done, new connections and reauthentication requests 3688a4b7226SKonrad Sztyberwill be required to use the new keys. 3698a4b7226SKonrad Sztyber 3708a4b7226SKonrad Sztyber```{.sh} 3718a4b7226SKonrad Sztyber$ scripts/rpc.py nvmf_subsystem_add_host nqn.2024-05.io.spdk:cnode0 nqn.2024-05.io.spdk:host0 \ 3728a4b7226SKonrad Sztyber --dhchap-key key0 --dhchap-ctrlr-key ctrlr-key0 3738a4b7226SKonrad Sztyber# Host nqn.2024-05.io.spdk:host0 connects to subsystem nqn.2024-05.io.spdk:cnode0 3748a4b7226SKonrad Sztyber$ scripts/rpc.py nvmf_subsystem_set_keys nqn.2024-05.io.spdk:cnode0 nqn.2024-05.io.spdk:host0 \ 3758a4b7226SKonrad Sztyber --dhchap-key key1 --dhchap-ctrlr-key ctrlr-key1 3768a4b7226SKonrad Sztyber``` 3778a4b7226SKonrad Sztyber 37882f7ed1cSKonrad SztyberOn the host side, the keys are specified when attaching controllers, e.g.: 37982f7ed1cSKonrad Sztyber 38082f7ed1cSKonrad Sztyber```{.sh} 38182f7ed1cSKonrad Sztyber$ scripts/rpc.py bdev_nvme_attach_controller -b nvme0 -t tcp -f ipv4 -a 127.0.0.1 -s 4420 \ 38282f7ed1cSKonrad Sztyber -n nqn.2024-05.io.spdk:cnode0 -q nqn.2024-05.io.spdk:host0 --dhchap-key key0 \ 38382f7ed1cSKonrad Sztyber --dhchap-ctrlr-key ctrlr-key0 38482f7ed1cSKonrad Sztyber``` 38582f7ed1cSKonrad Sztyber 38682f7ed1cSKonrad SztyberAll hash functions/Diffie-Hellman groups defined in the NVMe Base Specification 2.0d are supported 38782f7ed1cSKonrad Sztyberand the algorithms used for a given DH-HMAC-CHAP transaction are negotiated at the beginning. The 38882f7ed1cSKonrad SztyberSPDK NVMe-oF target selects the strongest available hash/group depending on its configuration and 38982f7ed1cSKonrad Sztyberthe capabilities of a peer. Users can limit the allowed hash functions and/or Diffie-Hellman groups 39082f7ed1cSKonrad Sztybervia RPCs. For example, the following limits the target (`nvmf_set_config`) and the driver 39182f7ed1cSKonrad Sztyber(`bdev_nvme_set_options`) to use sha384, sha512 and ffdhe6114, ffdhe8192: 39282f7ed1cSKonrad Sztyber 39382f7ed1cSKonrad Sztyber```{.sh} 39482f7ed1cSKonrad Sztyber$ scripts/rpc.py nvmf_set_config --dhchap-digests sha384,sha512 \ 39582f7ed1cSKonrad Sztyber --dhchap-dhgroups ffdhe6114,ffdhe8192 39682f7ed1cSKonrad Sztyber$ scripts/rpc.py bdev_nvme_set_options --dhchap-digests sha384,sha512 \ 39782f7ed1cSKonrad Sztyber --dhchap-dhgroups ffdhe6114,ffdhe8192 39882f7ed1cSKonrad Sztyber``` 399e0966436SKonrad Sztyber 400e0966436SKonrad SztyberThe NVMe specification describes the method for using in-band authentication in conjunction with 401e0966436SKonrad Sztyberestablishing a secure channel (e.g. TLS). However, that isn't supported currently, so in order to 402e0966436SKonrad Sztyberperform in-band authentication, hosts must connect over regular listeners (i.e. those that weren't 403e0966436SKonrad Sztybercreated with the `--secure-channel` option). 404