ec9e6728 | 03-Apr-2024 |
Shuhei Matsumoto <smatsumoto@nvidia.com> |
bdev/nvme: Fix the case that namespace was removed during reset
Some NVMe-oF targets add listener and then add namespace to a subsystem. If such NVMe-oF targets reset, the subsystem can be empty tem
bdev/nvme: Fix the case that namespace was removed during reset
Some NVMe-oF targets add listener and then add namespace to a subsystem. If such NVMe-oF targets reset, the subsystem can be empty temporarily when its connection is re-established. Then, namespace can be exposed again later.
The NVMe-oF initiator of the NVMe bdev module could not handle reset correctly for such targets because nvme_ns->ns was not changed and it was not checked if nvme_ns->ns is non-NULL.
This patch adds the following changes to fix the bug.
After adminq is reconnected, check if ns exists or not. If ns does not exist, clear nvme_ns->ns to NULL.
Async event callback fills nvme_ns->ns if ns is active and nvme_ns->ns is different from ns.
Add check if nvme_ns->ns is not NULL to all corresponding functions.
Add unit test for verification.
Fixes github issue #3313
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Iad3c2aace56a21191a7d772981bec8b4e2047af7 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/22485 Reviewed-by: Ben Walker <ben@nvidia.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <jim.harris@samsung.com>
show more ...
|
1c41caa7 | 11-Mar-2024 |
Marcin Spiewak <marcin.spiewak@intel.com> |
bdev/nvme: handle uuid generation errors
This is second patch in series that will implement error code handling during uuid generation.
The nvme_generate_uuid() function definition is changed, it t
bdev/nvme: handle uuid generation errors
This is second patch in series that will implement error code handling during uuid generation.
The nvme_generate_uuid() function definition is changed, it takes additional parameter as a pointer to location where the uuid is stored, and returns error code instead of the uuid structure.
If in nvme_disk_create() function there is an error code returned from nvme_generate_uuid(), an error message is logged and the functions returns with the error code passed from nvme_generate_uuid().
Now, when we are handlig errors from nvme_generate_uuid(), an assert that was verifying status of snprintf() operation is replaced by regular 'if' and returning error code in case of fail.
Change-Id: Iaf1a76597d9aa98c433ebb9d69ba6f9a22773deb Signed-off-by: Marcin Spiewak <marcin.spiewak@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/22291 Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <jim.harris@samsung.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
show more ...
|
75883719 | 09-Feb-2024 |
Konrad Sztyber <konrad.sztyber@intel.com> |
bdev/nvme: specify allowed DH-HMAC-CHAP digests/dhgroups
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: Id4a422d63d7a1526e1e78a84bd6e3b8624c9e41b Reviewed-on: https://review.spd
bdev/nvme: specify allowed DH-HMAC-CHAP digests/dhgroups
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: Id4a422d63d7a1526e1e78a84bd6e3b8624c9e41b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/22021 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <ben@nvidia.com> Reviewed-by: Jim Harris <jim.harris@samsung.com>
show more ...
|
1ecd5b03 | 15-Dec-2023 |
Konrad Sztyber <konrad.sztyber@intel.com> |
bdev/nvme: use keyring for PSKs
It is now possible to specify NVMe/TLS PSKs via keys attached to the keyring. For now, the old method is also available, but it's deprecated and will be removed in t
bdev/nvme: use keyring for PSKs
It is now possible to specify NVMe/TLS PSKs via keys attached to the keyring. For now, the old method is also available, but it's deprecated and will be removed in the future. No new RPC parameters have been added, instead the PSK is first interpreted as a key name and, if that fails, as path to the key file.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: I663e67ff11a3943c3c11d2f4ba4e31473fcc2e67 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/21749 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Jim Harris <jim.harris@samsung.com>
show more ...
|
a3bc7320 | 29-Sep-2023 |
Ben Walker <ben@nvidia.com> |
bdev/nvme: Replace spdk_bdev_io module_link with per-io ctx retry_link
We're removing module_link from spdk_bdev_io, so add a list element to this module's per-io area.
Change-Id: I771c6d415107cacd
bdev/nvme: Replace spdk_bdev_io module_link with per-io ctx retry_link
We're removing module_link from spdk_bdev_io, so add a list element to this module's per-io area.
Change-Id: I771c6d415107cacd797b51b89f9d10b52250d39d Signed-off-by: Ben Walker <ben@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/21937 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
show more ...
|
4f1a0b26 | 04-Jan-2024 |
Jacek Kalwas <jacek.kalwas@intel.com> |
nvme: add contig version of rd/wr with ext opts
The flow is simplified for the case when there is only single iovcnt.
From the past some measurements shown better results for contig vs sgl version
nvme: add contig version of rd/wr with ext opts
The flow is simplified for the case when there is only single iovcnt.
From the past some measurements shown better results for contig vs sgl version (w/o ext opts being used.
Change-Id: I5315703b2814e1f61bdf7b991d6a82853f27ec22 Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/21913 Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <jim.harris@samsung.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
show more ...
|
e3367a23 | 20-Feb-2024 |
Jim Harris <jim.harris@samsung.com> |
bdev/nvme: set max_num_segments based on controller limit
Fixes issue #3269.
Signed-off-by: Jim Harris <jim.harris@samsung.com> Change-Id: I48f8ef47841123c451421c53ad68714eff26722c Reviewed-on: htt
bdev/nvme: set max_num_segments based on controller limit
Fixes issue #3269.
Signed-off-by: Jim Harris <jim.harris@samsung.com> Change-Id: I48f8ef47841123c451421c53ad68714eff26722c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/21952 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Community-CI: Mellanox Build Bot
show more ...
|
d65bd99e | 27-Sep-2023 |
Pierre Lestringant <plestringant@kalray.eu> |
include: Remove duplicate includes in source files
Change-Id: I7dd6ae6fa11603a956c3d178b9b23d2c755913d1 Signed-off-by: Pierre Lestringant <plestringant@kalray.eu> Reviewed-on: https://review.spdk.io
include: Remove duplicate includes in source files
Change-Id: I7dd6ae6fa11603a956c3d178b9b23d2c755913d1 Signed-off-by: Pierre Lestringant <plestringant@kalray.eu> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/20106 Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <jim.harris@samsung.com>
show more ...
|
e85b6814 | 27-Nov-2023 |
Shuhei Matsumoto <smatsumoto@nvidia.com> |
bdev/nvme: bdev_nvme_io_complete() also checks retry_count for -ENXIO
We should be able to disable retry also for bdev_nvme_io_complete().
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Ch
bdev/nvme: bdev_nvme_io_complete() also checks retry_count for -ENXIO
We should be able to disable retry also for bdev_nvme_io_complete().
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I47c3e7c3b9a6a136bd7070feac6e710645944387 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/20702 Reviewed-by: Richael <richael.zhuang@arm.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <jim.harris@samsung.com> Community-CI: Mellanox Build Bot
show more ...
|
efb48abc | 27-Nov-2023 |
Shuhei Matsumoto <smatsumoto@nvidia.com> |
bdev/nvme: Remove remove parameter from bdev_nvme_failover_ctrlr()
The remove parameter of bdev_nvme_failover_ctrlr() is always set to false. _bdev_nvme_delete() sets the remove parameter to true bu
bdev/nvme: Remove remove parameter from bdev_nvme_failover_ctrlr()
The remove parameter of bdev_nvme_failover_ctrlr() is always set to false. _bdev_nvme_delete() sets the remove parameter to true but it calls bdev_nvme_failover_ctrlr_unsafe().
For clarification and simplification, remove the remove parameter from bdev_nvme_failover_ctrlr().
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I5373ba0bb1aeac896579d2effbec7a410782d7a1 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/20701 Reviewed-by: Richael <richael.zhuang@arm.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Mateusz Kozlowski <mateusz.kozlowski@solidigm.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Jim Harris <jim.harris@samsung.com> Community-CI: Mellanox Build Bot
show more ...
|
04a428f5 | 30-Oct-2023 |
Karl Bonde Torp <k.torp@samsung.com> |
nvme: add iovec passthru
This is used for sending big passthru commands, like Report Zones, over nvmf.
Change-Id: I83188367e0266e093faadd49cdb2e051eae71829 Signed-off-by: Karl Bonde Torp <k.torp@sa
nvme: add iovec passthru
This is used for sending big passthru commands, like Report Zones, over nvmf.
Change-Id: I83188367e0266e093faadd49cdb2e051eae71829 Signed-off-by: Karl Bonde Torp <k.torp@samsung.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/20498 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <jim.harris@samsung.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <ben@nvidia.com>
show more ...
|
07034ca3 | 27-Jul-2023 |
Artsiom Koltun <artsiom.koltun@intel.com> |
bdev/nvme: wait for detached nvme controller in rpc
rpc_bdev_nvme_detach_controller did not wait for an operation completion. If a sequence attach-detach-attach with the same name is run, the script
bdev/nvme: wait for detached nvme controller in rpc
rpc_bdev_nvme_detach_controller did not wait for an operation completion. If a sequence attach-detach-attach with the same name is run, the script might fail if the detach is not complete on an active path before the second attach. Wait for a path does not exist before sending back a json rpc response.
Change-Id: Id0e1fb49e69745acf1479188307aba20bf1b2020 Signed-off-by: Artsiom Koltun <artsiom.koltun@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/19340 Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
show more ...
|
277dbc96 | 04-Apr-2023 |
Konrad Sztyber <konrad.sztyber@intel.com> |
bdev/nvme: implement accel sequence callbacks
For now these include functions required for managing an accel sequence: finish, reverse, and abort, as well as a callback for appending a crc32c operat
bdev/nvme: implement accel sequence callbacks
For now these include functions required for managing an accel sequence: finish, reverse, and abort, as well as a callback for appending a crc32c operation. Additional callbacks can be added as needed at a later date.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: Ic9d37bddc295eb4456707be6cd56294b4ac166ae Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/18764 Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Community-CI: Mellanox Build Bot Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
show more ...
|
ea941cae | 28-Jul-2023 |
Konrad Sztyber <konrad.sztyber@intel.com> |
test/unit: use spdk_ut_run_tests()
Replaced direct calls to the CUnit's functions to run the tests with spdk_ut_run_tests(). That way, each test will have the ability to run a specific test case.
test/unit: use spdk_ut_run_tests()
Replaced direct calls to the CUnit's functions to run the tests with spdk_ut_run_tests(). That way, each test will have the ability to run a specific test case.
The blob.c unit test wasn't changed, because it runs all tests multiple times with different parameter combinations, so it cannot be easily converted. In the future, each such combination could be split into a separate test suite, which would make it compatible with spdk_ut_run_tests().
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: I4463f808f89844e9bf32b5b31eda197c5d729d1d Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/19288 Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
show more ...
|
ae431e31 | 28-Jul-2023 |
Konrad Sztyber <konrad.sztyber@intel.com> |
test/unit: move spdk_cunit.h to include/spdk_internal
It'll make it easier to include this file outside of unit tests.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: I171ddb864
test/unit: move spdk_cunit.h to include/spdk_internal
It'll make it easier to include this file outside of unit tests.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: I171ddb8649f67b5786f08647560e2907603d0574 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/19284 Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
show more ...
|
43cfaf81 | 18-May-2023 |
Shuhei Matsumoto <smatsumoto@nvidia.com> |
bdev/nvme: nvme_ctrlr_op_rpc() can enable/disables ctrlr dynamically
Extend nvme_ctrlr_op_rpc() and nvme_bdev_ctrlr_op_rpc() to support enable or disable a controller or all controllers in a bdev co
bdev/nvme: nvme_ctrlr_op_rpc() can enable/disables ctrlr dynamically
Extend nvme_ctrlr_op_rpc() and nvme_bdev_ctrlr_op_rpc() to support enable or disable a controller or all controllers in a bdev controller.
To disable a controller, bdev_nvme_disable_ctrlr() cancels reconnect and disable a controller if reconnect is already scheduled, or disconnect and disable a controller. Disable is to keep a controller disconnected without scheduling a reconnect.
To enable a controller, bdev_nvme_enable_ctrlr() reconnects a controller if it is disabled.
To indicate a controller is disabled, add a disabled variable to the nvme_ctrlr structure.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I02f97cdc549f317f4d37c802a125bf0f0db855fe Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/18235 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Richael <richael.zhuang@arm.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>
show more ...
|
512b7553 | 25-May-2023 |
Shuhei Matsumoto <smatsumoto@nvidia.com> |
bdev/nvme: Rename bdev_nvme_reset/failover() to bdev_nvme_reset/failover_ctrlr()
Renaming bdev_nvme_reset() and bdev_nvme_failover() to bdev_nvme_reset_ctrlr() and bdev_nvme_failover_ctrlr(), respec
bdev/nvme: Rename bdev_nvme_reset/failover() to bdev_nvme_reset/failover_ctrlr()
Renaming bdev_nvme_reset() and bdev_nvme_failover() to bdev_nvme_reset_ctrlr() and bdev_nvme_failover_ctrlr(), respectively makes the APIs clearer. We will be able to add upcoming APIs for ctrlr enable/disablement more easily.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I93d7f8fa81b200dfdb6851543c76462312aff393 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/18376 Reviewed-by: Jim Harris <james.r.harris@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Richael <richael.zhuang@arm.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
show more ...
|
6f2e8fa5 | 23-May-2023 |
Shuhei Matsumoto <smatsumoto@nvidia.com> |
bdev/nvme: Add nvme_bdev_ctrlr_op_rpc() to operate all ctrlrs in a nbdev_ctrlr
The bdev_nvme_reset_controller RPC was convenient but did not support multipath configuration. Support multipath config
bdev/nvme: Add nvme_bdev_ctrlr_op_rpc() to operate all ctrlrs in a nbdev_ctrlr
The bdev_nvme_reset_controller RPC was convenient but did not support multipath configuration. Support multipath configuration in this patch.
Add nvme_bdev_ctrlr_op_rpc() to operate all ctrlrs in a nbdev_ctrlr.
Add a new parameter cntlid to the bdev_nvme_reset_controller RPC.
The bdev_nvme_reset_controller RPC calls nvme_ctrlr_op_rpc() if cntlid is omitted or nvme_bdev_ctrlr_op_rpc() otherwise.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I9e71db79ad395428bb07c4bbf64d615fda711420 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16744 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Richael <richael.zhuang@arm.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com>
show more ...
|
ba5ae93d | 23-May-2023 |
Shuhei Matsumoto <smatsumoto@nvidia.com> |
bdev/nvme: nvme_ctrlr_op_rpc() always call callback for simplification
For nvme_ctrlr_op_rpc(). change its return type to void and it to always call the completion callback. This simplifies the code
bdev/nvme: nvme_ctrlr_op_rpc() always call callback for simplification
For nvme_ctrlr_op_rpc(). change its return type to void and it to always call the completion callback. This simplifies the code and make the next patch easier to reset each ctrlr sequentially when multipath is configured.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I744313b9f8ac650fe7c468d657a2ce899629479c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16743 Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
show more ...
|
aefc9cc4 | 17-May-2023 |
Shuhei Matsumoto <smatsumoto@nvidia.com> |
bdev/nvme: Rename bdev_nvme_reset_rpc() to nvme_ctrlr_op_rpc()
Rename bdev_nvme_reset_rpc() to nvme_ctrlr_op_rpc() and change related function pointers and variables accordingly to reuse these for o
bdev/nvme: Rename bdev_nvme_reset_rpc() to nvme_ctrlr_op_rpc()
Rename bdev_nvme_reset_rpc() to nvme_ctrlr_op_rpc() and change related function pointers and variables accordingly to reuse these for other operations.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I63188ee4891715f5e9390948206ebeaf3dd2f7b3 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/18212 Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Richael <richael.zhuang@arm.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>
show more ...
|
d62441e3 | 17-May-2023 |
Shuhei Matsumoto <smatsumoto@nvidia.com> |
bdev/nvme: Change 2nd param of reset_cb from boolean to int
Change the second parameter of bdev_nvme_reset_cb from boolean to int. This is helpful to make the return type of bdev_nvme_reset_rpc() fr
bdev/nvme: Change 2nd param of reset_cb from boolean to int
Change the second parameter of bdev_nvme_reset_cb from boolean to int. This is helpful to make the return type of bdev_nvme_reset_rpc() from int to void because bdev_nvme_reset_rpc() will always call reset_cb including error cases but does not want to loose error information.
The second parameter of bdev_nvme_reset_complete() is still boolean because bdev_nvme_reset_complete() is related with disconnected_cb and the second parameter of disconnected_cb is also boolean. Changing all of these will require a lot of effort. This may be done in future.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I946e879dfe82c8f6423f7e1a72ec058a8c58c2ba Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/18211 Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
show more ...
|
bb4be4dc | 22-May-2023 |
Shuhei Matsumoto <smatsumoto@nvidia.com> |
bdev/nvme: Ctrlr reset establishes connection for qpairs sequentially
A full controller reset sequence disconnects qpairs sequentially, but connects qpairs in parallel. It moves to the next ctrlr_ch
bdev/nvme: Ctrlr reset establishes connection for qpairs sequentially
A full controller reset sequence disconnects qpairs sequentially, but connects qpairs in parallel. It moves to the next ctrlr_channel after starting connection establishment for the previous ctrlr_channel. Conneciton establishment for qpairs are done in parallel.
However, this design caused one very difficult race issue. As highlighted in the previous patch, a full controller reset sequence returned success but connection establishment was failed. It will be better if we confirm connection establishment one by one and return success after all connection establishments are actually done.
Controller reset sequence does not require performance. This will reduce resource contention for connection establishment on the target side.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ifa2836f06865bcce6bc528719a51119522c8f43b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/18252 Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com>
show more ...
|
6cdc0f25 | 17-May-2023 |
Shuhei Matsumoto <smatsumoto@nvidia.com> |
bdev/nvme: Defer failover() until reset() completes
There was a complex issue that failover was lost and I/O qpair was never created again if fabric connect command got timeout for I/O qpair while c
bdev/nvme: Defer failover() until reset() completes
There was a complex issue that failover was lost and I/O qpair was never created again if fabric connect command got timeout for I/O qpair while controller was being reset.
To create I/O qpair for such case, add a boolean pending_failover variable to nvme_ctrlr structure, When bdev_nvme_failover() is called, if nvme_ctrlr->resetting is true, set pending_failover to true and return. Then, at _bdev_nvme_reset_complete() if pending_failover is true, call set failover_pending to false and call bdev_nvme_failover().
However, we have to be more careful. most SPDK threads call bdev_nvme_failover() almost simultaneously for a network error. For this case, we have to call bdev_nvme_failover() only once per network error. To do this, add and use another boolean variable in_failover.
After this change, bdev_nvme_failover() call is not lost but deferred. Hence, use -EINPROGRESS instead of -EBUSY for clarification.
Verify this change by adding a unit test case.
NOTE: Better practical workaround will be to extend timeout for fabric connect command. While fabric connect command is in progress, I/Os are queued even if the upper layer does not enable I/O error resiliency. But, this fix will be necessary. Otherwise, connection establishment is not retried.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ibe346b8ae35cab5bd2bcbda1aaa12d2d9364e283 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/18209 Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot
show more ...
|
7348b89c | 26-Apr-2023 |
Shuhei Matsumoto <smatsumoto@nvidia.com> |
bdev/nvme: Reset tries alternate paths immediately if allowed
Previously, one bdev_nvme_failover() call tried only one alternate path. However, when the controller has more than two alternate paths
bdev/nvme: Reset tries alternate paths immediately if allowed
Previously, one bdev_nvme_failover() call tried only one alternate path. However, when the controller has more than two alternate paths and its reconnect_delay_sec is non-zero, if the first failover failed, the following retries will be delayed with reconnect_delay_sec seconds.
We want all alternate paths to be tried immediately but want to set backoff if we try the same alternate path again in a single bdev_nvme_failover() call.
Hence, add last_failed_tsc to nvme_path_id structure. Set the current timestamp to the last_failed_tsc when bdev_nvme_failover() is called or reconnect is failed, or clear the last_failed_tsc if reconnect succeeds.
Then, control trid switch and reconnect timing based on the last_failed_tsc.
Add comments in the source code to explain this complex logic and modify unit tests to test this behavior.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I464ea12a32efe0de3889a6705fa0a6c92aeadbd6 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16864 Reviewed-by: Jim Harris <james.r.harris@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
show more ...
|
8d8208d6 | 09-Mar-2023 |
Shuhei Matsumoto <smatsumoto@nvidia.com> |
bdev/nvme: Clear caching io_path when removing io_path dynamically
User can remove io_path dynamically while processing I/O.
A to-be-retried I/O should clear io_path caching and get another io_path
bdev/nvme: Clear caching io_path when removing io_path dynamically
User can remove io_path dynamically while processing I/O.
A to-be-retried I/O should clear io_path caching and get another io_path from scratch for retry.
Verify this by adding a unit test case.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I891aafbb132c3beaef5cd4f55c9b4fde21aeaae9 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/17120 Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@nvidia.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
show more ...
|