c113e4cd | 09-Mar-2022 |
Shuhei Matsumoto <smatsumoto@nvidia.com> |
bdev/nvme: Alloc qpair context dynamically on nvme_ctrlr_channel
This is another preparation to disconnect qpair asynchronously.
Add nvme_qpair object and move the qpair and poll_group pointers and
bdev/nvme: Alloc qpair context dynamically on nvme_ctrlr_channel
This is another preparation to disconnect qpair asynchronously.
Add nvme_qpair object and move the qpair and poll_group pointers and the io_path_list list from nvme_ctrlr_channel to nvme_qpair. nvme_qpair is allocated dynamically when creating nvme_ctrlr_channel, and nvme_ctrlr_channel points to nvme_qpair.
We want to keep the times of references at I/O path. Change nvme_io_path to point nvme_qpair instead of nvme_ctrlr_channel, and add nvme_ctrlr_channel pointer to nvme_qpair.
nvme_ctrlr_channel may be freed earlier than nvme_qpair. nvme_poll_group lists nvme_qpair instead of nvme_ctrlr_channel and nvme_qpair has a pointer to nvme_ctrlr.
By using the nvme_ctrlr pointer of the nvme_qpair, a helper function nvme_ctrlr_channel_get_ctrlr() is not necessary any more. Remove it.
Change-Id: Ib3f579d3441f31b9db7d3844ec56c49e2bb53a5d Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11832 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
show more ...
|
d7f0a182 | 08-Mar-2022 |
Shuhei Matsumoto <smatsumoto@nvidia.com> |
bdev/nvme: Inline bdev_nvme_destroy_qpair()
In the following patches, spdk_nvme_ctrlr_disconnect_io_qpair() will be changed to be asynchronous, spdk_nvme_ctrlr_disconnect_io_qpair() will be called f
bdev/nvme: Inline bdev_nvme_destroy_qpair()
In the following patches, spdk_nvme_ctrlr_disconnect_io_qpair() will be changed to be asynchronous, spdk_nvme_ctrlr_disconnect_io_qpair() will be called first and then spdk_nvme_ctrlr_free_io_qpair() after the qpair is actually disconnected.
We will not be able to keep the current bdev_nvme_destroy_qpair() function.
As a preparation, inline bdev_nvme_destroy_qpair() and remove it.
Additionally, this patch has the following changes.
Previously I/O qpair was freed and then I/O path caches were cleared. Both are SPDK thread local. So there is no dependency for the ordering of these two operations. However, it will reduce the size of the following patches if we clear I/O path caches before freeing I/O qpair when the qpair is disconnected. Hence we clear I/O path caches and then free I/O qpair.
Remove DTRACE for bdev_nvme_destroy_qpair() for now. It will be restored in the following patches.
Furthermore, fix potential NULL pointer acces in bdev_nvme_create_qpair().
Change-Id: I0ab78ccb0d240e56b95b53179341afcd909a31f6 Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10746 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
show more ...
|
00a79982 | 04-Mar-2022 |
Shuhei Matsumoto <smatsumoto@nvidia.com> |
bdev/nvme: Move per controller settings into a option structure
The following patches will enable us to specify I/O error resiliency options per nvme_ctrlr as global options. To do it easier, move p
bdev/nvme: Move per controller settings into a option structure
The following patches will enable us to specify I/O error resiliency options per nvme_ctrlr as global options. To do it easier, move per controller options about I/O error resiliency into struct nvme_ctrlr_opts.
prchk_flags is not exactly for resiliency but move it into struct nvme_ctrlr_opts too.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I85fd1738bb6e293cd804b086ade82274485f213d Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11829 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>
show more ...
|
1a00f5c0 | 08-Mar-2022 |
Shuhei Matsumoto <smatsumoto@nvidia.com> |
bdev/nvme: Fix overflow of RB tree comparison when the NSID is very big
If 0 - UINT32_MAX or UINT32_MAX - 0 is substituted into a int variable, we cannot get any expected result.
Fix the bug and ad
bdev/nvme: Fix overflow of RB tree comparison when the NSID is very big
If 0 - UINT32_MAX or UINT32_MAX - 0 is substituted into a int variable, we cannot get any expected result.
Fix the bug and add unit test case to verify the fix.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: Ib045273238753e16755328805b38569909c8b83a Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11836 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot
show more ...
|
08f9b401 | 27-Jan-2022 |
Evgeniy Kochetov <evgeniik@nvidia.com> |
bdev/nvme: Fix namespace comparison
This patch aligns namespace comparison with Linux kernel implementation: - UUID is optional and may be NULL - command set (CSI) should be the same
Signed-off-by:
bdev/nvme: Fix namespace comparison
This patch aligns namespace comparison with Linux kernel implementation: - UUID is optional and may be NULL - command set (CSI) should be the same
Signed-off-by: Evgeniy Kochetov <evgeniik@nvidia.com> Change-Id: I8f889989f24cd51b104057217f87eb303b30fa68 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11312 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
show more ...
|
3185df90 | 03-Jan-2022 |
Shuhei Matsumoto <smatsumoto@nvidia.com> |
ut/bdev_nvme: Manage adminq's state and return -ENXIO if adminq is disconnected
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I81d4a8ce5c487449ab634bcd4f984d6867febf35 Reviewed-
ut/bdev_nvme: Manage adminq's state and return -ENXIO if adminq is disconnected
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I81d4a8ce5c487449ab634bcd4f984d6867febf35 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10949 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
show more ...
|
49b8d1f3 | 20-Dec-2021 |
Shuhei Matsumoto <smatsumoto@nvidia.com> |
ut/bdev_nvme: Delete qpair after unwiding context from process_completions()
This is the same effort as the last patch.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I94ef08abd
ut/bdev_nvme: Delete qpair after unwiding context from process_completions()
This is the same effort as the last patch.
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I94ef08abdbb2bd2e07d0cd1e552c5d05c805233e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10817 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
show more ...
|
5485f55d | 20-Dec-2021 |
Shuhei Matsumoto <smatsumoto@nvidia.com> |
ut/bdev_nvme: Separate disconnected and connected qpair in poll_group
More precise stubs for spdk_nvme_poll_group are critically important to verify upcoming changes.
Add a flag is_failed to struct
ut/bdev_nvme: Separate disconnected and connected qpair in poll_group
More precise stubs for spdk_nvme_poll_group are critically important to verify upcoming changes.
Add a flag is_failed to struct spdk_nvme_qpair separately from is_connected. This is used to inject error to a connection.
Replace a single list qpairs by two lists, connected_qpairs and disconnected_qpairs for struct spdk_nvme_poll_group.
Then utilize these to manage qpair in poll group.
spdk_nvme_ctrlr_reconnect_io_qpair() is not used in the NVMe bdev module now. Remove the corresponding stub.
Adjust polling count accordingly.
Change-Id: I4d867c56ae518276813f6f96d23a5f6933364fd4 Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10816 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
show more ...
|
80e81273 | 14-Jan-2022 |
Shuhei Matsumoto <smatsumoto@nvidia.com> |
bdev/nvme: Do not use ctrlr for I/O submission if reconnect failed repeatedly
If ctrlr_loss_timeout_sec is set to -1, reconnect is tried repeatedly indefinitely, and I/Os continue to be queued.
Thi
bdev/nvme: Do not use ctrlr for I/O submission if reconnect failed repeatedly
If ctrlr_loss_timeout_sec is set to -1, reconnect is tried repeatedly indefinitely, and I/Os continue to be queued.
This patch adds another option fast_io_fail_timeout_sec, a flag fast_io_fail_timedout to nvme_ctrlr.
If the time fast_io_fail_timeout_sec passed after starting reset, set fast_io_fail_timedout to true not to use the path for I/O submission.
fast_io_fail_timeout_sec is initialized to zero as same as ctrlr_loss_timeout_sec and reconnect_delay_sec.
The name of the parameter follows the famous DM-multipath, its fast_io_fail_tmo.
Change-Id: Ib870cf8e2fd29300c47f1df69617776f4e67bd8c Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10301 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>
show more ...
|
ae4e54fd | 13-Jan-2022 |
Shuhei Matsumoto <smatsumoto@nvidia.com> |
bdev/nvme: Retry reconnecting ctrlr after seconds if reset failed
Previously reconnect retry was not controlled and was repeated indefinitely.
This patch adds two options, ctrlr_loss_timeout_sec an
bdev/nvme: Retry reconnecting ctrlr after seconds if reset failed
Previously reconnect retry was not controlled and was repeated indefinitely.
This patch adds two options, ctrlr_loss_timeout_sec and reconnect_delay_sec, to nvme_ctrlr and add reset_start_tsc, reconnect_is_delayed, and reconnect_delay_timer to nvme_ctrlr to control reconnect retry.
Both of ctrlr_loss_timeout_sec and reconnect_delay_sec are initialized to zero. This means reconnect is not throttled as we did before this patch.
A few more changes are added.
Change nvme_io_path_is_failed() to return false if reset is throttled even if nvme_ctrlr is reseting or is to be reconnected.
spdk_nvme_ctrlr_reconnect_poll_async() may continue returning -EAGAIN infinitely. To check out such exceptional case, use ctrlr_loss_timeout_sec.
Not only ctrlr reset but also non-multipath ctrlr failover is controlled. So we need to include path failover into ctrlr reconnect.
When the active path is removed and switched to one of the alternative paths, if ctrlr reconnect is scheduled, connecting to the alternative path is left to the scheduled reconnect.
If reset or reconnect ctrlr is failed and the retry is scheduled, switch the active path to one of alternative paths.
Restore unit test cases removed in the previous patches.
Change-Id: Idec636c4eced39eb47ff4ef6fde72d6fd9fe4f85 Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10128 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
show more ...
|
962c4c38 | 13-Jan-2022 |
Shuhei Matsumoto <smatsumoto@nvidia.com> |
bdev/nvme: Fix a degradation that I/O gets queued infinitely
We noticed the difference between the SPDK 21.10 and the latest master in a test.
The simplified scenario is as follows: 1. Start SPDK N
bdev/nvme: Fix a degradation that I/O gets queued infinitely
We noticed the difference between the SPDK 21.10 and the latest master in a test.
The simplified scenario is as follows: 1. Start SPDK NVMe-oF target 2. Run bdevperf for the target with -f parameter to suppress exit on failure. 3. Kill the target after I/O started.
With the SPDK 21.10, bdevperf retries failed I/Os and exits after the test time is over.
With the latest SPDK master, bdevperf hungs and does not exit even after the test time is over.
The cause was as follows:
reset ctrlr is repeated very quickly (once per 10ms by default) and hence I/Os were queued infinitely because nvme_io_path_is_failed() returned false if nvme_ctrlr is resetting.
We should queue I/O when nvme_ctrlr is resetting only if reset is throttoled and fail-fast for the repeated failures is supported.
Hence in this patch, fix the degradation and remove the related unit test cases.
Reported-by: Evgeniy Kochetov <evgeniik@nvidia.com> Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I4047d42dc44488a05264c6a841d101a7c371358b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11062 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
show more ...
|
521a9bb2 | 30-Dec-2021 |
Shuhei Matsumoto <shuheimatsumoto@gmail.com> |
bdev/nvme: Fix race between failover and add secondary trid
We sort secondary trids to avoid using disconnected trids for failover. However the sort had a bug.
This bug was found by running test/nv
bdev/nvme: Fix race between failover and add secondary trid
We sort secondary trids to avoid using disconnected trids for failover. However the sort had a bug.
This bug was found by running test/nvmf/host/multipath.sh in a loop.
Verify the fix by adding unit test.
Fixes #2300
Signed-off-by: Shuhei Matsumoto <shuheimatsumoto@gmail.com> Change-Id: I22b0ede4d2ef98b786c3e0d1f5337a2d568ba56d Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10921 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
show more ...
|
b68f2eeb | 04-Dec-2021 |
Jim Harris <james.r.harris@intel.com> |
bdev_nvme: add bdev_nvme_start_discovery RPC
This patch adds the framework for a discovery service in the bdev/nvme module.
Users can specify an IP/port of a discovery service. The bdev/nvme module
bdev_nvme: add bdev_nvme_start_discovery RPC
This patch adds the framework for a discovery service in the bdev/nvme module.
Users can specify an IP/port of a discovery service. The bdev/nvme module will connect to a discovery controller, get the discovery log page, and then register for AERs. It will connect to each subsystem specified in the initial log page. AER completions will trigger fetching the log page again, at which point new subsystems will be connected to, or removed subsystems will be detached.
This patch does the following: * Adds the new start_discovery RPC * Connects to the discovery controller * Gets the discovery log page * Registers for AERs * Detach from discovery controllers at shutdown
Subsequent patches in this series will: * Connect to subsystems listed in discovery log page * Detach from subsystems that were listed in earlier discovery log pages but subsequently removed * Add a stop_discovery RPC
Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I54bfa896a48c5619676f156b5ea9f2d1f886c72f Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10694 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
show more ...
|
21551806 | 28-Nov-2021 |
Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> |
bdev/nvme: nvme_ctrlr_create() gets prchk_flags from nvme_async_probe_ctx
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: Id3deca8e0aba23299347a6aee6f0f44ee683556e Revie
bdev/nvme: nvme_ctrlr_create() gets prchk_flags from nvme_async_probe_ctx
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: Id3deca8e0aba23299347a6aee6f0f44ee683556e Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10555 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
show more ...
|
696ad465 | 25-Nov-2021 |
Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> |
bdev/nvme: Remove the failover_in_progress flag from struct nvme_ctrlr
The failover_in_progress flag is used to decide the return value of bdev_nvme_failover().
bdev_nvme_delete() calls bdev_nvme_f
bdev/nvme: Remove the failover_in_progress flag from struct nvme_ctrlr
The failover_in_progress flag is used to decide the return value of bdev_nvme_failover().
bdev_nvme_delete() calls bdev_nvme_failover() with remove=true to remove nvme_ctrlr->active_path_id. However bdev_nvme_failover() returns zero if nvme_ctrlr->failover_in_progress is true. bdev_nvme_failover() may return zero even if it does not remove nvme_ctrlr->active_path_id.
The following will be better.
bdev_nvme_failover() returns -EBUSY if nvme_ctrlr->resetting is true, and the caller repeats calling bdev_nvme_failover() until the target trid becomes alternative path or bdev_nvme_failover() returns zero.
To do that, the failover_in_progress flag is not necessary any more.
Removing the failover_in_progress will also simplify the following patches to unify ctrlr reset and failover.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: I57ab944beb1d06ea4def144c81c69705860de35f Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10441 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
show more ...
|
7cc66c0a | 24-Nov-2021 |
Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> |
bdev/nvme: Check if ns can be shared when configuring multipath
We had not checked the bit 0 of the Namespace Multipath I/O and Namespace Sharing Capabilities (NMIC) field in the Identify Namespace
bdev/nvme: Check if ns can be shared when configuring multipath
We had not checked the bit 0 of the Namespace Multipath I/O and Namespace Sharing Capabilities (NMIC) field in the Identify Namespace data structure.
If the bit 0 of the NMIC is zero, it is likely that namespaces are not identical.
We should check if the value of the NMIC first, and do it in this patch.
Additionally, it is not usual if the bit 0 of the CMIC and the bit 0 of the NMIC do not match. So in unit tests rename the parameter multi_ctrlr by multipath for ut_attach_ctrlr() and use it for the value of the NMIC.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: I6aa7cbcc99be2507dbf18930f7b585a9ea7d0f90 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10380 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
show more ...
|
8afa746b | 03-Nov-2021 |
Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> |
bdev/nvme: Use new APIs in a reset ctrlr sequence
Replace the spdk_nvme_ctrlr_reset_async() and spdk_nvme_reset_poll_async() calls by the spdk_nvme_ctrlr_disconnect(), spdk_nvme_ctrlr_reconnect_asyn
bdev/nvme: Use new APIs in a reset ctrlr sequence
Replace the spdk_nvme_ctrlr_reset_async() and spdk_nvme_reset_poll_async() calls by the spdk_nvme_ctrlr_disconnect(), spdk_nvme_ctrlr_reconnect_async(), and spdk_nvme_ctrlr_reconnect_poll_async() calls in a reset ctrlr sequence.
spdk_nvme_ctrlr_disconnect() can fail if ctrlr is already resetting or removed. But both cases are not possible. reset is controlled and the callback to the hot remove is called when the ctrlr is hot removed. So we assume spdk_nvme_ctrlr_disconnect() always succeed.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: I1299e198597b2a2110f80b9a868e2dae015682ee Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10092 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>
show more ...
|
c9c7c281 | 25-Nov-2021 |
Josh Soref <jsoref@gmail.com> |
spelling: test
Part of #2256
* achieve * additionally * against * aliases * already * another * arguments * between * capabilities * comparison * compatibility * configuration * continuing * contro
spelling: test
Part of #2256
* achieve * additionally * against * aliases * already * another * arguments * between * capabilities * comparison * compatibility * configuration * continuing * controlq * cpumask * default * depends * dereferenced * discussed * dissect * driver * environment * everything * excluded * existing * expectation * failed * fails * following * functions * hugepages * identifiers * implicitly * in_capsule * increment * initialization * initiator * integrity * iteration * latencies * libraries * management * namespace * negotiated * negotiation * nonexistent * number * occur * occurred * occurring * offsetting * operations * outstanding * overwhelmed * parameter * parameters * partition * preempts * provisioned * responded * segment * skipped * struct * subsystem * success * successfully * sufficiently * this * threshold * transfer * transferred * unchanged * unexpected * unregistered * useless * utility * value * variable * workload
Change-Id: I21ca7dab4ef575b5767e50aaeabc34314ab13396 Signed-off-by: Josh Soref <jsoref@gmail.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10409 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
show more ...
|
f9fba507 | 31-Oct-2021 |
Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> |
bdev/nvme: Redirect the reset ctrlr operation into nvme_ctrlr->thread
In the following patches, we want to retry reconnect if reconnect failed in a reset ctrlr sequence but we want to delay the retr
bdev/nvme: Redirect the reset ctrlr operation into nvme_ctrlr->thread
In the following patches, we want to retry reconnect if reconnect failed in a reset ctrlr sequence but we want to delay the retry. While we wait the delayed retry, we want to quiesce ctrlr completely.
As part of quiesce ctrlr operations, we want to pause adminq poller but we need to do it on the nvme_ctrlr->thread.
If a reset ctrlr sequence runs on the nvme_ctrlr->thread, we can avoid redirecting the pending destruct request at completion too.
So we redirect the reset ctrlr sequence into the nvme_ctrlr->thread.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: I538b962e2a7b5cf00ebbac2a1e888482ddeeee61 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10075 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
show more ...
|
50b10bc2 | 31-Oct-2021 |
Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> |
bdev/nvme: bdev_nvme_reset_io() redirect to the orig_thread at completion
In the following patches, bdev_nvme_reset() will execute the reset ctrlr operation on the nvme_ctrlr->thread until completio
bdev/nvme: bdev_nvme_reset_io() redirect to the orig_thread at completion
In the following patches, bdev_nvme_reset() will execute the reset ctrlr operation on the nvme_ctrlr->thread until completion as bdev_nvme_admin_passthru() does. Hence change the callback bdev_nvme_reset_io_continue() to redirect to the orig_thread by using bio. Furthermore, use bio->cpl.cdw0 to store the completion status of the reset processing. bdev_nvme_reset() does not use bio->cpl.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: I361cc44494190ba83ad6e360788d78851416c46c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10074 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
show more ...
|
b4447abf | 15-Nov-2021 |
Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> |
bdev/nvme: Retry failed admin passthru up to retry_count times
This patch supports admin passthrough retry when we get any error with DNR=0 but ABORTED_BY_REQUEST up to retry_count times.
Signed-of
bdev/nvme: Retry failed admin passthru up to retry_count times
This patch supports admin passthrough retry when we get any error with DNR=0 but ABORTED_BY_REQUEST up to retry_count times.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: I1bf29570791fdbe8651fa70c4c8685bb740fb86b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9944 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
show more ...
|
a9a86a14 | 19-Oct-2021 |
Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> |
bdev/nvme: Retry admin passthru immediately if it got ctrlr path error
This patch supports admin passthrough retry when we get ctrlr path error at completion.
Signed-off-by: Shuhei Matsumoto <shuhe
bdev/nvme: Retry admin passthru immediately if it got ctrlr path error
This patch supports admin passthrough retry when we get ctrlr path error at completion.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: Ice0045b84054ec66a9db9ef23e21786d2c082b1d Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9943 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
show more ...
|
35a2f4e2 | 18-Nov-2021 |
Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> |
bdev/nvme: Retry admin passthru a second later if any ctrlr may become available
When resetting ctrlr, adminq is disconnected first. If adminq is disconnected, admin passthrough request is rejected
bdev/nvme: Retry admin passthru a second later if any ctrlr may become available
When resetting ctrlr, adminq is disconnected first. If adminq is disconnected, admin passthrough request is rejected with -ENXIO.
But resetting ctrlr may succeed. If resetting ctrlr succeeds, adminq is connected again, and admin passthrough request will be submitted successfully.
On the other hand, if ctrlr is failed, admin passthrough request is rejected with -ENXIO. But when resetting ctrlr, ctrlr is set to unfailed.
Hence bdev_nvme_admin_passthru() skips any ctrlr which is resetting or failed, and calls bdev_nvme_admin_passthru_complete() with -ENXIO if no available ctrlr is found.
bdev_nvme_admin_passthru_complete() queues admin passthrough request and retry it one second later if ctrlr is resetting.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: Ic748dc4faf29ebf717ae5c29dcf7c55fe2ea9243 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9942 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
show more ...
|
7b8e7212 | 21-Oct-2021 |
Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> |
bdev/nvme: Abort the queued I/O for retry
The NVMe bdev module queues retried I/Os itself now. bdev_nvme_abort() needs to check and abort the target I/O if it is queued for retry.
This change will
bdev/nvme: Abort the queued I/O for retry
The NVMe bdev module queues retried I/Os itself now. bdev_nvme_abort() needs to check and abort the target I/O if it is queued for retry.
This change will cover admin passthrough requests too because they will be queued on the same thread as their callers and the public API spdk_bdev_reset() requires to be submitted on the same thread as the target I/O or admin passthrough requests.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: If37e8188bd3875805cef436437439220698124b9 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9913 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
show more ...
|
72e4a4d4 | 15-Oct-2021 |
Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> |
bdev/nvme: Each nvme_bdev_channel caches its current io_path
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: I3ec3a588ff741cf04383e89f5a701e33bf1987a6 Reviewed-on: https
bdev/nvme: Each nvme_bdev_channel caches its current io_path
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: I3ec3a588ff741cf04383e89f5a701e33bf1987a6 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9894 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
show more ...
|