History log of /dpdk/drivers/net/mlx5/linux/mlx5_verbs.c (Results 1 – 25 of 53)
Revision Date Author Comments
# e12a0166 14-May-2024 Tyler Retzlaff <roretzla@linux.microsoft.com>

drivers: use stdatomic API

Replace the use of gcc builtin __atomic_xxx intrinsics with
corresponding rte_atomic_xxx optional rte stdatomic API.

Signed-off-by: Tyler Retzlaff <roretzla@linux.microso

drivers: use stdatomic API

Replace the use of gcc builtin __atomic_xxx intrinsics with
corresponding rte_atomic_xxx optional rte stdatomic API.

Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>

show more ...


# 8fa8d147 05-Jul-2023 Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: add comprehensive send completion trace

There is the demand to trace the send completions of
every WQE if time scheduling is enabled.

The patch extends the size of completion queue and
re

net/mlx5: add comprehensive send completion trace

There is the demand to trace the send completions of
every WQE if time scheduling is enabled.

The patch extends the size of completion queue and
requests completion on every issued WQE in the
send queue. As the result hardware provides CQE on
each completed WQE and driver is able to fetch
completion timestamp for dedicated operation.

The add code is under conditional compilation
RTE_ENABLE_TRACE_FP flag and does not impact the
release code.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

show more ...


# ed090599 20-Mar-2023 Tyler Retzlaff <roretzla@linux.microsoft.com>

rework atomic intrinsics fetch operations

Use __atomic_fetch_{add,and,or,sub,xor} instead of
__atomic_{add,and,or,sub,xor}_fetch adding the necessary code to
allow consumption of the resulting value

rework atomic intrinsics fetch operations

Use __atomic_fetch_{add,and,or,sub,xor} instead of
__atomic_{add,and,or,sub,xor}_fetch adding the necessary code to
allow consumption of the resulting value.

Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>

show more ...


# f9eb7a4b 02-Mar-2023 Tyler Retzlaff <roretzla@linux.microsoft.com>

use atomic intrinsics closer to C11

Use __atomic_fetch_{add,and,or,sub,xor} instead of
__atomic_{add,and,or,sub,xor}_fetch when we have no interest in the
result of the operation.

This change reduc

use atomic intrinsics closer to C11

Use __atomic_fetch_{add,and,or,sub,xor} instead of
__atomic_{add,and,or,sub,xor}_fetch when we have no interest in the
result of the operation.

This change reduces unnecessary code that provided the result of the
atomic operation while this result was not used.

It also brings us to a closer alignment with atomics available in C11
standard and will reduce review effort when they are integrated.

Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Chengwen Feng <fengchengwen@huawei.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>

show more ...


# 1485d961 18-May-2022 Raja Zidane <rzidane@nvidia.com>

net/mlx5: fix Tx recovery

When an error occurs in Tx, and it is moved to ERROR state, it
is not recoverable, during recovery it's state cannot be modified
to INIT. to modify state from RESET to INIT

net/mlx5: fix Tx recovery

When an error occurs in Tx, and it is moved to ERROR state, it
is not recoverable, during recovery it's state cannot be modified
to INIT. to modify state from RESET to INIT, the port must be
passed in modify attributes, and in case of ERROR to READY
modification path, it was not provided.

Provide port number when changing state from RESET to INIT.

Fixes: 3a87b964edd3 ("net/mlx5: create Tx queues with DevX")
Cc: stable@dpdk.org

Signed-off-by: Raja Zidane <rzidane@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>

show more ...


# 2f5122df 24-Feb-2022 Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: configure Tx queue with send on time offload

The wait on time configuration flag is copied to the Tx queue
structure due to performance considerations. Timestamp
mask is prepared and store

net/mlx5: configure Tx queue with send on time offload

The wait on time configuration flag is copied to the Tx queue
structure due to performance considerations. Timestamp
mask is prepared and stored in queue structure as well.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

show more ...


# a6b9d5a5 23-Feb-2022 Michael Baum <michaelba@nvidia.com>

common/mlx5: update doorbell mapping parameter name

The "tx_db_nc" devarg forces doorbell register mapping to non-cached
region eliminating the extra write memory barrier. This argument was
used in

common/mlx5: update doorbell mapping parameter name

The "tx_db_nc" devarg forces doorbell register mapping to non-cached
region eliminating the extra write memory barrier. This argument was
used in creating the UAR for Tx and thus affected its performance.

Recently [1] its use has been extended to all UAR creation in all mlx5
drivers, and now its name is no longer so accurate.

This patch changes its name to "sq_db_nc" to suit any send queue that
uses it. The old name will still work for backward compatibility.

[1] commit 5dfa003db53f ("common/mlx5: fix post doorbell barrier")

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Reviewed-by: Raslan Darawsheh <rasland@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

show more ...


# 91d1cfaf 14-Feb-2022 Michael Baum <michaelba@nvidia.com>

net/mlx5: rearrange device attribute structure

Rearrange the mlx5_os_get_dev_attr() function in such a way that it
first executes the queries and only then updates the fields.
In addition, it change

net/mlx5: rearrange device attribute structure

Rearrange the mlx5_os_get_dev_attr() function in such a way that it
first executes the queries and only then updates the fields.
In addition, it changed its name in preparation for expanding its
operations to configure the capabilities inside it.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

show more ...


# 6dc0cbc6 14-Feb-2022 Michael Baum <michaelba@nvidia.com>

net/mlx5: remove DevX flag duplication

The sharing device context structure has a field named "devx" which
indicates if DevX is supported.
The common configure structure has also field named "devx"

net/mlx5: remove DevX flag duplication

The sharing device context structure has a field named "devx" which
indicates if DevX is supported.
The common configure structure has also field named "devx" with the same
meaning.

There is no need for this duplication, because there is a reference to
the common structure from within the sharing device context structure.

This patch removes it from sharing device context structure and uses the
common config structure instead.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

show more ...


# 0947ed38 23-Nov-2021 Michael Baum <michaelba@nvidia.com>

net/mlx5: improve stride parameter names

In the striding RQ management there are two important parameters, the
size of the single stride in bytes and the number of strides.

Both the data-path struc

net/mlx5: improve stride parameter names

In the striding RQ management there are two important parameters, the
size of the single stride in bytes and the number of strides.

Both the data-path structure and config structure keep the log of the
above parameters. However, in their names there is no mention that the
value is a log which may be misleading as if the fields represent the
values themselves.

This patch updates their names describing the values more accurately.

Fixes: ecb160456aed ("net/mlx5: add device parameter for MPRQ stride size")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

show more ...


# 5dfa003d 03-Nov-2021 Michael Baum <michaelba@nvidia.com>

common/mlx5: fix post doorbell barrier

The rdma-core library can map doorbell register in two ways, depending
on the environment variable "MLX5_SHUT_UP_BF":

- as regular cached memory, the variab

common/mlx5: fix post doorbell barrier

The rdma-core library can map doorbell register in two ways, depending
on the environment variable "MLX5_SHUT_UP_BF":

- as regular cached memory, the variable is either missing or set to
zero. This type of mapping may cause the significant doorbell
register writing latency and requires an explicit memory write
barrier to mitigate this issue and prevent write combining.

- as non-cached memory, the variable is present and set to not "0"
value. This type of mapping may cause performance impact under
heavy loading conditions but the explicit write memory barrier is
not required and it may improve core performance.

The UAR creation function maps a doorbell in one of the above ways
according to the system. In run time, it always adds an explicit memory
barrier after writing to.
In cases where the doorbell was mapped as non-cached memory, the
explicit memory barrier is unnecessary and may impair performance.

The commit [1] solved this problem for a Tx queue. In run time, it
checks the mapping type and provides the memory barrier after writing to
a Tx doorbell register if it is needed. The mapping type is extracted
directly from the uar_mmap_offset field in the queue properties.

This patch shares this code between the drivers and extends the above
solution for each of them.

[1] commit 8409a28573d3
("net/mlx5: control transmit doorbell register mapping")

Fixes: f8c97babc9f4 ("compress/mlx5: add data-path functions")
Fixes: 8e196c08ab53 ("crypto/mlx5: support enqueue/dequeue operations")
Fixes: 4d4e245ad637 ("regex/mlx5: support enqueue")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

show more ...


# b6e9c33c 03-Nov-2021 Michael Baum <michaelba@nvidia.com>

net/mlx5: remove duplicated reference of Tx doorbell

The Tx doorbell has different virtual addresses per process.
The secondary process takes the UAR physical page ID of the primary and
mmap it to i

net/mlx5: remove duplicated reference of Tx doorbell

The Tx doorbell has different virtual addresses per process.
The secondary process takes the UAR physical page ID of the primary and
mmap it to its own virtual address.
The primary doorbell references were saved in two shared memory
locations: the TxQ structure and a dedicated doorbell array.

Remove the doorbell reference from the TxQ structure and move the
primary processes to take the UAR information from the primary doorbell
array.

Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

show more ...


# 09c25553 04-Nov-2021 Xueming Li <xuemingl@nvidia.com>

net/mlx5: support shared Rx queue

This patch introduces shared RxQ. All shared Rx queues with same group
and queue ID share the same rxq_ctrl. Rxq_ctrl and rxq_data are shared,
all queues from diffe

net/mlx5: support shared Rx queue

This patch introduces shared RxQ. All shared Rx queues with same group
and queue ID share the same rxq_ctrl. Rxq_ctrl and rxq_data are shared,
all queues from different member port share same WQ and CQ, essentially
one Rx WQ, mbufs are filled into this singleton WQ.

Shared rxq_data is set into device Rx queues of all member ports as
RxQ object, used for receiving packets. Polling queue of any member
ports returns packets of any member, mbuf->port is used to identify
source port.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

show more ...


# 5cf0707f 04-Nov-2021 Xueming Li <xuemingl@nvidia.com>

net/mlx5: remove Rx queue data list from device

Rx queue data list(priv->rxqs) can be replaced by Rx queue
list(priv->rxq_privs), removes it and replaces with universal wrapper
API.

Signed-off-by:

net/mlx5: remove Rx queue data list from device

Rx queue data list(priv->rxqs) can be replaced by Rx queue
list(priv->rxq_privs), removes it and replaces with universal wrapper
API.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

show more ...


# 5ceb3a02 04-Nov-2021 Xueming Li <xuemingl@nvidia.com>

net/mlx5: move Rx queue DevX resource

To support shared RX queue, moves DevX RQ which is per queue resource to
Rx queue private data.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viach

net/mlx5: move Rx queue DevX resource

To support shared RX queue, moves DevX RQ which is per queue resource to
Rx queue private data.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

show more ...


# 5fbc75ac 19-Oct-2021 Michael Baum <michaelba@nvidia.com>

common/mlx5: add global MR cache create function

Add function for global shared MR cache structure initialization.
This function include:
- btree initialization.
- set callbacks for reg and dereg

common/mlx5: add global MR cache create function

Add function for global shared MR cache structure initialization.
This function include:
- btree initialization.
- set callbacks for reg and dereg MR.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

show more ...


# e35ccf24 19-Oct-2021 Michael Baum <michaelba@nvidia.com>

common/mlx5: share protection domain object

Create shared Protection Domain in common area and add it and its PDN as
fields of common device structure.

Use this Protection Domain in all drivers and

common/mlx5: share protection domain object

Create shared Protection Domain in common area and add it and its PDN as
fields of common device structure.

Use this Protection Domain in all drivers and remove the PD and PDN
fields from their private structure.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

show more ...


# ca1418ce 19-Oct-2021 Michael Baum <michaelba@nvidia.com>

common/mlx5: share device context object

Create shared context device in common area and add it as a field of
common device.
Use this context device in all drivers and remove the ctx field from
thei

common/mlx5: share device context object

Create shared context device in common area and add it as a field of
common device.
Use this context device in all drivers and remove the ctx field from
their private structure.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

show more ...


# 23233fd6 17-May-2021 Bing Zhao <bingz@nvidia.com>

net/mlx5: fix loopback for Direct Verbs queue

In the past, all the queues and other hardware objects were created
through Verbs interface. Currently, most of the objects creation are
migrated to Dev

net/mlx5: fix loopback for Direct Verbs queue

In the past, all the queues and other hardware objects were created
through Verbs interface. Currently, most of the objects creation are
migrated to Devx interface by default, including queues. Only when
the DV is disabled by device arg or eswitch is enabled, all or some
of the objects are created through Verbs interface.

When using Devx interface to create queues, the kernel driver
behavior is different from the case using Verbs. The Tx loopback
cannot work properly even if the Tx and Rx queues are configured
with loopback attribute. To fix the support self loopback for Tx, a
Verbs dummy queue pair needs to be created to trigger the kernel to
enable the global loopback capability.

This is only required when TIR is created for Rx and loopback is
needed. Only CQ and QP are needed for this case, no WQ(RQ) needs to
be created.

Bugzilla ID: 645
Fixes: 6deb19e1b2d2 ("net/mlx5: separate Rx queue object creations")
Cc: stable@dpdk.org

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

show more ...


# 377b69fb 12-Apr-2021 Michael Baum <michaelba@nvidia.com>

net/mlx5: separate Tx function declarations to another file

This patch separates Tx function declarations to different header file
in preparation for removing their implementation from the source fi

net/mlx5: separate Tx function declarations to another file

This patch separates Tx function declarations to different header file
in preparation for removing their implementation from the source file
and as an optional preparation for Tx cleanup.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

show more ...


# 151cbe3a 12-Apr-2021 Michael Baum <michaelba@nvidia.com>

net/mlx5: separate Rx function declarations to another file

The mlx5_rxtx.c file contains a lot of Tx burst functions, each of those
is performance-optimized for the specific set of requested offloa

net/mlx5: separate Rx function declarations to another file

The mlx5_rxtx.c file contains a lot of Tx burst functions, each of those
is performance-optimized for the specific set of requested offloads.
These ones are generated on the basis of the template function and it
takes significant time to compile, just due to a large number of giant
functions generated in the same file and this compilation is not being
done in parallel with using multithreading.

Therefore we can split the mlx5_rxtx.c file into several separate files
to allow different functions to be compiled simultaneously.
In this patch, we separate Rx function declarations to different header
file in preparation for removing them from the source file and as an
optional preparation step for further consolidation of Rx burst
functions.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

show more ...


# 87acdcc7 09-Mar-2021 Thomas Monjalon <thomas@monjalon.net>

net/mlx5: enable debug logs dynamically

Most debug logs are using DRV_LOG(DEBUG,)
but some were using DEBUG().
The macro DEBUG is doing nothing if not compiled with
RTE_LIBRTE_MLX5_DEBUG.

As it is

net/mlx5: enable debug logs dynamically

Most debug logs are using DRV_LOG(DEBUG,)
but some were using DEBUG().
The macro DEBUG is doing nothing if not compiled with
RTE_LIBRTE_MLX5_DEBUG.

As it is not used in the data path, the macro DEBUG
can be replaced with DRV_LOG.
Then all debug logs can be enabled at runtime with:
--log-level pmd.net.mlx5:debug

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Matan Azrad <matan@nvidia.com>

show more ...


# fdc44cdc 01-Feb-2021 Alexander Kozyrev <akozyrev@nvidia.com>

net/mlx5: fix miniCQE configuration for Verbs

Verbs cannot be used to configure newly introduced miniCQE formats for
Flow Tag and L3/L4 Header compression. Support for these formats has
been added t

net/mlx5: fix miniCQE configuration for Verbs

Verbs cannot be used to configure newly introduced miniCQE formats for
Flow Tag and L3/L4 Header compression. Support for these formats has
been added to the DevX configuration only. And the RX queue descriptor
has been updated with the CQE compression format information only as
well. But the datapath relies on this info no matter which method is
used for Rx queues configuration. Set proper CQE compression format
information in the Verbs configuration to fix the miniCQE parsing logic.

Fixes: 54c2d46b160f ("net/mlx5: support flow tag and packet header miniCQEs")
Cc: stable@dpdk.org

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

show more ...


# df96fd0d 29-Jan-2021 Bruce Richardson <bruce.richardson@intel.com>

ethdev: make driver-only headers private

The rte_ethdev_driver.h, rte_ethdev_vdev.h and rte_ethdev_pci.h files are
for drivers only and should be a private to DPDK and not installed.

Signed-off-by:

ethdev: make driver-only headers private

The rte_ethdev_driver.h, rte_ethdev_vdev.h and rte_ethdev_pci.h files are
for drivers only and should be a private to DPDK and not installed.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Steven Webster <steven.webster@windriver.com>

show more ...


# 4a7f979a 06-Jan-2021 Michael Baum <michaelba@nvidia.com>

net/mlx5: remove CQE padding device argument

The data-path code doesn't take care on 'rxq_cqe_pad_en' and use padded
CQE for any case when the system cache-line size is 128B.

This makes the argumen

net/mlx5: remove CQE padding device argument

The data-path code doesn't take care on 'rxq_cqe_pad_en' and use padded
CQE for any case when the system cache-line size is 128B.

This makes the argument redundant.

Remove it.

Fixes: bc91e8db12cd ("net/mlx5: add 128B padding of Rx completion entry")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

show more ...


123