History log of /dpdk/drivers/net/mlx5/mlx5_devx.c (Results 51 – 75 of 96)
Revision Date Author Comments
# 151cbe3a 12-Apr-2021 Michael Baum <michaelba@nvidia.com>

net/mlx5: separate Rx function declarations to another file

The mlx5_rxtx.c file contains a lot of Tx burst functions, each of those
is performance-optimized for the specific set of requested offloa

net/mlx5: separate Rx function declarations to another file

The mlx5_rxtx.c file contains a lot of Tx burst functions, each of those
is performance-optimized for the specific set of requested offloads.
These ones are generated on the basis of the template function and it
takes significant time to compile, just due to a large number of giant
functions generated in the same file and this compilation is not being
done in parallel with using multithreading.

Therefore we can split the mlx5_rxtx.c file into several separate files
to allow different functions to be compiled simultaneously.
In this patch, we separate Rx function declarations to different header
file in preparation for removing them from the source file and as an
optional preparation step for further consolidation of Rx burst
functions.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

show more ...


# d61381ad 14-Mar-2021 Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: support timestamp format

This patch adds support for the timestamp format settings for
the receive and send queues. If the firmware version x.30.1000
or above is installed and the NIC time

net/mlx5: support timestamp format

This patch adds support for the timestamp format settings for
the receive and send queues. If the firmware version x.30.1000
or above is installed and the NIC timestamps are configured
with the real-time format, the default zero values for newly
added fields cause the queue creation to fail.

The patch queries the timestamp formats supported by the hardware
and sets the configuration values in queue context accordingly.

Fixes: 86fc67fc9315 ("net/mlx5: create advanced RxQ object via DevX")
Fixes: ae18a1ae9692 ("net/mlx5: support Tx hairpin queues")
Fixes: 15c3807e86ab ("common/mlx5: support DevX QP operations")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>

show more ...


# e6988afd 25-Feb-2021 Matan Azrad <matan@nvidia.com>

net/mlx5: fix imissed statistics

The imissed port statistic counts packets that were dropped by the
device Rx queues.

In mlx5, the imissed counter summarizes 2 counters:
- packets dropped by the S

net/mlx5: fix imissed statistics

The imissed port statistic counts packets that were dropped by the
device Rx queues.

In mlx5, the imissed counter summarizes 2 counters:
- packets dropped by the SW queue handling counted by SW.
- packets dropped by the HW queues due to "out of buffer" events
detected when no SW buffer is available for the incoming
packets.

There is HW counter object that should be created per device, and all
the Rx queues should be assigned to this counter in configuration time.

This part was missed when the Rx queues were created by DevX what
remained the "out of buffer" counter clean forever in this case.

Add 2 options to assign the DevX Rx queues to queue counter:
- Create queue counter per device by DevX and assign all the
queues to it.
- Query the kernel counter and assign all the queues to it.

Use the first option by default and if it is failed, fallback to the
second option.

Fixes: e79c9be91515 ("net/mlx5: support Rx hairpin queues")
Fixes: dc9ceff73c99 ("net/mlx5: create advanced RxQ via DevX")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

show more ...


# 00984de5 04-Feb-2021 Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: fix Tx queue size created with DevX

The number of descriptors specified for queue creation
implies the queue should be able to contain the specified
amount of packets being sent. Typically

net/mlx5: fix Tx queue size created with DevX

The number of descriptors specified for queue creation
implies the queue should be able to contain the specified
amount of packets being sent. Typically one packet takes
one queue descriptor (WQE) to be handled. If there is inline
data option enabled one packet might require more WQEs to
embrace the inline data and the overall queue size (the
number of queue descriptors) should be adjusted accordingly.

In mlx5 PMD the queues can be created either via Verbs, using
the rdma-core library or via DevX as direct kernel/firmware call.
The rdma-core does queue size adjustment internally, depending on
TSO and inline setting. The DevX approach missed this point.
This caused the queue size discrepancy and performance variations.

The patch adjusts the Tx queue size for the DevX approach
in the same as it is done in rdma-core implementation.

Fixes: 86d259cec852 ("net/mlx5: separate Tx queue object creations")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

show more ...


# 6e0a3637 06-Jan-2021 Michael Baum <michaelba@nvidia.com>

net/mlx5: move Rx RQ creation to common

Using common function for Rx RQ creation.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>


# 74e91860 06-Jan-2021 Michael Baum <michaelba@nvidia.com>

net/mlx5: move Tx SQ creation to common

Using common function for Tx SQ creation.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>


# 5cd33796 06-Jan-2021 Michael Baum <michaelba@nvidia.com>

net/mlx5: move Rx CQ creation to common

Using common function for Rx CQ creation.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>


# 5f04f70c 06-Jan-2021 Michael Baum <michaelba@nvidia.com>

net/mlx5: move Tx CQ creation to common

Using common function for Tx CQ creation.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>


# a2521c8f 06-Jan-2021 Michael Baum <michaelba@nvidia.com>

common/mlx5: fix completion queue entry size configuration

According to the current data-path implementation in the PMD the CQE
size must follow the cache-line size.
So, the configuration of the CQE

common/mlx5: fix completion queue entry size configuration

According to the current data-path implementation in the PMD the CQE
size must follow the cache-line size.
So, the configuration of the CQE size should be depended in
RTE_CACHE_LINE_SIZE.

Wrongly, part of the CQE creations didn't follow it exactly what caused
an incompatibility between HW and SW in the data-path when working in
128B cache-line size systems.

Adjust the rule for any CQE creation.
Remove the cqe_size attribute from the DevX CQ creation command and set
it inside the command translation according to the cache-line size.

Fixes: 79a7e409a2f6 ("common/mlx5: prepare support of packet pacing")
Fixes: 5cd0a83f413e ("common/mlx5: support more fields in DevX CQ create")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

show more ...


# f1ae0b35 28-Dec-2020 Ophir Munk <ophirmu@nvidia.com>

net/mlx5: enable more shared code on Windows

Use macro HAVE_INFINIBAND_VERBS_H to successfully compile files both
under Linux and Windows (or any non Linux in general). Under Windows
this macro:
1.

net/mlx5: enable more shared code on Windows

Use macro HAVE_INFINIBAND_VERBS_H to successfully compile files both
under Linux and Windows (or any non Linux in general). Under Windows
this macro:
1. Hides Verbs references.
2. Exposes required DV structs that are under ifdefs related to rdma
core.

Linux code under definitions such as #ifdef HAVE_IBV_FLOW_DV_SUPPORT is
required unconditionally under Windows however those definitions are
never effective without rdma-core presence. Therefore update the #ifdef
condition to consider HAVE_INFINIBAND_VERBS_H as well (undefined macro
when running without an rdma-core library).

For example:
-#ifdef HAVE_IBV_FLOW_DV_SUPPORT
+#if defined(HAVE_IBV_FLOW_DV_SUPPORT) || !defined(HAVE_INFINIBAND_VERBS_H)

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

show more ...


# 88019723 28-Dec-2020 Ophir Munk <ophirmu@nvidia.com>

net/mlx5: fix flow operation wrapper per OS

Wrap glue call dv_create_flow_action_dest_devx_tir() with an OS API.

Fixes: b293fbf9672b ("net/mlx5: add OS specific flow actions operations")
Cc: stable

net/mlx5: fix flow operation wrapper per OS

Wrap glue call dv_create_flow_action_dest_devx_tir() with an OS API.

Fixes: b293fbf9672b ("net/mlx5: add OS specific flow actions operations")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

show more ...


# 98174626 28-Dec-2020 Tal Shnaiderman <talshn@nvidia.com>

common/mlx5: wrap event channel functions per OS

Wrap the API to create/destroy event channel and to subscribe an event
with OS calls. In Linux those calls are implemented by glue functions
while in

common/mlx5: wrap event channel functions per OS

Wrap the API to create/destroy event channel and to subscribe an event
with OS calls. In Linux those calls are implemented by glue functions
while in Windows they are not supported.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

show more ...


# 07a99de8 28-Dec-2020 Tal Shnaiderman <talshn@nvidia.com>

net/mlx5: wrap glue reg/dereg UMEM per OS

Wrap glue calls for UMEM registration and deregistration with generic OS
calls since each OS (Linux or Windows) has a different glue API
parameters.

Signed

net/mlx5: wrap glue reg/dereg UMEM per OS

Wrap glue calls for UMEM registration and deregistration with generic OS
calls since each OS (Linux or Windows) has a different glue API
parameters.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

show more ...


# 6e09c7bb 24-Nov-2020 Gregory Etelson <getelson@nvidia.com>

net/mlx5: fix DevX resources freeing

Invalid memory release order of DevX resources caused PMD crash.

1. SQ and CQ memory must be unregistered with DevX before it is freed.
2. SQ objects reference

net/mlx5: fix DevX resources freeing

Invalid memory release order of DevX resources caused PMD crash.

1. SQ and CQ memory must be unregistered with DevX before it is freed.
2. SQ objects reference to a CQ ones. Hence, SQ should be destroyed in
advance of CQ it references to.

Fixes: 6deb19e1b2d2 ("net/mlx5: separate Rx queue object creations")
Fixes: 88f2e3f18cc7 ("net/mlx5: rearrange SQ and CQ creation in DevX module")

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

show more ...


# fa7ad49e 22-Nov-2020 Andrey Vesnovaty <andreyv@nvidia.com>

net/mlx5: fix shared RSS action update

The shared RSS action update was not operational due to lack
of kernel driver support of TIR object modification.
This commit introduces the workaround to supp

net/mlx5: fix shared RSS action update

The shared RSS action update was not operational due to lack
of kernel driver support of TIR object modification.
This commit introduces the workaround to support shared RSS
action modify using an indirect queue table update instead of
touching TIR object directly.
Limitations: the only supported RSS property to update is queues, the
rest of the properties ignored.

Fixes: d2046c09aa64 ("net/mlx5: support shared action for RSS")

Signed-off-by: Andrey Vesnovaty <andreyv@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

show more ...


# ff2deada 15-Nov-2020 Alexander Kozyrev <akozyrev@nvidia.com>

net/mlx5: fix Rx packet padding config via DevX

Received packets can be aligned to the size of the cache line on
PCI transactions. This could improve performance by avoiding
partial cache line write

net/mlx5: fix Rx packet padding config via DevX

Received packets can be aligned to the size of the cache line on
PCI transactions. This could improve performance by avoiding
partial cache line writes in exchange for increased PCI bandwidth.

This feature is supposed to be controlled by the rxq_pkt_pad_en
devarg and it is true for an RxQ created via the Verbs API.
But in the DevX API case, it is erroneously controlled by the
rxq_cqe_pad_en devarg instead, which is in charge of the CQE
padding instead and should not control the RxQ creation.

Fix DevX RxQ creation by using the proper configuration flag for
Rx packet padding that is being set by the rxq_pkt_pad_en devarg.

Fixes: dc9ceff73c99 ("net/mlx5: create advanced RxQ via DevX")
Cc: stable@dpdk.org

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

show more ...


# 876b5d52 03-Nov-2020 Matan Azrad <matan@nvidia.com>

net/mlx5: fix Tx queue stop state

The Tx queue stop API doesn't call the PMD callback when the state of
the queue is stopped.
The drivers should update the state to be stopped when the queue stop
ca

net/mlx5: fix Tx queue stop state

The Tx queue stop API doesn't call the PMD callback when the state of
the queue is stopped.
The drivers should update the state to be stopped when the queue stop
callback is done successfully or when the port is stopped.
The drivers should update the state to be started when the queue start
callback is done successfully or when the port is started.

The driver wrongly didn't update the state as started when the port
start callback was done which kept the state as stopped.
Following call to a queue stop API was not completed by ethdev layer
because the state is already stopped.

Move the state update from the Tx queue setup to the port start
callback.

Fixes: 161d103b231c ("net/mlx5: add queue start and stop")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

show more ...


# 8178d9be 28-Oct-2020 Tal Shnaiderman <talshn@nvidia.com>

net/mlx5: fix SQ resources release in error flow

Fix in error flow in which the function
mlx5_txq_release_devx_sq_resources is called twice by setting the
release object to NULL after the first call

net/mlx5: fix SQ resources release in error flow

Fix in error flow in which the function
mlx5_txq_release_devx_sq_resources is called twice by setting the
release object to NULL after the first call

The incorrect flow was introduced in the work done on generic
object creation.

Once an error flow inside mlx5_txq_create_devx_sq_resources
occurs the function will call mlx5_txq_release_devx_sq_resources
however the released pointers are not set to NULL after the release
calls and undefined memory is released in the same call in
mlx5_txq_release_devx_resources.

This results in calls to MLX5_FREE with
an already released memory addresses and assert in mlx5_release_dbr:

EAL: Error: Invalid memory
EAL: Error: Invalid memory

PANIC in mlx5_txq_release_devx_sq_resources():
assert "(mlx5_release_dbr(&txq_obj->txq_ctrl->priv->dbrpgs,
mlx5_os_get_umem_id (txq_obj->sq_dbrec_page->umem),
txq_obj->sq_dbrec_offset)) == 0" failed

The fix is setting the released pointers to NULL after the first release
calls.

Fixes: 86d259cec852 ("net/mlx5: separate Tx queue object creations")
Cc: stable@dpdk.org

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

show more ...


# 54c2d46b 01-Nov-2020 Alexander Kozyrev <akozyrev@nvidia.com>

net/mlx5: support flow tag and packet header miniCQEs

CQE compression allows us to save the PCI bandwidth and improve
the performance by compressing several CQEs together to a miniCQE.
But the miniC

net/mlx5: support flow tag and packet header miniCQEs

CQE compression allows us to save the PCI bandwidth and improve
the performance by compressing several CQEs together to a miniCQE.
But the miniCQE size is only 8 bytes and this limits the ability
to successfully keep the compression session in case of various
traffic patterns.

The current miniCQE format only keeps the compression session alive
in case of uniform traffic with the Hash RSS as the only difference.
There are requests to keep the compression session in case of tagged
traffic by RTE Flow Mark Id and mixed UDP/TCP and IPv4/IPv6 traffic.
Add 2 new miniCQE formats in order to achieve the best performance
for these traffic patterns: Flow Tag and Packet Header miniCQEs.

The existing rxq_cqe_comp_en devarg is modified to specify the
desired miniCQE format. Specifying 2 selects Flow Tag format
for better compression rate in case of RTE Flow Mark traffic.
Specifying 3 selects Checksum format (existing format for MPRQ).
Specifying 4 selects L3/L4 Header format for better compression
rate in case of mixed TCP/UDP and IPv4/IPv6 traffic.

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

show more ...


# b8cc58c1 23-Oct-2020 Andrey Vesnovaty <andreyv@nvidia.com>

net/mlx5: modify hash Rx queue objects

Implement modification for hashed table of Rx queue object (see
mlx5_hrxq_modify()). This implementation relies on the capability to
modify TIR object via DevX

net/mlx5: modify hash Rx queue objects

Implement modification for hashed table of Rx queue object (see
mlx5_hrxq_modify()). This implementation relies on the capability to
modify TIR object via DevX API, i.e. current implementation doesn't
support verbs HW object operations. The functionality to modify hashed
table of Rx queue object is prerequisite to implement
rete_flow_shared_action_update() for shared RSS action in mlx5 PMD.

Signed-off-by: Andrey Vesnovaty <andreyv@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

show more ...


# 0f20acbf 21-Oct-2020 Alexander Kozyrev <akozyrev@nvidia.com>

net/mlx5: implement vectorized MPRQ burst

MPRQ (Multi-Packet Rx Queue) processes one packet at a time using
simple scalar instructions. MPRQ works by posting a single large buffer
(consisted of mult

net/mlx5: implement vectorized MPRQ burst

MPRQ (Multi-Packet Rx Queue) processes one packet at a time using
simple scalar instructions. MPRQ works by posting a single large buffer
(consisted of multiple fixed-size strides) in order to receive multiple
packets at once on this buffer. A Rx packet is then copied to a
user-provided mbuf or PMD attaches the Rx packet to the mbuf by the
pointer to an external buffer.

There is an opportunity to speed up the packet receiving by processing
4 packets simultaneously using SIMD (single instruction, multiple data)
extensions. Allocate mbufs in batches for every MPRQ buffer and process
the packets in groups of 4 until all the strides are exhausted. Then
switch to another MPRQ buffer and repeat the process over again.

The vectorized MPRQ burst routine is engaged automatically in case
the mprq_en=1 devarg is specified and the vectorization is not disabled
explicitly by providing rx_vec_en=0 devarg. There is a limitation:
LRO is not supported and scalar MPRQ is selected if it is on.

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

show more ...


# e96242ef 01-Oct-2020 Michael Baum <michaelba@nvidia.com>

net/mlx5: remove Rx queue object type field

Once the separation between Verbs and DevX is done using function
pointers, the type field of the Rx queue object structure becomes
redundant and no more

net/mlx5: remove Rx queue object type field

Once the separation between Verbs and DevX is done using function
pointers, the type field of the Rx queue object structure becomes
redundant and no more code is used.
Remove the unnecessary field from the structure.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

show more ...


# 4c6d80f1 01-Oct-2020 Michael Baum <michaelba@nvidia.com>

net/mlx5: separate Rx queue state modification

Separate Rx state modification to the Verbs and DevX modules.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.c

net/mlx5: separate Rx queue state modification

Separate Rx state modification to the Verbs and DevX modules.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

show more ...


# 354cc08a 01-Oct-2020 Michael Baum <michaelba@nvidia.com>

net/mlx5: remove Tx queue object type field

Once the separation between Verbs and DevX is done using function
pointers, the type field of the Tx queue object structure becomes
redundant and no more

net/mlx5: remove Tx queue object type field

Once the separation between Verbs and DevX is done using function
pointers, the type field of the Tx queue object structure becomes
redundant and no more code is used.
Remove the unnecessary field from the structure.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

show more ...


# a9c79306 01-Oct-2020 Michael Baum <michaelba@nvidia.com>

net/mlx5: share Tx queue object modification

Use new modify_qp functions for Tx object creation in DevX and Verbs
modules.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <

net/mlx5: share Tx queue object modification

Use new modify_qp functions for Tx object creation in DevX and Verbs
modules.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

show more ...


1234