mlx5_devx.c - OpenGrok history log for /dpdk/drivers/net/mlx5/mlx5

Revision	Date	Author	Comments
# 151cbe3a	12-Apr-2021	Michael Baum <michaelba@nvidia.com>	net/mlx5: separate Rx function declarations to another file The mlx5_rxtx.c file contains a lot of Tx burst functions, each of those is performance-optimized for the specific set of requested offloa net/mlx5: separate Rx function declarations to another file The mlx5_rxtx.c file contains a lot of Tx burst functions, each of those is performance-optimized for the specific set of requested offloads. These ones are generated on the basis of the template function and it takes significant time to compile, just due to a large number of giant functions generated in the same file and this compilation is not being done in parallel with using multithreading. Therefore we can split the mlx5_rxtx.c file into several separate files to allow different functions to be compiled simultaneously. In this patch, we separate Rx function declarations to different header file in preparation for removing them from the source file and as an optional preparation step for further consolidation of Rx burst functions. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# d61381ad	14-Mar-2021	Viacheslav Ovsiienko <viacheslavo@nvidia.com>	net/mlx5: support timestamp format This patch adds support for the timestamp format settings for the receive and send queues. If the firmware version x.30.1000 or above is installed and the NIC time net/mlx5: support timestamp format This patch adds support for the timestamp format settings for the receive and send queues. If the firmware version x.30.1000 or above is installed and the NIC timestamps are configured with the real-time format, the default zero values for newly added fields cause the queue creation to fail. The patch queries the timestamp formats supported by the hardware and sets the configuration values in queue context accordingly. Fixes: 86fc67fc9315 ("net/mlx5: create advanced RxQ object via DevX") Fixes: ae18a1ae9692 ("net/mlx5: support Tx hairpin queues") Fixes: 15c3807e86ab ("common/mlx5: support DevX QP operations") Cc: stable@dpdk.org Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> show more ...
# e6988afd	25-Feb-2021	Matan Azrad <matan@nvidia.com>	net/mlx5: fix imissed statistics The imissed port statistic counts packets that were dropped by the device Rx queues. In mlx5, the imissed counter summarizes 2 counters: - packets dropped by the S net/mlx5: fix imissed statistics The imissed port statistic counts packets that were dropped by the device Rx queues. In mlx5, the imissed counter summarizes 2 counters: - packets dropped by the SW queue handling counted by SW. - packets dropped by the HW queues due to "out of buffer" events detected when no SW buffer is available for the incoming packets. There is HW counter object that should be created per device, and all the Rx queues should be assigned to this counter in configuration time. This part was missed when the Rx queues were created by DevX what remained the "out of buffer" counter clean forever in this case. Add 2 options to assign the DevX Rx queues to queue counter: - Create queue counter per device by DevX and assign all the queues to it. - Query the kernel counter and assign all the queues to it. Use the first option by default and if it is failed, fallback to the second option. Fixes: e79c9be91515 ("net/mlx5: support Rx hairpin queues") Fixes: dc9ceff73c99 ("net/mlx5: create advanced RxQ via DevX") Cc: stable@dpdk.org Signed-off-by: Matan Azrad <matan@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# 00984de5	04-Feb-2021	Viacheslav Ovsiienko <viacheslavo@nvidia.com>	net/mlx5: fix Tx queue size created with DevX The number of descriptors specified for queue creation implies the queue should be able to contain the specified amount of packets being sent. Typically net/mlx5: fix Tx queue size created with DevX The number of descriptors specified for queue creation implies the queue should be able to contain the specified amount of packets being sent. Typically one packet takes one queue descriptor (WQE) to be handled. If there is inline data option enabled one packet might require more WQEs to embrace the inline data and the overall queue size (the number of queue descriptors) should be adjusted accordingly. In mlx5 PMD the queues can be created either via Verbs, using the rdma-core library or via DevX as direct kernel/firmware call. The rdma-core does queue size adjustment internally, depending on TSO and inline setting. The DevX approach missed this point. This caused the queue size discrepancy and performance variations. The patch adjusts the Tx queue size for the DevX approach in the same as it is done in rdma-core implementation. Fixes: 86d259cec852 ("net/mlx5: separate Tx queue object creations") Cc: stable@dpdk.org Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# 6e0a3637	06-Jan-2021	Michael Baum <michaelba@nvidia.com>	net/mlx5: move Rx RQ creation to common Using common function for Rx RQ creation. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
# 74e91860	06-Jan-2021	Michael Baum <michaelba@nvidia.com>	net/mlx5: move Tx SQ creation to common Using common function for Tx SQ creation. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
# 5cd33796	06-Jan-2021	Michael Baum <michaelba@nvidia.com>	net/mlx5: move Rx CQ creation to common Using common function for Rx CQ creation. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
# 5f04f70c	06-Jan-2021	Michael Baum <michaelba@nvidia.com>	net/mlx5: move Tx CQ creation to common Using common function for Tx CQ creation. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
# a2521c8f	06-Jan-2021	Michael Baum <michaelba@nvidia.com>	common/mlx5: fix completion queue entry size configuration According to the current data-path implementation in the PMD the CQE size must follow the cache-line size. So, the configuration of the CQE common/mlx5: fix completion queue entry size configuration According to the current data-path implementation in the PMD the CQE size must follow the cache-line size. So, the configuration of the CQE size should be depended in RTE_CACHE_LINE_SIZE. Wrongly, part of the CQE creations didn't follow it exactly what caused an incompatibility between HW and SW in the data-path when working in 128B cache-line size systems. Adjust the rule for any CQE creation. Remove the cqe_size attribute from the DevX CQ creation command and set it inside the command translation according to the cache-line size. Fixes: 79a7e409a2f6 ("common/mlx5: prepare support of packet pacing") Fixes: 5cd0a83f413e ("common/mlx5: support more fields in DevX CQ create") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> show more ...
# f1ae0b35	28-Dec-2020	Ophir Munk <ophirmu@nvidia.com>	net/mlx5: enable more shared code on Windows Use macro HAVE_INFINIBAND_VERBS_H to successfully compile files both under Linux and Windows (or any non Linux in general). Under Windows this macro: 1. net/mlx5: enable more shared code on Windows Use macro HAVE_INFINIBAND_VERBS_H to successfully compile files both under Linux and Windows (or any non Linux in general). Under Windows this macro: 1. Hides Verbs references. 2. Exposes required DV structs that are under ifdefs related to rdma core. Linux code under definitions such as #ifdef HAVE_IBV_FLOW_DV_SUPPORT is required unconditionally under Windows however those definitions are never effective without rdma-core presence. Therefore update the #ifdef condition to consider HAVE_INFINIBAND_VERBS_H as well (undefined macro when running without an rdma-core library). For example: -#ifdef HAVE_IBV_FLOW_DV_SUPPORT +#if defined(HAVE_IBV_FLOW_DV_SUPPORT) \|\| !defined(HAVE_INFINIBAND_VERBS_H) Signed-off-by: Ophir Munk <ophirmu@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> show more ...
# 88019723	28-Dec-2020	Ophir Munk <ophirmu@nvidia.com>	net/mlx5: fix flow operation wrapper per OS Wrap glue call dv_create_flow_action_dest_devx_tir() with an OS API. Fixes: b293fbf9672b ("net/mlx5: add OS specific flow actions operations") Cc: stable net/mlx5: fix flow operation wrapper per OS Wrap glue call dv_create_flow_action_dest_devx_tir() with an OS API. Fixes: b293fbf9672b ("net/mlx5: add OS specific flow actions operations") Cc: stable@dpdk.org Signed-off-by: Ophir Munk <ophirmu@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> show more ...
# 98174626	28-Dec-2020	Tal Shnaiderman <talshn@nvidia.com>	common/mlx5: wrap event channel functions per OS Wrap the API to create/destroy event channel and to subscribe an event with OS calls. In Linux those calls are implemented by glue functions while in common/mlx5: wrap event channel functions per OS Wrap the API to create/destroy event channel and to subscribe an event with OS calls. In Linux those calls are implemented by glue functions while in Windows they are not supported. Signed-off-by: Tal Shnaiderman <talshn@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> show more ...
# 07a99de8	28-Dec-2020	Tal Shnaiderman <talshn@nvidia.com>	net/mlx5: wrap glue reg/dereg UMEM per OS Wrap glue calls for UMEM registration and deregistration with generic OS calls since each OS (Linux or Windows) has a different glue API parameters. Signed net/mlx5: wrap glue reg/dereg UMEM per OS Wrap glue calls for UMEM registration and deregistration with generic OS calls since each OS (Linux or Windows) has a different glue API parameters. Signed-off-by: Tal Shnaiderman <talshn@nvidia.com> Signed-off-by: Ophir Munk <ophirmu@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> show more ...
# 6e09c7bb	24-Nov-2020	Gregory Etelson <getelson@nvidia.com>	net/mlx5: fix DevX resources freeing Invalid memory release order of DevX resources caused PMD crash. 1. SQ and CQ memory must be unregistered with DevX before it is freed. 2. SQ objects reference net/mlx5: fix DevX resources freeing Invalid memory release order of DevX resources caused PMD crash. 1. SQ and CQ memory must be unregistered with DevX before it is freed. 2. SQ objects reference to a CQ ones. Hence, SQ should be destroyed in advance of CQ it references to. Fixes: 6deb19e1b2d2 ("net/mlx5: separate Rx queue object creations") Fixes: 88f2e3f18cc7 ("net/mlx5: rearrange SQ and CQ creation in DevX module") Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> show more ...
# fa7ad49e	22-Nov-2020	Andrey Vesnovaty <andreyv@nvidia.com>	net/mlx5: fix shared RSS action update The shared RSS action update was not operational due to lack of kernel driver support of TIR object modification. This commit introduces the workaround to supp net/mlx5: fix shared RSS action update The shared RSS action update was not operational due to lack of kernel driver support of TIR object modification. This commit introduces the workaround to support shared RSS action modify using an indirect queue table update instead of touching TIR object directly. Limitations: the only supported RSS property to update is queues, the rest of the properties ignored. Fixes: d2046c09aa64 ("net/mlx5: support shared action for RSS") Signed-off-by: Andrey Vesnovaty <andreyv@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# ff2deada	15-Nov-2020	Alexander Kozyrev <akozyrev@nvidia.com>	net/mlx5: fix Rx packet padding config via DevX Received packets can be aligned to the size of the cache line on PCI transactions. This could improve performance by avoiding partial cache line write net/mlx5: fix Rx packet padding config via DevX Received packets can be aligned to the size of the cache line on PCI transactions. This could improve performance by avoiding partial cache line writes in exchange for increased PCI bandwidth. This feature is supposed to be controlled by the rxq_pkt_pad_en devarg and it is true for an RxQ created via the Verbs API. But in the DevX API case, it is erroneously controlled by the rxq_cqe_pad_en devarg instead, which is in charge of the CQE padding instead and should not control the RxQ creation. Fix DevX RxQ creation by using the proper configuration flag for Rx packet padding that is being set by the rxq_pkt_pad_en devarg. Fixes: dc9ceff73c99 ("net/mlx5: create advanced RxQ via DevX") Cc: stable@dpdk.org Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> show more ...
# 876b5d52	03-Nov-2020	Matan Azrad <matan@nvidia.com>	net/mlx5: fix Tx queue stop state The Tx queue stop API doesn't call the PMD callback when the state of the queue is stopped. The drivers should update the state to be stopped when the queue stop ca net/mlx5: fix Tx queue stop state The Tx queue stop API doesn't call the PMD callback when the state of the queue is stopped. The drivers should update the state to be stopped when the queue stop callback is done successfully or when the port is stopped. The drivers should update the state to be started when the queue start callback is done successfully or when the port is started. The driver wrongly didn't update the state as started when the port start callback was done which kept the state as stopped. Following call to a queue stop API was not completed by ethdev layer because the state is already stopped. Move the state update from the Tx queue setup to the port start callback. Fixes: 161d103b231c ("net/mlx5: add queue start and stop") Cc: stable@dpdk.org Signed-off-by: Matan Azrad <matan@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# 8178d9be	28-Oct-2020	Tal Shnaiderman <talshn@nvidia.com>	net/mlx5: fix SQ resources release in error flow Fix in error flow in which the function mlx5_txq_release_devx_sq_resources is called twice by setting the release object to NULL after the first call net/mlx5: fix SQ resources release in error flow Fix in error flow in which the function mlx5_txq_release_devx_sq_resources is called twice by setting the release object to NULL after the first call The incorrect flow was introduced in the work done on generic object creation. Once an error flow inside mlx5_txq_create_devx_sq_resources occurs the function will call mlx5_txq_release_devx_sq_resources however the released pointers are not set to NULL after the release calls and undefined memory is released in the same call in mlx5_txq_release_devx_resources. This results in calls to MLX5_FREE with an already released memory addresses and assert in mlx5_release_dbr: EAL: Error: Invalid memory EAL: Error: Invalid memory PANIC in mlx5_txq_release_devx_sq_resources(): assert "(mlx5_release_dbr(&txq_obj->txq_ctrl->priv->dbrpgs, mlx5_os_get_umem_id (txq_obj->sq_dbrec_page->umem), txq_obj->sq_dbrec_offset)) == 0" failed The fix is setting the released pointers to NULL after the first release calls. Fixes: 86d259cec852 ("net/mlx5: separate Tx queue object creations") Cc: stable@dpdk.org Signed-off-by: Tal Shnaiderman <talshn@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> show more ...
# 54c2d46b	01-Nov-2020	Alexander Kozyrev <akozyrev@nvidia.com>	net/mlx5: support flow tag and packet header miniCQEs CQE compression allows us to save the PCI bandwidth and improve the performance by compressing several CQEs together to a miniCQE. But the miniC net/mlx5: support flow tag and packet header miniCQEs CQE compression allows us to save the PCI bandwidth and improve the performance by compressing several CQEs together to a miniCQE. But the miniCQE size is only 8 bytes and this limits the ability to successfully keep the compression session in case of various traffic patterns. The current miniCQE format only keeps the compression session alive in case of uniform traffic with the Hash RSS as the only difference. There are requests to keep the compression session in case of tagged traffic by RTE Flow Mark Id and mixed UDP/TCP and IPv4/IPv6 traffic. Add 2 new miniCQE formats in order to achieve the best performance for these traffic patterns: Flow Tag and Packet Header miniCQEs. The existing rxq_cqe_comp_en devarg is modified to specify the desired miniCQE format. Specifying 2 selects Flow Tag format for better compression rate in case of RTE Flow Mark traffic. Specifying 3 selects Checksum format (existing format for MPRQ). Specifying 4 selects L3/L4 Header format for better compression rate in case of mixed TCP/UDP and IPv4/IPv6 traffic. Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# b8cc58c1	23-Oct-2020	Andrey Vesnovaty <andreyv@nvidia.com>	net/mlx5: modify hash Rx queue objects Implement modification for hashed table of Rx queue object (see mlx5_hrxq_modify()). This implementation relies on the capability to modify TIR object via DevX net/mlx5: modify hash Rx queue objects Implement modification for hashed table of Rx queue object (see mlx5_hrxq_modify()). This implementation relies on the capability to modify TIR object via DevX API, i.e. current implementation doesn't support verbs HW object operations. The functionality to modify hashed table of Rx queue object is prerequisite to implement rete_flow_shared_action_update() for shared RSS action in mlx5 PMD. Signed-off-by: Andrey Vesnovaty <andreyv@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# 0f20acbf	21-Oct-2020	Alexander Kozyrev <akozyrev@nvidia.com>	net/mlx5: implement vectorized MPRQ burst MPRQ (Multi-Packet Rx Queue) processes one packet at a time using simple scalar instructions. MPRQ works by posting a single large buffer (consisted of mult net/mlx5: implement vectorized MPRQ burst MPRQ (Multi-Packet Rx Queue) processes one packet at a time using simple scalar instructions. MPRQ works by posting a single large buffer (consisted of multiple fixed-size strides) in order to receive multiple packets at once on this buffer. A Rx packet is then copied to a user-provided mbuf or PMD attaches the Rx packet to the mbuf by the pointer to an external buffer. There is an opportunity to speed up the packet receiving by processing 4 packets simultaneously using SIMD (single instruction, multiple data) extensions. Allocate mbufs in batches for every MPRQ buffer and process the packets in groups of 4 until all the strides are exhausted. Then switch to another MPRQ buffer and repeat the process over again. The vectorized MPRQ burst routine is engaged automatically in case the mprq_en=1 devarg is specified and the vectorization is not disabled explicitly by providing rx_vec_en=0 devarg. There is a limitation: LRO is not supported and scalar MPRQ is selected if it is on. Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# e96242ef	01-Oct-2020	Michael Baum <michaelba@nvidia.com>	net/mlx5: remove Rx queue object type field Once the separation between Verbs and DevX is done using function pointers, the type field of the Rx queue object structure becomes redundant and no more net/mlx5: remove Rx queue object type field Once the separation between Verbs and DevX is done using function pointers, the type field of the Rx queue object structure becomes redundant and no more code is used. Remove the unnecessary field from the structure. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> show more ...
# 4c6d80f1	01-Oct-2020	Michael Baum <michaelba@nvidia.com>	net/mlx5: separate Rx queue state modification Separate Rx state modification to the Verbs and DevX modules. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.c net/mlx5: separate Rx queue state modification Separate Rx state modification to the Verbs and DevX modules. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> show more ...
# 354cc08a	01-Oct-2020	Michael Baum <michaelba@nvidia.com>	net/mlx5: remove Tx queue object type field Once the separation between Verbs and DevX is done using function pointers, the type field of the Tx queue object structure becomes redundant and no more net/mlx5: remove Tx queue object type field Once the separation between Verbs and DevX is done using function pointers, the type field of the Tx queue object structure becomes redundant and no more code is used. Remove the unnecessary field from the structure. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> show more ...
# a9c79306	01-Oct-2020	Michael Baum <michaelba@nvidia.com>	net/mlx5: share Tx queue object modification Use new modify_qp functions for Tx object creation in DevX and Verbs modules. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad < net/mlx5: share Tx queue object modification Use new modify_qp functions for Tx object creation in DevX and Verbs modules. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> show more ...
1 234