mlx5_devx.c - OpenGrok history log for /dpdk/drivers/net/mlx5/mlx5

Revision	Date	Author	Comments
# 2d876343	05-Jul-2024	Jiawei Wang <jiaweiw@nvidia.com>	net/mlx5: fix shared Rx queue data access race The rxq_data resources were shared for shared Rx queue with the same group and queue ID. The cq_ci:24 of rxq_data was unalignment with other fields in net/mlx5: fix shared Rx queue data access race The rxq_data resources were shared for shared Rx queue with the same group and queue ID. The cq_ci:24 of rxq_data was unalignment with other fields in the one 32-bit data, like the dynf_meta and delay_drop. 32bit: xxxx xxxI IIII IIII IIII IIII IIII IIIx ^ .... .... .... .... ...^ \| cq_ci \| The issue is that while the control thread updates the dynf_meta:1 or delay_drop:1 value during port start, another data thread updates the cq_ci at the same time, it causes the bytes race condition with different thread, and cq_ci value may be overwritten and updated the abnormal value into HW CQ DB. This patch separates the cq_ci from the configuration data spaces, and adds checking for delay_drop and dynf_meta if shared Rx queue if started. Fixes: 02a6195cbeaa ("net/mlx5: support enhanced CQE compression in Rx burst") Cc: stable@dpdk.org Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com> Acked-by: Bing Zhao <bingz@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# cd00dce6	01-Jul-2024	Shani Peretz <shperetz@nvidia.com>	net/mlx5: add hairpin out-of-buffer per-port counter Currently mlx5 PMD exposes rx_out_of_buffer counter that tracks packets dropped when Rx queue was full. To provide more granular statistics, thi net/mlx5: add hairpin out-of-buffer per-port counter Currently mlx5 PMD exposes rx_out_of_buffer counter that tracks packets dropped when Rx queue was full. To provide more granular statistics, this patch splits the `rx_out_of_buffer` counter into two separate counters: 1. hairpin_out_of_buffer - This counter specifically tracks packets dropped by the device's hairpin Rx queues. 2. rx_out_of_buffer - This counter tracks packets dropped by the device's Rx queues, excluding the hairpin Rx queues. Two hardware counter objects will be created per device, and all the Rx queues will be assigned to these counters during the configuration phase. The `hairpin_out_of_buffer` counter will be created only if there is at least one hairpin Rx queue present on the device. Signed-off-by: Shani Peretz <shperetz@nvidia.com> Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com> show more ...
# 1944fbc3	05-Jun-2024	Suanming Mou <suanmingm@nvidia.com>	net/mlx5: support flow match with external Tx queue For using external created Tx queues in RTE_FLOW_ITEM_TX_QUEUE, this commit provides the map and unmap functions to convert the external created S net/mlx5: support flow match with external Tx queue For using external created Tx queues in RTE_FLOW_ITEM_TX_QUEUE, this commit provides the map and unmap functions to convert the external created SQ's devx ID to DPDK flow item Tx queue ID. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com> show more ...
# 8e8b44f2	05-Jun-2024	Suanming Mou <suanmingm@nvidia.com>	net/mlx5: rename external queue Due to external Tx queue will be supported, in order to reuse the external queue struct, rename the current external Rx queue to external queue. Signed-off-by: Suanm net/mlx5: rename external queue Due to external Tx queue will be supported, in order to reuse the external queue struct, rename the current external Rx queue to external queue. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com> show more ...
# 177d90dd	13-Jul-2023	Bing Zhao <bingz@nvidia.com>	net/mlx5: fix drop action memory leak In DV mode, when quitting an application, the default drop action and its resources should be released. The Devx action for the TIR was not destroyed and it wou net/mlx5: fix drop action memory leak In DV mode, when quitting an application, the default drop action and its resources should be released. The Devx action for the TIR was not destroyed and it would cause 80B memory leak. With this commit, in DV mode, the action should be destroyed in the mlx5_devx_drop_action_destroy() explicitly. Bugzilla ID: 1192 Bugzilla ID: 1255 Fixes: bc5bee028ebc ("net/mlx5: create drop queue using DevX") Cc: stable@dpdk.org Reported-by: David Marchand <david.marchand@redhat.com> Reported-by: Mário Kuka <kuka@cesnet.cz> Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# 8fa8d147	05-Jul-2023	Viacheslav Ovsiienko <viacheslavo@nvidia.com>	net/mlx5: add comprehensive send completion trace There is the demand to trace the send completions of every WQE if time scheduling is enabled. The patch extends the size of completion queue and re net/mlx5: add comprehensive send completion trace There is the demand to trace the send completions of every WQE if time scheduling is enabled. The patch extends the size of completion queue and requests completion on every issued WQE in the send queue. As the result hardware provides CQE on each completed WQE and driver is able to fetch completion timestamp for dedicated operation. The add code is under conditional compilation RTE_ENABLE_TRACE_FP flag and does not impact the release code. Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# 0e04e1e2	06-Jul-2023	Xueming Li <xuemingl@nvidia.com>	net/mlx5: support symmetric RSS hash function This patch supports symmetric hash function that creating same hash result for bi-direction traffic which having reverse source and destination IP and L net/mlx5: support symmetric RSS hash function This patch supports symmetric hash function that creating same hash result for bi-direction traffic which having reverse source and destination IP and L4 port. Since the hash algorithm is different than spec(XOR), leave a warning in validation. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> show more ...
# 99532fb1	28-Feb-2023	Alexander Kozyrev <akozyrev@nvidia.com>	net/mlx5: enable enhanced CQE compression Extend rxq_cqe_comp_en devarg to allow the Enhanced CQE Compression layout to be enabled by a user. Setting the 8th bit turns it on. For example, rxq_cqe_co net/mlx5: enable enhanced CQE compression Extend rxq_cqe_comp_en devarg to allow the Enhanced CQE Compression layout to be enabled by a user. Setting the 8th bit turns it on. For example, rxq_cqe_comp_en=0x84 means the L3/L4 Header miniCQE format and the Enhanced CQE Compression layout. Enhanced CQE Compression can be enabled only if it is supported by FW. Create CQ with the proper CQE compression layout based on capabilities. Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# ce306af6	22-Feb-2023	Jiawei Wang <jiaweiw@nvidia.com>	net/mlx5: enhance Tx queue affinity The rte_eth_dev_map_aggr_tx_affinity() was introduced in ethdev lib, it was used to set the affinity value per Tx queue. This patch adds the MLX5 PMD support for net/mlx5: enhance Tx queue affinity The rte_eth_dev_map_aggr_tx_affinity() was introduced in ethdev lib, it was used to set the affinity value per Tx queue. This patch adds the MLX5 PMD support for two device ops: - map_aggr_tx_affinity - count_aggr_ports After maps a Tx queue with an aggregated port by call map_aggr_tx_affinity() and starts sending traffic, the MLX5 PMD updates TIS creation with tx_aggr_affinity value of Tx queue. TIS index 1 goes to first physical port, TIS index 2 goes to second physical port, and so on, TIS index 0 is reserved for default HW hash mode. Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# 5d542232	05-Jan-2023	Erez Shitrit <erezsh@nvidia.com>	net/mlx5/hws: support actions with shared resources TIR/FT actions are different in the context of shared ibv resource, it created on the local ibv_context and aliased to the shared ibv_context. Oth net/mlx5/hws: support actions with shared resources TIR/FT actions are different in the context of shared ibv resource, it created on the local ibv_context and aliased to the shared ibv_context. Other actions should be created on the shared ibv resource, a new flag was added to indicate where this object came from. Signed-off-by: Erez Shitrit <erezsh@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# a2364004	17-Nov-2022	Gregory Etelson <getelson@nvidia.com>	net/mlx5: fix maximum LRO message size The PMD analyzes each Rx queue maximal LRO size and selects one that fits all queues to configure TIR LRO attribute. TIR LRO attribute is number of 256 bytes c net/mlx5: fix maximum LRO message size The PMD analyzes each Rx queue maximal LRO size and selects one that fits all queues to configure TIR LRO attribute. TIR LRO attribute is number of 256 bytes chunks that match the selected maximal LRO size. PMD used `priv->max_lro_msg_size` for selected maximal LRO size and number of TIR chunks. Fixes: b9f1f4c239 ("net/mlx5: fix port initialization with small LRO") Cc: stable@dpdk.org Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> show more ...
# d9bad050	06-Nov-2022	Suanming Mou <suanmingm@nvidia.com>	net/mlx5: fix flow table and queue routine on Windows The macro HAVE_MLX5_HWS_SUPPORT was introduced for HWS only. And HWS was not supported on Windows. So macro HAVE_MLX5_HWS_SUPPORT should be only net/mlx5: fix flow table and queue routine on Windows The macro HAVE_MLX5_HWS_SUPPORT was introduced for HWS only. And HWS was not supported on Windows. So macro HAVE_MLX5_HWS_SUPPORT should be only around the code which HWS uses, but avoid including the code block shared by Linux and Windows. Fixes: 22681deead3e ("net/mlx5/hws: enable hardware steering") Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# 22681dee	20-Oct-2022	Alex Vesker <valex@nvidia.com>	net/mlx5/hws: enable hardware steering Replace stub implementation of HWS with mlx5dr code. Signed-off-by: Alex Vesker <valex@nvidia.com>
# f2d43ff5	06-Oct-2022	Dariusz Sosnowski <dsosnowski@nvidia.com>	net/mlx5: allow hairpin Rx queue in locked memory This patch adds a capability to place hairpin Rx queue in locked device memory. This capability is equivalent to storing hairpin RQ's data buffers i net/mlx5: allow hairpin Rx queue in locked memory This patch adds a capability to place hairpin Rx queue in locked device memory. This capability is equivalent to storing hairpin RQ's data buffers in locked internal device memory. Hairpin Rx queue creation is extended with requesting that RQ is allocated in locked internal device memory. If allocation fails and force_memory hairpin configuration is set, then hairpin queue creation (and, as a result, device start) fails. If force_memory is unset, then PMD will fallback to allocating memory for hairpin RQ in unlocked internal device memory. To allow such allocation, the user must set HAIRPIN_DATA_BUFFER_LOCK flag in FW using mlxconfig tool. Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# 7274b417	06-Oct-2022	Dariusz Sosnowski <dsosnowski@nvidia.com>	net/mlx5: allow hairpin Tx queue in host memory This patch adds a capability to place hairpin Tx queue in host memory managed by DPDK. This capability is equivalent to storing hairpin SQ's WQ buffer net/mlx5: allow hairpin Tx queue in host memory This patch adds a capability to place hairpin Tx queue in host memory managed by DPDK. This capability is equivalent to storing hairpin SQ's WQ buffer in host memory. Hairpin Tx queue creation is extended with allocating a memory buffer of proper size (calculated from required number of packets and WQE BB size advertised in HCA capabilities). force_memory flag of hairpin queue configuration is also supported. If it is set and: - allocation of memory buffer fails, - or hairpin SQ creation fails, then device start will fail. If it is unset, PMD will fallback to creating the hairpin SQ with WQ buffer located in unlocked device memory. Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# 593f913a	27-Jul-2022	Michael Baum <michaelba@nvidia.com>	net/mlx5: fix LRO requirements check One of the conditions to allow LRO offload is the DV configuration. The function incorrectly checks the DV configuration before initializing it by the user deva net/mlx5: fix LRO requirements check One of the conditions to allow LRO offload is the DV configuration. The function incorrectly checks the DV configuration before initializing it by the user devarg; hence, LRO cannot be allowed. This patch moves this check to mlx5_shared_dev_ctx_args_config, where DV configuration is initialized. Fixes: c4b862013598 ("net/mlx5: refactor to detect operation by DevX") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Reported-by: Gal Shalom <galshalom@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> show more ...
# 25025da3	16-Jun-2022	Spike Du <spiked@nvidia.com>	net/mlx5: handle Rx descriptor LWM event When LWM meets RQ WQE, the kernel driver raises an event to SW. Use devx event_channel to catch this and to notify the user. Allocate this channel per shared net/mlx5: handle Rx descriptor LWM event When LWM meets RQ WQE, the kernel driver raises an event to SW. Use devx event_channel to catch this and to notify the user. Allocate this channel per shared device. The channel has a cookie that informs the specific event port and queue. Signed-off-by: Spike Du <spiked@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> show more ...
# 7158e46c	16-Jun-2022	Spike Du <spiked@nvidia.com>	net/mlx5: support descriptor LWM for Rx queue Add LWM (Limit WaterMark) field to Rxq object which indicates the percentage of Rx queue size used by HW to raise descriptor event to the user. Allow LW net/mlx5: support descriptor LWM for Rx queue Add LWM (Limit WaterMark) field to Rxq object which indicates the percentage of Rx queue size used by HW to raise descriptor event to the user. Allow LWM setting in modify_rq command. Allow the LWM configuration dynamically by adding RDY2RDY state change. Signed-off-by: Spike Du <spiked@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> show more ...
# 18ca4a4e	12-May-2022	Raja Zidane <rzidane@nvidia.com>	net/mlx5: support ESP SPI match and RSS hash In packets with ESP header, the inner IP will be encrypted, and its fields cannot be used for RSS hashing. So, ESP packets can be hashed only by the oute net/mlx5: support ESP SPI match and RSS hash In packets with ESP header, the inner IP will be encrypted, and its fields cannot be used for RSS hashing. So, ESP packets can be hashed only by the outer IP layer. So, when using RSS on ESP packets, hashing may not be efficient, because the fields used by the hash functions are only the outer IPs, causing all traffic belonging to all tunnels between a given pair of GWs to land on one core. Adding the SPI hash field can extend the spreading of IPsec packets. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> show more ...
# 7a993368	25-Apr-2022	Michael Baum <michaelba@nvidia.com>	net/mlx5: fix LRO configuration in drop Rx queue The driver wrongly set the LRO configurations to the TIR of the DevX drop queue even when LRO is not supported. Actually, the LRO configuration is no net/mlx5: fix LRO configuration in drop Rx queue The driver wrongly set the LRO configurations to the TIR of the DevX drop queue even when LRO is not supported. Actually, the LRO configuration is not relevant to the drop queue at all. This causes failure in the initialization of the device, which doesn't support LRO where the drop queue is created. Probably, the drop queue creation by DevX missed the fact that LRO is set by default in the TIR creation function and didn't unset it in the drop queue case like other cases that unset LRO. Move the default LRO configuration to unset it and set it only in the case of all the TIR queues configured with LRO. Fixes: bc5bee028ebc ("net/mlx5: create drop queue using DevX") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> show more ...
# 9011af71	09-Mar-2022	Thinh Tran <thinhtr@linux.vnet.ibm.com>	net/mlx5: fix CPU socket ID for Rx queue creation The default CPU socket ID was used while creating the Rx queue and this caused creation failure in case if hardware was not resided on the default s net/mlx5: fix CPU socket ID for Rx queue creation The default CPU socket ID was used while creating the Rx queue and this caused creation failure in case if hardware was not resided on the default socket. The patch sets the correct CPU socket ID for the mlx5_rxq_ctrl before calling the mlx5_rxq_create_devx_rq_resources() which eventually calls mlx5_devx_rq_create() with correct CPU socket ID. Fixes: bc5bee028ebc ("net/mlx5: create drop queue using DevX") Cc: stable@dpdk.org Signed-off-by: Thinh Tran <thinhtr@linux.vnet.ibm.com> Reviewed-by: David Christensen <drc@linux.vnet.ibm.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> show more ...
# 311b17e6	24-Feb-2022	Michael Baum <michaelba@nvidia.com>	net/mlx5: support queue/RSS actions for external Rx queue Add support queue/RSS action for external Rx queue. In indirection table creation, the queue index will be taken from mapping array. This f net/mlx5: support queue/RSS actions for external Rx queue Add support queue/RSS action for external Rx queue. In indirection table creation, the queue index will be taken from mapping array. This feature supports neither LRO nor Hairpin. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> show more ...
# c06f77ae	24-Feb-2022	Michael Baum <michaelba@nvidia.com>	net/mlx5: optimize queue type checks The RxQ/TxQ control structure has a field named type. This type is enum with values for standard and hairpin. The use of this field is to check whether the queue net/mlx5: optimize queue type checks The RxQ/TxQ control structure has a field named type. This type is enum with values for standard and hairpin. The use of this field is to check whether the queue is of the hairpin type or standard. This patch replaces it with a boolean variable that saves whether it is a hairpin. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> show more ...
# 3a2f674b	24-Feb-2022	Suanming Mou <suanmingm@nvidia.com>	net/mlx5: add queue and RSS HW steering action This commit adds the queue and RSS action. Similar to the jump action, dynamic ones will be added to the action construct list. Due to the queue and R net/mlx5: add queue and RSS HW steering action This commit adds the queue and RSS action. Similar to the jump action, dynamic ones will be added to the action construct list. Due to the queue and RSS action in template should not be destroyed during port restart, the actions are created with standalone indirect table as indirect action does. When port stops, detaches the indirect table from action, when port starts, attaches the indirect table back to the action. One more change is made to accelerate the action creation. Currently the mlx5_hrxq_get() function returns the object index instead of object pointer. This introduced an extra converting the index to the object by calling mlx5_ipool_get() in most of the case. And that extra converting hurts multi-thread performance since mlx5_ipool_get() uses the global lock inside. As the hash Rx queue object itself also contains the index, returns the object directly will achieve better performance without the global lock. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# 2f5122df	24-Feb-2022	Viacheslav Ovsiienko <viacheslavo@nvidia.com>	net/mlx5: configure Tx queue with send on time offload The wait on time configuration flag is copied to the Tx queue structure due to performance considerations. Timestamp mask is prepared and store net/mlx5: configure Tx queue with send on time offload The wait on time configuration flag is copied to the Tx queue structure due to performance considerations. Timestamp mask is prepared and stored in queue structure as well. Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
12 3 4