#
f8f294c6 |
| 25-Nov-2024 |
Bing Zhao <bingz@nvidia.com> |
net/mlx5: fix shared Rx queue control release
Correct the reference counting and condition checking for shared Rx queue control structures. This fix ensures proper memory management during port stop
net/mlx5: fix shared Rx queue control release
Correct the reference counting and condition checking for shared Rx queue control structures. This fix ensures proper memory management during port stop and device close stages.
The changes move the control structure reference count decrease outside the owners list empty condition, and adjust the reference count check to subtract first, then evaluate.
This prevents potential crashes during port restart by ensuring shared Rx queues' control structures are properly freed.
Fixes: 3c9a82fa6edc ("net/mlx5: fix Rx queue control management") Cc: stable@dpdk.org
Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
show more ...
|
#
3c9a82fa |
| 04-Nov-2024 |
Bing Zhao <bingz@nvidia.com> |
net/mlx5: fix Rx queue control management
With the shared Rx queue feature introduced, the control and private Rx queue structures are decoupled, each control structure can be shared for multiple qu
net/mlx5: fix Rx queue control management
With the shared Rx queue feature introduced, the control and private Rx queue structures are decoupled, each control structure can be shared for multiple queue for all representors inside a domain.
So it should be only managed by the shared context instead of any private data of each device. The previous workaround is using a flag to check the owner (allocator) of the structure and handle it only on that device closing stage.
A proper formal solution is to add a reference count for each control structure and only free the structure when there is no reference to it to get rid of the UAF issue.
Fixes: f957ac996435 ("net/mlx5: workaround list management of Rx queue control") Fixes: bcc220cb57d7 ("net/mlx5: fix shared Rx queue list management") CC: stable@dpdk.org
Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
show more ...
|
#
f957ac99 |
| 23-Jul-2024 |
Bing Zhao <bingz@nvidia.com> |
net/mlx5: workaround list management of Rx queue control
The LIST_REMOVE macro only removes the entry from the list and updates list itself. The pointers of this entry are not reset to NULL to preve
net/mlx5: workaround list management of Rx queue control
The LIST_REMOVE macro only removes the entry from the list and updates list itself. The pointers of this entry are not reset to NULL to prevent the accessing for the 2nd time.
In the previous fix for the memory accessing, the "rxq_ctrl" was removed from the list in a device private data when the "refcnt" was decreased to 0. Under only shared or non-shared queues scenarios, this was safe since all the "rxq_ctrl" entries were freed or kept.
There is one case that shared and non-shared Rx queues are configured simultaneously, for example, a hairpin Rx queue cannot be shared. When closing the port that allocated the shared Rx queues' "rxq_ctrl", if the next entry is hairpin "rxq_ctrl", the hairpin "rxq_ctrl" will be freed directly with other resources. When trying to close the another port sharing the "rxq_ctrl", the LIST_REMOVE will be called again and cause some UFA issue. If the memory is no longer mapped, there will be a SIGSEGV.
Adding a flag in the Rx queue private structure to remove the "rxq_ctrl" from the list only on the port/queue that allocated it.
Fixes: bcc220cb57d7 ("net/mlx5: fix shared Rx queue list management") Cc: stable@dpdk.org
Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
show more ...
|
#
2d876343 |
| 05-Jul-2024 |
Jiawei Wang <jiaweiw@nvidia.com> |
net/mlx5: fix shared Rx queue data access race
The rxq_data resources were shared for shared Rx queue with the same group and queue ID. The cq_ci:24 of rxq_data was unalignment with other fields in
net/mlx5: fix shared Rx queue data access race
The rxq_data resources were shared for shared Rx queue with the same group and queue ID. The cq_ci:24 of rxq_data was unalignment with other fields in the one 32-bit data, like the dynf_meta and delay_drop.
32bit: xxxx xxxI IIII IIII IIII IIII IIII IIIx ^ .... .... .... .... ...^ | cq_ci |
The issue is that while the control thread updates the dynf_meta:1 or delay_drop:1 value during port start, another data thread updates the cq_ci at the same time, it causes the bytes race condition with different thread, and cq_ci value may be overwritten and updated the abnormal value into HW CQ DB.
This patch separates the cq_ci from the configuration data spaces, and adds checking for delay_drop and dynf_meta if shared Rx queue if started.
Fixes: 02a6195cbeaa ("net/mlx5: support enhanced CQE compression in Rx burst") Cc: stable@dpdk.org
Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com> Acked-by: Bing Zhao <bingz@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
show more ...
|
#
1944fbc3 |
| 05-Jun-2024 |
Suanming Mou <suanmingm@nvidia.com> |
net/mlx5: support flow match with external Tx queue
For using external created Tx queues in RTE_FLOW_ITEM_TX_QUEUE, this commit provides the map and unmap functions to convert the external created S
net/mlx5: support flow match with external Tx queue
For using external created Tx queues in RTE_FLOW_ITEM_TX_QUEUE, this commit provides the map and unmap functions to convert the external created SQ's devx ID to DPDK flow item Tx queue ID.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
show more ...
|
#
8e8b44f2 |
| 05-Jun-2024 |
Suanming Mou <suanmingm@nvidia.com> |
net/mlx5: rename external queue
Due to external Tx queue will be supported, in order to reuse the external queue struct, rename the current external Rx queue to external queue.
Signed-off-by: Suanm
net/mlx5: rename external queue
Due to external Tx queue will be supported, in order to reuse the external queue struct, rename the current external Rx queue to external queue.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
show more ...
|
#
e12a0166 |
| 14-May-2024 |
Tyler Retzlaff <roretzla@linux.microsoft.com> |
drivers: use stdatomic API
Replace the use of gcc builtin __atomic_xxx intrinsics with corresponding rte_atomic_xxx optional rte stdatomic API.
Signed-off-by: Tyler Retzlaff <roretzla@linux.microso
drivers: use stdatomic API
Replace the use of gcc builtin __atomic_xxx intrinsics with corresponding rte_atomic_xxx optional rte stdatomic API.
Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
show more ...
|
#
27595cd8 |
| 15-Apr-2024 |
Tyler Retzlaff <roretzla@linux.microsoft.com> |
drivers: move alignment attribute on types for MSVC
Move location of __rte_aligned(a) to new conventional location. The new placement between {struct,union} and the tag allows the desired alignment
drivers: move alignment attribute on types for MSVC
Move location of __rte_aligned(a) to new conventional location. The new placement between {struct,union} and the tag allows the desired alignment to be imparted on the type regardless of the toolchain being used for both C and C++. Additionally, it avoids confusion by Doxygen when generating documentation.
Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com> Acked-by: Morten Brørup <mb@smartsharesystems.com>
show more ...
|
#
86647d46 |
| 31-Oct-2023 |
Thomas Monjalon <thomas@monjalon.net> |
net/mlx5: add global API prefix to public constants
The file rte_pmd_mlx5.h is a public API, so its components must be prefixed with RTE_PMD_.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
|
#
a1e910f5 |
| 05-Jul-2023 |
Viacheslav Ovsiienko <viacheslavo@nvidia.com> |
net/mlx5: introduce tracepoints
There is an intention to engage DPDK tracing capabilities for mlx5 PMDs monitoring and profiling in various modes. The patch introduces tracepoints for the Tx datapat
net/mlx5: introduce tracepoints
There is an intention to engage DPDK tracing capabilities for mlx5 PMDs monitoring and profiling in various modes. The patch introduces tracepoints for the Tx datapath in the ethernet device driver.
To engage this tracing capability the following steps should be taken:
- meson option -Denable_trace_fp=true - meson option -Dc_args='-DALLOW_EXPERIMENTAL_API' - EAL command line parameter --trace=pmd.net.mlx5.tx.*
The Tx datapath tracing allows to get information how packets are pushed into hardware descriptors, time stamping for scheduled wait and send completions, etc.
To provide the human readable form of trace results the dedicated post-processing script is presumed.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
show more ...
|
#
0e04e1e2 |
| 06-Jul-2023 |
Xueming Li <xuemingl@nvidia.com> |
net/mlx5: support symmetric RSS hash function
This patch supports symmetric hash function that creating same hash result for bi-direction traffic which having reverse source and destination IP and L
net/mlx5: support symmetric RSS hash function
This patch supports symmetric hash function that creating same hash result for bi-direction traffic which having reverse source and destination IP and L4 port.
Since the hash algorithm is different than spec(XOR), leave a warning in validation.
Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
show more ...
|
#
fca8cba4 |
| 21-Jun-2023 |
David Marchand <david.marchand@redhat.com> |
ethdev: advertise flow restore in mbuf
As reported by Ilya [1], unconditionally calling rte_flow_get_restore_info() impacts an application performance for drivers that do not provide this ops. It co
ethdev: advertise flow restore in mbuf
As reported by Ilya [1], unconditionally calling rte_flow_get_restore_info() impacts an application performance for drivers that do not provide this ops. It could also impact processing of packets that require no call to rte_flow_get_restore_info() at all.
Register a dynamic mbuf flag when an application negotiates tunnel metadata delivery (calling rte_eth_rx_metadata_negotiate() with RTE_ETH_RX_METADATA_TUNNEL_ID).
Drivers then advertise that metadata can be extracted by setting this dynamic flag in each mbuf.
The application then calls rte_flow_get_restore_info() only when required.
Link: http://inbox.dpdk.org/dev/5248c2ca-f2a6-3fb0-38b8-7f659bfa40de@ovn.org/
Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Tested-by: Ali Alnubani <alialnu@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com>
show more ...
|
#
f9eb7a4b |
| 02-Mar-2023 |
Tyler Retzlaff <roretzla@linux.microsoft.com> |
use atomic intrinsics closer to C11
Use __atomic_fetch_{add,and,or,sub,xor} instead of __atomic_{add,and,or,sub,xor}_fetch when we have no interest in the result of the operation.
This change reduc
use atomic intrinsics closer to C11
Use __atomic_fetch_{add,and,or,sub,xor} instead of __atomic_{add,and,or,sub,xor}_fetch when we have no interest in the result of the operation.
This change reduces unnecessary code that provided the result of the atomic operation while this result was not used.
It also brings us to a closer alignment with atomics available in C11 standard and will reduce review effort when they are integrated.
Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Ruifeng Wang <ruifeng.wang@arm.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Reviewed-by: David Marchand <david.marchand@redhat.com>
show more ...
|
#
fc3e1798 |
| 28-Feb-2023 |
Alexander Kozyrev <akozyrev@nvidia.com> |
net/mlx5: support enhanced CQE zipping in vector Rx burst
Add Enhanced CQE compression support to vectorized Rx burst routines. Adopt the same algorithm as scalar Rx burst routines have today. 1. Re
net/mlx5: support enhanced CQE zipping in vector Rx burst
Add Enhanced CQE compression support to vectorized Rx burst routines. Adopt the same algorithm as scalar Rx burst routines have today. 1. Retrieve the validity_iteration_count from CQEs and use it to check if the CQE is ready to be processed instead of the owner_bit. 2. Do not invalidate reserved CQEs between miniCQE arrays. 3. Copy the title packet from the last processed uncompressed CQE since we will need it later to build packets from zipped CQEs. 4. Skip the regular CQE processing and go straight to the CQE unzip function in case the very first CQE is compressed to sace CPU time.
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
show more ...
|
#
02a6195c |
| 28-Feb-2023 |
Alexander Kozyrev <akozyrev@nvidia.com> |
net/mlx5: support enhanced CQE compression in Rx burst
Enhanced CQE compression changes the structure of the compression block and the number of miniCQEs per miniCQE array. Adapt to these changes in
net/mlx5: support enhanced CQE compression in Rx burst
Enhanced CQE compression changes the structure of the compression block and the number of miniCQEs per miniCQE array. Adapt to these changes in the datapath by defining a new parsing mechanism of a miniCQE array: 1. The title CQE is no longer marked as the compressed one. Need to copy it for the future miniCQE arrays parsing. 2. Mini CQE arrays now consist of up to 7 miniCQEs and a control block. The control block contains the number of miniCQEs in the array as well as an indication that this CQE is compressed. 3. The invalidation of reserved CQEs between miniCQEs arrays is not needed. 4. The owner_bit is replaced the validity_iteration_count for all CQEs.
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
show more ...
|
#
aa67ed30 |
| 27-Jan-2023 |
Alexander Kozyrev <akozyrev@nvidia.com> |
net/mlx5: ignore non-critical syndromes for Rx queue
For non-fatal syndromes like LOCAL_LENGTH_ERR, the Rx queue reset shouldn't be triggered. Rx queue could continue with the next packets without a
net/mlx5: ignore non-critical syndromes for Rx queue
For non-fatal syndromes like LOCAL_LENGTH_ERR, the Rx queue reset shouldn't be triggered. Rx queue could continue with the next packets without any recovery. Only three syndromes warrant Rx queue reset: LOCAL_QP_OP_ERR, LOCAL_PROT_ERR and WR_FLUSH_ERR. Do not initiate a Rx queue reset in any other cases. Skip all non-critical error CQEs and continue with packet processing.
Fixes: 88c0733535 ("net/mlx5: extend Rx completion with error handling") Cc: stable@dpdk.org
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
show more ...
|
#
633684e0 |
| 27-Jan-2023 |
Alexander Kozyrev <akozyrev@nvidia.com> |
net/mlx5: fix error CQE dumping for vectorized Rx
There is a dump file with debug information created for an error CQE to help with troubleshooting later. It starts with the last CQE, which, presuma
net/mlx5: fix error CQE dumping for vectorized Rx
There is a dump file with debug information created for an error CQE to help with troubleshooting later. It starts with the last CQE, which, presumably is the error CQE. But this is only true for the scalar Rx burst routing since we handle CQEs there one by one and detect the error immediately. For vectorized Rx bursts, we may already move to another CQE when we detect the error since we handle CQEs in batches there. Go back to the error CQE in this case to dump proper CQE.
Fixes: 88c0733535 ("net/mlx5: extend Rx completion with error handling") Cc: stable@dpdk.org
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
show more ...
|
#
5c9f3294 |
| 16-Jun-2022 |
Spike Du <spiked@nvidia.com> |
net/mlx5: support Rx descriptor threshold event
Add mlx5 specific available descriptor threshold configuration and query handler. In mlx5 PMD, available descriptor threshold is also called LWM (limi
net/mlx5: support Rx descriptor threshold event
Add mlx5 specific available descriptor threshold configuration and query handler. In mlx5 PMD, available descriptor threshold is also called LWM (limit watermark). While the Rx queue fullness reaches the LWM limit, the driver catches an HW event and invokes the user callback. The query handler finds the next Rx queue with pending LWM event if any, starting from the given Rx queue index.
Signed-off-by: Spike Du <spiked@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
show more ...
|
#
25025da3 |
| 16-Jun-2022 |
Spike Du <spiked@nvidia.com> |
net/mlx5: handle Rx descriptor LWM event
When LWM meets RQ WQE, the kernel driver raises an event to SW. Use devx event_channel to catch this and to notify the user. Allocate this channel per shared
net/mlx5: handle Rx descriptor LWM event
When LWM meets RQ WQE, the kernel driver raises an event to SW. Use devx event_channel to catch this and to notify the user. Allocate this channel per shared device. The channel has a cookie that informs the specific event port and queue.
Signed-off-by: Spike Du <spiked@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
show more ...
|
#
7158e46c |
| 16-Jun-2022 |
Spike Du <spiked@nvidia.com> |
net/mlx5: support descriptor LWM for Rx queue
Add LWM (Limit WaterMark) field to Rxq object which indicates the percentage of Rx queue size used by HW to raise descriptor event to the user. Allow LW
net/mlx5: support descriptor LWM for Rx queue
Add LWM (Limit WaterMark) field to Rxq object which indicates the percentage of Rx queue size used by HW to raise descriptor event to the user. Allow LWM setting in modify_rq command. Allow the LWM configuration dynamically by adding RDY2RDY state change.
Signed-off-by: Spike Du <spiked@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
show more ...
|
#
773a7de2 |
| 20-Apr-2022 |
Raja Zidane <rzidane@nvidia.com> |
net/mlx5: fix Rx/Tx stats concurrency
Queue statistics are being continuously updated in Rx/Tx burst routines while handling traffic. In addition to that, statistics can be reset (written with zeroe
net/mlx5: fix Rx/Tx stats concurrency
Queue statistics are being continuously updated in Rx/Tx burst routines while handling traffic. In addition to that, statistics can be reset (written with zeroes) on statistics reset in other threads, causing a race condition, which in turn could result in wrong stats.
The patch provides an approach with reference values, allowing the actual counters to be writable within Rx/Tx burst threads only, and updating reference values on stats reset.
Fixes: 87011737b715 ("mlx5: add software counters") Cc: stable@dpdk.org
Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
show more ...
|
#
3a29cb3a |
| 11-Mar-2022 |
Alexander Kozyrev <akozyrev@nvidia.com> |
net/mlx5: handle MPRQ incompatibility with external buffers
Multi-Packet Rx queue uses PMD-managed buffers to store packets. These buffers are externally attached to user mbufs. This conflicts with
net/mlx5: handle MPRQ incompatibility with external buffers
Multi-Packet Rx queue uses PMD-managed buffers to store packets. These buffers are externally attached to user mbufs. This conflicts with the feature that allows using user-managed externally attached buffers in an application. Fall back to SPRQ in case external buffers mempool is configured. The limitation is already documented in mlx5 guide.
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
show more ...
|
#
311b17e6 |
| 24-Feb-2022 |
Michael Baum <michaelba@nvidia.com> |
net/mlx5: support queue/RSS actions for external Rx queue
Add support queue/RSS action for external Rx queue. In indirection table creation, the queue index will be taken from mapping array.
This f
net/mlx5: support queue/RSS actions for external Rx queue
Add support queue/RSS action for external Rx queue. In indirection table creation, the queue index will be taken from mapping array.
This feature supports neither LRO nor Hairpin.
Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
show more ...
|
#
80f872ee |
| 24-Feb-2022 |
Michael Baum <michaelba@nvidia.com> |
net/mlx5: add external Rx queue mapping API
External queue is a queue that has been created and managed outside the PMD. The queues owner might use PMD to generate flow rules using these external qu
net/mlx5: add external Rx queue mapping API
External queue is a queue that has been created and managed outside the PMD. The queues owner might use PMD to generate flow rules using these external queues.
When the queue is created in hardware it is given an ID represented by 32 bits. In contrast, the index of the queues in PMD is represented by 16 bits. To enable the use of PMD to generate flow rules, the queue owner must provide a mapping between the HW index and a 16-bit index corresponding to the ethdev API.
This patch adds an API enabling to insert/cancel a mapping between HW queue id and ethdev queue id.
Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
show more ...
|
#
c06f77ae |
| 24-Feb-2022 |
Michael Baum <michaelba@nvidia.com> |
net/mlx5: optimize queue type checks
The RxQ/TxQ control structure has a field named type. This type is enum with values for standard and hairpin. The use of this field is to check whether the queue
net/mlx5: optimize queue type checks
The RxQ/TxQ control structure has a field named type. This type is enum with values for standard and hairpin. The use of this field is to check whether the queue is of the hairpin type or standard.
This patch replaces it with a boolean variable that saves whether it is a hairpin.
Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
show more ...
|