mlx5_rxtx_vec.c - OpenGrok history log for /dpdk/drivers/net/mlx5/mlx5_rxtx

Revision	Date	Author	Comments
# 73f7ae1d	06-Dec-2024	Gavin Hu <gahu@nvidia.com>	net/mlx5: fix polling CQEs In certain situations, the receive queue (rxq) fails to replenish its internal ring with memory buffers (mbufs) from the pool. This can happen when the pool has a limited net/mlx5: fix polling CQEs In certain situations, the receive queue (rxq) fails to replenish its internal ring with memory buffers (mbufs) from the pool. This can happen when the pool has a limited number of mbufs allocated, and the user application holds incoming packets for an extended period, resulting in a delayed release of mbufs. Consequently, the pool becomes depleted, preventing the rxq from replenishing from it. There was a bug in the behavior of the vectorized rxq_cq_process_v routine, which handled completion queue entries (CQEs) in batches of four. This routine consistently accessed four mbufs from the internal queue ring, regardless of whether they had been replenished. As a result, it could access mbufs that no longer belonged to the poll mode driver (PMD). The fix involves checking if there are four replenished mbufs available before allowing rxq_cq_process_v to handle the batch. Once replenishment succeeds during the polling process, the routine will resume its operation. Fixes: 1ded26239aa0 ("net/mlx5: refactor vectorized Rx") Cc: stable@dpdk.org Reported-by: Changqi Dingluo <dingluochangqi.ck@bytedance.com> Signed-off-by: Gavin Hu <gahu@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# 90ec9b0d	01-Nov-2023	Alexander Kozyrev <akozyrev@nvidia.com>	net/mlx5: replenish MPRQ buffers for miniCQEs Keep unzipping if the next CQE is the miniCQE array in rxq_cq_decompress_v() routine only for non-MPRQ scenario, MPRQ requires buffer replenishment betw net/mlx5: replenish MPRQ buffers for miniCQEs Keep unzipping if the next CQE is the miniCQE array in rxq_cq_decompress_v() routine only for non-MPRQ scenario, MPRQ requires buffer replenishment between the miniCQEs. Restore the check for the initial compressed CQE for SPRQ and check that the current CQE is not compressed before copying it as a possible title CQE. Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com> show more ...
# fc3e1798	28-Feb-2023	Alexander Kozyrev <akozyrev@nvidia.com>	net/mlx5: support enhanced CQE zipping in vector Rx burst Add Enhanced CQE compression support to vectorized Rx burst routines. Adopt the same algorithm as scalar Rx burst routines have today. 1. Re net/mlx5: support enhanced CQE zipping in vector Rx burst Add Enhanced CQE compression support to vectorized Rx burst routines. Adopt the same algorithm as scalar Rx burst routines have today. 1. Retrieve the validity_iteration_count from CQEs and use it to check if the CQE is ready to be processed instead of the owner_bit. 2. Do not invalidate reserved CQEs between miniCQE arrays. 3. Copy the title packet from the last processed uncompressed CQE since we will need it later to build packets from zipped CQEs. 4. Skip the regular CQE processing and go straight to the CQE unzip function in case the very first CQE is compressed to sace CPU time. Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# aa67ed30	27-Jan-2023	Alexander Kozyrev <akozyrev@nvidia.com>	net/mlx5: ignore non-critical syndromes for Rx queue For non-fatal syndromes like LOCAL_LENGTH_ERR, the Rx queue reset shouldn't be triggered. Rx queue could continue with the next packets without a net/mlx5: ignore non-critical syndromes for Rx queue For non-fatal syndromes like LOCAL_LENGTH_ERR, the Rx queue reset shouldn't be triggered. Rx queue could continue with the next packets without any recovery. Only three syndromes warrant Rx queue reset: LOCAL_QP_OP_ERR, LOCAL_PROT_ERR and WR_FLUSH_ERR. Do not initiate a Rx queue reset in any other cases. Skip all non-critical error CQEs and continue with packet processing. Fixes: 88c0733535 ("net/mlx5: extend Rx completion with error handling") Cc: stable@dpdk.org Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> show more ...
# 633684e0	27-Jan-2023	Alexander Kozyrev <akozyrev@nvidia.com>	net/mlx5: fix error CQE dumping for vectorized Rx There is a dump file with debug information created for an error CQE to help with troubleshooting later. It starts with the last CQE, which, presuma net/mlx5: fix error CQE dumping for vectorized Rx There is a dump file with debug information created for an error CQE to help with troubleshooting later. It starts with the last CQE, which, presumably is the error CQE. But this is only true for the scalar Rx burst routing since we handle CQEs there one by one and detect the error immediately. For vectorized Rx bursts, we may already move to another CQE when we detect the error since we handle CQEs in batches there. Go back to the error CQE in this case to dump proper CQE. Fixes: 88c0733535 ("net/mlx5: extend Rx completion with error handling") Cc: stable@dpdk.org Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> show more ...
# 0947ed38	23-Nov-2021	Michael Baum <michaelba@nvidia.com>	net/mlx5: improve stride parameter names In the striding RQ management there are two important parameters, the size of the single stride in bytes and the number of strides. Both the data-path struc net/mlx5: improve stride parameter names In the striding RQ management there are two important parameters, the size of the single stride in bytes and the number of strides. Both the data-path structure and config structure keep the log of the above parameters. However, in their names there is no mention that the value is a log which may be misleading as if the fields represent the values themselves. This patch updates their names describing the values more accurately. Fixes: ecb160456aed ("net/mlx5: add device parameter for MPRQ stride size") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> show more ...
# 5cf0707f	04-Nov-2021	Xueming Li <xuemingl@nvidia.com>	net/mlx5: remove Rx queue data list from device Rx queue data list(priv->rxqs) can be replaced by Rx queue list(priv->rxq_privs), removes it and replaces with universal wrapper API. Signed-off-by: net/mlx5: remove Rx queue data list from device Rx queue data list(priv->rxqs) can be replaced by Rx queue list(priv->rxq_privs), removes it and replaces with universal wrapper API. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# 5db77fef	04-Nov-2021	Xueming Li <xuemingl@nvidia.com>	net/mlx5: remove port info from shareable Rx queue To prepare for shared Rx queue, removes port info from shareable Rx queue control. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viach net/mlx5: remove port info from shareable Rx queue To prepare for shared Rx queue, removes port info from shareable Rx queue control. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# 828274b7	04-Aug-2021	Alexander Kozyrev <akozyrev@nvidia.com>	net/mlx5: fix mbuf replenishment check for zipped CQE A core dump is being generated with the following call stack: 0 _mm256_storeu_si256 (__A=..., __P=0x80) 1 rte_mov32 (src=0x2299c9140 "", dst=0x8 net/mlx5: fix mbuf replenishment check for zipped CQE A core dump is being generated with the following call stack: 0 _mm256_storeu_si256 (__A=..., __P=0x80) 1 rte_mov32 (src=0x2299c9140 "", dst=0x80) 2 rte_memcpy_aligned (n=60, src=0x2299c9140, dst=0x80) 3 rte_memcpy (n=60, src=0x2299c9140, dst=0x80) 4 mprq_buf_to_pkt (strd_cnt=1, strd_idx=0, buf=0x2299c8a00, len=60, pkt=0x18345f0c0, rxq=0x18345ef40) 5 rxq_copy_mprq_mbuf_v (rxq=0x18345ef40, pkts=0x7f76e0ff6d18, pkts_n=5) 6 rxq_burst_mprq_v (rxq=0x18345ef40, pkts=0x7f76e0ff6d18, pkts_n=46, err=0x7f76e0ff6a28, no_cq=0x7f76e0ff6a27) 7 mlx5_rx_burst_mprq_vec (dpdk_rxq=0x18345ef40, pkts=0x7f76e0ff6a88, pkts_n=128) 8 rte_eth_rx_burst (nb_pkts=128, rx_pkts=0x7f76e0ff6a88, queue_id=<optimized out>, port_id=<optimized out>) This crash is caused by an attempt to copy previously uncompressed CQEs into non-allocated mbufs. There is a check to make sure we only use allocated mbufs in the rxq_burst_mprq_v() function, but it is done only before the main processing loop. Leftovers of compressed CQEs session are handled before that loop and may lead to the mbufs overflow as seen. Move the check for replenished mbufs up to protect uncompressed CQEs session leftovers from accessing non-allocated mbufs after the mlx5_rx_mprq_replenish_bulk_mbuf() function is invoked. Bugzilla ID: 746 Fixes: 0f20acbf5eda ("net/mlx5: implement vectorized MPRQ burst") Cc: stable@dpdk.org Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# acc87479	13-Jul-2021	Alexander Kozyrev <akozyrev@nvidia.com>	net/mlx5: fix threshold for mbuf replenishment in MPRQ The replenishment scheme for the vectorized MPRQ Rx burst aims to improve the cache locality by allocating new mbufs only when there are almost net/mlx5: fix threshold for mbuf replenishment in MPRQ The replenishment scheme for the vectorized MPRQ Rx burst aims to improve the cache locality by allocating new mbufs only when there are almost no mbufs left: one burst gap between allocated and consumed indexes. This gap is not big enough to accommodate a corner case when we have a very aggressive CQE compression with multiple regular CQEs at the beginning and 64 zipped CQEs at the end. Need to keep in mind this case and extend the replenishment threshold by MLX5_VPMD_RX_MAX_BURST (64) to avoid mbuf overflow. Fixes: 5fc2e5c27d6 ("net/mlx5: fix mbuf overflow in vectorized MPRQ") Cc: stable@dpdk.org Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# 1db288f9	07-Jul-2021	Ruifeng Wang <ruifeng.wang@arm.com>	net/mlx5: reduce unnecessary memory access in Rx MR btree len is a constant during Rx replenish. Moved retrieve of the value out of loop to reduce data loads. Slight performance uplift was measured net/mlx5: reduce unnecessary memory access in Rx MR btree len is a constant during Rx replenish. Moved retrieve of the value out of loop to reduce data loads. Slight performance uplift was measured on both N1SDP and x86. Suggested-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# 151cbe3a	12-Apr-2021	Michael Baum <michaelba@nvidia.com>	net/mlx5: separate Rx function declarations to another file The mlx5_rxtx.c file contains a lot of Tx burst functions, each of those is performance-optimized for the specific set of requested offloa net/mlx5: separate Rx function declarations to another file The mlx5_rxtx.c file contains a lot of Tx burst functions, each of those is performance-optimized for the specific set of requested offloads. These ones are generated on the basis of the template function and it takes significant time to compile, just due to a large number of giant functions generated in the same file and this compilation is not being done in parallel with using multithreading. Therefore we can split the mlx5_rxtx.c file into several separate files to allow different functions to be compiled simultaneously. In this patch, we separate Rx function declarations to different header file in preparation for removing them from the source file and as an optional preparation step for further consolidation of Rx burst functions. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# 5fc2e5c2	21-Nov-2020	Alexander Kozyrev <akozyrev@nvidia.com>	net/mlx5: fix mbuf overflow in vectorized MPRQ Changing the allocation scheme to improve mbufs locality caused mbufs overrun in some cases. Revert the previous replenish logic back. Calculate a numb net/mlx5: fix mbuf overflow in vectorized MPRQ Changing the allocation scheme to improve mbufs locality caused mbufs overrun in some cases. Revert the previous replenish logic back. Calculate a number of unused mbufs and replenish max this number of mbufs. Mark the last 4 mbufs as fake mbufs to prevent overflowing into consumed mbufs in the future. Keep the consumed index and the produced index 4 mbufs apart for this purpose. Replenish some mbufs only in case the consumed index is within the replenish threshold of the produced index in order to retain the cache locality for the vectorized MPRQ routine. Fixes: 5c68764377 ("net/mlx5: improve vectorized MPRQ descriptors locality") Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# 5c687643	08-Nov-2020	Alexander Kozyrev <akozyrev@nvidia.com>	net/mlx5: improve vectorized MPRQ descriptors locality There is a performance penalty for the replenish scheme used in vectorized Rx burst for both MPRQ and SPRQ. Mbuf elements are being filled at t net/mlx5: improve vectorized MPRQ descriptors locality There is a performance penalty for the replenish scheme used in vectorized Rx burst for both MPRQ and SPRQ. Mbuf elements are being filled at the end of the mbufs array and being replenished at the beginning. That leads to an increase in cache misses and the performance drop. The more Rx descriptors are used the worse the situation. Change the allocation scheme for vectorized MPRQ Rx burst: allocate new mbufs only when consumed mbufs are almost depleted (always have one burst gap between allocated and consumed indices). Keeping a small number of mbufs allocated improves cache locality and improves performance a lot. Unfortunately, this approach cannot be applied to SPRQ Rx burst routine. In MPRQ Rx burst we simply copy packets from external MPRQ buffers or attach these buffers to mbufs. In SPRQ Rx burst we allow the NIC to fill mbufs for us. Hence keeping a small number of allocated mbufs will limit NIC ability to fill as many buffers as possible. This fact offsets the advantage of better cache locality. Fixes: 0f20acbf5eda ("net/mlx5: implement vectorized MPRQ burst") Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# 0f20acbf	21-Oct-2020	Alexander Kozyrev <akozyrev@nvidia.com>	net/mlx5: implement vectorized MPRQ burst MPRQ (Multi-Packet Rx Queue) processes one packet at a time using simple scalar instructions. MPRQ works by posting a single large buffer (consisted of mult net/mlx5: implement vectorized MPRQ burst MPRQ (Multi-Packet Rx Queue) processes one packet at a time using simple scalar instructions. MPRQ works by posting a single large buffer (consisted of multiple fixed-size strides) in order to receive multiple packets at once on this buffer. A Rx packet is then copied to a user-provided mbuf or PMD attaches the Rx packet to the mbuf by the pointer to an external buffer. There is an opportunity to speed up the packet receiving by processing 4 packets simultaneously using SIMD (single instruction, multiple data) extensions. Allocate mbufs in batches for every MPRQ buffer and process the packets in groups of 4 until all the strides are exhausted. Then switch to another MPRQ buffer and repeat the process over again. The vectorized MPRQ burst routine is engaged automatically in case the mprq_en=1 devarg is specified and the vectorization is not disabled explicitly by providing rx_vec_en=0 devarg. There is a limitation: LRO is not supported and scalar MPRQ is selected if it is on. Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# 1ded2623	21-Oct-2020	Alexander Kozyrev <akozyrev@nvidia.com>	net/mlx5: refactor vectorized Rx Move the main processing cycle into a separate function: rxq_cq_process_v. Put the regular rxq_burst_v function to a non-arch specific file. Having all SIMD instruct net/mlx5: refactor vectorized Rx Move the main processing cycle into a separate function: rxq_cq_process_v. Put the regular rxq_burst_v function to a non-arch specific file. Having all SIMD instructions in a single reusable block is a first preparatory step to implement vectorized Rx burst for MPRQ feature. Pass a pointer to the storage of mbufs directly to the rxq_copy_mbuf_v instead of calculating the pointer inside this function. This is needed for the future vectorized Rx routing which is going to pass a different pointer here. Calculate the number of packets to replenish inside the mlx5_rx_replenish_bulk_mbuf. Containing this logic in one place allows us to do the same for MPRQ case. Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# 2c5e0dd2	19-Oct-2020	Ciara Power <ciara.power@intel.com>	net/mlx5: check max SIMD bitwidth When choosing a vector path to take, an extra condition must be satisfied to ensure the max SIMD bitwidth allows for the CPU enabled path. Signed-off-by: Ciara Pow net/mlx5: check max SIMD bitwidth When choosing a vector path to take, an extra condition must be satisfied to ensure the max SIMD bitwidth allows for the CPU enabled path. Signed-off-by: Ciara Power <ciara.power@intel.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> show more ...
# 9d60f545	19-Jul-2020	Ophir Munk <ophirmu@mellanox.com>	common/mlx5: remove inclusion of Verbs header files Several source files include Verbs header files as in (1). These source files will not compile under non-Linux operating systems. This commit remo common/mlx5: remove inclusion of Verbs header files Several source files include Verbs header files as in (1). These source files will not compile under non-Linux operating systems. This commit removes this inclusion in two cases: Case 1: There is no usage of ibv_* or mlx5dv_* symbols in the source file so the inclusion in (1) can be safely removed. Case 2: Verbs symbols are used. Please note the inclusion in (1) already appears in file linux/mlx5_glue.h (which represents the interface to the rdma-core library). Therefore, replace (1) in the source file with (2). Under non-Linux operating systems - file mlx5_glue.h will not include (1). (1) #include <infiniband/verbs.h> #include <infiniband/mlx5dv.h> (2) #include <mlx5_glue.h> Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com> show more ...
# 0f006468	24-Jun-2020	Michael Baum <michaelba@mellanox.com>	net/mlx5: fix iterator type in Rx queue management The mlx5_check_vec_rx_support function in the mlx5_rxtx_vec.c file passes the RX queues array in the loop. Similarly, the mlx5_mprq_enabled functio net/mlx5: fix iterator type in Rx queue management The mlx5_check_vec_rx_support function in the mlx5_rxtx_vec.c file passes the RX queues array in the loop. Similarly, the mlx5_mprq_enabled function in the mlx5_rxq.c file passes the RX queues array in the loop. In both cases, the iterator of the loop is called i and the variable representing the array size is called rxqs_n. The i variable is of UINT16_T type while the rxqs_n variable is of unsigned int type. The size of the rxqs_n variable is much larger than the number of iterations allowed by the i type, theoretically there may be a situation where the value of the rxqs_n will be greater than can be represented by 16 bits and the loop will never end. Change the type of i to UINT32_T. Fixes: 7d6bf6b866b8 ("net/mlx5: add Multi-Packet Rx support") Fixes: 6cb559d67b83 ("net/mlx5: add vectorized Rx/Tx burst for x86") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com> show more ...
# c9cc554b	02-Jun-2020	Alexander Kozyrev <akozyrev@mellanox.com>	net/mlx5: fix vectorized Rx burst termination Maximum burst size of Vectorized Rx burst routine is set to MLX5_VPMD_RX_MAX_BURST(64). This limits the performance of any application that would like t net/mlx5: fix vectorized Rx burst termination Maximum burst size of Vectorized Rx burst routine is set to MLX5_VPMD_RX_MAX_BURST(64). This limits the performance of any application that would like to gather more than 64 packets from the single Rx burst for batch processing (i.e. VPP). The situation gets worse with a mix of zipped and unzipped CQEs. They are processed separately and the Rx burst function returns small number of packets every call. Repeat the cycle of gathering packets from the vectorized Rx routine until a requested number of packets are collected or there are no more CQEs left to process. Fixes: 6cb559d67b83 ("net/mlx5: add vectorized Rx/Tx burst for x86") Cc: stable@dpdk.org Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com> show more ...
# ce6427dd	09-Feb-2020	Thomas Monjalon <thomas@monjalon.net>	replace cold attributes The new macro __rte_cold, for compiler hinting, is now used where appropriate for consistency. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: David Christ replace cold attributes The new macro __rte_cold, for compiler hinting, is now used where appropriate for consistency. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: David Christensen <drc@linux.vnet.ibm.com> show more ...
# 8e46d4e1	30-Jan-2020	Alexander Kozyrev <akozyrev@mellanox.com>	common/mlx5: improve assert control Use the MLX5_ASSERT macros instead of the standard assert clause. Depends on the RTE_LIBRTE_MLX5_DEBUG configuration option to define it. If RTE_LIBRTE_MLX5_DEBUG common/mlx5: improve assert control Use the MLX5_ASSERT macros instead of the standard assert clause. Depends on the RTE_LIBRTE_MLX5_DEBUG configuration option to define it. If RTE_LIBRTE_MLX5_DEBUG is enabled MLX5_ASSERT is equal to RTE_VERIFY to bypass the global CONFIG_RTE_ENABLE_ASSERT option. If RTE_LIBRTE_MLX5_DEBUG is disabled, the global CONFIG_RTE_ENABLE_ASSERT can still make this assert active by calling RTE_VERIFY inside RTE_ASSERT. Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> show more ...
# 7b4f1e6b	29-Jan-2020	Matan Azrad <matan@mellanox.com>	common/mlx5: introduce common library A new Mellanox vdpa PMD will be added to support vdpa operations by Mellanox adapters. This vdpa PMD design includes mlx5_glue and mlx5_devx operations and lar common/mlx5: introduce common library A new Mellanox vdpa PMD will be added to support vdpa operations by Mellanox adapters. This vdpa PMD design includes mlx5_glue and mlx5_devx operations and large parts of them are shared with the net/mlx5 PMD. Create a new common library in drivers/common for mlx5 PMDs. Move mlx5_glue, mlx5_devx_cmds and their dependencies to the new mlx5 common library in drivers/common. The files mlx5_devx_cmds.c, mlx5_devx_cmds.h, mlx5_glue.c, mlx5_glue.h and mlx5_prm.h are moved as is from drivers/net/mlx5 to drivers/common/mlx5. Share the log mechanism macros. Separate also the log mechanism to allow different log level control to the common library. Build files and version files are adjusted accordingly. Include lines are adjusted accordingly. Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> show more ...
# 2e542da7	16-Aug-2019	David Christensen <drc@linux.vnet.ibm.com>	net/mlx5: add Altivec Rx Added mlx5_rxtx_vec_altivec.h which supports vectorized RX using Altivec vector code. Modified associated build files to use the new code. Signed-off-by: David Christensen net/mlx5: add Altivec Rx Added mlx5_rxtx_vec_altivec.h which supports vectorized RX using Altivec vector code. Modified associated build files to use the new code. Signed-off-by: David Christensen <drc@linux.vnet.ibm.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Tested-by: Raslan Darawsheh <rasland@mellanox.com> show more ...
# 17ed314c	29-Jul-2019	Matan Azrad <matan@mellanox.com>	net/mlx5: allow LRO per Rx queue Enabling LRO offload per queue makes sense because the user will probably want to allocate different mempool for LRO queues - the LRO mempool mbuf size may be bigger net/mlx5: allow LRO per Rx queue Enabling LRO offload per queue makes sense because the user will probably want to allocate different mempool for LRO queues - the LRO mempool mbuf size may be bigger than non LRO mempool. Change the LRO offload to be per queue instead of per port. If one of the queues is with LRO enabled, all the queues will be configured via DevX. If RSS flows direct TCP packets to queues with different LRO enabling, these flows will not be offloaded with LRO. Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> show more ...
12