#
295968d1 |
| 22-Oct-2021 |
Ferruh Yigit <ferruh.yigit@intel.com> |
ethdev: add namespace
Add 'RTE_ETH' namespace to all enums & macros in a backward compatible way. The macros for backward compatibility can be removed in next LTS. Also updated some struct names to
ethdev: add namespace
Add 'RTE_ETH' namespace to all enums & macros in a backward compatible way. The macros for backward compatibility can be removed in next LTS. Also updated some struct names to have 'rte_eth' prefix.
All internal components switched to using new names.
Syntax fixed on lines that this patch touches.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Wisam Jaddo <wisamm@nvidia.com> Acked-by: Rosen Xu <rosen.xu@intel.com> Acked-by: Chenbo Xia <chenbo.xia@intel.com> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com> Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
show more ...
|
#
9f1d636f |
| 19-Oct-2021 |
Michael Baum <michaelba@nvidia.com> |
common/mlx5: share MR management
Add global shared MR cache as a field of common device structure. Move MR management to use this global cache for all drivers.
Signed-off-by: Michael Baum <michaelb
common/mlx5: share MR management
Add global shared MR cache as a field of common device structure. Move MR management to use this global cache for all drivers.
Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
show more ...
|
#
0f20acbf |
| 21-Oct-2020 |
Alexander Kozyrev <akozyrev@nvidia.com> |
net/mlx5: implement vectorized MPRQ burst
MPRQ (Multi-Packet Rx Queue) processes one packet at a time using simple scalar instructions. MPRQ works by posting a single large buffer (consisted of mult
net/mlx5: implement vectorized MPRQ burst
MPRQ (Multi-Packet Rx Queue) processes one packet at a time using simple scalar instructions. MPRQ works by posting a single large buffer (consisted of multiple fixed-size strides) in order to receive multiple packets at once on this buffer. A Rx packet is then copied to a user-provided mbuf or PMD attaches the Rx packet to the mbuf by the pointer to an external buffer.
There is an opportunity to speed up the packet receiving by processing 4 packets simultaneously using SIMD (single instruction, multiple data) extensions. Allocate mbufs in batches for every MPRQ buffer and process the packets in groups of 4 until all the strides are exhausted. Then switch to another MPRQ buffer and repeat the process over again.
The vectorized MPRQ burst routine is engaged automatically in case the mprq_en=1 devarg is specified and the vectorization is not disabled explicitly by providing rx_vec_en=0 devarg. There is a limitation: LRO is not supported and scalar MPRQ is selected if it is on.
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
show more ...
|
#
1ded2623 |
| 21-Oct-2020 |
Alexander Kozyrev <akozyrev@nvidia.com> |
net/mlx5: refactor vectorized Rx
Move the main processing cycle into a separate function: rxq_cq_process_v. Put the regular rxq_burst_v function to a non-arch specific file. Having all SIMD instruct
net/mlx5: refactor vectorized Rx
Move the main processing cycle into a separate function: rxq_cq_process_v. Put the regular rxq_burst_v function to a non-arch specific file. Having all SIMD instructions in a single reusable block is a first preparatory step to implement vectorized Rx burst for MPRQ feature.
Pass a pointer to the storage of mbufs directly to the rxq_copy_mbuf_v instead of calculating the pointer inside this function. This is needed for the future vectorized Rx routing which is going to pass a different pointer here.
Calculate the number of packets to replenish inside the mlx5_rx_replenish_bulk_mbuf. Containing this logic in one place allows us to do the same for MPRQ case.
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
show more ...
|
#
f0f5d844 |
| 23-Sep-2020 |
Phil Yang <phil.yang@arm.com> |
eal: remove deprecated coherent IO memory barriers
Since the 20.08 release deprecated rte_cio_*mb APIs because these APIs provide the same functionality as rte_io_*mb APIs on all platforms, so remov
eal: remove deprecated coherent IO memory barriers
Since the 20.08 release deprecated rte_cio_*mb APIs because these APIs provide the same functionality as rte_io_*mb APIs on all platforms, so remove them and use rte_io_*mb instead.
Signed-off-by: Phil Yang <phil.yang@arm.com> Signed-off-by: Joyce Kong <joyce.kong@arm.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Acked-by: David Marchand <david.marchand@redhat.com>
show more ...
|
#
b8dc6b0e |
| 13-Apr-2020 |
Vu Pham <vuhuong@mellanox.com> |
common/mlx5: refactor memory management
Refactor common memory btree and cache management to common driver. Replace some input parameters of MR APIs to more common data structure like PD, port_id, s
common/mlx5: refactor memory management
Refactor common memory btree and cache management to common driver. Replace some input parameters of MR APIs to more common data structure like PD, port_id, share_cache,... so that multiple PMD drivers can use those MR APIs.
Modify mlx5 net pmd driver to use MR management APIs from common driver.
Signed-off-by: Vu Pham <vuhuong@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
show more ...
|
#
8e46d4e1 |
| 30-Jan-2020 |
Alexander Kozyrev <akozyrev@mellanox.com> |
common/mlx5: improve assert control
Use the MLX5_ASSERT macros instead of the standard assert clause. Depends on the RTE_LIBRTE_MLX5_DEBUG configuration option to define it. If RTE_LIBRTE_MLX5_DEBUG
common/mlx5: improve assert control
Use the MLX5_ASSERT macros instead of the standard assert clause. Depends on the RTE_LIBRTE_MLX5_DEBUG configuration option to define it. If RTE_LIBRTE_MLX5_DEBUG is enabled MLX5_ASSERT is equal to RTE_VERIFY to bypass the global CONFIG_RTE_ENABLE_ASSERT option. If RTE_LIBRTE_MLX5_DEBUG is disabled, the global CONFIG_RTE_ENABLE_ASSERT can still make this assert active by calling RTE_VERIFY inside RTE_ASSERT.
Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
show more ...
|
#
7b4f1e6b |
| 29-Jan-2020 |
Matan Azrad <matan@mellanox.com> |
common/mlx5: introduce common library
A new Mellanox vdpa PMD will be added to support vdpa operations by Mellanox adapters.
This vdpa PMD design includes mlx5_glue and mlx5_devx operations and lar
common/mlx5: introduce common library
A new Mellanox vdpa PMD will be added to support vdpa operations by Mellanox adapters.
This vdpa PMD design includes mlx5_glue and mlx5_devx operations and large parts of them are shared with the net/mlx5 PMD.
Create a new common library in drivers/common for mlx5 PMDs. Move mlx5_glue, mlx5_devx_cmds and their dependencies to the new mlx5 common library in drivers/common.
The files mlx5_devx_cmds.c, mlx5_devx_cmds.h, mlx5_glue.c, mlx5_glue.h and mlx5_prm.h are moved as is from drivers/net/mlx5 to drivers/common/mlx5.
Share the log mechanism macros. Separate also the log mechanism to allow different log level control to the common library.
Build files and version files are adjusted accordingly. Include lines are adjusted accordingly.
Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
show more ...
|
#
bdb8e5b1 |
| 20-Jan-2020 |
Viacheslav Ovsiienko <viacheslavo@mellanox.com> |
net/mlx5: allow allocated mbuf with external buffer
In the Rx datapath the flags in the newly allocated mbufs are all explicitly cleared but the EXT_ATTACHED_MBUF must be preserved. It would allow t
net/mlx5: allow allocated mbuf with external buffer
In the Rx datapath the flags in the newly allocated mbufs are all explicitly cleared but the EXT_ATTACHED_MBUF must be preserved. It would allow to use mbuf pools with pre-attached external data buffers.
The vectorized rx_burst routines are updated in order to inherit the EXT_ATTACHED_MBUF from mbuf pool private RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF flag.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>
show more ...
|
#
9bf26e13 |
| 05-Nov-2019 |
Viacheslav Ovsiienko <viacheslavo@mellanox.com> |
ethdev: move egress metadata to dynamic field
The dynamic mbuf fields were introduced by [1]. The egress metadata is good candidate to be moved from statically allocated field tx_metadata to dynamic
ethdev: move egress metadata to dynamic field
The dynamic mbuf fields were introduced by [1]. The egress metadata is good candidate to be moved from statically allocated field tx_metadata to dynamic one. Because mbufs are used in half-duplex fashion only, it is safe to share this dynamic field with ingress metadata.
The shared dynamic field contains either egress (if application going to transmit mbuf with tx_burst) or ingress (if mbuf is received with rx_burst) metadata and can be accessed by RTE_FLOW_DYNF_METADATA() macro or with rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper routines. PKT_TX_DYNF_METADATA/PKT_RX_DYNF_METADATA flag will be set along with the data.
The mbuf dynamic field must be registered by calling rte_flow_dynf_metadata_register() prior accessing the data.
The availability of dynamic mbuf metadata field can be checked with rte_flow_dynf_metadata_avail() routine.
DEV_TX_OFFLOAD_MATCH_METADATA offload and configuration flag is removed. The metadata support in PMDs is engaged on dynamic field registration.
Metadata feature is getting complex. We might have some set of actions and items that might be supported by PMDs in multiple combinations, the supported values and masks are the subjects to query by perfroming trials (with rte_flow_validate).
[1] http://patches.dpdk.org/patch/62040/
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Ori Kam <orika@mellanox.com>
show more ...
|
#
8b8f7994 |
| 22-Jul-2019 |
Matan Azrad <matan@mellanox.com> |
net/mlx5: update LRO fields in completion entry
Update the CQE structure to include LRO fields.
Some reserved values were changed, hence also data-path code used the reserved values were updated ac
net/mlx5: update LRO fields in completion entry
Update the CQE structure to include LRO fields.
Some reserved values were changed, hence also data-path code used the reserved values were updated accordingly.
Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
show more ...
|
#
9c55c6bd |
| 25-Mar-2019 |
Yongseok Koh <yskoh@mellanox.com> |
net/mlx5: revert mbuf address calculation for x86
When replenishing mbufs on Rx, buffer address (mbuf->buf_addr) should be loaded. non-x86 processors (mostly RISC such as ARM and Power) are more vul
net/mlx5: revert mbuf address calculation for x86
When replenishing mbufs on Rx, buffer address (mbuf->buf_addr) should be loaded. non-x86 processors (mostly RISC such as ARM and Power) are more vulnerable to load stall. For x86, reducing the number of instructions seems to matter most.
For x86, this is simply a load but for other architectures, it is calculated from the address of mbuf structure by rte_mbuf_buf_addr() without having to load the first cacheline of the mbuf.
Fixes: 12d468a62bc1 ("net/mlx5: fix instruction hotspot on replenishing Rx buffer") Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
show more ...
|
#
12d468a6 |
| 14-Jan-2019 |
Yongseok Koh <yskoh@mellanox.com> |
net/mlx5: fix instruction hotspot on replenishing Rx buffer
On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed to be accessed as it is static and easily calculated from the mb
net/mlx5: fix instruction hotspot on replenishing Rx buffer
On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed to be accessed as it is static and easily calculated from the mbuf address. Accessing the mbuf content causes unnecessary load stall and it is worsened on ARM.
Fixes: 545b884b1da3 ("net/mlx5: fix buffer address posting in SSE Rx") Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
show more ...
|
#
6bd7fbd0 |
| 23-Oct-2018 |
Dekel Peled <dekelp@mellanox.com> |
net/mlx5: support metadata as flow rule criteria
As described in series starting at [1], it adds option to set metadata value as match pattern when creating a new flow rule.
This patch adds metadat
net/mlx5: support metadata as flow rule criteria
As described in series starting at [1], it adds option to set metadata value as match pattern when creating a new flow rule.
This patch adds metadata support in mlx5 driver, in two parts: - Add the validation and setting of metadata value in matcher, when creating a new flow rule. - Add the passing of metadata value from mbuf to wqe when indicated by ol_flag, in different burst functions.
[1] "ethdev: support metadata as flow rule criteria" http://mails.dpdk.org/archives/dev/2018-September/113269.html
Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
show more ...
|
#
e10245a1 |
| 26-Jun-2018 |
Yongseok Koh <yskoh@mellanox.com> |
net/mlx5: fix Rx buffer replenishment threshold
The threshold of buffer replenishment for vectorized Rx burst is a constant value (64). If the size of Rx queue is comparatively small, device could r
net/mlx5: fix Rx buffer replenishment threshold
The threshold of buffer replenishment for vectorized Rx burst is a constant value (64). If the size of Rx queue is comparatively small, device could run out of buffers. For example, if the size of Rx queue is 128, buffers are replenished only twice per a wraparound. This can cause jitter in receiving packets and the jitter can cause unnecessary retransmission for TCP connections.
Fixes: 6cb559d67b83 ("net/mlx5: add vectorized Rx/Tx burst for x86") Fixes: 570acdb1da8a ("net/mlx5: add vectorized Rx/Tx burst for ARM") Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
show more ...
|
#
7d6bf6b8 |
| 09-May-2018 |
Yongseok Koh <yskoh@mellanox.com> |
net/mlx5: add Multi-Packet Rx support
Multi-Packet Rx Queue (MPRQ a.k.a Striding RQ) can further save PCIe bandwidth by posting a single large buffer for multiple packets. Instead of posting a buffe
net/mlx5: add Multi-Packet Rx support
Multi-Packet Rx Queue (MPRQ a.k.a Striding RQ) can further save PCIe bandwidth by posting a single large buffer for multiple packets. Instead of posting a buffer per a packet, one large buffer is posted in order to receive multiple packets on the buffer. A MPRQ buffer consists of multiple fixed-size strides and each stride receives one packet.
Rx packet is mem-copied to a user-provided mbuf if the size of Rx packet is comparatively small, or PMD attaches the Rx packet to the mbuf by external buffer attachment - rte_pktmbuf_attach_extbuf(). A mempool for external buffers will be allocated and managed by PMD.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
show more ...
|
#
974f1e7e |
| 09-May-2018 |
Yongseok Koh <yskoh@mellanox.com> |
net/mlx5: add new memory region support
This is the new design of Memory Region (MR) for mlx PMD, in order to: - Accommodate the new memory hotplug model. - Support non-contiguous Mempool.
There ar
net/mlx5: add new memory region support
This is the new design of Memory Region (MR) for mlx PMD, in order to: - Accommodate the new memory hotplug model. - Support non-contiguous Mempool.
There are multiple layers for MR search.
L0 is to look up the last-hit entry which is pointed by mr_ctrl->mru (Most Recently Used). If L0 misses, L1 is to look up the address in a fixed-sized array by linear search. L0/L1 is in an inline function - mlx5_mr_lookup_cache().
If L1 misses, the bottom-half function is called to look up the address from the bigger local cache of the queue. This is L2 - mlx5_mr_addr2mr_bh() and it is not an inline function. Data structure for L2 is the Binary Tree.
If L2 misses, the search falls into the slowest path which takes locks in order to access global device cache (priv->mr.cache) which is also a B-tree and caches the original MR list (priv->mr.mr_list) of the device. Unless the global cache is overflowed, it is all-inclusive of the MR list. This is L3 - mlx5_mr_lookup_dev(). The size of the L3 cache table is limited and can't be expanded on the fly due to deadlock. Refer to the comments in the code for the details - mr_lookup_dev(). If L3 is overflowed, the list will have to be searched directly bypassing the cache although it is slower.
If L3 misses, a new MR for the address should be created - mlx5_mr_create(). When it creates a new MR, it tries to register adjacent memsegs as much as possible which are virtually contiguous around the address. This must take two locks - memory_hotplug_lock and priv->mr.rwlock. Due to memory_hotplug_lock, there can't be any allocation/free of memory inside.
In the free callback of the memory hotplug event, freed space is searched from the MR list and corresponding bits are cleared from the bitmap of MRs. This can fragment a MR and the MR will have multiple search entries in the caches. Once there's a change by the event, the global cache must be rebuilt and all the per-queue caches will be flushed as well. If memory is frequently freed in run-time, that may cause jitter on dataplane processing in the worst case by incurring MR cache flush and rebuild. But, it would be the least probable scenario.
To guarantee the most optimal performance, it is highly recommended to use an EAL option - '--socket-mem'. Then, the reserved memory will be pinned and won't be freed dynamically. And it is also recommended to configure per-lcore cache of Mempool. Even though there're many MRs for a device or MRs are highly fragmented, the cache of Mempool will be much helpful to reduce misses on per-queue caches anyway.
'--legacy-mem' is also supported.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
show more ...
|
#
5feecc57 |
| 20-Mar-2018 |
Shahaf Shuler <shahafs@mellanox.com> |
align SPDX Mellanox copyrights
Aligning Mellanox SPDX copyrights to a single format. In addition replace to SPDX licence files which were missed.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
align SPDX Mellanox copyrights
Aligning Mellanox SPDX copyrights to a single format. In addition replace to SPDX licence files which were missed.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
show more ...
|
#
8fd92a66 |
| 29-Jan-2018 |
Olivier Matz <olivier.matz@6wind.com> |
net/mlx5: use SPDX tags in 6WIND copyrighted files
Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Bruce Richardson <bruce.rich
net/mlx5: use SPDX tags in 6WIND copyrighted files
Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>
show more ...
|
#
4fe7f662 |
| 25-Jan-2018 |
Yongseok Koh <yskoh@mellanox.com> |
net/mlx5: replace I/O memory barrier with coherent version
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
|
#
dbccb4cd |
| 10-Jan-2018 |
Shahaf Shuler <shahafs@mellanox.com> |
net/mlx5: convert to new Tx offloads API
Ethdev Tx offloads API has changed since:
commit cba7f53b717d ("ethdev: introduce Tx queue offloads API")
This commit support the new Tx offloads API.
Sig
net/mlx5: convert to new Tx offloads API
Ethdev Tx offloads API has changed since:
commit cba7f53b717d ("ethdev: introduce Tx queue offloads API")
This commit support the new Tx offloads API.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
show more ...
|
#
03e0868b |
| 10-Oct-2017 |
Yongseok Koh <yskoh@mellanox.com> |
net/mlx5: fix deadlock due to buffered slots in Rx SW ring
When replenishing Rx ring, there're always buffered slots reserved between consumed entries and HW owned entries. These have to be filled w
net/mlx5: fix deadlock due to buffered slots in Rx SW ring
When replenishing Rx ring, there're always buffered slots reserved between consumed entries and HW owned entries. These have to be filled with fake mbufs to protect from possible overflow rather than optimistically expecting successful replenishment which can cause deadlock with small-sized queue.
Fixes: fc048bd52cb7 ("net/mlx5: fix overflow of Rx SW ring") Cc: stable@dpdk.org
Reported-by: Martin Weiser <martin.weiser@allegro-packets.com> Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Tested-by: Martin Weiser <martin.weiser@allegro-packets.com>
show more ...
|
#
570acdb1 |
| 09-Oct-2017 |
Yongseok Koh <yskoh@mellanox.com> |
net/mlx5: add vectorized Rx/Tx burst for ARM
Brings vectorization through NEON instructions.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
|
#
3c2ddbd4 |
| 09-Oct-2017 |
Yongseok Koh <yskoh@mellanox.com> |
net/mlx5: separate shareable vector functions
Considering more architecture (e.g. ARM and PowerPC) will be added for vectorized Rx/Tx burst, all the shareable functions which don't use any vector in
net/mlx5: separate shareable vector functions
Considering more architecture (e.g. ARM and PowerPC) will be added for vectorized Rx/Tx burst, all the shareable functions which don't use any vector intrinsics need to be separated from architecture-dependent functions. All the vector functions for x86 SSE are moved to a new header file - mlx5_rxtx_vec_sse.h. And shareable common functions are now in mlx5_rxtx_vec.c.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
show more ...
|
#
5bfc9fc1 |
| 09-Oct-2017 |
Yongseok Koh <yskoh@mellanox.com> |
net/mlx5: use static assert for compile-time sanity checks
Replace compile-time sanity check with static_assert() as c11 standard has been set. Add mlx5_rxtx_vec.h and move the sanity checks to the
net/mlx5: use static assert for compile-time sanity checks
Replace compile-time sanity check with static_assert() as c11 standard has been set. Add mlx5_rxtx_vec.h and move the sanity checks to the file
Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
show more ...
|