#
6be4c57a |
| 14-Feb-2022 |
Michael Baum <michaelba@nvidia.com> |
net/mlx5: fix errno update in shared context creation
The mlx5_alloc_shared_dev_ctx() function has a local variable named "err" which contains the errno value in case of failure.
When functions cal
net/mlx5: fix errno update in shared context creation
The mlx5_alloc_shared_dev_ctx() function has a local variable named "err" which contains the errno value in case of failure.
When functions called by this function are failed, this variable is updated with their return value (that should be a positive errno value). However, some functions doesn't update errno value by themselves or return negative errno value. If one of them fails, the "err" variable contains negative value what cause to assertion failure.
This patch updates all functions uses by mlx5_alloc_shared_dev_ctx() function to update rte_errno and take this value instead of "err" value.
Fixes: 5dfa003db53f ("common/mlx5: fix post doorbell barrier") Fixes: 5d55a494f4e6 ("net/mlx5: split multi-thread flow handling per OS") Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
show more ...
|
#
ad9d0c63 |
| 14-Feb-2022 |
Michael Baum <michaelba@nvidia.com> |
net/mlx5: fix ineffective metadata argument adjustment
In "dv_xmeta_en" devarg there is an option of dv_xmeta_en=3 which engages tunnel offload mode. In E-Switch configuration, that mode implicitly
net/mlx5: fix ineffective metadata argument adjustment
In "dv_xmeta_en" devarg there is an option of dv_xmeta_en=3 which engages tunnel offload mode. In E-Switch configuration, that mode implicitly activates dv_xmeta_en=1.
The update according to E-switch support is done immediately after the first parsing of the devargs, but there is another adjustment later.
This patch moves the adjustment after the second parsing.
Fixes: 4ec6360de37d ("net/mlx5: implement tunnel offload") Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
show more ...
|
#
dcbaafdc |
| 14-Feb-2022 |
Michael Baum <michaelba@nvidia.com> |
net/mlx5: fix sibling device config check
The MLX5 net driver supports "probe again". In probing again, it creates a new ethdev under an existing infiniband device context.
Sibling devices sharing
net/mlx5: fix sibling device config check
The MLX5 net driver supports "probe again". In probing again, it creates a new ethdev under an existing infiniband device context.
Sibling devices sharing infiniband device context should have compatible configurations, so some of the devargs given in the probe again, the ones that are mainly relevant to the sharing device context are sent to the mlx5_dev_check_sibling_config function which makes sure that they compatible its siblings. However, the arguments are adjusted according to the capability of the device, and the function compares the arguments of the probe again before the adjustment with the arguments of the siblings after the adjustment. A user who sends the same values to all siblings may fail in this comparison if he requested something that the device does not support and adjusted.
This patch moves the call to the mlx5_dev_check_sibling_config function after the relevant adjustments.
Fixes: 92d5dd483450 ("net/mlx5: check sibling device configurations mismatch") Fixes: 2d241515ebaf ("net/mlx5: add devarg for extensive metadata support") Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
show more ...
|
#
a41f593f |
| 11-Feb-2022 |
Ferruh Yigit <ferruh.yigit@intel.com> |
ethdev: introduce generic dummy packet burst function
Multiple PMDs have dummy/noop Rx/Tx packet burst functions.
These dummy functions are very simple, introduce a common function in the ethdev an
ethdev: introduce generic dummy packet burst function
Multiple PMDs have dummy/noop Rx/Tx packet burst functions.
These dummy functions are very simple, introduce a common function in the ethdev and update drivers to use it instead of each driver having its own functions.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>
show more ...
|
#
34776af6 |
| 23-Nov-2021 |
Michael Baum <michaelba@nvidia.com> |
net/mlx5: fix MPRQ stride devargs adjustment
In Multi-Packet RQ creation, the user can choose the number of strides and their size in bytes. The user updates it using specific devargs for both of th
net/mlx5: fix MPRQ stride devargs adjustment
In Multi-Packet RQ creation, the user can choose the number of strides and their size in bytes. The user updates it using specific devargs for both of these parameters. The above two parameters determine the size of the WQE which is actually their product of multiplication.
If the user selects values that are not in the supported range, the PMD changes them to default values. However, apart from the range limitations for each parameter individually there is also a minimum value on their multiplication. When the user selects values that their multiplication are lower than minimum value, no adjustment is made and the creation of the WQE fails.
This patch adds an adjustment in these cases as well. When the user selects values whose multiplication is lower than the minimum, they are replaced with the default values.
Fixes: ecb160456aed ("net/mlx5: add device parameter for MPRQ stride size") Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
show more ...
|
#
0947ed38 |
| 23-Nov-2021 |
Michael Baum <michaelba@nvidia.com> |
net/mlx5: improve stride parameter names
In the striding RQ management there are two important parameters, the size of the single stride in bytes and the number of strides.
Both the data-path struc
net/mlx5: improve stride parameter names
In the striding RQ management there are two important parameters, the size of the single stride in bytes and the number of strides.
Both the data-path structure and config structure keep the log of the above parameters. However, in their names there is no mention that the value is a log which may be misleading as if the fields represent the values themselves.
This patch updates their names describing the values more accurately.
Fixes: ecb160456aed ("net/mlx5: add device parameter for MPRQ stride size") Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
show more ...
|
#
7be78d02 |
| 29-Nov-2021 |
Josh Soref <jsoref@gmail.com> |
fix spelling in comments and strings
The tool comes from https://github.com/jsoref
Signed-off-by: Josh Soref <jsoref@gmail.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
|
#
11cfe349 |
| 10-Nov-2021 |
Viacheslav Ovsiienko <viacheslavo@nvidia.com> |
net/mlx5: fix Tx scheduling check
There was a redundant check for the enabled E-Switch, this resulted in device probing failure if the Tx scheduling was requested and E-Switch was enabled.
Fixes: f
net/mlx5: fix Tx scheduling check
There was a redundant check for the enabled E-Switch, this resulted in device probing failure if the Tx scheduling was requested and E-Switch was enabled.
Fixes: f17e4b4ffef9 ("net/mlx5: add Tx scheduling check on queue creation") Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
show more ...
|
#
febcac7b |
| 05-Nov-2021 |
Bing Zhao <bingz@nvidia.com> |
net/mlx5: support Rx queue delay drop
For the Ethernet RQs, if there all receiving descriptors are exhausted, the packets being received will be dropped. This behavior prevents slow or malicious sof
net/mlx5: support Rx queue delay drop
For the Ethernet RQs, if there all receiving descriptors are exhausted, the packets being received will be dropped. This behavior prevents slow or malicious software entities at the host from affecting the network. While for hairpin cases, even if there is no software involved during the packet forwarding from Rx to Tx side, some hiccup in the hardware or back pressure from Tx side may still cause the descriptors to be exhausted. In certain scenarios it may be preferred to configure the device to avoid such packet drops, assuming the posting of descriptors will resume shortly.
To support this, a new devarg "delay_drop" is introduced. By default, the delay drop is enabled for hairpin Rx queues and disabled for standard Rx queues. This value is used as a bit mask: - bit 0: enablement of standard Rx queue - bit 1: enablement of hairpin Rx queue And this attribute will be applied to all Rx queues of a device.
The "rq_delay_drop" capability in the HCA_CAP is checked before creating any queue. If the hardware capabilities do not support this delay drop, all the Rx queues will still be created without this attribute, and the devarg setting will be ignored even if it is specified explicitly. A warning log is used to notify the application when this occurs.
Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
show more ...
|
#
09c25553 |
| 04-Nov-2021 |
Xueming Li <xuemingl@nvidia.com> |
net/mlx5: support shared Rx queue
This patch introduces shared RxQ. All shared Rx queues with same group and queue ID share the same rxq_ctrl. Rxq_ctrl and rxq_data are shared, all queues from diffe
net/mlx5: support shared Rx queue
This patch introduces shared RxQ. All shared Rx queues with same group and queue ID share the same rxq_ctrl. Rxq_ctrl and rxq_data are shared, all queues from different member port share same WQ and CQ, essentially one Rx WQ, mbufs are filled into this singleton WQ.
Shared rxq_data is set into device Rx queues of all member ports as RxQ object, used for receiving packets. Polling queue of any member ports returns packets of any member, mbuf->port is used to identify source port.
Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
show more ...
|
#
9086ac09 |
| 02-Nov-2021 |
Gregory Etelson <getelson@nvidia.com> |
net/mlx5: add flex parser DevX object management
The DevX flex parsers can be shared between representors within the same IB context. We should put the flex parser objects into the shared list and e
net/mlx5: add flex parser DevX object management
The DevX flex parsers can be shared between representors within the same IB context. We should put the flex parser objects into the shared list and engage the standard mlx5_list_xxx API to manage ones.
Signed-off-by: Gregory Etelson <getelson@nvidia.com> Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
show more ...
|
#
db25cadc |
| 02-Nov-2021 |
Viacheslav Ovsiienko <viacheslavo@nvidia.com> |
net/mlx5: add flex item operations
This patch is a preparation step of implementing flex item feature in driver and it provides:
- external entry point routines for flex item creation/deletio
net/mlx5: add flex item operations
This patch is a preparation step of implementing flex item feature in driver and it provides:
- external entry point routines for flex item creation/deletion
- flex item objects management over the ports.
The flex item object keeps information about the item created over the port - reference counter to track whether item is in use by some active flows and the pointer to underlying shared DevX object, providing all the data needed to translate the flow flex pattern into matcher fields according hardware configuration.
There is not too many flex items supposed to be created on the port, the design is optimized rather for flow insertion rate than memory savings.
Signed-off-by: Gregory Etelson <getelson@nvidia.com> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
show more ...
|
#
bc5bee02 |
| 02-Nov-2021 |
Dmitry Kozlyuk <dkozlyuk@nvidia.com> |
net/mlx5: create drop queue using DevX
Drop queue creation and destruction were not implemented for DevX flow engine and Verbs engine methods were used as a workaround. Implement these methods for D
net/mlx5: create drop queue using DevX
Drop queue creation and destruction were not implemented for DevX flow engine and Verbs engine methods were used as a workaround. Implement these methods for DevX so that there is a valid queue ID that can be used regardless of queue configuration via API.
Cc: stable@dpdk.org
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
show more ...
|
#
3c4338a4 |
| 27-Oct-2021 |
Jiawei Wang <jiaweiw@nvidia.com> |
net/mlx5: optimize device spawn time with representors
During the device spawn process, mlx5 PMD queried the available flow priorities by calling mlx5_flow_discover_priorities, queried if the DR dro
net/mlx5: optimize device spawn time with representors
During the device spawn process, mlx5 PMD queried the available flow priorities by calling mlx5_flow_discover_priorities, queried if the DR drop action was supported on the root table by calling the mlx5_flow_discover_dr_action_support routine, and queried the availability of metadata register C by calling mlx5_flow_discover_mreg_c
These functions created the test flows to get the supported fields, and at the end destroyed the test flows. The test flows in the first two functions was created on the root table. If the device was spawned with multiple representors, these test flows were created and destroyed on each representor as well. The above operations took a significant amount of init time during the device spawn.
This patch optimizes the device discover functions, if there is the device with multiple representors (VF/SF) being spawned, the priority and drop action and metadata register support check can be done only ones and check results can be shared for all representors.
Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
show more ...
|
#
7299ab68 |
| 26-Oct-2021 |
Rongwei Liu <rongweil@nvidia.com> |
net/mlx5: support socket direct mode bonding
In socket direct mode, it's possible to bind any two (maybe four in future) PCIe devices with IDs like xxxx:xx:xx.x and yyyy:yy:yy.y. Bonding member inte
net/mlx5: support socket direct mode bonding
In socket direct mode, it's possible to bind any two (maybe four in future) PCIe devices with IDs like xxxx:xx:xx.x and yyyy:yy:yy.y. Bonding member interfaces are unnecessary to have the same PCIe domain/bus/device ID anymore,
Kernel driver uses "system_image_guid" to identify if devices can be bound together or not. Sysfs "phys_switch_id" is used to get "system_image_guid" of each network interface.
OFED 5.4+ is required to support "phys_switch_id".
Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
show more ...
|
#
d61138d4 |
| 22-Oct-2021 |
Harman Kalra <hkalra@marvell.com> |
drivers: remove direct access to interrupt handle
Removing direct access to interrupt handle structure fields, rather use respective get set APIs for the same. Making changes to all the drivers acce
drivers: remove direct access to interrupt handle
Removing direct access to interrupt handle structure fields, rather use respective get set APIs for the same. Making changes to all the drivers access the interrupt handle fields.
Signed-off-by: Harman Kalra <hkalra@marvell.com> Acked-by: Hyong Youb Kim <hyonkim@cisco.com> Signed-off-by: David Marchand <david.marchand@redhat.com> Tested-by: Raslan Darawsheh <rasland@nvidia.com>
show more ...
|
#
295968d1 |
| 22-Oct-2021 |
Ferruh Yigit <ferruh.yigit@intel.com> |
ethdev: add namespace
Add 'RTE_ETH' namespace to all enums & macros in a backward compatible way. The macros for backward compatibility can be removed in next LTS. Also updated some struct names to
ethdev: add namespace
Add 'RTE_ETH' namespace to all enums & macros in a backward compatible way. The macros for backward compatibility can be removed in next LTS. Also updated some struct names to have 'rte_eth' prefix.
All internal components switched to using new names.
Syntax fixed on lines that this patch touches.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Wisam Jaddo <wisamm@nvidia.com> Acked-by: Rosen Xu <rosen.xu@intel.com> Acked-by: Chenbo Xia <chenbo.xia@intel.com> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com> Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
show more ...
|
#
a89f6433 |
| 21-Oct-2021 |
Rongwei Liu <rongweil@nvidia.com> |
net/mlx5: set Tx queue affinity in round-robin
Previously, we set txq affinity to 0 and let firmware to perform round-robin when bonding. Firmware uses a global counter to assign txq affinity to dif
net/mlx5: set Tx queue affinity in round-robin
Previously, we set txq affinity to 0 and let firmware to perform round-robin when bonding. Firmware uses a global counter to assign txq affinity to different physical ports accord to remainder after division.
There are three dis-advantages: 1. The global counter is shared between kernel and dpdk. 2. After restarting pmd or port, the previous counter value is reused, so the new affinity is unpredictable. 3. There is no way to get what affinity is set by firmware.
In this update, we will create several TISs up to the number of bonding ports and bind each TIS to one PF port.
For each port, it will start to pick up TIS using its port index. Upper layer application can quickly calculate each txq's affinity without querying.
At DPDK layer, when creating txq with 2 bonding ports, the affinity is set like: port 0: 1-->2-->1-->2 port 1: 2-->1-->2-->1 port 2: 1-->2-->1-->2
Note: Only applicable to DevX api. This affinity subjects to HW hash.
Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
show more ...
|
#
ea823b2c |
| 14-Oct-2021 |
Dmitry Kozlyuk <dkozlyuk@nvidia.com> |
net/mlx5: close tools socket with last device
MLX5 PMD exposes a socket for external tools to dump port state. Socket events are listened using an interrupt source of EXT type. The socket was closed
net/mlx5: close tools socket with last device
MLX5 PMD exposes a socket for external tools to dump port state. Socket events are listened using an interrupt source of EXT type. The socket was closed and the interrupt callback was unregistered at program exit, which is incorrect because DPDK could be already shut down at this point. Move actions performed at program exit to the moment the last MLX5 port is closed. The socket will be opened again if later a new MLX5 device is plugged in and probed. Also fix comments that were decisively talking about secondary processes instead of external tools.
Fixes: e6cdc54cc0ef ("net/mlx5: add socket server for external tools") Cc: stable@dpdk.org
Reported-by: Harman Kalra <hkalra@marvell.com> Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>
show more ...
|
#
614966c2 |
| 19-Oct-2021 |
Xueming Li <xuemingl@nvidia.com> |
net/mlx5: check DevX to support more Verbs ports
Verbs API doesn't support device port number larger than 255 by design.
To support more VF or SubFunction port representors, forces DevX API check w
net/mlx5: check DevX to support more Verbs ports
Verbs API doesn't support device port number larger than 255 by design.
To support more VF or SubFunction port representors, forces DevX API check when max Verbs device link ports larger than 255.
Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
show more ...
|
#
686d05b6 |
| 19-Oct-2021 |
Xueming Li <xuemingl@nvidia.com> |
net/mlx5: enable DevX Tx queue creation
Verbs API does not support Infiniband device port number larger 255 by design. To support more representors on a single Infiniband device DevX API should be e
net/mlx5: enable DevX Tx queue creation
Verbs API does not support Infiniband device port number larger 255 by design. To support more representors on a single Infiniband device DevX API should be engaged.
While creating Send Queue (SQ) object with Verbs API, the PMD assigned IB device port attribute and kernel created the default miss flows in FDB domain, to redirect egress traffic from the queue being created to representor appropriate peer (wire, HPF, VF or SF).
With DevX API there is no IB-device port attribute (it is merely kernel one, DevX operates in PRM terms) and PMD must create default miss flows in FDB explicitly. PMD did not provide this and using DevX API for E-Switch configurations was disabled.
The default miss FDB flow matches E-Switch manager vport (to make sure the source is some representor) and SQn (Send Queue number - device internal queue index). The root flow table managed by kernel/firmware and it does not support vport redirect action, we have to split the default miss flow into two ones:
- flow with lowest priority in the root table that matches E-Switch manager vport ID and jump to group 1. - flow in group 1 that matches E-Switch manager vport ID and SQn and forwards packet to peer vport
Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
show more ...
|
#
3fd2961e |
| 19-Oct-2021 |
Xueming Li <xuemingl@nvidia.com> |
net/mlx5: use Netlink when IB port greater than 255
IB spec doesn't allow 255 ports on a single HCA, port number of 256 was cast to u8 value 0 which invalid to ibv_query_port()
This patch invokes N
net/mlx5: use Netlink when IB port greater than 255
IB spec doesn't allow 255 ports on a single HCA, port number of 256 was cast to u8 value 0 which invalid to ibv_query_port()
This patch invokes Netlink API to query port state when port number greater than 255.
Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
show more ...
|
#
9f1d636f |
| 19-Oct-2021 |
Michael Baum <michaelba@nvidia.com> |
common/mlx5: share MR management
Add global shared MR cache as a field of common device structure. Move MR management to use this global cache for all drivers.
Signed-off-by: Michael Baum <michaelb
common/mlx5: share MR management
Add global shared MR cache as a field of common device structure. Move MR management to use this global cache for all drivers.
Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
show more ...
|
#
5fbc75ac |
| 19-Oct-2021 |
Michael Baum <michaelba@nvidia.com> |
common/mlx5: add global MR cache create function
Add function for global shared MR cache structure initialization. This function include: - btree initialization. - set callbacks for reg and dereg
common/mlx5: add global MR cache create function
Add function for global shared MR cache structure initialization. This function include: - btree initialization. - set callbacks for reg and dereg MR.
Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
show more ...
|
#
fe46b20c |
| 19-Oct-2021 |
Michael Baum <michaelba@nvidia.com> |
common/mlx5: share HCA capabilities handle
Add HCA attributes structure as a field of device config structure. It query in common probing, and updates the timestamp format fields.
Each driver use H
common/mlx5: share HCA capabilities handle
Add HCA attributes structure as a field of device config structure. It query in common probing, and updates the timestamp format fields.
Each driver use HCA attributes from common device config structure, instead of query it for itself.
Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
show more ...
|