#
6fd05da9 |
| 05-Aug-2019 |
Xiaoyu Min <jackmin@mellanox.com> |
net/mlx5: fix link speed info when link is down
When the link is down, the link speed returned by ethtool is UINT32_MAX and the link status is 0.
In this case, the DPDK ethdev link speed should be
net/mlx5: fix link speed info when link is down
When the link is down, the link speed returned by ethtool is UINT32_MAX and the link status is 0.
In this case, the DPDK ethdev link speed should be set to ETH_SPEED_NUM_NONE. Otherwise since link speed is non-zero but link status is zero, this is an inconsistent situation and -EAGAIN is returned, which is not right.
Fixes: 188408719888 ("net/mlx5: fix support for newer link speeds") Cc: stable@dpdk.org
Signed-off-by: Xiaoyu Min <jackmin@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
show more ...
|
#
17ed314c |
| 29-Jul-2019 |
Matan Azrad <matan@mellanox.com> |
net/mlx5: allow LRO per Rx queue
Enabling LRO offload per queue makes sense because the user will probably want to allocate different mempool for LRO queues - the LRO mempool mbuf size may be bigger
net/mlx5: allow LRO per Rx queue
Enabling LRO offload per queue makes sense because the user will probably want to allocate different mempool for LRO queues - the LRO mempool mbuf size may be bigger than non LRO mempool.
Change the LRO offload to be per queue instead of per port.
If one of the queues is with LRO enabled, all the queues will be configured via DevX.
If RSS flows direct TCP packets to queues with different LRO enabling, these flows will not be offloaded with LRO.
Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
show more ...
|
#
bd41389e |
| 29-Jul-2019 |
Matan Azrad <matan@mellanox.com> |
net/mlx5: allow LRO in regular Rx queue
LRO support was only for MPRQ, hence mprq Rx burst was selected when LRO was configured in the port.
The current support for MPRQ is suffering from bad memor
net/mlx5: allow LRO in regular Rx queue
LRO support was only for MPRQ, hence mprq Rx burst was selected when LRO was configured in the port.
The current support for MPRQ is suffering from bad memory utilization since an external mempool is allocated by the PMD for the packets data in addition to the user mempool, besides that, the user may get packet data addresses which were not configured by him.
Even though MPRQ has the best performance for packet receiving in the most cases and because of the above facts it is better to remove the automatic MPRQ select when LRO is configured.
Move MPRQ to be selected only when the user force it by the PMD arguments including LRO case.
Allow LRO offload using the regular RQ with the regular Rx burst function.
Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
show more ...
|
#
175f1c21 |
| 22-Jul-2019 |
Dekel Peled <dekelp@mellanox.com> |
net/mlx5: check conditions to enable LRO
Use DevX API to read device LRO capabilities. Check if LRO is supported and can be enabled. Check if MPRQ is supported and can be used. Enable MPRQ for LRO u
net/mlx5: check conditions to enable LRO
Use DevX API to read device LRO capabilities. Check if LRO is supported and can be enabled. Check if MPRQ is supported and can be used. Enable MPRQ for LRO use if not enabled by user. Added note for mlx5_mprq_enabled(), to emphasize that LRO enables MPRQ. Disable CQE compression and CRC stripping if LRO is enabled.
Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
show more ...
|
#
21bb6c7e |
| 22-Jul-2019 |
Dekel Peled <dekelp@mellanox.com> |
net/mlx5: introduce LRO
Add command-line argument to set LRO session timeout. Add LRO settings struct in PMD configuration struct. Add support of LRO offload in port configuration. Add macros and fu
net/mlx5: introduce LRO
Add command-line argument to set LRO session timeout. Add LRO settings struct in PMD configuration struct. Add support of LRO offload in port configuration. Add macros and function to check if LRO is supported and enabled.
Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
show more ...
|
#
ff45f462 |
| 21-Jul-2019 |
Viacheslav Ovsiienko <viacheslavo@mellanox.com> |
net/mlx5: revert Netlink socket sharing
This reverts commit e28111ac9864af09e826241a915dfff87a9c00ad. The netlink requests are replaced by ifindex caching and not needed anymore.
Fixes: e28111ac986
net/mlx5: revert Netlink socket sharing
This reverts commit e28111ac9864af09e826241a915dfff87a9c00ad. The netlink requests are replaced by ifindex caching and not needed anymore.
Fixes: e28111ac9864 ("net/mlx5: fix master device Netlink socket sharing") Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>
show more ...
|
#
fa2e14d4 |
| 21-Jul-2019 |
Viacheslav Ovsiienko <viacheslavo@mellanox.com> |
net/mlx5: cache associated network device index
The associated device index is retrieved via Netlink request to underlying Infiniband device driver. This network device index is permanent throughout
net/mlx5: cache associated network device index
The associated device index is retrieved via Netlink request to underlying Infiniband device driver. This network device index is permanent throughout the lifetime of device. We do not spawn the rte_eth_dev ports without associated network device, and if network device is being unbound we get the remove notification message and rte_eth_dev port is also detached. So, we may store the ifindex in mlx5_device_spawn() routine at rte_eth_dev port creation and initialization time and use the cached value further instead of doing actual Netlink request.
Reported-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>
show more ...
|
#
cb9cb61e |
| 21-Jul-2019 |
Viacheslav Ovsiienko <viacheslavo@mellanox.com> |
net/mlx5: report max number of mbuf segments
This patch fills the tx_desc_lim.nb_seg_max and tx_desc_lim.nb_mtu_seg_max fields of rte_eth_dev_info structure to report thee maximal number of packet s
net/mlx5: report max number of mbuf segments
This patch fills the tx_desc_lim.nb_seg_max and tx_desc_lim.nb_mtu_seg_max fields of rte_eth_dev_info structure to report thee maximal number of packet segments, requested inline data configuration is taken into account in conservative way.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>
show more ...
|
#
a6bd4911 |
| 21-Jul-2019 |
Viacheslav Ovsiienko <viacheslavo@mellanox.com> |
net/mlx5: remove Tx implementation
This patch removes the existing Tx datapath code as preparation step before introducing the new implementation. The following entities are being removed:
- deprec
net/mlx5: remove Tx implementation
This patch removes the existing Tx datapath code as preparation step before introducing the new implementation. The following entities are being removed:
- deprecated devargs support - tx_burst() routines - related PRM definitions - SQ configuration code - Tx routine selection code - incompatible Tx completion code
The following devargs are deprecated and ignored: - "txq_inline" is going to be converted to "txq_inline_max" for compatibility issue - "tx_vec_en" - "txqs_max_vec" - "txq_mpw_hdr_dseg_en" - "txq_max_inline_len" is going to be converted to "txq_inline_mpw" for compatibility issue
The deprecated devarg keys are recognized by PMD and ignored/converted to the new ones in order not to block device probing.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>
show more ...
|
#
f15db67d |
| 16-Jul-2019 |
Matan Azrad <matan@mellanox.com> |
net/mlx5: accelerate DV flow counter query
All the DV counters are cashed in the PMD memory and are contained in pools which are contained in containers according to the counters allocation type - b
net/mlx5: accelerate DV flow counter query
All the DV counters are cashed in the PMD memory and are contained in pools which are contained in containers according to the counters allocation type - batch or single.
Currently, the flow counter query is done synchronously in pool resolution means that on the user request a FW command is triggered to read all the counters in the pool.
A new feature of devX to asynchronously read batch of flow counters allows to accelerate the user query operation.
Using the DPDK host thread, the PMD periodically triggers asynchronous query in pool resolution for all the counter pools and an interrupt is triggered by the FW when the values are updated. In the interrupt handler the pool counter values raw data is replaced using a double buffer algorithm (very fast). In the user query, the PMD just returns the last query values from the PMD cache - no system-calls and FW commands are triggered from the user control thread on query operation!
More synchronization is added with the host thread: Container resize uses double buffer algorithm. Pools growing in container uses atomic operation. Pool query buffer replace uses a spinlock. Pool minimum devX counter ID uses atomic operation.
Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
show more ...
|
#
cb1d2cce |
| 19-Jun-2019 |
Asaf Penso <asafp@mellanox.com> |
net/mlx5: fix condition for link update fallback
mlx5_link_update uses the newer ethtool command ETHTOOL_GLINKSETTINGS to determine interface capabilities but falls back to the older (deprecated) ET
net/mlx5: fix condition for link update fallback
mlx5_link_update uses the newer ethtool command ETHTOOL_GLINKSETTINGS to determine interface capabilities but falls back to the older (deprecated) ETHTOOL_GSET command if the new method fails for any reason. The older method only supports reporting of capabilities up to 40G.
However, mlx5_link_update_unlocked_gs can return a failure for a number of reasons (including the link being down). Using the older method in cases of transient failure of the method can result in reporting of reduced capabilities to the application.
The older method (mlx5_link_update_unlocked_gset) should only be invoked if the newer method returns EOPNOTSUPP.
Fixes: 7d2e32f76cfc ("net/mlx5: fix ethtool link setting call order") Cc: stable@dpdk.org
Reported-by: Srinivas Narayan <srinivas.narayan@att.com> Signed-off-by: Asaf Penso <asafp@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
show more ...
|
#
028669bc |
| 05-Jul-2019 |
Anatoly Burakov <anatoly.burakov@intel.com> |
eal: hide shared memory config
Now that everything that has ever accessed the shared memory config is doing so through the public API's, we can make it internal. Since we're removing quite a few hea
eal: hide shared memory config
Now that everything that has ever accessed the shared memory config is doing so through the public API's, we can make it internal. Since we're removing quite a few headers from rte_eal_memconfig.h, we need to add them back in places where this header is used.
This bumps the ABI, so also change all build files and make update documentation.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David Marchand <david.marchand@redhat.com>
show more ...
|
#
e28111ac |
| 06-Jun-2019 |
Viacheslav Ovsiienko <viacheslavo@mellanox.com> |
net/mlx5: fix master device Netlink socket sharing
There is the patch [1] that uses master device Netlink socket to retrieve master device link settings. This is not thread safe because this resourc
net/mlx5: fix master device Netlink socket sharing
There is the patch [1] that uses master device Netlink socket to retrieve master device link settings. This is not thread safe because this resource may be in use by other call to the master device itself. Using the same Netlink socket concurrently from the multiple threads causes Netlink requests malfunction and must be eliminated. The patch replaces master Netlink socket with the socket from representor device.
[1] http://patches.dpdk.org/patch/53120/
Fixes: 0333b2f584d9 ("net/mlx5: inherit master link settings for representors") Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
show more ...
|
#
5897ac13 |
| 27-May-2019 |
Viacheslav Ovsiienko <viacheslavo@mellanox.com> |
net/mlx5: fix event handler uninstall
When device is being closed and tries to unregister interrupt callback, there is a chance the handler is still active (called in context of eal_intr_thread_main
net/mlx5: fix event handler uninstall
When device is being closed and tries to unregister interrupt callback, there is a chance the handler is still active (called in context of eal_intr_thread_main thread). If so the rte_intr_callback_unregister returns -EAGAIN and keeps the handler registered, causing crash when underlaying resourse is gone away.
This race condition may happen if event handling in application takes a long time. We should check the return code of unregistering routine and try again to unregister the handler. The diagnostic messages are shown once a second, while trying to unregister.
Fixes: 028b2a28c3cb ("net/mlx5: update event handler for multiport IB devices") Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>
show more ...
|
#
e571ad55 |
| 02-May-2019 |
Tom Barbette <barbette@kth.se> |
net/mlx5: support reading clock
Implements support for read_clock for the mlx5 driver. mlx5 supports hardware timestamp offload, setting packets timestamp field to the device clock. rte_eth_read_clo
net/mlx5: support reading clock
Implements support for read_clock for the mlx5 driver. mlx5 supports hardware timestamp offload, setting packets timestamp field to the device clock. rte_eth_read_clock allows to read the device's current clock value and therefore compare values on similar time base.
See rxtx_callbacks for an example.
Signed-off-by: Tom Barbette <barbette@kth.se> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
show more ...
|
#
40d9f906 |
| 12-May-2019 |
Viacheslav Ovsiienko <viacheslavo@mellanox.com> |
net/mlx5: fix device removal handler for multiport
IBV_EVENT_DEVICE_FATAL event is generated by the driver once for the entire multiport Infiniband device, not for each existing ports. The port inde
net/mlx5: fix device removal handler for multiport
IBV_EVENT_DEVICE_FATAL event is generated by the driver once for the entire multiport Infiniband device, not for each existing ports. The port index is zero and it causes dropping the device removal event. We should invoke the removal event processing routine for each port we have installed handler for.
Fixes: 028b2a28c3cb ("net/mlx5: update event handler for multiport IB devices")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
show more ...
|
#
09ba4c58 |
| 05-May-2019 |
Dekel Peled <dekelp@mellanox.com> |
net/mlx5: fix init with zero Rx queue
Recent patch [1] added, at the end of mlx5_dev_configure(), a call to mlx5_proc_priv_init(), initializing process_private data of eth_dev. This call is not reac
net/mlx5: fix init with zero Rx queue
Recent patch [1] added, at the end of mlx5_dev_configure(), a call to mlx5_proc_priv_init(), initializing process_private data of eth_dev. This call is not reached if PMD is started with zero Rx queues. In this case mlx5_dev_configure() returns earlier due to the check: if (rxqs_n == priv->rxqs_n) return 0; In such a scenario, later references to uninitialized process_private data will result in segmentation fault. For example see in function txq_uar_init().
This patch changes the check logic. The following code is executed if (rxqs_n != priv->rxqs_n), and skipped otherwise. Function mlx5_proc_priv_init() is always invoked, to ensure process_private data is initialized.
[1] http://patches.dpdk.org/patch/52629/
Fixes: 120dc4a7dcd3 ("net/mlx5: remove device register remap") Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>
show more ...
|
#
0333b2f5 |
| 27-Apr-2019 |
Viacheslav Ovsiienko <viacheslavo@mellanox.com> |
net/mlx5: inherit master link settings for representors
There are some physical link settings can be queried from Ethernet devices: link status, link speed, speed capabilities, duplex mode, etc. The
net/mlx5: inherit master link settings for representors
There are some physical link settings can be queried from Ethernet devices: link status, link speed, speed capabilities, duplex mode, etc. These setting do not make a lot of sense for representors due to missing physical link. The new kernel drivers dropped query for link settings for representors causing the ioctl call to fail. This patch adds some kind of emulation of link settings to PMD - representors inherit the link parameters from the master device. The actual link status (up/down) is retrieved from the representor device.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
show more ...
|
#
2e4c987a |
| 18-Apr-2019 |
Ori Kam <orika@mellanox.com> |
net/mlx5: validate Direct Rule E-Switch
Add validation logic for E-Switch using Direct Rules.
Signed-off-by: Ori Kam <orika@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>
|
#
30a86157 |
| 16-Apr-2019 |
Viacheslav Ovsiienko <viacheslavo@mellanox.com> |
net/mlx5: support PF representor
On BlueField platform we have the new entity - PF representor. This one represents the PCI PF attached to external host on the side of ARM. The traffic sent by the e
net/mlx5: support PF representor
On BlueField platform we have the new entity - PF representor. This one represents the PCI PF attached to external host on the side of ARM. The traffic sent by the external host to the NIC via PF will be seem by ARM on this PF representor.
This patch refactors port recognizing capability on the base of physical port name. We have two groups of name formats. Legacy name formats are supported by kernels before ver 5.0 (being more precise - before the patch [1]) or before Mellanox OFED 4.6, and new naming formats added by the patch [1].
Legacy naming formats are supported:
- missing physical port name (no sysfs/netlink key) at all, master is assumed
- decimal digits (for example "12"), representor is assumed, the value is the index of attached VF
New naming formats are supported:
- "p" followed by decimal digits, for example "p2", master is assumed
- "pf" followed by PF index concatenated with "vf" followed by VF index, for example "pf0vf1", representor is assumed. If index of VF is "-1" it is a special case of host PF representor, this representor must be indexed in devargs as 65535, for example representor=[0-3,65535] will allow representors for VF0, VF1, VF2, VF3 and for host PF.
Note: do not specify representor=[0-65535], it causes devargs processing error, because number of ports (rte_eth_dev) is limited.
Applications should distinguish representors and master devices exclusively by device flag RTE_ETH_DEV_REPRESENTOR and do not rely on switch port_id (mlx5 PMD deduces ones from representor_id) values returned by dev_infos_get() API.
[1] https://www.spinics.net/lists/netdev/msg547007.html Linux-tree: c12ecc23 (Or Gerlitz 2018-04-25 17:32 +0300) "net/mlx5e: Move to use common phys port names for vport representors"
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
show more ...
|
#
120dc4a7 |
| 10-Apr-2019 |
Yongseok Koh <yskoh@mellanox.com> |
net/mlx5: remove device register remap
UAR (User Access Region) register does not need to be remapped for primary process but it should be remapped only for secondary process. UAR register table is
net/mlx5: remove device register remap
UAR (User Access Region) register does not need to be remapped for primary process but it should be remapped only for secondary process. UAR register table is in the process private structure in rte_eth_devices[], (struct mlx5_proc_priv *)rte_eth_devices[port_id].process_private
The actual UAR table follows the data structure and the table is used for both Tx and Rx.
For Tx, BlueFlame in UAR is used to ring the doorbell. MLX5_TX_BFREG(txq) is defined to get a register for the txq. Processes access its own private data to acquire the register from the UAR table.
For Rx, the doorbell in UAR is required in arming CQ event. However, it is a known issue that the register isn't remapped for secondary process.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
show more ...
|
#
9c2bbd04 |
| 05-Apr-2019 |
Viacheslav Ovsiienko <viacheslavo@mellanox.com> |
net/mlx5: fix device probing for old kernel drivers
Retrieving network interface index via Netlink fails in case of old ib_core kernel driver installed - mlx5_nl_ifindex() routine fails due to RDMA_
net/mlx5: fix device probing for old kernel drivers
Retrieving network interface index via Netlink fails in case of old ib_core kernel driver installed - mlx5_nl_ifindex() routine fails due to RDMA_NLDEV_ATTR_NDEV_INDEX attribute is not supported by the old driver.
The patch allowing to retrieve the network interface index and name via Netlink [1]. So, the problem depends on ib_core module version - 4.16 supports getting ifindex via Netlink, 4.15 does not.
This error was ignored in previous versions of MLX5 PMD probing routine. For single device ifindex was retrieved via sysfs and link control was not lost, so problem just was not noticed. In order to support MLX5 PMD functioning over old kernel driver this patch adds ifindex retrieving via sysfs into probing routine. It is worth to note this method works for master/standalone device only.
[1] https://www.spinics.net/lists/linux-rdma/msg62948.html Linux tree: 5b2cc79d (Leon Romanovsky 2018-03-27 20:40:49 +0300 270)
Fixes: ad74bc619504 ("net/mlx5: support multiport IB device during probing")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
show more ...
|
#
d874a4ee |
| 01-Apr-2019 |
Thomas Monjalon <thomas@monjalon.net> |
net/mlx5: use port sibling iterators
Iterating over siblings was done with RTE_ETH_FOREACH_DEV() which skips the owned ports. The new iterators RTE_ETH_FOREACH_DEV_SIBLING() and RTE_ETH_FOREACH_DEV_
net/mlx5: use port sibling iterators
Iterating over siblings was done with RTE_ETH_FOREACH_DEV() which skips the owned ports. The new iterators RTE_ETH_FOREACH_DEV_SIBLING() and RTE_ETH_FOREACH_DEV_OF() are more appropriate and more correct.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Tested-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>
show more ...
|
#
9a8ab29b |
| 01-Apr-2019 |
Yongseok Koh <yskoh@mellanox.com> |
net/mlx5: replace IPC socket with EAL API
Socket API is used for IPC in order for secondary process to acquire Verb command file descriptor. The FD is used to remap UAR address. The multi-process AP
net/mlx5: replace IPC socket with EAL API
Socket API is used for IPC in order for secondary process to acquire Verb command file descriptor. The FD is used to remap UAR address. The multi-process APIs (rte_mp) in EAL are newly introduced. mlx5_socket.c is replaced with mlx5_mp.c, which uses the new APIs.
As it is PMD global infrastructure, only one IPC channel is established. All the IPC message types may have port_id in the message if there is need to reference a specific device.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
show more ...
|
#
028b2a28 |
| 27-Mar-2019 |
Viacheslav Ovsiienko <viacheslavo@mellanox.com> |
net/mlx5: update event handler for multiport IB devices
This patch modifies asynchronous event handler to support multiport Infiniband devices. Handler queries the event parameters, including event
net/mlx5: update event handler for multiport IB devices
This patch modifies asynchronous event handler to support multiport Infiniband devices. Handler queries the event parameters, including event source port index, and invokes the handler for specific devices with appropriate port_id.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
show more ...
|