1.. SPDX-License-Identifier: BSD-3-Clause 2 Copyright(c) 2018-2019 HiSilicon Limited. 3 4HNS3 Poll Mode Driver 5=============================== 6 7The hns3 PMD (**librte_net_hns3**) provides poll mode driver support 8for the inbuilt HiSilicon Network Subsystem(HNS) network engine 9found in the HiSilicon Kunpeng 920 SoC (HIP08) and Kunpeng 930 SoC (HIP09/HIP10). 10 11Features 12-------- 13 14Features of the HNS3 PMD are: 15 16- Multiple queues for TX and RX 17- Receive Side Scaling (RSS) 18- Packet type information 19- Checksum offload 20- TSO offload 21- LRO offload 22- Promiscuous mode 23- Multicast mode 24- Port hardware statistics 25- Jumbo frames 26- Link state information 27- Interrupt mode for RX 28- VLAN stripping and inserting 29- QinQ inserting 30- DCB 31- Scattered and gather for TX and RX 32- Vector Poll mode driver 33- SR-IOV VF 34- Multi-process 35- MAC/VLAN filter 36- MTU update 37- NUMA support 38- Generic flow API 39- IEEE1588/802.1AS timestamping 40- Basic stats 41- Extended stats 42- Traffic Management API 43- Speed capabilities 44- Link Auto-negotiation 45- Link flow control 46- Dump register 47- Dump private info from device 48- FW version 49 50Prerequisites 51------------- 52- Get the information about Kunpeng920 chip using 53 `<https://www.hisilicon.com/en/products/Kunpeng>`_. 54 55- Follow the DPDK :ref:`Getting Started Guide for Linux <linux_gsg>` to 56 setup the basic DPDK environment. 57 58Link status event Pre-conditions 59~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 60 61Firmware 1.8.0.0 and later versions support reporting link changes to the PF. 62Therefore, to use the LSC for the PF driver, ensure that the firmware version 63also supports reporting link changes. 64If the VF driver needs to support LSC, special patch must be added: 65`<https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=18b6e31f8bf4ac7af7b057228f38a5a530378e4e>`_. 66 67Note: The patch has been uploaded to 5.13 of the Linux kernel mainline. 68 69 70Configuration 71------------- 72 73Compilation Options 74~~~~~~~~~~~~~~~~~~~ 75 76The following options can be modified in the ``config/rte_config.h`` file. 77 78- ``RTE_LIBRTE_HNS3_MAX_TQP_NUM_PER_PF`` (default ``256``) 79 80 Number of MAX queues reserved for PF on HIP09 and HIP10. 81 The MAX queue number is also determined by the value the firmware report. 82 83Runtime Configuration 84~~~~~~~~~~~~~~~~~~~~~ 85 86- ``rx_func_hint`` (default ``none``) 87 88 Used to select Rx burst function, supported value are ``vec``, ``sve``, 89 ``simple``, ``common``. 90 ``vec``, if supported use the ``vec`` Rx function which indicates the 91 default vector algorithm, neon for Kunpeng Arm platform. 92 ``sve``, if supported use the ``sve`` Rx function which indicates the 93 sve algorithm. 94 ``simple``, if supported use the ``simple`` Rx function which indicates 95 the scalar simple algorithm. 96 ``common``, if supported use the ``common`` Rx function which indicates 97 the scalar scattered algorithm. 98 99 When provided parameter is not supported, ``vec`` usage condition will 100 be first checked, if meets, use the ``vec``. Then, ``simple``, at last 101 ``common``. 102 103 For example:: 104 105 -a 0000:7d:00.0,rx_func_hint=simple 106 107- ``tx_func_hint`` (default ``none``) 108 109 Used to select Tx burst function, supported value are ``vec``, ``sve``, 110 ``simple``, ``common``. 111 ``vec``, if supported use the ``vec`` Tx function which indicates the 112 default vector algorithm, neon for Kunpeng Arm platform. 113 ``sve``, if supported use the ``sve`` Tx function which indicates the 114 sve algorithm. 115 ``simple``, if supported use the ``simple`` Tx function which indicates 116 the scalar simple algorithm. 117 ``common``, if supported use the ``common`` Tx function which indicates 118 the scalar algorithm. 119 120 When provided parameter is not supported, ``vec`` usage condition will 121 be first checked, if meets, use the ``vec``. Then, ``simple``, at last 122 ``common``. 123 124 For example:: 125 126 -a 0000:7d:00.0,tx_func_hint=common 127 128- ``dev_caps_mask`` (default ``0``) 129 130 Used to mask the capability which queried from firmware. 131 This args take hexadecimal bitmask where each bit represents whether mask 132 corresponding capability. eg. If the capability is 0xFFFF queried from 133 firmware, and the args value is 0xF which means the bit0~bit3 should be 134 masked off, then the capability will be 0xFFF0. 135 Its main purpose is to debug and avoid problems. 136 137 For example:: 138 139 -a 0000:7d:00.0,dev_caps_mask=0xF 140 141- ``mbx_time_limit_ms`` (default ``500``) 142 143 Used to define the mailbox time limit by user. 144 Current, the max waiting time for MBX response is 500ms, but in 145 some scenarios, it is not enough. Since it depends on the response 146 of the kernel mode driver, and its response time is related to the 147 scheduling of the system. In this special scenario, most of the 148 cores are isolated, and only a few cores are used for system 149 scheduling. When a large number of services are started, the 150 scheduling of the system will be very busy, and the reply of the 151 mbx message will time out, which will cause our PMD initialization 152 to fail. So provide access to set mailbox time limit for user. 153 154 For example:: 155 156 -a 0000:7d:00.0,mbx_time_limit_ms=600 157 158- ``fdir_vlan_match_mode`` (default ``strict``) 159 160 Used to select VLAN match mode. This runtime config can be ``strict`` 161 or ``nostrict`` and is only valid for PF devices. 162 If driver works on ``strict`` mode (default mode), hardware does strictly 163 match the input flow base on VLAN number. 164 165 For the following scenarios with two rules: 166 167 .. code-block:: console 168 169 rule0: 170 pattern: eth type is 0x0806 171 actions: queue index 3 172 rule1: 173 pattern: eth type is 0x0806 / vlan vid is 20 174 actions: queue index 4 175 176 If application select ``strict`` mode, only the ARP packets with VLAN 177 20 are directed to queue 4, and the ARP packets with other VLAN ID 178 cannot be directed to the specified queue. If application want to all 179 ARP packets with or without VLAN to be directed to the specified queue, 180 application can select ``nostrict`` mode and just need to set rule0. 181 182 For example:: 183 184 -a 0000:7d:00.0,fdir_vlan_match_mode=nostrict 185 186- ``fdir_tuple_config`` (default ``none``) 187 188 Used to customize the flow director tuples. Current supported options are follows: 189 ``+outvlan-insmac``: means disable inner src mac tuple, and enable outer vlan tuple. 190 ``+outvlan-indmac``: means disable inner dst mac tuple, and enable outer vlan tuple. 191 ``+outvlan-insip``: means disable inner src ip tuple, and enable outer vlan tuple. 192 ``+outvlan-indip``: means disable inner dst ip tuple, and enable outer vlan tuple. 193 ``+outvlan-sctptag``: means disable sctp tag tuple, and enable outer vlan tuple. 194 ``+outvlan-tunvni``: means disable tunnel vni tuple, and enable outer vlan tuple. 195 196- ``fdir_index_config`` (default ``hash``) 197 198 Used to select flow director index strategy, 199 the flow director index is the index position in the hardware flow director table. 200 Lower index denotes higher priority 201 (it means when a packet matches multiple indexes, the smaller index wins). 202 Current supported options are as follows: 203 ``hash``: The driver generates a flow index based on the hash of the rte_flow key. 204 ``priority``: the driver uses the rte_flow priority field as the flow director index. 205 206Driver compilation and testing 207------------------------------ 208 209Refer to the document :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` 210for details. 211 212Sample Application Notes 213------------------------ 214 215VLAN filter 216~~~~~~~~~~~ 217 218VLAN filter only works when Promiscuous mode is off. 219 220To start ``testpmd``, and add VLAN 10 to port 0: 221 222.. code-block:: console 223 224 ./<build_dir>/app/dpdk-testpmd -l 0-15 -n 4 -- -i --forward-mode=mac 225 ... 226 227 testpmd> set promisc 0 off 228 testpmd> vlan set filter on 0 229 testpmd> rx_vlan add 10 0 230 231 232Flow Director 233~~~~~~~~~~~~~ 234 235The Flow Director works in receive mode to identify specific flows or sets of 236flows and route them to specific queues. 237The Flow Director filters can match the different fields for different type of 238packet: flow type, specific input set per flow type. 239 240 241Start ``testpmd``: 242 243.. code-block:: console 244 245 ./<build_dir>/app/dpdk-testpmd -l 0-15 -n 4 -- -i --rxq=8 --txq=8 \ 246 --nb-cores=8 --nb-ports=1 247 248Add a rule to direct ``ipv4-udp`` packet whose ``dst_ip=2.2.2.5, src_ip=2.2.2.3, 249src_port=32, dst_port=32`` to queue 1: 250 251.. code-block:: console 252 253 testpmd> flow create 0 ingress pattern eth / ipv4 src is 2.2.2.3 \ 254 dst is 2.2.2.5 / udp src is 32 dst is 32 / end \ 255 actions mark id 1 / queue index 1 / end 256 257The flow rules:: 258 259 rule-0: flow create 0 ingress pattern eth / end \ 260 queue index 1 / end 261 rule-1: flow create 0 ingress pattern eth / vlan vid is 10 / end \ 262 queue index 1 / end 263 rule-2: flow create 0 ingress pattern eth / vlan / vlan vid is 10 / end \ 264 queue index 1 / end 265 rule-3: flow create 0 ingress pattern eth / vlan vid is 10 / vlan vid is 11 / end \ 266 queue index 1 / end 267 268will match the following packet types with specific VLAN ID at the specified VLAN layer 269and any VLAN ID at the rest VLAN layer. 270 271 +--------+------------------+-------------------------------------------+ 272 | rules | ``strict`` | ``nostrict`` | 273 +========+==================+===========================================+ 274 | rule-0 | untagged | untagged || single-tagged || multi-tagged | 275 +--------+------------------+-------------------------------------------+ 276 | rule-1 | single-tagged | single-tagged || multi-tagged | 277 +--------+------------------+-------------------------------------------+ 278 | rule-2 | double-tagged | multi-tagged | 279 +--------+------------------+-------------------------------------------+ 280 | rule-3 | double-tagged | multi-tagged | 281 +--------+------------------+-------------------------------------------+ 282 283The attributes ``has_vlan`` and ``has_more_vlan`` are supported. 284The usage is as follows:: 285 286 rule-4: flow create 0 ingress pattern eth has_vlan is 1 / end \ 287 queue index 1 / end 288 rule-5: flow create 0 ingress pattern eth has_vlan is 0 / end \ 289 queue index 1 / end 290 rule-6: flow create 0 ingress pattern eth / vlan has_more_vlan is 1 / \ 291 end queue index 1 / end 292 rule-7: flow create 0 ingress pattern eth / vlan has_more_vlan is 0 / \ 293 end queue index 1 / end 294 295They will match the following packet types with any VLAN ID. 296 297 +--------+------------------+-------------------------------------------+ 298 | rules | ``strict`` | ``nostrict`` | 299 +========+==================+===========================================+ 300 | rule-4 | single-tagged | untagged || single-tagged || multi-tagged | 301 +--------+------------------+-------------------------------------------+ 302 | rule-5 | untagged | untagged || single-tagged || multi-tagged | 303 +--------+------------------+-------------------------------------------+ 304 | rule-6 | double-tagged | untagged || single-tagged || multi-tagged | 305 +--------+------------------+-------------------------------------------+ 306 | rule-7 | single-tagged | untagged || single-tagged || multi-tagged | 307 +--------+------------------+-------------------------------------------+ 308 309These two fields may be used followed by VLAN item, 310and may partially overlap or conflict with the VLAN item. 311For examples, the rule-8 will be rejected by the driver 312and rule-9, rule-10 are repeated with rule-4. 313Similar usage for ``has_more_vlan``. 314 315:: 316 317 rule-8: flow create 0 ingress pattern eth has_vlan is 0 / vlan / end \ 318 queue index 1 / end 319 rule-9: flow create 0 ingress pattern eth has_vlan is 1 / vlan / end \ 320 queue index 1 / end 321 rule-10: flow create 0 ingress pattern eth / vlan / end \ 322 queue index 1 / end 323 324 325Generic flow API 326~~~~~~~~~~~~~~~~ 327 328- ``RSS Flow`` 329 330 RSS Flow supports for creating rule base on input tuple, hash key, queues 331 and hash algorithm. But hash key, queues and hash algorithm are the global 332 configuration for hardware which will affect other rules. 333 The rule just setting input tuple is completely independent. 334 335 In addition, if the rule priority level is set, no error is reported, 336 but the rule priority level does not take effect. 337 338 Run ``testpmd``: 339 340 .. code-block:: console 341 342 dpdk-testpmd -a 0000:7d:00.0 -l 10-18 -- -i --rxq=8 --txq=8 343 344 All IP packets can be distributed to 8 queues. 345 346 Set IPv4-TCP packet is distributed to 8 queues based on L3/L4 SRC only. 347 348 .. code-block:: console 349 350 testpmd> flow create 0 ingress pattern eth / ipv4 / tcp / end actions \ 351 rss types ipv4-tcp l4-src-only l3-src-only end queues end / end 352 353 Disable IPv4 packet RSS hash. 354 355 .. code-block:: console 356 357 testpmd> flow create 0 ingress pattern eth / ipv4 / end actions rss \ 358 types none end queues end / end 359 360 Set hash function as symmetric Toeplitz. 361 362 .. code-block:: console 363 364 testpmd> flow create 0 ingress pattern end actions rss types end \ 365 queues end func symmetric_toeplitz / end 366 367 In this case, all packets that enabled RSS are hashed using symmetric 368 Toeplitz algorithm. 369 370 Flush all RSS rules 371 372 .. code-block:: console 373 374 testpmd> flow flush 0 375 376 The RSS configurations of hardwre is back to the one ethdev ops set. 377 378Statistics 379---------- 380 381HNS3 supports various methods to report statistics: 382 383Port statistics can be queried using ``rte_eth_stats_get()``. The number 384of packets received or sent successfully by the PMD. While the received and 385sent packet bytes are through SW only. The imissed counter is the amount of 386packets that could not be delivered to SW because a queue was full. The oerror 387counter is the amount of packets that are dropped by HW in Tx. 388 389Extended statistics can be queried using ``rte_eth_xstats_get()``. The extended 390statistics expose a wider set of counters counted by the device. The extended 391port statistics contains packets statistics per queue, Mac statistics, HW reset 392count and IO error count. 393 394Finally per-flow statistics can by queried using ``rte_flow_query`` when attaching 395a count action for specific flow. The flow counter counts the number of packets 396received successfully by the port and match the specific flow. 397 398Performance tuning 399------------------ 400 401Hardware configuration 402~~~~~~~~~~~~~~~~~~~~~~ 40332 GB DIMMs is used to ensure that each channel is fully configured. 404Dynamic CPU Tuning is disabled. 405 406Queue depth configuration 407~~~~~~~~~~~~~~~~~~~~~~~~~ 408According to the actual test, the performance is best when the queue depth 409ranges from 1024 to 2048. 410 411IO burst configuration 412~~~~~~~~~~~~~~~~~~~~~~ 413According to the actual test, the performance is best when IO burst is set to 64. 414IO burst is the number of packets per burst. 415 416Queue number configuration 417~~~~~~~~~~~~~~~~~~~~~~~~~~ 418When the number of port queues corresponds to the number of CPU cores, the 419performance will be better. 420 421Hugepage configuration 422~~~~~~~~~~~~~~~~~~~~~~ 423For 4K systems, 1 GB hugepages are recommended. For 64 KB systems, 512 MB 424hugepages are recommended. 425 426CPU core isolation 427~~~~~~~~~~~~~~~~~~ 428To reduce the possibility of context switching, kernel isolation parameter should 429be provided to avoid scheduling the CPU core used by DPDK application threads for 430other tasks. Before starting the Linux OS, add the kernel isolation boot parameter. 431For example, "isolcpus=1-18 nohz_full=1-18 rcu_nocbs=1-18". 432 433Dump registers 434-------------- 435 436HNS3 supports dumping registers values with their names, 437and supports filtering by module names. 438The available module names are ``bios``, ``ssu``, ``igu_egu``, 439``rpu``, ``ncsi``, ``rtc``, ``ppp``, ``rcb``, ``tqp``, ``rtc``, ``cmdq``, 440``common_pf``, ``common_vf``, ``ring``, ``tqp_intr``, ``32_bit_dfx``, 441``64_bit_dfx``. 442 443Limitations or Known issues 444--------------------------- 445Currently, we only support VF device driven by DPDK driver when PF is driven 446by kernel mode hns3 ethdev driver. VF is not supported when PF is driven by 447DPDK driver. 448 449For sake of Rx/Tx performance, IEEE 1588 is not supported when using vec or 450sve burst function. When enabling IEEE 1588, Rx/Tx burst mode should be 451simple or common. It is recommended that enable IEEE 1588 before ethdev 452start. In this way, the correct Rx/Tx burst function can be selected. 453 454Build with ICC is not supported yet. 455X86-32, Power8, ARMv7 and BSD are not supported yet. 456