1.. SPDX-License-Identifier: BSD-3-Clause 2 Copyright(C) 2021 Marvell. 3 4Marvell cnxk platform guide 5=========================== 6 7This document gives an overview of **Marvell OCTEON CN9K and CN10K** RVU H/W block, 8packet flow and procedure to build DPDK on OCTEON cnxk platform. 9 10More information about CN9K and CN10K SoC can be found at `Marvell Official Website 11<https://www.marvell.com/embedded-processors/infrastructure-processors/>`_. 12 13Supported OCTEON cnxk SoCs 14-------------------------- 15 16- CN106xx 17- CNF105xx 18 19Resource Virtualization Unit architecture 20----------------------------------------- 21 22The :numref:`figure_cnxk_resource_virtualization` diagram depicts the 23RVU architecture and a resource provisioning example. 24 25.. _figure_cnxk_resource_virtualization: 26 27.. figure:: img/cnxk_resource_virtualization.* 28 29 cnxk Resource virtualization architecture and provisioning example 30 31 32Resource Virtualization Unit (RVU) on Marvell's OCTEON CN9K/CN10K SoC maps HW 33resources belonging to the network, crypto and other functional blocks onto 34PCI-compatible physical and virtual functions. 35 36Each functional block has multiple local functions (LFs) for 37provisioning to different PCIe devices. RVU supports multiple PCIe SRIOV 38physical functions (PFs) and virtual functions (VFs). 39 40The :numref:`table_cnxk_rvu_dpdk_mapping` shows the various local 41functions (LFs) provided by the RVU and its functional mapping to 42DPDK subsystem. 43 44.. _table_cnxk_rvu_dpdk_mapping: 45 46.. table:: RVU managed functional blocks and its mapping to DPDK subsystem 47 48 +---+-----+--------------------------------------------------------------+ 49 | # | LF | DPDK subsystem mapping | 50 +===+=====+==============================================================+ 51 | 1 | NIX | rte_ethdev, rte_tm, rte_event_eth_[rt]x_adapter, rte_security| 52 +---+-----+--------------------------------------------------------------+ 53 | 2 | NPA | rte_mempool | 54 +---+-----+--------------------------------------------------------------+ 55 | 3 | NPC | rte_flow | 56 +---+-----+--------------------------------------------------------------+ 57 | 4 | CPT | rte_cryptodev, rte_event_crypto_adapter | 58 +---+-----+--------------------------------------------------------------+ 59 | 5 | SSO | rte_eventdev | 60 +---+-----+--------------------------------------------------------------+ 61 | 6 | TIM | rte_event_timer_adapter | 62 +---+-----+--------------------------------------------------------------+ 63 | 7 | LBK | rte_ethdev | 64 +---+-----+--------------------------------------------------------------+ 65 | 8 | DPI | rte_rawdev | 66 +---+-----+--------------------------------------------------------------+ 67 | 9 | SDP | rte_ethdev | 68 +---+-----+--------------------------------------------------------------+ 69 | 10| REE | rte_regexdev | 70 +---+-----+--------------------------------------------------------------+ 71 72PF0 is called the administrative / admin function (AF) and has exclusive 73privileges to provision RVU functional block's LFs to each of the PF/VF. 74 75PF/VFs communicates with AF via a shared memory region (mailbox).Upon receiving 76requests from PF/VF, AF does resource provisioning and other HW configuration. 77 78AF is always attached to host, but PF/VFs may be used by host kernel itself, 79or attached to VMs or to userspace applications like DPDK, etc. So, AF has to 80handle provisioning/configuration requests sent by any device from any domain. 81 82The AF driver does not receive or process any data. 83It is only a configuration driver used in control path. 84 85The :numref:`figure_cnxk_resource_virtualization` diagram also shows a 86resource provisioning example where, 87 881. PFx and PFx-VF0 bound to Linux netdev driver. 892. PFx-VF1 ethdev driver bound to the first DPDK application. 903. PFy ethdev driver, PFy-VF0 ethdev driver, PFz eventdev driver, PFm-VF0 cryptodev driver bound to the second DPDK application. 91 92LBK HW Access 93------------- 94 95Loopback HW Unit (LBK) receives packets from NIX-RX and sends packets back to NIX-TX. 96The loopback block has N channels and contains data buffering that is shared across 97all channels. The LBK HW Unit is abstracted using ethdev subsystem, Where PF0's 98VFs are exposed as ethdev device and odd-even pairs of VFs are tied together, 99that is, packets sent on odd VF end up received on even VF and vice versa. 100This would enable HW accelerated means of communication between two domains 101where even VF bound to the first domain and odd VF bound to the second domain. 102 103Typical application usage models are, 104 105#. Communication between the Linux kernel and DPDK application. 106#. Exception path to Linux kernel from DPDK application as SW ``KNI`` replacement. 107#. Communication between two different DPDK applications. 108 109SDP interface 110------------- 111 112System DPI Packet Interface unit(SDP) provides PCIe endpoint support for remote host 113to DMA packets into and out of cnxk SoC. SDP interface comes in to live only when 114cnxk SoC is connected in PCIe endpoint mode. It can be used to send/receive 115packets to/from remote host machine using input/output queue pairs exposed to it. 116SDP interface receives input packets from remote host from NIX-RX and sends packets 117to remote host using NIX-TX. Remote host machine need to use corresponding driver 118(kernel/user mode) to communicate with SDP interface on cnxk SoC. SDP supports 119single PCIe SRIOV physical function(PF) and multiple virtual functions(VF's). Users 120can bind PF or VF to use SDP interface and it will be enumerated as ethdev ports. 121 122The primary use case for SDP is to enable the smart NIC use case. Typical usage models are, 123 124#. Communication channel between remote host and cnxk SoC over PCIe. 125#. Transfer packets received from network interface to remote host over PCIe and 126 vice-versa. 127 128cnxk packet flow 129---------------------- 130 131The :numref:`figure_cnxk_packet_flow_hw_accelerators` diagram depicts 132the packet flow on cnxk SoC in conjunction with use of various HW accelerators. 133 134.. _figure_cnxk_packet_flow_hw_accelerators: 135 136.. figure:: img/cnxk_packet_flow_hw_accelerators.* 137 138 cnxk packet flow in conjunction with use of HW accelerators 139 140HW Offload Drivers 141------------------ 142 143This section lists dataplane H/W block(s) available in cnxk SoC. 144 145Procedure to Setup Platform 146--------------------------- 147 148There are three main prerequisites for setting up DPDK on cnxk 149compatible board: 150 1511. **RVU AF Linux kernel driver** 152 153 The dependent kernel drivers can be obtained from the 154 `kernel.org <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/marvell/octeontx2>`_. 155 156 Alternatively, the Marvell SDK also provides the required kernel drivers. 157 158 Linux kernel should be configured with the following features enabled: 159 160.. code-block:: console 161 162 # 64K pages enabled for better performance 163 CONFIG_ARM64_64K_PAGES=y 164 CONFIG_ARM64_VA_BITS_48=y 165 # huge pages support enabled 166 CONFIG_HUGETLBFS=y 167 CONFIG_HUGETLB_PAGE=y 168 # VFIO enabled with TYPE1 IOMMU at minimum 169 CONFIG_VFIO_IOMMU_TYPE1=y 170 CONFIG_VFIO_VIRQFD=y 171 CONFIG_VFIO=y 172 CONFIG_VFIO_NOIOMMU=y 173 CONFIG_VFIO_PCI=y 174 CONFIG_VFIO_PCI_MMAP=y 175 # SMMUv3 driver 176 CONFIG_ARM_SMMU_V3=y 177 # ARMv8.1 LSE atomics 178 CONFIG_ARM64_LSE_ATOMICS=y 179 # OCTEONTX2 drivers 180 CONFIG_OCTEONTX2_MBOX=y 181 CONFIG_OCTEONTX2_AF=y 182 # Enable if netdev PF driver required 183 CONFIG_OCTEONTX2_PF=y 184 # Enable if netdev VF driver required 185 CONFIG_OCTEONTX2_VF=y 186 CONFIG_CRYPTO_DEV_OCTEONTX2_CPT=y 187 # Enable if OCTEONTX2 DMA PF driver required 188 CONFIG_OCTEONTX2_DPI_PF=n 189 1902. **ARM64 Linux Tool Chain** 191 192 For example, the *aarch64* Linaro Toolchain, which can be obtained from 193 `here <https://releases.linaro.org/components/toolchain/binaries/7.4-2019.02/aarch64-linux-gnu/>`_. 194 195 Alternatively, the Marvell SDK also provides GNU GCC toolchain, which is 196 optimized for cnxk CPU. 197 1983. **Rootfile system** 199 200 Any *aarch64* supporting filesystem may be used. For example, 201 Ubuntu 15.10 (Wily) or 16.04 LTS (Xenial) userland which can be obtained 202 from `<http://cdimage.ubuntu.com/ubuntu-base/releases/16.04/release/ubuntu-base-16.04.1-base-arm64.tar.gz>`_. 203 204 Alternatively, the Marvell SDK provides the buildroot based root filesystem. 205 The SDK includes all the above prerequisites necessary to bring up the cnxk board. 206 207- Follow the DPDK :doc:`../linux_gsg/index` to setup the basic DPDK environment. 208 209 210Debugging Options 211----------------- 212 213.. _table_cnxk_common_debug_options: 214 215.. table:: cnxk common debug options 216 217 +---+------------+-------------------------------------------------------+ 218 | # | Component | EAL log command | 219 +===+============+=======================================================+ 220 | 1 | Common | --log-level='pmd\.cnxk\.base,8' | 221 +---+------------+-------------------------------------------------------+ 222 | 2 | Mailbox | --log-level='pmd\.cnxk\.mbox,8' | 223 +---+------------+-------------------------------------------------------+ 224 225Debugfs support 226~~~~~~~~~~~~~~~ 227 228The **RVU AF Linux kernel driver** provides support to dump RVU blocks 229context or stats using debugfs. 230 231Enable ``debugfs`` by: 232 2331. Compile kernel with debugfs enabled, i.e ``CONFIG_DEBUGFS=y``. 2342. Boot OCTEON CN9K/CN10K with debugfs supported kernel. 2353. Verify ``debugfs`` mounted by default "mount | grep -i debugfs" or mount it manually by using. 236 237.. code-block:: console 238 239 # mount -t debugfs none /sys/kernel/debug 240 241Currently ``debugfs`` supports the following RVU blocks NIX, NPA, NPC, NDC, 242SSO & RPM. 243 244The file structure under ``/sys/kernel/debug`` is as follows 245 246.. code-block:: console 247 248 octeontx2/ 249 | 250 cn10k/ 251 |-- rpm 252 | |-- rpm0 253 | | '-- lmac0 254 | | '-- stats 255 | |-- rpm1 256 | | |-- lmac0 257 | | | '-- stats 258 | | '-- lmac1 259 | | '-- stats 260 | '-- rpm2 261 | '-- lmac0 262 | '-- stats 263 |-- cpt 264 | |-- cpt_engines_info 265 | |-- cpt_engines_sts 266 | |-- cpt_err_info 267 | |-- cpt_lfs_info 268 | '-- cpt_pc 269 |---- nix 270 | |-- cq_ctx 271 | |-- ndc_rx_cache 272 | |-- ndc_rx_hits_miss 273 | |-- ndc_tx_cache 274 | |-- ndc_tx_hits_miss 275 | |-- qsize 276 | |-- rq_ctx 277 | '-- sq_ctx 278 |-- npa 279 | |-- aura_ctx 280 | |-- ndc_cache 281 | |-- ndc_hits_miss 282 | |-- pool_ctx 283 | '-- qsize 284 |-- npc 285 | |-- mcam_info 286 | |-- mcam_rules 287 | '-- rx_miss_act_stats 288 |-- rsrc_alloc 289 '-- sso 290 |-- hws 291 | '-- sso_hws_info 292 '-- hwgrp 293 |-- sso_hwgrp_aq_thresh 294 |-- sso_hwgrp_iaq_walk 295 |-- sso_hwgrp_pc 296 |-- sso_hwgrp_free_list_walk 297 |-- sso_hwgrp_ient_walk 298 '-- sso_hwgrp_taq_walk 299 300RVU block LF allocation: 301 302.. code-block:: console 303 304 cat /sys/kernel/debug/cn10k/rsrc_alloc 305 306 pcifunc NPA NIX SSO GROUP SSOWS TIM CPT 307 PF1 0 0 308 PF4 1 309 PF13 0, 1 0, 1 0 310 311RPM example usage: 312 313.. code-block:: console 314 315 cat /sys/kernel/debug/cn10k/rpm/rpm0/lmac0/stats 316 317 =======Link Status====== 318 319 Link is UP 25000 Mbps 320 321 =======NIX RX_STATS(rpm port level)====== 322 323 rx_ucast_frames: 0 324 rx_mcast_frames: 0 325 rx_bcast_frames: 0 326 rx_frames: 0 327 rx_bytes: 0 328 rx_drops: 0 329 rx_errors: 0 330 331 =======NIX TX_STATS(rpm port level)====== 332 333 tx_ucast_frames: 0 334 tx_mcast_frames: 0 335 tx_bcast_frames: 0 336 tx_frames: 0 337 tx_bytes: 0 338 tx_drops: 0 339 340 =======rpm RX_STATS====== 341 342 Octets of received packets: 0 343 Octets of received packets with out error: 0 344 Received packets with alignment errors: 0 345 Control/PAUSE packets received: 0 346 Packets received with Frame too long Errors: 0 347 Packets received with a1nrange length Errors: 0 348 Received packets: 0 349 Packets received with FrameCheckSequenceErrors: 0 350 Packets received with VLAN header: 0 351 Error packets: 0 352 Packets recievd with unicast DMAC: 0 353 Packets received with multicast DMAC: 0 354 Packets received with broadcast DMAC: 0 355 Dropped packets: 0 356 Total frames received on interface: 0 357 Packets received with an octet count < 64: 0 358 Packets received with an octet count == 64: 0 359 Packets received with an octet count of 65–127: 0 360 Packets received with an octet count of 128-255: 0 361 Packets received with an octet count of 256-511: 0 362 Packets received with an octet count of 512-1023: 0 363 Packets received with an octet count of 1024-1518: 0 364 Packets received with an octet count of > 1518: 0 365 Oversized Packets: 0 366 Jabber Packets: 0 367 Fragmented Packets: 0 368 CBFC(class based flow control) pause frames received for class 0: 0 369 CBFC pause frames received for class 1: 0 370 CBFC pause frames received for class 2: 0 371 CBFC pause frames received for class 3: 0 372 CBFC pause frames received for class 4: 0 373 CBFC pause frames received for class 5: 0 374 CBFC pause frames received for class 6: 0 375 CBFC pause frames received for class 7: 0 376 CBFC pause frames received for class 8: 0 377 CBFC pause frames received for class 9: 0 378 CBFC pause frames received for class 10: 0 379 CBFC pause frames received for class 11: 0 380 CBFC pause frames received for class 12: 0 381 CBFC pause frames received for class 13: 0 382 CBFC pause frames received for class 14: 0 383 CBFC pause frames received for class 15: 0 384 MAC control packets received: 0 385 386 =======rpm TX_STATS====== 387 388 Total octets sent on the interface: 0 389 Total octets transmitted OK: 0 390 Control/Pause frames sent: 0 391 Total frames transmitted OK: 0 392 Total frames sent with VLAN header: 0 393 Error Packets: 0 394 Packets sent to unicast DMAC: 0 395 Packets sent to the multicast DMAC: 0 396 Packets sent to a broadcast DMAC: 0 397 Packets sent with an octet count == 64: 0 398 Packets sent with an octet count of 65–127: 0 399 Packets sent with an octet count of 128-255: 0 400 Packets sent with an octet count of 256-511: 0 401 Packets sent with an octet count of 512-1023: 0 402 Packets sent with an octet count of 1024-1518: 0 403 Packets sent with an octet count of > 1518: 0 404 CBFC(class based flow control) pause frames transmitted for class 0: 0 405 CBFC pause frames transmitted for class 1: 0 406 CBFC pause frames transmitted for class 2: 0 407 CBFC pause frames transmitted for class 3: 0 408 CBFC pause frames transmitted for class 4: 0 409 CBFC pause frames transmitted for class 5: 0 410 CBFC pause frames transmitted for class 6: 0 411 CBFC pause frames transmitted for class 7: 0 412 CBFC pause frames transmitted for class 8: 0 413 CBFC pause frames transmitted for class 9: 0 414 CBFC pause frames transmitted for class 10: 0 415 CBFC pause frames transmitted for class 11: 0 416 CBFC pause frames transmitted for class 12: 0 417 CBFC pause frames transmitted for class 13: 0 418 CBFC pause frames transmitted for class 14: 0 419 CBFC pause frames transmitted for class 15: 0 420 MAC control packets sent: 0 421 Total frames sent on the interface: 0 422 423CPT example usage: 424 425.. code-block:: console 426 427 cat /sys/kernel/debug/cn10k/cpt/cpt_pc 428 429 CPT instruction requests 0 430 CPT instruction latency 0 431 CPT NCB read requests 0 432 CPT NCB read latency 0 433 CPT read requests caused by UC fills 0 434 CPT active cycles pc 1395642 435 CPT clock count pc 5579867595493 436 437NIX example usage: 438 439.. code-block:: console 440 441 Usage: echo <nixlf> [cq number/all] > /sys/kernel/debug/cn10k/nix/cq_ctx 442 cat /sys/kernel/debug/cn10k/nix/cq_ctx 443 echo 0 0 > /sys/kernel/debug/cn10k/nix/cq_ctx 444 cat /sys/kernel/debug/cn10k/nix/cq_ctx 445 446 =====cq_ctx for nixlf:0 and qidx:0 is===== 447 W0: base 158ef1a00 448 449 W1: wrptr 0 450 W1: avg_con 0 451 W1: cint_idx 0 452 W1: cq_err 0 453 W1: qint_idx 0 454 W1: bpid 0 455 W1: bp_ena 0 456 457 W2: update_time 31043 458 W2:avg_level 255 459 W2: head 0 460 W2:tail 0 461 462 W3: cq_err_int_ena 5 463 W3:cq_err_int 0 464 W3: qsize 4 465 W3:caching 1 466 W3: substream 0x000 467 W3: ena 1 468 W3: drop_ena 1 469 W3: drop 64 470 W3: bp 0 471 472NPA example usage: 473 474.. code-block:: console 475 476 Usage: echo <npalf> [pool number/all] > /sys/kernel/debug/cn10k/npa/pool_ctx 477 cat /sys/kernel/debug/cn10k/npa/pool_ctx 478 echo 0 0 > /sys/kernel/debug/cn10k/npa/pool_ctx 479 cat /sys/kernel/debug/cn10k/npa/pool_ctx 480 481 ======POOL : 0======= 482 W0: Stack base 1375bff00 483 W1: ena 1 484 W1: nat_align 1 485 W1: stack_caching 1 486 W1: stack_way_mask 0 487 W1: buf_offset 1 488 W1: buf_size 19 489 W2: stack_max_pages 24315 490 W2: stack_pages 24314 491 W3: op_pc 267456 492 W4: stack_offset 2 493 W4: shift 5 494 W4: avg_level 255 495 W4: avg_con 0 496 W4: fc_ena 0 497 W4: fc_stype 0 498 W4: fc_hyst_bits 0 499 W4: fc_up_crossing 0 500 W4: update_time 62993 501 W5: fc_addr 0 502 W6: ptr_start 1593adf00 503 W7: ptr_end 180000000 504 W8: err_int 0 505 W8: err_int_ena 7 506 W8: thresh_int 0 507 W8: thresh_int_ena 0 508 W8: thresh_up 0 509 W8: thresh_qint_idx 0 510 W8: err_qint_idx 0 511 512NPC example usage: 513 514.. code-block:: console 515 516 cat /sys/kernel/debug/cn10k/npc/mcam_info 517 518 NPC MCAM info: 519 RX keywidth : 224bits 520 TX keywidth : 224bits 521 522 MCAM entries : 2048 523 Reserved : 158 524 Available : 1890 525 526 MCAM counters : 512 527 Reserved : 1 528 Available : 511 529 530SSO example usage: 531 532.. code-block:: console 533 534 Usage: echo [<hws>/all] > /sys/kernel/debug/cn10k/sso/hws/sso_hws_info 535 echo 0 > /sys/kernel/debug/cn10k/sso/hws/sso_hws_info 536 537 ================================================== 538 SSOW HWS[0] Arbitration State 0x0 539 SSOW HWS[0] Guest Machine Control 0x0 540 SSOW HWS[0] SET[0] Group Mask[0] 0xffffffffffffffff 541 SSOW HWS[0] SET[0] Group Mask[1] 0xffffffffffffffff 542 SSOW HWS[0] SET[0] Group Mask[2] 0xffffffffffffffff 543 SSOW HWS[0] SET[0] Group Mask[3] 0xffffffffffffffff 544 SSOW HWS[0] SET[1] Group Mask[0] 0xffffffffffffffff 545 SSOW HWS[0] SET[1] Group Mask[1] 0xffffffffffffffff 546 SSOW HWS[0] SET[1] Group Mask[2] 0xffffffffffffffff 547 SSOW HWS[0] SET[1] Group Mask[3] 0xffffffffffffffff 548 ================================================== 549 550Compile DPDK 551------------ 552 553DPDK may be compiled either natively on OCTEON CN9K/CN10K platform or cross-compiled on 554an x86 based platform. 555 556Native Compilation 557~~~~~~~~~~~~~~~~~~ 558 559.. code-block:: console 560 561 meson build 562 ninja -C build 563 564Cross Compilation 565~~~~~~~~~~~~~~~~~ 566 567Refer to :doc:`../linux_gsg/cross_build_dpdk_for_arm64` for generic arm64 details. 568 569.. code-block:: console 570 571 meson build --cross-file config/arm/arm64_cn10k_linux_gcc 572 ninja -C build 573 574.. note:: 575 576 By default, meson cross compilation uses ``aarch64-linux-gnu-gcc`` toolchain, 577 if Marvell toolchain is available then it can be used by overriding the 578 c, cpp, ar, strip ``binaries`` attributes to respective Marvell 579 toolchain binaries in ``config/arm/arm64_cn10k_linux_gcc`` file. 580