15630257fSFerruh Yigit.. SPDX-License-Identifier: BSD-3-Clause 25630257fSFerruh Yigit Copyright(c) 2010-2014 Intel Corporation. 3fc1f2750SBernard Iremonger 4*41dd9a6bSDavid YoungPacket Framework Library 5*41dd9a6bSDavid Young======================== 6fc1f2750SBernard Iremonger 7fc1f2750SBernard IremongerDesign Objectives 8fc1f2750SBernard Iremonger----------------- 9fc1f2750SBernard Iremonger 1048624fd9SSiobhan ButlerThe main design objectives for the DPDK Packet Framework are: 11fc1f2750SBernard Iremonger 12fc1f2750SBernard Iremonger* Provide standard methodology to build complex packet processing pipelines. 13fc1f2750SBernard Iremonger Provide reusable and extensible templates for the commonly used pipeline functional blocks; 14fc1f2750SBernard Iremonger 15fc1f2750SBernard Iremonger* Provide capability to switch between pure software and hardware-accelerated implementations for the same pipeline functional block; 16fc1f2750SBernard Iremonger 17fc1f2750SBernard Iremonger* Provide the best trade-off between flexibility and performance. 18fc1f2750SBernard Iremonger Hardcoded pipelines usually provide the best performance, but are not flexible, 19fc1f2750SBernard Iremonger while developing flexible frameworks is never a problem, but performance is usually low; 20fc1f2750SBernard Iremonger 21fc1f2750SBernard Iremonger* Provide a framework that is logically similar to Open Flow. 22fc1f2750SBernard Iremonger 23fc1f2750SBernard IremongerOverview 24fc1f2750SBernard Iremonger-------- 25fc1f2750SBernard Iremonger 26fc1f2750SBernard IremongerPacket processing applications are frequently structured as pipelines of multiple stages, 27fc1f2750SBernard Iremongerwith the logic of each stage glued around a lookup table. 28fc1f2750SBernard IremongerFor each incoming packet, the table defines the set of actions to be applied to the packet, 29fc1f2750SBernard Iremongeras well as the next stage to send the packet to. 30fc1f2750SBernard Iremonger 3148624fd9SSiobhan ButlerThe DPDK Packet Framework minimizes the development effort required to build packet processing pipelines 32fc1f2750SBernard Iremongerby defining a standard methodology for pipeline development, 33fc1f2750SBernard Iremongeras well as providing libraries of reusable templates for the commonly used pipeline blocks. 34fc1f2750SBernard Iremonger 35fc1f2750SBernard IremongerThe pipeline is constructed by connecting the set of input ports with the set of output ports 36fc1f2750SBernard Iremongerthrough the set of tables in a tree-like topology. 37fc1f2750SBernard IremongerAs result of lookup operation for the current packet in the current table, 38fc1f2750SBernard Iremongerone of the table entries (on lookup hit) or the default table entry (on lookup miss) 39fc1f2750SBernard Iremongerprovides the set of actions to be applied on the current packet, 40fc1f2750SBernard Iremongeras well as the next hop for the packet, which can be either another table, an output port or packet drop. 41fc1f2750SBernard Iremonger 424a22e6eeSJohn McNamaraAn example of packet processing pipeline is presented in :numref:`figure_figure32`: 43fc1f2750SBernard Iremonger 444a22e6eeSJohn McNamara.. _figure_figure32: 45fc1f2750SBernard Iremonger 464a22e6eeSJohn McNamara.. figure:: img/figure32.* 47fc1f2750SBernard Iremonger 484a22e6eeSJohn McNamara Example of Packet Processing Pipeline where Input Ports 0 and 1 494a22e6eeSJohn McNamara are Connected with Output Ports 0, 1 and 2 through Tables 0 and 1 50fc1f2750SBernard Iremonger 51fc1f2750SBernard Iremonger 52fc1f2750SBernard IremongerPort Library Design 53fc1f2750SBernard Iremonger------------------- 54fc1f2750SBernard Iremonger 55fc1f2750SBernard IremongerPort Types 56fc1f2750SBernard Iremonger~~~~~~~~~~ 57fc1f2750SBernard Iremonger 588c9a3374SJohn McNamara:numref:`table_qos_19` is a non-exhaustive list of ports that can be implemented with the Packet Framework. 59fc1f2750SBernard Iremonger 608c9a3374SJohn McNamara.. _table_qos_19: 61fc1f2750SBernard Iremonger 628c9a3374SJohn McNamara.. table:: Port Types 63fc1f2750SBernard Iremonger 64fc1f2750SBernard Iremonger +---+------------------+---------------------------------------------------------------------------------------+ 65fc1f2750SBernard Iremonger | # | Port type | Description | 66fc1f2750SBernard Iremonger | | | | 67fc1f2750SBernard Iremonger +===+==================+=======================================================================================+ 68fc1f2750SBernard Iremonger | 1 | SW ring | SW circular buffer used for message passing between the application threads. Uses | 6948624fd9SSiobhan Butler | | | the DPDK rte_ring primitive. Expected to be the most commonly used type of | 70fc1f2750SBernard Iremonger | | | port. | 71fc1f2750SBernard Iremonger | | | | 72fc1f2750SBernard Iremonger +---+------------------+---------------------------------------------------------------------------------------+ 73fc1f2750SBernard Iremonger | 2 | HW ring | Queue of buffer descriptors used to interact with NIC, switch or accelerator ports. | 7448624fd9SSiobhan Butler | | | For NIC ports, it uses the DPDK rte_eth_rx_queue or rte_eth_tx_queue | 75fc1f2750SBernard Iremonger | | | primitives. | 76fc1f2750SBernard Iremonger | | | | 77fc1f2750SBernard Iremonger +---+------------------+---------------------------------------------------------------------------------------+ 78fc1f2750SBernard Iremonger | 3 | IP reassembly | Input packets are either IP fragments or complete IP datagrams. Output packets are | 79fc1f2750SBernard Iremonger | | | complete IP datagrams. | 80fc1f2750SBernard Iremonger | | | | 81fc1f2750SBernard Iremonger +---+------------------+---------------------------------------------------------------------------------------+ 82fc1f2750SBernard Iremonger | 4 | IP fragmentation | Input packets are jumbo (IP datagrams with length bigger than MTU) or non-jumbo | 83fc1f2750SBernard Iremonger | | | packets. Output packets are non-jumbo packets. | 84fc1f2750SBernard Iremonger | | | | 85fc1f2750SBernard Iremonger +---+------------------+---------------------------------------------------------------------------------------+ 86fc1f2750SBernard Iremonger | 5 | Traffic manager | Traffic manager attached to a specific NIC output port, performing congestion | 87fc1f2750SBernard Iremonger | | | management and hierarchical scheduling according to pre-defined SLAs. | 88fc1f2750SBernard Iremonger | | | | 89fc1f2750SBernard Iremonger +---+------------------+---------------------------------------------------------------------------------------+ 90f78c100bSStephen Hemminger | 6 | Source | Input port used as packet generator. Similar to Linux kernel /dev/zero character | 91fc1f2750SBernard Iremonger | | | device. | 92fc1f2750SBernard Iremonger | | | | 93fc1f2750SBernard Iremonger +---+------------------+---------------------------------------------------------------------------------------+ 94f78c100bSStephen Hemminger | 7 | Sink | Output port used to drop all input packets. Similar to Linux kernel /dev/null | 95fc1f2750SBernard Iremonger | | | character device. | 96fc1f2750SBernard Iremonger | | | | 97fc1f2750SBernard Iremonger +---+------------------+---------------------------------------------------------------------------------------+ 98f78c100bSStephen Hemminger | 8 | Sym_crypto | Output port used to extract DPDK Cryptodev operations from a fixed offset of the | 99cc85c078SFan Zhang | | | packet and then enqueue to the Cryptodev PMD. Input port used to dequeue the | 100cc85c078SFan Zhang | | | Cryptodev operations from the Cryptodev PMD and then retrieve the packets from them. | 101cc85c078SFan Zhang +---+------------------+---------------------------------------------------------------------------------------+ 102fc1f2750SBernard Iremonger 103fc1f2750SBernard IremongerPort Interface 104fc1f2750SBernard Iremonger~~~~~~~~~~~~~~ 105fc1f2750SBernard Iremonger 106fc1f2750SBernard IremongerEach port is unidirectional, i.e. either input port or output port. 107fc1f2750SBernard IremongerEach input/output port is required to implement an abstract interface that 108fc1f2750SBernard Iremongerdefines the initialization and run-time operation of the port. 109fc1f2750SBernard IremongerThe port abstract interface is described in. 110fc1f2750SBernard Iremonger 1118c9a3374SJohn McNamara.. _table_qos_20: 112fc1f2750SBernard Iremonger 1138c9a3374SJohn McNamara.. table:: 20 Port Abstract Interface 114fc1f2750SBernard Iremonger 115fc1f2750SBernard Iremonger +---+----------------+-----------------------------------------------------------------------------------------+ 116fc1f2750SBernard Iremonger | # | Port Operation | Description | 117fc1f2750SBernard Iremonger | | | | 118fc1f2750SBernard Iremonger +===+================+=========================================================================================+ 119fc1f2750SBernard Iremonger | 1 | Create | Create the low-level port object (e.g. queue). Can internally allocate memory. | 120fc1f2750SBernard Iremonger | | | | 121fc1f2750SBernard Iremonger +---+----------------+-----------------------------------------------------------------------------------------+ 122fc1f2750SBernard Iremonger | 2 | Free | Free the resources (e.g. memory) used by the low-level port object. | 123fc1f2750SBernard Iremonger | | | | 124fc1f2750SBernard Iremonger +---+----------------+-----------------------------------------------------------------------------------------+ 125fc1f2750SBernard Iremonger | 3 | RX | Read a burst of input packets. Non-blocking operation. Only defined for input ports. | 126fc1f2750SBernard Iremonger | | | | 127fc1f2750SBernard Iremonger +---+----------------+-----------------------------------------------------------------------------------------+ 128fc1f2750SBernard Iremonger | 4 | TX | Write a burst of input packets. Non-blocking operation. Only defined for output ports. | 129fc1f2750SBernard Iremonger | | | | 130fc1f2750SBernard Iremonger +---+----------------+-----------------------------------------------------------------------------------------+ 131fc1f2750SBernard Iremonger | 5 | Flush | Flush the output buffer. Only defined for output ports. | 132fc1f2750SBernard Iremonger | | | | 133fc1f2750SBernard Iremonger +---+----------------+-----------------------------------------------------------------------------------------+ 134fc1f2750SBernard Iremonger 135fc1f2750SBernard IremongerTable Library Design 136fc1f2750SBernard Iremonger-------------------- 137fc1f2750SBernard Iremonger 138fc1f2750SBernard IremongerTable Types 139fc1f2750SBernard Iremonger~~~~~~~~~~~ 140fc1f2750SBernard Iremonger 1418c9a3374SJohn McNamara:numref:`table_qos_21` is a non-exhaustive list of types of tables that can be implemented with the Packet Framework. 142fc1f2750SBernard Iremonger 1438c9a3374SJohn McNamara.. _table_qos_21: 144fc1f2750SBernard Iremonger 1458c9a3374SJohn McNamara.. table:: Table Types 146fc1f2750SBernard Iremonger 147fc1f2750SBernard Iremonger +---+----------------------------+-----------------------------------------------------------------------------+ 148fc1f2750SBernard Iremonger | # | Table Type | Description | 149fc1f2750SBernard Iremonger | | | | 150fc1f2750SBernard Iremonger +===+============================+=============================================================================+ 151fc1f2750SBernard Iremonger | 1 | Hash table | Lookup key is n-tuple based. | 152fc1f2750SBernard Iremonger | | | | 153fc1f2750SBernard Iremonger | | | Typically, the lookup key is hashed to produce a signature that is used to | 154fc1f2750SBernard Iremonger | | | identify a bucket of entries where the lookup key is searched next. | 155fc1f2750SBernard Iremonger | | | | 156fc1f2750SBernard Iremonger | | | The signature associated with the lookup key of each input packet is either | 157fc1f2750SBernard Iremonger | | | read from the packet descriptor (pre-computed signature) or computed at | 158fc1f2750SBernard Iremonger | | | table lookup time. | 159fc1f2750SBernard Iremonger | | | | 160fc1f2750SBernard Iremonger | | | The table lookup, add entry and delete entry operations, as well as any | 161fc1f2750SBernard Iremonger | | | other pipeline block that pre-computes the signature all have to use the | 162fc1f2750SBernard Iremonger | | | same hashing algorithm to generate the signature. | 163fc1f2750SBernard Iremonger | | | | 164fc1f2750SBernard Iremonger | | | Typically used to implement flow classification tables, ARP caches, routing | 165fc1f2750SBernard Iremonger | | | table for tunnelling protocols, etc. | 166fc1f2750SBernard Iremonger | | | | 167fc1f2750SBernard Iremonger +---+----------------------------+-----------------------------------------------------------------------------+ 168fc1f2750SBernard Iremonger | 2 | Longest Prefix Match (LPM) | Lookup key is the IP address. | 169fc1f2750SBernard Iremonger | | | | 170fc1f2750SBernard Iremonger | | | Each table entries has an associated IP prefix (IP and depth). | 171fc1f2750SBernard Iremonger | | | | 172fc1f2750SBernard Iremonger | | | The table lookup operation selects the IP prefix that is matched by the | 173fc1f2750SBernard Iremonger | | | lookup key; in case of multiple matches, the entry with the longest prefix | 174fc1f2750SBernard Iremonger | | | depth wins. | 175fc1f2750SBernard Iremonger | | | | 176fc1f2750SBernard Iremonger | | | Typically used to implement IP routing tables. | 177fc1f2750SBernard Iremonger | | | | 178fc1f2750SBernard Iremonger +---+----------------------------+-----------------------------------------------------------------------------+ 179fc1f2750SBernard Iremonger | 3 | Access Control List (ACLs) | Lookup key is 7-tuple of two VLAN/MPLS labels, IP destination address, | 180fc1f2750SBernard Iremonger | | | IP source addresses, L4 protocol, L4 destination port, L4 source port. | 181fc1f2750SBernard Iremonger | | | | 182fc1f2750SBernard Iremonger | | | Each table entry has an associated ACL and priority. The ACL contains bit | 183fc1f2750SBernard Iremonger | | | masks for the VLAN/MPLS labels, IP prefix for IP destination address, IP | 184fc1f2750SBernard Iremonger | | | prefix for IP source addresses, L4 protocol and bitmask, L4 destination | 185fc1f2750SBernard Iremonger | | | port and bit mask, L4 source port and bit mask. | 186fc1f2750SBernard Iremonger | | | | 187fc1f2750SBernard Iremonger | | | The table lookup operation selects the ACL that is matched by the lookup | 188fc1f2750SBernard Iremonger | | | key; in case of multiple matches, the entry with the highest priority wins. | 189fc1f2750SBernard Iremonger | | | | 190fc1f2750SBernard Iremonger | | | Typically used to implement rule databases for firewalls, etc. | 191fc1f2750SBernard Iremonger | | | | 192fc1f2750SBernard Iremonger +---+----------------------------+-----------------------------------------------------------------------------+ 193fc1f2750SBernard Iremonger | 4 | Pattern matching search | Lookup key is the packet payload. | 194fc1f2750SBernard Iremonger | | | | 195fc1f2750SBernard Iremonger | | | Table is a database of patterns, with each pattern having a priority | 196fc1f2750SBernard Iremonger | | | assigned. | 197fc1f2750SBernard Iremonger | | | | 198fc1f2750SBernard Iremonger | | | The table lookup operation selects the patterns that is matched by the | 199fc1f2750SBernard Iremonger | | | input packet; in case of multiple matches, the matching pattern with the | 200fc1f2750SBernard Iremonger | | | highest priority wins. | 201fc1f2750SBernard Iremonger | | | | 202fc1f2750SBernard Iremonger +---+----------------------------+-----------------------------------------------------------------------------+ 203fc1f2750SBernard Iremonger | 5 | Array | Lookup key is the table entry index itself. | 204fc1f2750SBernard Iremonger | | | | 205fc1f2750SBernard Iremonger +---+----------------------------+-----------------------------------------------------------------------------+ 206fc1f2750SBernard Iremonger 207fc1f2750SBernard IremongerTable Interface 208fc1f2750SBernard Iremonger~~~~~~~~~~~~~~~ 209fc1f2750SBernard Iremonger 210fc1f2750SBernard IremongerEach table is required to implement an abstract interface that defines the initialization 211fc1f2750SBernard Iremongerand run-time operation of the table. 2128c9a3374SJohn McNamaraThe table abstract interface is described in :numref:`table_qos_29_1`. 213fc1f2750SBernard Iremonger 2148c9a3374SJohn McNamara.. _table_qos_29_1: 215fc1f2750SBernard Iremonger 2168c9a3374SJohn McNamara.. table:: Table Abstract Interface 217fc1f2750SBernard Iremonger 218fc1f2750SBernard Iremonger +---+-----------------+----------------------------------------------------------------------------------------+ 219fc1f2750SBernard Iremonger | # | Table operation | Description | 220fc1f2750SBernard Iremonger | | | | 221fc1f2750SBernard Iremonger +===+=================+========================================================================================+ 222fc1f2750SBernard Iremonger | 1 | Create | Create the low-level data structures of the lookup table. Can internally allocate | 223fc1f2750SBernard Iremonger | | | memory. | 224fc1f2750SBernard Iremonger | | | | 225fc1f2750SBernard Iremonger +---+-----------------+----------------------------------------------------------------------------------------+ 226fc1f2750SBernard Iremonger | 2 | Free | Free up all the resources used by the lookup table. | 227fc1f2750SBernard Iremonger | | | | 228fc1f2750SBernard Iremonger +---+-----------------+----------------------------------------------------------------------------------------+ 229fc1f2750SBernard Iremonger | 3 | Add entry | Add new entry to the lookup table. | 230fc1f2750SBernard Iremonger | | | | 231fc1f2750SBernard Iremonger +---+-----------------+----------------------------------------------------------------------------------------+ 232fc1f2750SBernard Iremonger | 4 | Delete entry | Delete specific entry from the lookup table. | 233fc1f2750SBernard Iremonger | | | | 234fc1f2750SBernard Iremonger +---+-----------------+----------------------------------------------------------------------------------------+ 235fc1f2750SBernard Iremonger | 5 | Lookup | Look up a burst of input packets and return a bit mask specifying the result of the | 236fc1f2750SBernard Iremonger | | | lookup operation for each packet: a set bit signifies lookup hit for the corresponding | 237fc1f2750SBernard Iremonger | | | packet, while a cleared bit a lookup miss. | 238fc1f2750SBernard Iremonger | | | | 239fc1f2750SBernard Iremonger | | | For each lookup hit packet, the lookup operation also returns a pointer to the table | 240fc1f2750SBernard Iremonger | | | entry that was hit, which contains the actions to be applied on the packet and any | 241fc1f2750SBernard Iremonger | | | associated metadata. | 242fc1f2750SBernard Iremonger | | | | 243fc1f2750SBernard Iremonger | | | For each lookup miss packet, the actions to be applied on the packet and any | 244fc1f2750SBernard Iremonger | | | associated metadata are specified by the default table entry preconfigured for lookup | 245fc1f2750SBernard Iremonger | | | miss. | 246fc1f2750SBernard Iremonger | | | | 247fc1f2750SBernard Iremonger +---+-----------------+----------------------------------------------------------------------------------------+ 248fc1f2750SBernard Iremonger 249fc1f2750SBernard Iremonger 250fc1f2750SBernard IremongerHash Table Design 251fc1f2750SBernard Iremonger~~~~~~~~~~~~~~~~~ 252fc1f2750SBernard Iremonger 253fc1f2750SBernard IremongerHash Table Overview 254fc1f2750SBernard Iremonger^^^^^^^^^^^^^^^^^^^ 255fc1f2750SBernard Iremonger 256fc1f2750SBernard IremongerHash tables are important because the key lookup operation is optimized for speed: 257fc1f2750SBernard Iremongerinstead of having to linearly search the lookup key through all the keys in the table, 258fc1f2750SBernard Iremongerthe search is limited to only the keys stored in a single table bucket. 259fc1f2750SBernard Iremonger 260fc1f2750SBernard Iremonger**Associative Arrays** 261fc1f2750SBernard Iremonger 262fc1f2750SBernard IremongerAn associative array is a function that can be specified as a set of (key, value) pairs, 263fc1f2750SBernard Iremongerwith each key from the possible set of input keys present at most once. 264fc1f2750SBernard IremongerFor a given associative array, the possible operations are: 265fc1f2750SBernard Iremonger 266fc1f2750SBernard Iremonger#. *add (key, value)*: When no value is currently associated with *key*, then the (key, *value* ) association is created. 267fc1f2750SBernard Iremonger When *key* is already associated value *value0*, then the association (*key*, *value0*) is removed 268fc1f2750SBernard Iremonger and association *(key, value)* is created; 269fc1f2750SBernard Iremonger 270fc1f2750SBernard Iremonger#. *delete key*: When no value is currently associated with *key*, this operation has no effect. 271fc1f2750SBernard Iremonger When *key* is already associated *value*, then association *(key, value)* is removed; 272fc1f2750SBernard Iremonger 273fc1f2750SBernard Iremonger#. *lookup key*: When no value is currently associated with *key*, then this operation returns void value (lookup miss). 274fc1f2750SBernard Iremonger When *key* is associated with *value*, then this operation returns *value*. 275fc1f2750SBernard Iremonger The *(key, value)* association is not changed. 276fc1f2750SBernard Iremonger 277fc1f2750SBernard IremongerThe matching criterion used to compare the input key against the keys in the associative array is *exact match*, 278fc1f2750SBernard Iremongeras the key size (number of bytes) and the key value (array of bytes) have to match exactly for the two keys under comparison. 279fc1f2750SBernard Iremonger 280fc1f2750SBernard Iremonger**Hash Function** 281fc1f2750SBernard Iremonger 282fc1f2750SBernard IremongerA hash function deterministically maps data of variable length (key) to data of fixed size (hash value or key signature). 283fc1f2750SBernard IremongerTypically, the size of the key is bigger than the size of the key signature. 284fc1f2750SBernard IremongerThe hash function basically compresses a long key into a short signature. 285fc1f2750SBernard IremongerSeveral keys can share the same signature (collisions). 286fc1f2750SBernard Iremonger 287fc1f2750SBernard IremongerHigh quality hash functions have uniform distribution. 288fc1f2750SBernard IremongerFor large number of keys, when dividing the space of signature values into a fixed number of equal intervals (buckets), 289fc1f2750SBernard Iremongerit is desirable to have the key signatures evenly distributed across these intervals (uniform distribution), 290fc1f2750SBernard Iremongeras opposed to most of the signatures going into only a few of the intervals 291fc1f2750SBernard Iremongerand the rest of the intervals being largely unused (non-uniform distribution). 292fc1f2750SBernard Iremonger 293fc1f2750SBernard Iremonger**Hash Table** 294fc1f2750SBernard Iremonger 295fc1f2750SBernard IremongerA hash table is an associative array that uses a hash function for its operation. 296fc1f2750SBernard IremongerThe reason for using a hash function is to optimize the performance of the lookup operation 297fc1f2750SBernard Iremongerby minimizing the number of table keys that have to be compared against the input key. 298fc1f2750SBernard Iremonger 299fc1f2750SBernard IremongerInstead of storing the (key, value) pairs in a single list, the hash table maintains multiple lists (buckets). 300fc1f2750SBernard IremongerFor any given key, there is a single bucket where that key might exist, and this bucket is uniquely identified based on the key signature. 301fc1f2750SBernard IremongerOnce the key signature is computed and the hash table bucket identified, 302fc1f2750SBernard Iremongerthe key is either located in this bucket or it is not present in the hash table at all, 303fc1f2750SBernard Iremongerso the key search can be narrowed down from the full set of keys currently in the table 304fc1f2750SBernard Iremongerto just the set of keys currently in the identified table bucket. 305fc1f2750SBernard Iremonger 306fc1f2750SBernard IremongerThe performance of the hash table lookup operation is greatly improved, 307fea1d908SJohn McNamaraprovided that the table keys are evenly distributed among the hash table buckets, 308fc1f2750SBernard Iremongerwhich can be achieved by using a hash function with uniform distribution. 309fc1f2750SBernard IremongerThe rule to map a key to its bucket can simply be to use the key signature (modulo the number of table buckets) as the table bucket ID: 310fc1f2750SBernard Iremonger 311fc1f2750SBernard Iremonger *bucket_id = f_hash(key) % n_buckets;* 312fc1f2750SBernard Iremonger 313fc1f2750SBernard IremongerBy selecting the number of buckets to be a power of two, the modulo operator can be replaced by a bitwise AND logical operation: 314fc1f2750SBernard Iremonger 315fc1f2750SBernard Iremonger *bucket_id = f_hash(key) & (n_buckets - 1);* 316fc1f2750SBernard Iremonger 317fc1f2750SBernard Iremongerconsidering *n_bits* as the number of bits set in *bucket_mask = n_buckets - 1*, 318fc1f2750SBernard Iremongerthis means that all the keys that end up in the same hash table bucket have the lower *n_bits* of their signature identical. 319fc1f2750SBernard IremongerIn order to reduce the number of keys in the same bucket (collisions), the number of hash table buckets needs to be increased. 320fc1f2750SBernard Iremonger 3214a22e6eeSJohn McNamaraIn packet processing context, the sequence of operations involved in hash table operations is described in :numref:`figure_figure33`: 322fc1f2750SBernard Iremonger 3234a22e6eeSJohn McNamara.. _figure_figure33: 324fc1f2750SBernard Iremonger 3254a22e6eeSJohn McNamara.. figure:: img/figure33.* 326fc1f2750SBernard Iremonger 3274a22e6eeSJohn McNamara Sequence of Steps for Hash Table Operations in a Packet Processing Context 3284a22e6eeSJohn McNamara 329fc1f2750SBernard Iremonger 330fc1f2750SBernard Iremonger 331fc1f2750SBernard IremongerHash Table Use Cases 332fc1f2750SBernard Iremonger^^^^^^^^^^^^^^^^^^^^ 333fc1f2750SBernard Iremonger 334fc1f2750SBernard Iremonger**Flow Classification** 335fc1f2750SBernard Iremonger 336fc1f2750SBernard Iremonger*Description:* The flow classification is executed at least once for each input packet. 337fc1f2750SBernard IremongerThis operation maps each incoming packet against one of the known traffic flows in the flow database that typically contains millions of flows. 338fc1f2750SBernard Iremonger 339fc1f2750SBernard Iremonger*Hash table name:* Flow classification table 340fc1f2750SBernard Iremonger 341fc1f2750SBernard Iremonger*Number of keys:* Millions 342fc1f2750SBernard Iremonger 343fc1f2750SBernard Iremonger*Key format:* n-tuple of packet fields that uniquely identify a traffic flow/connection. 344fc1f2750SBernard IremongerExample: DiffServ 5-tuple of (Source IP address, Destination IP address, L4 protocol, L4 protocol source port, L4 protocol destination port). 345fc1f2750SBernard IremongerFor IPv4 protocol and L4 protocols like TCP, UDP or SCTP, the size of the DiffServ 5-tuple is 13 bytes, while for IPv6 it is 37 bytes. 346fc1f2750SBernard Iremonger 347fc1f2750SBernard Iremonger*Key value (key data):* actions and action meta-data describing what processing to be applied for the packets of the current flow. 348fc1f2750SBernard IremongerThe size of the data associated with each traffic flow can vary from 8 bytes to kilobytes. 349fc1f2750SBernard Iremonger 350fc1f2750SBernard Iremonger**Address Resolution Protocol (ARP)** 351fc1f2750SBernard Iremonger 352fc1f2750SBernard Iremonger*Description:* Once a route has been identified for an IP packet (so the output interface and the IP address of the next hop station are known), 353fc1f2750SBernard Iremongerthe MAC address of the next hop station is needed in order to send this packet onto the next leg of the journey 354fc1f2750SBernard Iremongertowards its destination (as identified by its destination IP address). 355fc1f2750SBernard IremongerThe MAC address of the next hop station becomes the destination MAC address of the outgoing Ethernet frame. 356fc1f2750SBernard Iremonger 357fc1f2750SBernard Iremonger*Hash table name:* ARP table 358fc1f2750SBernard Iremonger 359fc1f2750SBernard Iremonger*Number of keys:* Thousands 360fc1f2750SBernard Iremonger 361fc1f2750SBernard Iremonger*Key format:* The pair of (Output interface, Next Hop IP address), which is typically 5 bytes for IPv4 and 17 bytes for IPv6. 362fc1f2750SBernard Iremonger 363fc1f2750SBernard Iremonger*Key value (key data):* MAC address of the next hop station (6 bytes). 364fc1f2750SBernard Iremonger 365fc1f2750SBernard IremongerHash Table Types 366fc1f2750SBernard Iremonger^^^^^^^^^^^^^^^^ 367fc1f2750SBernard Iremonger 3688c9a3374SJohn McNamara:numref:`table_qos_22` lists the hash table configuration parameters shared by all different hash table types. 369fc1f2750SBernard Iremonger 3708c9a3374SJohn McNamara.. _table_qos_22: 371fc1f2750SBernard Iremonger 3728c9a3374SJohn McNamara.. table:: Configuration Parameters Common for All Hash Table Types 373fc1f2750SBernard Iremonger 374fc1f2750SBernard Iremonger +---+---------------------------+------------------------------------------------------------------------------+ 375fc1f2750SBernard Iremonger | # | Parameter | Details | 376fc1f2750SBernard Iremonger | | | | 377fc1f2750SBernard Iremonger +===+===========================+==============================================================================+ 378fc1f2750SBernard Iremonger | 1 | Key size | Measured as number of bytes. All keys have the same size. | 379fc1f2750SBernard Iremonger | | | | 380fc1f2750SBernard Iremonger +---+---------------------------+------------------------------------------------------------------------------+ 381fc1f2750SBernard Iremonger | 2 | Key value (key data) size | Measured as number of bytes. | 382fc1f2750SBernard Iremonger | | | | 383fc1f2750SBernard Iremonger +---+---------------------------+------------------------------------------------------------------------------+ 384fc1f2750SBernard Iremonger | 3 | Number of buckets | Needs to be a power of two. | 385fc1f2750SBernard Iremonger | | | | 386fc1f2750SBernard Iremonger +---+---------------------------+------------------------------------------------------------------------------+ 387fc1f2750SBernard Iremonger | 4 | Maximum number of keys | Needs to be a power of two. | 388fc1f2750SBernard Iremonger | | | | 389fc1f2750SBernard Iremonger +---+---------------------------+------------------------------------------------------------------------------+ 390fc1f2750SBernard Iremonger | 5 | Hash function | Examples: jhash, CRC hash, etc. | 391fc1f2750SBernard Iremonger | | | | 392fc1f2750SBernard Iremonger +---+---------------------------+------------------------------------------------------------------------------+ 393fc1f2750SBernard Iremonger | 6 | Hash function seed | Parameter to be passed to the hash function. | 394fc1f2750SBernard Iremonger | | | | 395fc1f2750SBernard Iremonger +---+---------------------------+------------------------------------------------------------------------------+ 396fc1f2750SBernard Iremonger | 7 | Key offset | Offset of the lookup key byte array within the packet meta-data stored in | 397fc1f2750SBernard Iremonger | | | the packet buffer. | 398fc1f2750SBernard Iremonger | | | | 399fc1f2750SBernard Iremonger +---+---------------------------+------------------------------------------------------------------------------+ 400fc1f2750SBernard Iremonger 401fc1f2750SBernard IremongerBucket Full Problem 402fc1f2750SBernard Iremonger""""""""""""""""""" 403fc1f2750SBernard Iremonger 404fc1f2750SBernard IremongerOn initialization, each hash table bucket is allocated space for exactly 4 keys. 405fc1f2750SBernard IremongerAs keys are added to the table, it can happen that a given bucket already has 4 keys when a new key has to be added to this bucket. 406fc1f2750SBernard IremongerThe possible options are: 407fc1f2750SBernard Iremonger 408fc1f2750SBernard Iremonger#. **Least Recently Used (LRU) Hash Table.** 409fc1f2750SBernard Iremonger One of the existing keys in the bucket is deleted and the new key is added in its place. 410fc1f2750SBernard Iremonger The number of keys in each bucket never grows bigger than 4. The logic to pick the key to be dropped from the bucket is LRU. 411fc1f2750SBernard Iremonger The hash table lookup operation maintains the order in which the keys in the same bucket are hit, so every time a key is hit, 412fc1f2750SBernard Iremonger it becomes the new Most Recently Used (MRU) key, i.e. the last candidate for drop. 413fc1f2750SBernard Iremonger When a key is added to the bucket, it also becomes the new MRU key. 414fc1f2750SBernard Iremonger When a key needs to be picked and dropped, the first candidate for drop, i.e. the current LRU key, is always picked. 415fc1f2750SBernard Iremonger The LRU logic requires maintaining specific data structures per each bucket. 416fc1f2750SBernard Iremonger 417fea1d908SJohn McNamara#. **Extendable Bucket Hash Table.** 418fc1f2750SBernard Iremonger The bucket is extended with space for 4 more keys. 419fc1f2750SBernard Iremonger This is done by allocating additional memory at table initialization time, 420fc1f2750SBernard Iremonger which is used to create a pool of free keys (the size of this pool is configurable and always a multiple of 4). 421fc1f2750SBernard Iremonger On key add operation, the allocation of a group of 4 keys only happens successfully within the limit of free keys, 422fc1f2750SBernard Iremonger otherwise the key add operation fails. 423fc1f2750SBernard Iremonger On key delete operation, a group of 4 keys is freed back to the pool of free keys 424fc1f2750SBernard Iremonger when the key to be deleted is the only key that was used within its group of 4 keys at that time. 425fc1f2750SBernard Iremonger On key lookup operation, if the current bucket is in extended state and a match is not found in the first group of 4 keys, 426fc1f2750SBernard Iremonger the search continues beyond the first group of 4 keys, potentially until all keys in this bucket are examined. 427fea1d908SJohn McNamara The extendable bucket logic requires maintaining specific data structures per table and per each bucket. 428fc1f2750SBernard Iremonger 4298c9a3374SJohn McNamara.. _table_qos_23: 430fc1f2750SBernard Iremonger 4312fe68f32SJohn McNamara.. table:: Configuration Parameters Specific to Extendable Bucket Hash Table 432fc1f2750SBernard Iremonger 433fc1f2750SBernard Iremonger +---+---------------------------+--------------------------------------------------+ 434fc1f2750SBernard Iremonger | # | Parameter | Details | 435fc1f2750SBernard Iremonger | | | | 436fc1f2750SBernard Iremonger +===+===========================+==================================================+ 437fc1f2750SBernard Iremonger | 1 | Number of additional keys | Needs to be a power of two, at least equal to 4. | 438fc1f2750SBernard Iremonger | | | | 439fc1f2750SBernard Iremonger +---+---------------------------+--------------------------------------------------+ 440fc1f2750SBernard Iremonger 441fc1f2750SBernard Iremonger 442fc1f2750SBernard IremongerSignature Computation 443fc1f2750SBernard Iremonger""""""""""""""""""""" 444fc1f2750SBernard Iremonger 445fc1f2750SBernard IremongerThe possible options for key signature computation are: 446fc1f2750SBernard Iremonger 447fc1f2750SBernard Iremonger#. **Pre-computed key signature.** 448fc1f2750SBernard Iremonger The key lookup operation is split between two CPU cores. 449fc1f2750SBernard Iremonger The first CPU core (typically the CPU core that performs packet RX) extracts the key from the input packet, 450fc1f2750SBernard Iremonger computes the key signature and saves both the key and the key signature in the packet buffer as packet meta-data. 451fc1f2750SBernard Iremonger The second CPU core reads both the key and the key signature from the packet meta-data 452fc1f2750SBernard Iremonger and performs the bucket search step of the key lookup operation. 453fc1f2750SBernard Iremonger 454fc1f2750SBernard Iremonger#. **Key signature computed on lookup ("do-sig" version).** 455fc1f2750SBernard Iremonger The same CPU core reads the key from the packet meta-data, uses it to compute the key signature 456fc1f2750SBernard Iremonger and also performs the bucket search step of the key lookup operation. 457fc1f2750SBernard Iremonger 4588c9a3374SJohn McNamara.. _table_qos_24: 459fc1f2750SBernard Iremonger 4608c9a3374SJohn McNamara.. table:: Configuration Parameters Specific to Pre-computed Key Signature Hash Table 461fc1f2750SBernard Iremonger 462fc1f2750SBernard Iremonger +---+------------------+-----------------------------------------------------------------------+ 463fc1f2750SBernard Iremonger | # | Parameter | Details | 464fc1f2750SBernard Iremonger | | | | 465fc1f2750SBernard Iremonger +===+==================+=======================================================================+ 466fc1f2750SBernard Iremonger | 1 | Signature offset | Offset of the pre-computed key signature within the packet meta-data. | 467fc1f2750SBernard Iremonger | | | | 468fc1f2750SBernard Iremonger +---+------------------+-----------------------------------------------------------------------+ 469fc1f2750SBernard Iremonger 470fc1f2750SBernard IremongerKey Size Optimized Hash Tables 471fc1f2750SBernard Iremonger"""""""""""""""""""""""""""""" 472fc1f2750SBernard Iremonger 473fc1f2750SBernard IremongerFor specific key sizes, the data structures and algorithm of key lookup operation can be specially handcrafted for further performance improvements, 474fc1f2750SBernard Iremongerso following options are possible: 475fc1f2750SBernard Iremonger 476fc1f2750SBernard Iremonger#. **Implementation supporting configurable key size.** 477fc1f2750SBernard Iremonger 478fc1f2750SBernard Iremonger#. **Implementation supporting a single key size.** 479fc1f2750SBernard Iremonger Typical key sizes are 8 bytes and 16 bytes. 480fc1f2750SBernard Iremonger 481fc1f2750SBernard IremongerBucket Search Logic for Configurable Key Size Hash Tables 482fc1f2750SBernard Iremonger^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 483fc1f2750SBernard Iremonger 484fc1f2750SBernard IremongerThe performance of the bucket search logic is one of the main factors influencing the performance of the key lookup operation. 485fc1f2750SBernard IremongerThe data structures and algorithm are designed to make the best use of Intel CPU architecture resources like: 486fc1f2750SBernard Iremongercache memory space, cache memory bandwidth, external memory bandwidth, multiple execution units working in parallel, 487fc1f2750SBernard Iremongerout of order instruction execution, special CPU instructions, etc. 488fc1f2750SBernard Iremonger 489fc1f2750SBernard IremongerThe bucket search logic handles multiple input packets in parallel. 490fc1f2750SBernard IremongerIt is built as a pipeline of several stages (3 or 4), with each pipeline stage handling two different packets from the burst of input packets. 491fc1f2750SBernard IremongerOn each pipeline iteration, the packets are pushed to the next pipeline stage: for the 4-stage pipeline, 492fc1f2750SBernard Iremongertwo packets (that just completed stage 3) exit the pipeline, 493fc1f2750SBernard Iremongertwo packets (that just completed stage 2) are now executing stage 3, two packets (that just completed stage 1) are now executing stage 2, 494fc1f2750SBernard Iremongertwo packets (that just completed stage 0) are now executing stage 1 and two packets (next two packets to read from the burst of input packets) 495fc1f2750SBernard Iremongerare entering the pipeline to execute stage 0. 496fc1f2750SBernard IremongerThe pipeline iterations continue until all packets from the burst of input packets execute the last stage of the pipeline. 497fc1f2750SBernard Iremonger 498fc1f2750SBernard IremongerThe bucket search logic is broken into pipeline stages at the boundary of the next memory access. 499fc1f2750SBernard IremongerEach pipeline stage uses data structures that are stored (with high probability) into the L1 or L2 cache memory of the current CPU core and 500fc1f2750SBernard Iremongerbreaks just before the next memory access required by the algorithm. 501fc1f2750SBernard IremongerThe current pipeline stage finalizes by prefetching the data structures required by the next pipeline stage, 502fc1f2750SBernard Iremongerso given enough time for the prefetch to complete, 503fc1f2750SBernard Iremongerwhen the next pipeline stage eventually gets executed for the same packets, 504fc1f2750SBernard Iremongerit will read the data structures it needs from L1 or L2 cache memory and thus avoid the significant penalty incurred by L2 or L3 cache memory miss. 505fc1f2750SBernard Iremonger 506fc1f2750SBernard IremongerBy prefetching the data structures required by the next pipeline stage in advance (before they are used) 507fc1f2750SBernard Iremongerand switching to executing another pipeline stage for different packets, 508fc1f2750SBernard Iremongerthe number of L2 or L3 cache memory misses is greatly reduced, hence one of the main reasons for improved performance. 509fc1f2750SBernard IremongerThis is because the cost of L2/L3 cache memory miss on memory read accesses is high, as usually due to data dependency between instructions, 510fc1f2750SBernard Iremongerthe CPU execution units have to stall until the read operation is completed from L3 cache memory or external DRAM memory. 511fc1f2750SBernard IremongerBy using prefetch instructions, the latency of memory read accesses is hidden, 512e4e1f2f7SFlore Norceideprovided that it is performed early enough before the respective data structure is actually used. 513fc1f2750SBernard Iremonger 514fc1f2750SBernard IremongerBy splitting the processing into several stages that are executed on different packets (the packets from the input burst are interlaced), 515fc1f2750SBernard Iremongerenough work is created to allow the prefetch instructions to complete successfully (before the prefetched data structures are actually accessed) and 516fc1f2750SBernard Iremongeralso the data dependency between instructions is loosened. 517fc1f2750SBernard IremongerFor example, for the 4-stage pipeline, stage 0 is executed on packets 0 and 1 and then, 518fc1f2750SBernard Iremongerbefore same packets 0 and 1 are used (i.e. before stage 1 is executed on packets 0 and 1), 519fc1f2750SBernard Iremongerdifferent packets are used: packets 2 and 3 (executing stage 1), packets 4 and 5 (executing stage 2) and packets 6 and 7 (executing stage 3). 520fc1f2750SBernard IremongerBy executing useful work while the data structures are brought into the L1 or L2 cache memory, the latency of the read memory accesses is hidden. 521fc1f2750SBernard IremongerBy increasing the gap between two consecutive accesses to the same data structure, the data dependency between instructions is loosened; 522fc1f2750SBernard Iremongerthis allows making the best use of the super-scalar and out-of-order execution CPU architecture, 523fc1f2750SBernard Iremongeras the number of CPU core execution units that are active (rather than idle or stalled due to data dependency constraints between instructions) is maximized. 524fc1f2750SBernard Iremonger 525fc1f2750SBernard IremongerThe bucket search logic is also implemented without using any branch instructions. 526fc1f2750SBernard IremongerThis avoids the important cost associated with flushing the CPU core execution pipeline on every instance of branch misprediction. 527fc1f2750SBernard Iremonger 528fc1f2750SBernard IremongerConfigurable Key Size Hash Table 529fc1f2750SBernard Iremonger"""""""""""""""""""""""""""""""" 530fc1f2750SBernard Iremonger 5318c9a3374SJohn McNamara:numref:`figure_figure34`, :numref:`table_qos_25` and :numref:`table_qos_26` detail the main data structures used to implement configurable key size hash tables (either LRU or extendable bucket, 532fc1f2750SBernard Iremongereither with pre-computed signature or "do-sig"). 533fc1f2750SBernard Iremonger 5344a22e6eeSJohn McNamara.. _figure_figure34: 535fc1f2750SBernard Iremonger 5364a22e6eeSJohn McNamara.. figure:: img/figure34.* 537fc1f2750SBernard Iremonger 5384a22e6eeSJohn McNamara Data Structures for Configurable Key Size Hash Tables 539fc1f2750SBernard Iremonger 540fc1f2750SBernard Iremonger 5418c9a3374SJohn McNamara.. _table_qos_25: 542fc1f2750SBernard Iremonger 5438c9a3374SJohn McNamara.. table:: Main Large Data Structures (Arrays) used for Configurable Key Size Hash Tables 544fc1f2750SBernard Iremonger 545fc1f2750SBernard Iremonger +---+-------------------------+------------------------------+---------------------------+-------------------------------+ 546fc1f2750SBernard Iremonger | # | Array name | Number of entries | Entry size (bytes) | Description | 547fc1f2750SBernard Iremonger | | | | | | 548fc1f2750SBernard Iremonger +===+=========================+==============================+===========================+===============================+ 549fc1f2750SBernard Iremonger | 1 | Bucket array | n_buckets (configurable) | 32 | Buckets of the hash table. | 550fc1f2750SBernard Iremonger | | | | | | 551fc1f2750SBernard Iremonger +---+-------------------------+------------------------------+---------------------------+-------------------------------+ 552fc1f2750SBernard Iremonger | 2 | Bucket extensions array | n_buckets_ext (configurable) | 32 | This array is only created | 5532fe68f32SJohn McNamara | | | | | for extendable bucket tables. | 554fc1f2750SBernard Iremonger | | | | | | 555fc1f2750SBernard Iremonger +---+-------------------------+------------------------------+---------------------------+-------------------------------+ 556fc1f2750SBernard Iremonger | 3 | Key array | n_keys | key_size (configurable) | Keys added to the hash table. | 557fc1f2750SBernard Iremonger | | | | | | 558fc1f2750SBernard Iremonger +---+-------------------------+------------------------------+---------------------------+-------------------------------+ 559fc1f2750SBernard Iremonger | 4 | Data array | n_keys | entry_size (configurable) | Key values (key data) | 560fc1f2750SBernard Iremonger | | | | | associated with the hash | 561fc1f2750SBernard Iremonger | | | | | table keys. | 562fc1f2750SBernard Iremonger | | | | | | 563fc1f2750SBernard Iremonger +---+-------------------------+------------------------------+---------------------------+-------------------------------+ 564fc1f2750SBernard Iremonger 5658c9a3374SJohn McNamara.. _table_qos_26: 566fc1f2750SBernard Iremonger 5678c9a3374SJohn McNamara.. table:: Field Description for Bucket Array Entry (Configurable Key Size Hash Tables) 568fc1f2750SBernard Iremonger 569fc1f2750SBernard Iremonger +---+------------------+--------------------+------------------------------------------------------------------+ 570fc1f2750SBernard Iremonger | # | Field name | Field size (bytes) | Description | 571fc1f2750SBernard Iremonger | | | | | 572fc1f2750SBernard Iremonger +===+==================+====================+==================================================================+ 573fc1f2750SBernard Iremonger | 1 | Next Ptr/LRU | 8 | For LRU tables, this fields represents the LRU list for the | 574fc1f2750SBernard Iremonger | | | | current bucket stored as array of 4 entries of 2 bytes each. | 575fc1f2750SBernard Iremonger | | | | Entry 0 stores the index (0 .. 3) of the MRU key, while entry 3 | 576fc1f2750SBernard Iremonger | | | | stores the index of the LRU key. | 577fc1f2750SBernard Iremonger | | | | | 5782fe68f32SJohn McNamara | | | | For extendable bucket tables, this field represents the next | 579fc1f2750SBernard Iremonger | | | | pointer (i.e. the pointer to the next group of 4 keys linked to | 580fc1f2750SBernard Iremonger | | | | the current bucket). The next pointer is not NULL if the bucket | 581fc1f2750SBernard Iremonger | | | | is currently extended or NULL otherwise. | 582fc1f2750SBernard Iremonger | | | | To help the branchless implementation, bit 0 (least significant | 583fc1f2750SBernard Iremonger | | | | bit) of this field is set to 1 if the next pointer is not NULL | 584fc1f2750SBernard Iremonger | | | | and to 0 otherwise. | 585fc1f2750SBernard Iremonger | | | | | 586fc1f2750SBernard Iremonger +---+------------------+--------------------+------------------------------------------------------------------+ 587fc1f2750SBernard Iremonger | 2 | Sig[0 .. 3] | 4 x 2 | If key X (X = 0 .. 3) is valid, then sig X bits 15 .. 1 store | 588fc1f2750SBernard Iremonger | | | | the most significant 15 bits of key X signature and sig X bit 0 | 589fc1f2750SBernard Iremonger | | | | is set to 1. | 590fc1f2750SBernard Iremonger | | | | | 591fc1f2750SBernard Iremonger | | | | If key X is not valid, then sig X is set to zero. | 592fc1f2750SBernard Iremonger | | | | | 593fc1f2750SBernard Iremonger +---+------------------+--------------------+------------------------------------------------------------------+ 594fc1f2750SBernard Iremonger | 3 | Key Pos [0 .. 3] | 4 x 4 | If key X is valid (X = 0 .. 3), then Key Pos X represents the | 595fc1f2750SBernard Iremonger | | | | index into the key array where key X is stored, as well as the | 596fc1f2750SBernard Iremonger | | | | index into the data array where the value associated with key X | 597fc1f2750SBernard Iremonger | | | | is stored. | 598fc1f2750SBernard Iremonger | | | | | 599fc1f2750SBernard Iremonger | | | | If key X is not valid, then the value of Key Pos X is undefined. | 600fc1f2750SBernard Iremonger | | | | | 601fc1f2750SBernard Iremonger +---+------------------+--------------------+------------------------------------------------------------------+ 602fc1f2750SBernard Iremonger 603fc1f2750SBernard Iremonger 6048c9a3374SJohn McNamara:numref:`figure_figure35` and :numref:`table_qos_27` detail the bucket search pipeline stages (either LRU or extendable bucket, 605fc1f2750SBernard Iremongereither with pre-computed signature or "do-sig"). 606fc1f2750SBernard IremongerFor each pipeline stage, the described operations are applied to each of the two packets handled by that stage. 607fc1f2750SBernard Iremonger 6084a22e6eeSJohn McNamara.. _figure_figure35: 609fc1f2750SBernard Iremonger 6104a22e6eeSJohn McNamara.. figure:: img/figure35.* 611fc1f2750SBernard Iremonger 6124a22e6eeSJohn McNamara Bucket Search Pipeline for Key Lookup Operation (Configurable Key Size Hash 6134a22e6eeSJohn McNamara Tables) 6144a22e6eeSJohn McNamara 615fc1f2750SBernard Iremonger 6168c9a3374SJohn McNamara.. _table_qos_27: 617fc1f2750SBernard Iremonger 6188c9a3374SJohn McNamara.. table:: Description of the Bucket Search Pipeline Stages (Configurable Key Size Hash Tables) 619fc1f2750SBernard Iremonger 620fc1f2750SBernard Iremonger +---+---------------------------+------------------------------------------------------------------------------+ 621fc1f2750SBernard Iremonger | # | Stage name | Description | 622fc1f2750SBernard Iremonger | | | | 623fc1f2750SBernard Iremonger +===+===========================+==============================================================================+ 624fc1f2750SBernard Iremonger | 0 | Prefetch packet meta-data | Select next two packets from the burst of input packets. | 625fc1f2750SBernard Iremonger | | | | 626fc1f2750SBernard Iremonger | | | Prefetch packet meta-data containing the key and key signature. | 627fc1f2750SBernard Iremonger | | | | 628fc1f2750SBernard Iremonger +---+---------------------------+------------------------------------------------------------------------------+ 629fc1f2750SBernard Iremonger | 1 | Prefetch table bucket | Read the key signature from the packet meta-data (for extendable bucket hash | 630fc1f2750SBernard Iremonger | | | tables) or read the key from the packet meta-data and compute key signature | 631fc1f2750SBernard Iremonger | | | (for LRU tables). | 632fc1f2750SBernard Iremonger | | | | 633fc1f2750SBernard Iremonger | | | Identify the bucket ID using the key signature. | 634fc1f2750SBernard Iremonger | | | | 635fc1f2750SBernard Iremonger | | | Set bit 0 of the signature to 1 (to match only signatures of valid keys from | 636fc1f2750SBernard Iremonger | | | the table). | 637fc1f2750SBernard Iremonger | | | | 638fc1f2750SBernard Iremonger | | | Prefetch the bucket. | 639fc1f2750SBernard Iremonger | | | | 640fc1f2750SBernard Iremonger +---+---------------------------+------------------------------------------------------------------------------+ 641fc1f2750SBernard Iremonger | 2 | Prefetch table key | Read the key signatures from the bucket. | 642fc1f2750SBernard Iremonger | | | | 643fc1f2750SBernard Iremonger | | | Compare the signature of the input key against the 4 key signatures from the | 644fc1f2750SBernard Iremonger | | | packet. As result, the following is obtained: | 645fc1f2750SBernard Iremonger | | | | 646fc1f2750SBernard Iremonger | | | *match* | 647fc1f2750SBernard Iremonger | | | = equal to TRUE if there was at least one signature match and to FALSE in | 648fc1f2750SBernard Iremonger | | | the case of no signature match; | 649fc1f2750SBernard Iremonger | | | | 650fc1f2750SBernard Iremonger | | | *match_many* | 651fc1f2750SBernard Iremonger | | | = equal to TRUE is there were more than one signature matches (can be up to | 652fc1f2750SBernard Iremonger | | | 4 signature matches in the worst case scenario) and to FALSE otherwise; | 653fc1f2750SBernard Iremonger | | | | 654fc1f2750SBernard Iremonger | | | *match_pos* | 655fc1f2750SBernard Iremonger | | | = the index of the first key that produced signature match (only valid if | 656fc1f2750SBernard Iremonger | | | match is true). | 657fc1f2750SBernard Iremonger | | | | 658fc1f2750SBernard Iremonger | | | For extendable bucket hash tables only, set | 659fc1f2750SBernard Iremonger | | | *match_many* | 660fc1f2750SBernard Iremonger | | | to TRUE if next pointer is valid. | 661fc1f2750SBernard Iremonger | | | | 662fc1f2750SBernard Iremonger | | | Prefetch the bucket key indicated by | 663fc1f2750SBernard Iremonger | | | *match_pos* | 664fc1f2750SBernard Iremonger | | | (even if | 665fc1f2750SBernard Iremonger | | | *match_pos* | 666fc1f2750SBernard Iremonger | | | does not point to valid key valid). | 667fc1f2750SBernard Iremonger | | | | 668fc1f2750SBernard Iremonger +---+---------------------------+------------------------------------------------------------------------------+ 669fc1f2750SBernard Iremonger | 3 | Prefetch table data | Read the bucket key indicated by | 670fc1f2750SBernard Iremonger | | | *match_pos*. | 671fc1f2750SBernard Iremonger | | | | 672fc1f2750SBernard Iremonger | | | Compare the bucket key against the input key. As result, the following is | 673fc1f2750SBernard Iremonger | | | obtained: | 674fc1f2750SBernard Iremonger | | | *match_key* | 675fc1f2750SBernard Iremonger | | | = equal to TRUE if the two keys match and to FALSE otherwise. | 676fc1f2750SBernard Iremonger | | | | 677fc1f2750SBernard Iremonger | | | Report input key as lookup hit only when both | 678fc1f2750SBernard Iremonger | | | *match* | 679fc1f2750SBernard Iremonger | | | and | 680fc1f2750SBernard Iremonger | | | *match_key* | 681fc1f2750SBernard Iremonger | | | are equal to TRUE and as lookup miss otherwise. | 682fc1f2750SBernard Iremonger | | | | 683fc1f2750SBernard Iremonger | | | For LRU tables only, use branchless logic to update the bucket LRU list | 684fc1f2750SBernard Iremonger | | | (the current key becomes the new MRU) only on lookup hit. | 685fc1f2750SBernard Iremonger | | | | 686fc1f2750SBernard Iremonger | | | Prefetch the key value (key data) associated with the current key (to avoid | 687fc1f2750SBernard Iremonger | | | branches, this is done on both lookup hit and miss). | 688fc1f2750SBernard Iremonger | | | | 689fc1f2750SBernard Iremonger +---+---------------------------+------------------------------------------------------------------------------+ 690fc1f2750SBernard Iremonger 691fc1f2750SBernard Iremonger 692fc1f2750SBernard IremongerAdditional notes: 693fc1f2750SBernard Iremonger 694fc1f2750SBernard Iremonger#. The pipelined version of the bucket search algorithm is executed only if there are at least 7 packets in the burst of input packets. 695fc1f2750SBernard Iremonger If there are less than 7 packets in the burst of input packets, 696fc1f2750SBernard Iremonger a non-optimized implementation of the bucket search algorithm is executed. 697fc1f2750SBernard Iremonger 698fc1f2750SBernard Iremonger#. Once the pipelined version of the bucket search algorithm has been executed for all the packets in the burst of input packets, 699fc1f2750SBernard Iremonger the non-optimized implementation of the bucket search algorithm is also executed for any packets that did not produce a lookup hit, 700fc1f2750SBernard Iremonger but have the *match_many* flag set. 701fc1f2750SBernard Iremonger As result of executing the non-optimized version, some of these packets may produce a lookup hit or lookup miss. 702fc1f2750SBernard Iremonger This does not impact the performance of the key lookup operation, 703fc1f2750SBernard Iremonger as the probability of matching more than one signature in the same group of 4 keys or of having the bucket in extended state 704fc1f2750SBernard Iremonger (for extendable bucket hash tables only) is relatively small. 705fc1f2750SBernard Iremonger 706fc1f2750SBernard Iremonger**Key Signature Comparison Logic** 707fc1f2750SBernard Iremonger 7088c9a3374SJohn McNamaraThe key signature comparison logic is described in :numref:`table_qos_28`. 709fc1f2750SBernard Iremonger 7108c9a3374SJohn McNamara.. _table_qos_28: 711fc1f2750SBernard Iremonger 7128c9a3374SJohn McNamara.. table:: Lookup Tables for Match, Match_Many and Match_Pos 713fc1f2750SBernard Iremonger 714fc1f2750SBernard Iremonger +----+------+---------------+--------------------+--------------------+ 715fc1f2750SBernard Iremonger | # | mask | match (1 bit) | match_many (1 bit) | match_pos (2 bits) | 716fc1f2750SBernard Iremonger | | | | | | 717fc1f2750SBernard Iremonger +----+------+---------------+--------------------+--------------------+ 718fc1f2750SBernard Iremonger | 0 | 0000 | 0 | 0 | 00 | 719fc1f2750SBernard Iremonger | | | | | | 720fc1f2750SBernard Iremonger +----+------+---------------+--------------------+--------------------+ 721fc1f2750SBernard Iremonger | 1 | 0001 | 1 | 0 | 00 | 722fc1f2750SBernard Iremonger | | | | | | 723fc1f2750SBernard Iremonger +----+------+---------------+--------------------+--------------------+ 724fc1f2750SBernard Iremonger | 2 | 0010 | 1 | 0 | 01 | 725fc1f2750SBernard Iremonger | | | | | | 726fc1f2750SBernard Iremonger +----+------+---------------+--------------------+--------------------+ 727fc1f2750SBernard Iremonger | 3 | 0011 | 1 | 1 | 00 | 728fc1f2750SBernard Iremonger | | | | | | 729fc1f2750SBernard Iremonger +----+------+---------------+--------------------+--------------------+ 730fc1f2750SBernard Iremonger | 4 | 0100 | 1 | 0 | 10 | 731fc1f2750SBernard Iremonger | | | | | | 732fc1f2750SBernard Iremonger +----+------+---------------+--------------------+--------------------+ 733fc1f2750SBernard Iremonger | 5 | 0101 | 1 | 1 | 00 | 734fc1f2750SBernard Iremonger | | | | | | 735fc1f2750SBernard Iremonger +----+------+---------------+--------------------+--------------------+ 736fc1f2750SBernard Iremonger | 6 | 0110 | 1 | 1 | 01 | 737fc1f2750SBernard Iremonger | | | | | | 738fc1f2750SBernard Iremonger +----+------+---------------+--------------------+--------------------+ 739fc1f2750SBernard Iremonger | 7 | 0111 | 1 | 1 | 00 | 740fc1f2750SBernard Iremonger | | | | | | 741fc1f2750SBernard Iremonger +----+------+---------------+--------------------+--------------------+ 742fc1f2750SBernard Iremonger | 8 | 1000 | 1 | 0 | 11 | 743fc1f2750SBernard Iremonger | | | | | | 744fc1f2750SBernard Iremonger +----+------+---------------+--------------------+--------------------+ 745fc1f2750SBernard Iremonger | 9 | 1001 | 1 | 1 | 00 | 746fc1f2750SBernard Iremonger | | | | | | 747fc1f2750SBernard Iremonger +----+------+---------------+--------------------+--------------------+ 748fc1f2750SBernard Iremonger | 10 | 1010 | 1 | 1 | 01 | 749fc1f2750SBernard Iremonger | | | | | | 750fc1f2750SBernard Iremonger +----+------+---------------+--------------------+--------------------+ 751fc1f2750SBernard Iremonger | 11 | 1011 | 1 | 1 | 00 | 752fc1f2750SBernard Iremonger | | | | | | 753fc1f2750SBernard Iremonger +----+------+---------------+--------------------+--------------------+ 754fc1f2750SBernard Iremonger | 12 | 1100 | 1 | 1 | 10 | 755fc1f2750SBernard Iremonger | | | | | | 756fc1f2750SBernard Iremonger +----+------+---------------+--------------------+--------------------+ 757fc1f2750SBernard Iremonger | 13 | 1101 | 1 | 1 | 00 | 758fc1f2750SBernard Iremonger | | | | | | 759fc1f2750SBernard Iremonger +----+------+---------------+--------------------+--------------------+ 760fc1f2750SBernard Iremonger | 14 | 1110 | 1 | 1 | 01 | 761fc1f2750SBernard Iremonger | | | | | | 762fc1f2750SBernard Iremonger +----+------+---------------+--------------------+--------------------+ 763fc1f2750SBernard Iremonger | 15 | 1111 | 1 | 1 | 00 | 764fc1f2750SBernard Iremonger | | | | | | 765fc1f2750SBernard Iremonger +----+------+---------------+--------------------+--------------------+ 766fc1f2750SBernard Iremonger 767fc1f2750SBernard IremongerThe input *mask* hash bit X (X = 0 .. 3) set to 1 if input signature is equal to bucket signature X and set to 0 otherwise. 768fc1f2750SBernard IremongerThe outputs *match*, *match_many* and *match_pos* are 1 bit, 1 bit and 2 bits in size respectively and their meaning has been explained above. 769fc1f2750SBernard Iremonger 7708c9a3374SJohn McNamaraAs displayed in :numref:`table_qos_29`, the lookup tables for *match* and *match_many* can be collapsed into a single 32-bit value and the lookup table for 771fc1f2750SBernard Iremonger*match_pos* can be collapsed into a 64-bit value. 772fc1f2750SBernard IremongerGiven the input *mask*, the values for *match*, *match_many* and *match_pos* can be obtained by indexing their respective bit array to extract 1 bit, 773fc1f2750SBernard Iremonger1 bit and 2 bits respectively with branchless logic. 774fc1f2750SBernard Iremonger 7758c9a3374SJohn McNamara.. _table_qos_29: 776fc1f2750SBernard Iremonger 7778c9a3374SJohn McNamara.. table:: Collapsed Lookup Tables for Match, Match_Many and Match_Pos 778fc1f2750SBernard Iremonger 779fc1f2750SBernard Iremonger +------------+------------------------------------------+-------------------+ 780fc1f2750SBernard Iremonger | | Bit array | Hexadecimal value | 781fc1f2750SBernard Iremonger | | | | 782fc1f2750SBernard Iremonger +------------+------------------------------------------+-------------------+ 783fc1f2750SBernard Iremonger | match | 1111_1111_1111_1110 | 0xFFFELLU | 784fc1f2750SBernard Iremonger | | | | 785fc1f2750SBernard Iremonger +------------+------------------------------------------+-------------------+ 786fc1f2750SBernard Iremonger | match_many | 1111_1110_1110_1000 | 0xFEE8LLU | 787fc1f2750SBernard Iremonger | | | | 788fc1f2750SBernard Iremonger +------------+------------------------------------------+-------------------+ 789fc1f2750SBernard Iremonger | match_pos | 0001_0010_0001_0011__0001_0010_0001_0000 | 0x12131210LLU | 790fc1f2750SBernard Iremonger | | | | 791fc1f2750SBernard Iremonger +------------+------------------------------------------+-------------------+ 792fc1f2750SBernard Iremonger 793fc1f2750SBernard Iremonger 7944a22e6eeSJohn McNamaraThe pseudo-code for match, match_many and match_pos is:: 795fc1f2750SBernard Iremonger 796fc1f2750SBernard Iremonger match = (0xFFFELLU >> mask) & 1; 797fc1f2750SBernard Iremonger 798fc1f2750SBernard Iremonger match_many = (0xFEE8LLU >> mask) & 1; 799fc1f2750SBernard Iremonger 800fc1f2750SBernard Iremonger match_pos = (0x12131210LLU >> (mask << 1)) & 3; 801fc1f2750SBernard Iremonger 802fc1f2750SBernard IremongerSingle Key Size Hash Tables 803fc1f2750SBernard Iremonger""""""""""""""""""""""""""" 804fc1f2750SBernard Iremonger 8058c9a3374SJohn McNamara:numref:`figure_figure37`, :numref:`figure_figure38`, :numref:`table_qos_30` and :numref:`table_qos_31` detail the main data structures used to implement 8-byte and 16-byte key hash tables 806fc1f2750SBernard Iremonger(either LRU or extendable bucket, either with pre-computed signature or "do-sig"). 807fc1f2750SBernard Iremonger 8084a22e6eeSJohn McNamara.. _figure_figure37: 809fc1f2750SBernard Iremonger 8104a22e6eeSJohn McNamara.. figure:: img/figure37.* 811fc1f2750SBernard Iremonger 8124a22e6eeSJohn McNamara Data Structures for 8-byte Key Hash Tables 813fc1f2750SBernard Iremonger 814fc1f2750SBernard Iremonger 8154a22e6eeSJohn McNamara.. _figure_figure38: 816fc1f2750SBernard Iremonger 8174a22e6eeSJohn McNamara.. figure:: img/figure38.* 818fc1f2750SBernard Iremonger 8194a22e6eeSJohn McNamara Data Structures for 16-byte Key Hash Tables 820fc1f2750SBernard Iremonger 821fc1f2750SBernard Iremonger 8228c9a3374SJohn McNamara.. _table_qos_30: 823fc1f2750SBernard Iremonger 8248c9a3374SJohn McNamara.. table:: Main Large Data Structures (Arrays) used for 8-byte and 16-byte Key Size Hash Tables 825fc1f2750SBernard Iremonger 826fc1f2750SBernard Iremonger +---+-------------------------+------------------------------+----------------------+------------------------------------+ 827fc1f2750SBernard Iremonger | # | Array name | Number of entries | Entry size (bytes) | Description | 828fc1f2750SBernard Iremonger | | | | | | 829fc1f2750SBernard Iremonger +===+=========================+==============================+======================+====================================+ 830fc1f2750SBernard Iremonger | 1 | Bucket array | n_buckets (configurable) | *8-byte key size:* | Buckets of the hash table. | 831fc1f2750SBernard Iremonger | | | | | | 832fc1f2750SBernard Iremonger | | | | 64 + 4 x entry_size | | 833fc1f2750SBernard Iremonger | | | | | | 834fc1f2750SBernard Iremonger | | | | | | 835fc1f2750SBernard Iremonger | | | | *16-byte key size:* | | 836fc1f2750SBernard Iremonger | | | | | | 837fc1f2750SBernard Iremonger | | | | 128 + 4 x entry_size | | 838fc1f2750SBernard Iremonger | | | | | | 839fc1f2750SBernard Iremonger +---+-------------------------+------------------------------+----------------------+------------------------------------+ 840fc1f2750SBernard Iremonger | 2 | Bucket extensions array | n_buckets_ext (configurable) | *8-byte key size:* | This array is only created for | 8412fe68f32SJohn McNamara | | | | | extendable bucket tables. | 842fc1f2750SBernard Iremonger | | | | | | 843fc1f2750SBernard Iremonger | | | | 64 + 4 x entry_size | | 844fc1f2750SBernard Iremonger | | | | | | 845fc1f2750SBernard Iremonger | | | | | | 846fc1f2750SBernard Iremonger | | | | *16-byte key size:* | | 847fc1f2750SBernard Iremonger | | | | | | 848fc1f2750SBernard Iremonger | | | | 128 + 4 x entry_size | | 849fc1f2750SBernard Iremonger | | | | | | 850fc1f2750SBernard Iremonger +---+-------------------------+------------------------------+----------------------+------------------------------------+ 851fc1f2750SBernard Iremonger 8528c9a3374SJohn McNamara.. _table_qos_31: 853fc1f2750SBernard Iremonger 8548c9a3374SJohn McNamara.. table:: Field Description for Bucket Array Entry (8-byte and 16-byte Key Hash Tables) 855fc1f2750SBernard Iremonger 856fc1f2750SBernard Iremonger +---+---------------+--------------------+-------------------------------------------------------------------------------+ 857fc1f2750SBernard Iremonger | # | Field name | Field size (bytes) | Description | 858fc1f2750SBernard Iremonger | | | | | 859fc1f2750SBernard Iremonger +===+===============+====================+===============================================================================+ 860fc1f2750SBernard Iremonger | 1 | Valid | 8 | Bit X (X = 0 .. 3) is set to 1 if key X is valid or to 0 otherwise. | 861fc1f2750SBernard Iremonger | | | | | 8622fe68f32SJohn McNamara | | | | Bit 4 is only used for extendable bucket tables to help with the | 863fc1f2750SBernard Iremonger | | | | implementation of the branchless logic. In this case, bit 4 is set to 1 if | 864fc1f2750SBernard Iremonger | | | | next pointer is valid (not NULL) or to 0 otherwise. | 865fc1f2750SBernard Iremonger | | | | | 866fc1f2750SBernard Iremonger +---+---------------+--------------------+-------------------------------------------------------------------------------+ 867fc1f2750SBernard Iremonger | 2 | Next Ptr/LRU | 8 | For LRU tables, this fields represents the LRU list for the current bucket | 868fc1f2750SBernard Iremonger | | | | stored as array of 4 entries of 2 bytes each. Entry 0 stores the index | 869fc1f2750SBernard Iremonger | | | | (0 .. 3) of the MRU key, while entry 3 stores the index of the LRU key. | 870fc1f2750SBernard Iremonger | | | | | 8712fe68f32SJohn McNamara | | | | For extendable bucket tables, this field represents the next pointer (i.e. | 872fc1f2750SBernard Iremonger | | | | the pointer to the next group of 4 keys linked to the current bucket). The | 873fc1f2750SBernard Iremonger | | | | next pointer is not NULL if the bucket is currently extended or NULL | 874fc1f2750SBernard Iremonger | | | | otherwise. | 875fc1f2750SBernard Iremonger | | | | | 876fc1f2750SBernard Iremonger +---+---------------+--------------------+-------------------------------------------------------------------------------+ 877fc1f2750SBernard Iremonger | 3 | Key [0 .. 3] | 4 x key_size | Full keys. | 878fc1f2750SBernard Iremonger | | | | | 879fc1f2750SBernard Iremonger +---+---------------+--------------------+-------------------------------------------------------------------------------+ 880fc1f2750SBernard Iremonger | 4 | Data [0 .. 3] | 4 x entry_size | Full key values (key data) associated with keys 0 .. 3. | 881fc1f2750SBernard Iremonger | | | | | 882fc1f2750SBernard Iremonger +---+---------------+--------------------+-------------------------------------------------------------------------------+ 883fc1f2750SBernard Iremonger 884fc1f2750SBernard Iremongerand detail the bucket search pipeline used to implement 8-byte and 16-byte key hash tables (either LRU or extendable bucket, 885fc1f2750SBernard Iremongereither with pre-computed signature or "do-sig"). 886fc1f2750SBernard IremongerFor each pipeline stage, the described operations are applied to each of the two packets handled by that stage. 887fc1f2750SBernard Iremonger 8884a22e6eeSJohn McNamara.. _figure_figure39: 889fc1f2750SBernard Iremonger 8904a22e6eeSJohn McNamara.. figure:: img/figure39.* 891fc1f2750SBernard Iremonger 8924a22e6eeSJohn McNamara Bucket Search Pipeline for Key Lookup Operation (Single Key Size Hash 8934a22e6eeSJohn McNamara Tables) 8944a22e6eeSJohn McNamara 895fc1f2750SBernard Iremonger 8968c9a3374SJohn McNamara.. _table_qos_32: 897fc1f2750SBernard Iremonger 8988c9a3374SJohn McNamara.. table:: Description of the Bucket Search Pipeline Stages (8-byte and 16-byte Key Hash Tables) 899fc1f2750SBernard Iremonger 900fc1f2750SBernard Iremonger +---+---------------------------+-----------------------------------------------------------------------------+ 901fc1f2750SBernard Iremonger | # | Stage name | Description | 902fc1f2750SBernard Iremonger | | | | 903fc1f2750SBernard Iremonger +===+===========================+=============================================================================+ 904fc1f2750SBernard Iremonger | 0 | Prefetch packet meta-data | #. Select next two packets from the burst of input packets. | 905fc1f2750SBernard Iremonger | | | | 906fc1f2750SBernard Iremonger | | | #. Prefetch packet meta-data containing the key and key signature. | 907fc1f2750SBernard Iremonger | | | | 908fc1f2750SBernard Iremonger +---+---------------------------+-----------------------------------------------------------------------------+ 909fc1f2750SBernard Iremonger | 1 | Prefetch table bucket | #. Read the key signature from the packet meta-data (for extendable bucket | 910fc1f2750SBernard Iremonger | | | hash tables) or read the key from the packet meta-data and compute key | 911fc1f2750SBernard Iremonger | | | signature (for LRU tables). | 912fc1f2750SBernard Iremonger | | | | 913fc1f2750SBernard Iremonger | | | #. Identify the bucket ID using the key signature. | 914fc1f2750SBernard Iremonger | | | | 915fc1f2750SBernard Iremonger | | | #. Prefetch the bucket. | 916fc1f2750SBernard Iremonger | | | | 917fc1f2750SBernard Iremonger +---+---------------------------+-----------------------------------------------------------------------------+ 918fc1f2750SBernard Iremonger | 2 | Prefetch table data | #. Read the bucket. | 919fc1f2750SBernard Iremonger | | | | 920fc1f2750SBernard Iremonger | | | #. Compare all 4 bucket keys against the input key. | 921fc1f2750SBernard Iremonger | | | | 922fc1f2750SBernard Iremonger | | | #. Report input key as lookup hit only when a match is identified (more | 923fc1f2750SBernard Iremonger | | | than one key match is not possible) | 924fc1f2750SBernard Iremonger | | | | 925fc1f2750SBernard Iremonger | | | #. For LRU tables only, use branchless logic to update the bucket LRU list | 926fc1f2750SBernard Iremonger | | | (the current key becomes the new MRU) only on lookup hit. | 927fc1f2750SBernard Iremonger | | | | 928fc1f2750SBernard Iremonger | | | #. Prefetch the key value (key data) associated with the matched key (to | 929fc1f2750SBernard Iremonger | | | avoid branches, this is done on both lookup hit and miss). | 930fc1f2750SBernard Iremonger | | | | 931fc1f2750SBernard Iremonger +---+---------------------------+-----------------------------------------------------------------------------+ 932fc1f2750SBernard Iremonger 933fc1f2750SBernard IremongerAdditional notes: 934fc1f2750SBernard Iremonger 935fc1f2750SBernard Iremonger#. The pipelined version of the bucket search algorithm is executed only if there are at least 5 packets in the burst of input packets. 936fc1f2750SBernard Iremonger If there are less than 5 packets in the burst of input packets, a non-optimized implementation of the bucket search algorithm is executed. 937fc1f2750SBernard Iremonger 938fea1d908SJohn McNamara#. For extendable bucket hash tables only, 939fc1f2750SBernard Iremonger once the pipelined version of the bucket search algorithm has been executed for all the packets in the burst of input packets, 940fc1f2750SBernard Iremonger the non-optimized implementation of the bucket search algorithm is also executed for any packets that did not produce a lookup hit, 941fc1f2750SBernard Iremonger but have the bucket in extended state. 942fc1f2750SBernard Iremonger As result of executing the non-optimized version, some of these packets may produce a lookup hit or lookup miss. 943fc1f2750SBernard Iremonger This does not impact the performance of the key lookup operation, 944fc1f2750SBernard Iremonger as the probability of having the bucket in extended state is relatively small. 945fc1f2750SBernard Iremonger 946fc1f2750SBernard IremongerPipeline Library Design 947fc1f2750SBernard Iremonger----------------------- 948fc1f2750SBernard Iremonger 949fc1f2750SBernard IremongerA pipeline is defined by: 950fc1f2750SBernard Iremonger 951fc1f2750SBernard Iremonger#. The set of input ports; 952fc1f2750SBernard Iremonger 953fc1f2750SBernard Iremonger#. The set of output ports; 954fc1f2750SBernard Iremonger 955fc1f2750SBernard Iremonger#. The set of tables; 956fc1f2750SBernard Iremonger 957fc1f2750SBernard Iremonger#. The set of actions. 958fc1f2750SBernard Iremonger 959fc1f2750SBernard IremongerThe input ports are connected with the output ports through tree-like topologies of interconnected tables. 960fc1f2750SBernard IremongerThe table entries contain the actions defining the operations to be executed on the input packets and the packet flow within the pipeline. 961fc1f2750SBernard Iremonger 962fc1f2750SBernard IremongerConnectivity of Ports and Tables 963fc1f2750SBernard Iremonger~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 964fc1f2750SBernard Iremonger 965fc1f2750SBernard IremongerTo avoid any dependencies on the order in which pipeline elements are created, 966fc1f2750SBernard Iremongerthe connectivity of pipeline elements is defined after all the pipeline input ports, 967fc1f2750SBernard Iremongeroutput ports and tables have been created. 968fc1f2750SBernard Iremonger 969fc1f2750SBernard IremongerGeneral connectivity rules: 970fc1f2750SBernard Iremonger 971fc1f2750SBernard Iremonger#. Each input port is connected to a single table. No input port should be left unconnected; 972fc1f2750SBernard Iremonger 973fc1f2750SBernard Iremonger#. The table connectivity to other tables or to output ports is regulated by the next hop actions of each table entry and the default table entry. 974fc1f2750SBernard Iremonger The table connectivity is fluid, as the table entries and the default table entry can be updated during run-time. 975fc1f2750SBernard Iremonger 976fc1f2750SBernard Iremonger * A table can have multiple entries (including the default entry) connected to the same output port. 977fc1f2750SBernard Iremonger A table can have different entries connected to different output ports. 978fc1f2750SBernard Iremonger Different tables can have entries (including default table entry) connected to the same output port. 979fc1f2750SBernard Iremonger 980fc1f2750SBernard Iremonger * A table can have multiple entries (including the default entry) connected to another table, 981fc1f2750SBernard Iremonger in which case all these entries have to point to the same table. 982fc1f2750SBernard Iremonger This constraint is enforced by the API and prevents tree-like topologies from being created (allowing table chaining only), 983fc1f2750SBernard Iremonger with the purpose of simplifying the implementation of the pipeline run-time execution engine. 984fc1f2750SBernard Iremonger 985fc1f2750SBernard IremongerPort Actions 986fc1f2750SBernard Iremonger~~~~~~~~~~~~ 987fc1f2750SBernard Iremonger 988fc1f2750SBernard IremongerPort Action Handler 989fc1f2750SBernard Iremonger^^^^^^^^^^^^^^^^^^^ 990fc1f2750SBernard Iremonger 991fc1f2750SBernard IremongerAn action handler can be assigned to each input/output port to define actions to be executed on each input packet that is received by the port. 992fc1f2750SBernard IremongerDefining the action handler for a specific input/output port is optional (i.e. the action handler can be disabled). 993fc1f2750SBernard Iremonger 994fc1f2750SBernard IremongerFor input ports, the action handler is executed after RX function. For output ports, the action handler is executed before the TX function. 995fc1f2750SBernard Iremonger 996fc1f2750SBernard IremongerThe action handler can decide to drop packets. 997fc1f2750SBernard Iremonger 998fc1f2750SBernard IremongerTable Actions 999fc1f2750SBernard Iremonger~~~~~~~~~~~~~ 1000fc1f2750SBernard Iremonger 1001fc1f2750SBernard IremongerTable Action Handler 1002fc1f2750SBernard Iremonger^^^^^^^^^^^^^^^^^^^^ 1003fc1f2750SBernard Iremonger 1004fc1f2750SBernard IremongerAn action handler to be executed on each input packet can be assigned to each table. 1005fc1f2750SBernard IremongerDefining the action handler for a specific table is optional (i.e. the action handler can be disabled). 1006fc1f2750SBernard Iremonger 1007fc1f2750SBernard IremongerThe action handler is executed after the table lookup operation is performed and the table entry associated with each input packet is identified. 1008fc1f2750SBernard IremongerThe action handler can only handle the user-defined actions, while the reserved actions (e.g. the next hop actions) are handled by the Packet Framework. 1009fc1f2750SBernard IremongerThe action handler can decide to drop the input packet. 1010fc1f2750SBernard Iremonger 1011fc1f2750SBernard IremongerReserved Actions 1012fc1f2750SBernard Iremonger^^^^^^^^^^^^^^^^ 1013fc1f2750SBernard Iremonger 1014fc1f2750SBernard IremongerThe reserved actions are handled directly by the Packet Framework without the user being able to change their meaning 1015fc1f2750SBernard Iremongerthrough the table action handler configuration. 1016fc1f2750SBernard IremongerA special category of the reserved actions is represented by the next hop actions, which regulate the packet flow between input ports, 1017fc1f2750SBernard Iremongertables and output ports through the pipeline. 10188c9a3374SJohn McNamara:numref:`table_qos_33` lists the next hop actions. 1019fc1f2750SBernard Iremonger 10208c9a3374SJohn McNamara.. _table_qos_33: 1021fc1f2750SBernard Iremonger 10228c9a3374SJohn McNamara.. table:: Next Hop Actions (Reserved) 1023fc1f2750SBernard Iremonger 1024fc1f2750SBernard Iremonger +---+---------------------+-----------------------------------------------------------------------------------+ 1025fc1f2750SBernard Iremonger | # | Next hop action | Description | 1026fc1f2750SBernard Iremonger | | | | 1027fc1f2750SBernard Iremonger +===+=====================+===================================================================================+ 1028fc1f2750SBernard Iremonger | 1 | Drop | Drop the current packet. | 1029fc1f2750SBernard Iremonger | | | | 1030fc1f2750SBernard Iremonger +---+---------------------+-----------------------------------------------------------------------------------+ 1031fc1f2750SBernard Iremonger | 2 | Send to output port | Send the current packet to specified output port. The output port ID is metadata | 1032fc1f2750SBernard Iremonger | | | stored in the same table entry. | 1033fc1f2750SBernard Iremonger | | | | 1034fc1f2750SBernard Iremonger +---+---------------------+-----------------------------------------------------------------------------------+ 1035fc1f2750SBernard Iremonger | 3 | Send to table | Send the current packet to specified table. The table ID is metadata stored in | 1036fc1f2750SBernard Iremonger | | | the same table entry. | 1037fc1f2750SBernard Iremonger | | | | 1038fc1f2750SBernard Iremonger +---+---------------------+-----------------------------------------------------------------------------------+ 1039fc1f2750SBernard Iremonger 1040fc1f2750SBernard IremongerUser Actions 1041fc1f2750SBernard Iremonger^^^^^^^^^^^^ 1042fc1f2750SBernard Iremonger 1043fc1f2750SBernard IremongerFor each table, the meaning of user actions is defined through the configuration of the table action handler. 1044fc1f2750SBernard IremongerDifferent tables can be configured with different action handlers, therefore the meaning of the user actions 1045fc1f2750SBernard Iremongerand their associated meta-data is private to each table. 1046fc1f2750SBernard IremongerWithin the same table, all the table entries (including the table default entry) share the same definition 1047fc1f2750SBernard Iremongerfor the user actions and their associated meta-data, 1048fc1f2750SBernard Iremongerwith each table entry having its own set of enabled user actions and its own copy of the action meta-data. 10498c9a3374SJohn McNamara:numref:`table_qos_34` contains a non-exhaustive list of user action examples. 1050fc1f2750SBernard Iremonger 10518c9a3374SJohn McNamara.. _table_qos_34: 1052fc1f2750SBernard Iremonger 10538c9a3374SJohn McNamara.. table:: User Action Examples 1054fc1f2750SBernard Iremonger 1055fc1f2750SBernard Iremonger +---+-----------------------------------+---------------------------------------------------------------------+ 1056fc1f2750SBernard Iremonger | # | User action | Description | 1057fc1f2750SBernard Iremonger | | | | 1058fc1f2750SBernard Iremonger +===+===================================+=====================================================================+ 1059fc1f2750SBernard Iremonger | 1 | Metering | Per flow traffic metering using the srTCM and trTCM algorithms. | 1060fc1f2750SBernard Iremonger | | | | 1061fc1f2750SBernard Iremonger +---+-----------------------------------+---------------------------------------------------------------------+ 1062fc1f2750SBernard Iremonger | 2 | Statistics | Update the statistics counters maintained per flow. | 1063fc1f2750SBernard Iremonger | | | | 1064fc1f2750SBernard Iremonger +---+-----------------------------------+---------------------------------------------------------------------+ 1065fc1f2750SBernard Iremonger | 3 | App ID | Per flow state machine fed by variable length sequence of packets | 1066fc1f2750SBernard Iremonger | | | at the flow initialization with the purpose of identifying the | 1067fc1f2750SBernard Iremonger | | | traffic type and application. | 1068fc1f2750SBernard Iremonger | | | | 1069fc1f2750SBernard Iremonger +---+-----------------------------------+---------------------------------------------------------------------+ 1070fc1f2750SBernard Iremonger | 4 | Push/pop labels | Push/pop VLAN/MPLS labels to/from the current packet. | 1071fc1f2750SBernard Iremonger | | | | 1072fc1f2750SBernard Iremonger +---+-----------------------------------+---------------------------------------------------------------------+ 1073fc1f2750SBernard Iremonger | 5 | Network Address Translation (NAT) | Translate between the internal (LAN) and external (WAN) IP | 1074fc1f2750SBernard Iremonger | | | destination/source address and/or L4 protocol destination/source | 1075fc1f2750SBernard Iremonger | | | port. | 1076fc1f2750SBernard Iremonger | | | | 1077fc1f2750SBernard Iremonger +---+-----------------------------------+---------------------------------------------------------------------+ 1078fc1f2750SBernard Iremonger | 6 | TTL update | Decrement IP TTL and, in case of IPv4 packets, update the IP | 1079fc1f2750SBernard Iremonger | | | checksum. | 1080fc1f2750SBernard Iremonger | | | | 1081fc1f2750SBernard Iremonger +---+-----------------------------------+---------------------------------------------------------------------+ 1082cc85c078SFan Zhang | 7 | Sym Crypto | Generate Cryptodev session based on the user-specified algorithm | 1083cc85c078SFan Zhang | | | and key(s), and assemble the cryptodev operation based on the | 1084cc85c078SFan Zhang | | | predefined offsets. | 1085cc85c078SFan Zhang | | | | 1086cc85c078SFan Zhang +---+-----------------------------------+---------------------------------------------------------------------+ 1087fc1f2750SBernard Iremonger 1088fc1f2750SBernard IremongerMulticore Scaling 1089fc1f2750SBernard Iremonger----------------- 1090fc1f2750SBernard Iremonger 1091fc1f2750SBernard IremongerA complex application is typically split across multiple cores, with cores communicating through SW queues. 1092fc1f2750SBernard IremongerThere is usually a performance limit on the number of table lookups 1093fc1f2750SBernard Iremongerand actions that can be fitted on the same CPU core due to HW constraints like: 1094fc1f2750SBernard Iremongeravailable CPU cycles, cache memory size, cache transfer BW, memory transfer BW, etc. 1095fc1f2750SBernard Iremonger 1096fc1f2750SBernard IremongerAs the application is split across multiple CPU cores, the Packet Framework facilitates the creation of several pipelines, 1097fc1f2750SBernard Iremongerthe assignment of each such pipeline to a different CPU core 1098fc1f2750SBernard Iremongerand the interconnection of all CPU core-level pipelines into a single application-level complex pipeline. 1099fc1f2750SBernard IremongerFor example, if CPU core A is assigned to run pipeline P1 and CPU core B pipeline P2, 1100fc1f2750SBernard Iremongerthen the interconnection of P1 with P2 could be achieved by having the same set of SW queues act like output ports 1101fc1f2750SBernard Iremongerfor P1 and input ports for P2. 1102fc1f2750SBernard Iremonger 1103fc1f2750SBernard IremongerThis approach enables the application development using the pipeline, run-to-completion (clustered) or hybrid (mixed) models. 1104fc1f2750SBernard Iremonger 1105fc1f2750SBernard IremongerIt is allowed for the same core to run several pipelines, but it is not allowed for several cores to run the same pipeline. 1106fc1f2750SBernard Iremonger 1107fc1f2750SBernard IremongerShared Data Structures 1108fc1f2750SBernard Iremonger~~~~~~~~~~~~~~~~~~~~~~ 1109fc1f2750SBernard Iremonger 1110fc1f2750SBernard IremongerThe threads performing table lookup are actually table writers rather than just readers. 1111fc1f2750SBernard IremongerEven if the specific table lookup algorithm is thread-safe for multiple readers 1112fc1f2750SBernard Iremonger(e. g. read-only access of the search algorithm data structures is enough to conduct the lookup operation), 1113fc1f2750SBernard Iremongeronce the table entry for the current packet is identified, the thread is typically expected to update the action meta-data stored in the table entry 1114fc1f2750SBernard Iremonger(e.g. increment the counter tracking the number of packets that hit this table entry), and thus modify the table entry. 1115fc1f2750SBernard IremongerDuring the time this thread is accessing this table entry (either writing or reading; duration is application specific), 1116fc1f2750SBernard Iremongerfor data consistency reasons, no other threads (threads performing table lookup or entry add/delete operations) are allowed to modify this table entry. 1117fc1f2750SBernard Iremonger 1118fc1f2750SBernard IremongerMechanisms to share the same table between multiple threads: 1119fc1f2750SBernard Iremonger 1120fc1f2750SBernard Iremonger#. **Multiple writer threads.** 1121fc1f2750SBernard Iremonger Threads need to use synchronization primitives like semaphores (distinct semaphore per table entry) or atomic instructions. 1122fc1f2750SBernard Iremonger The cost of semaphores is usually high, even when the semaphore is free. 1123fc1f2750SBernard Iremonger The cost of atomic instructions is normally higher than the cost of regular instructions. 1124fc1f2750SBernard Iremonger 1125fc1f2750SBernard Iremonger#. **Multiple writer threads, with single thread performing table lookup operations and multiple threads performing table entry add/delete operations.** 1126fc1f2750SBernard Iremonger The threads performing table entry add/delete operations send table update requests to the reader (typically through message passing queues), 1127fc1f2750SBernard Iremonger which does the actual table updates and then sends the response back to the request initiator. 1128fc1f2750SBernard Iremonger 1129fc1f2750SBernard Iremonger#. **Single writer thread performing table entry add/delete operations and multiple reader threads that perform table lookup operations with read-only access to the table entries.** 1130fc1f2750SBernard Iremonger The reader threads use the main table copy while the writer is updating the mirror copy. 1131fc1f2750SBernard Iremonger Once the writer update is done, the writer can signal to the readers and busy wait until all readers swaps between the mirror copy (which now becomes the main copy) and 1132fc1f2750SBernard Iremonger the mirror copy (which now becomes the main copy). 1133fc1f2750SBernard Iremonger 1134fc1f2750SBernard IremongerInterfacing with Accelerators 1135fc1f2750SBernard Iremonger----------------------------- 1136fc1f2750SBernard Iremonger 1137fc1f2750SBernard IremongerThe presence of accelerators is usually detected during the initialization phase by inspecting the HW devices that are part of the system (e.g. by PCI bus enumeration). 1138fc1f2750SBernard IremongerTypical devices with acceleration capabilities are: 1139fc1f2750SBernard Iremonger 1140fc1f2750SBernard Iremonger* Inline accelerators: NICs, switches, FPGAs, etc; 1141fc1f2750SBernard Iremonger 1142cc85c078SFan Zhang* Look-aside accelerators: chipsets, FPGAs, Intel QuickAssist, etc. 1143fc1f2750SBernard Iremonger 1144fc1f2750SBernard IremongerUsually, to support a specific functional block, specific implementation of Packet Framework tables and/or ports and/or actions has to be provided for each accelerator, 1145fc1f2750SBernard Iremongerwith all the implementations sharing the same API: pure SW implementation (no acceleration), implementation using accelerator A, implementation using accelerator B, etc. 1146fc1f2750SBernard IremongerThe selection between these implementations could be done at build time or at run-time (recommended), based on which accelerators are present in the system, 1147fc1f2750SBernard Iremongerwith no application changes required. 11485adff62eSCristian Dumitrescu 11495adff62eSCristian DumitrescuThe Software Switch (SWX) Pipeline 11505adff62eSCristian Dumitrescu---------------------------------- 11515adff62eSCristian Dumitrescu 11525adff62eSCristian DumitrescuThe Software Switch (SWX) pipeline is designed to combine the DPDK performance with the flexibility of the P4-16 language [1]. It can be used either by itself 11535adff62eSCristian Dumitrescuto code a complete software switch or data plane application, or in combination with the open-source P4 compiler P4C [2], acting as a P4C back-end that allows 11545adff62eSCristian Dumitrescuthe P4 programs to be translated to the DPDK SWX API and run on multi-core CPUs. 11555adff62eSCristian Dumitrescu 11565adff62eSCristian DumitrescuThe main features of the SWX pipeline are: 11575adff62eSCristian Dumitrescu 11585adff62eSCristian Dumitrescu* Nothing is hard-wired, everything is dynamically defined: The packet headers (i.e. the network protocols), the packet meta-data, the actions, the tables 11595adff62eSCristian Dumitrescu and the pipeline itself are dynamically defined instead of selected from a predefined set. 11605adff62eSCristian Dumitrescu 11615adff62eSCristian Dumitrescu* Instructions: The actions and the life of the packet through the pipeline are defined with instructions that manipulate the pipeline objects mentioned 11625adff62eSCristian Dumitrescu above. The pipeline is the main function of the packet program, with actions as subroutines triggered by the tables. 11635adff62eSCristian Dumitrescu 11645adff62eSCristian Dumitrescu* Call external plugins: Extern objects and functions can be defined to call functionality that cannot be efficiently implemented with the existing 11655adff62eSCristian Dumitrescu pipeline-oriented instruction set, such as: error detecting/correcting codes, cryptographic operations, meters, statistics counter arrays, heuristics, etc. 11665adff62eSCristian Dumitrescu 11675adff62eSCristian Dumitrescu* Better control plane interaction: Transaction-oriented table update mechanism that supports multi-table atomic updates. Multiple tables can be updated in a 11685adff62eSCristian Dumitrescu single step with only the before-update and the after-update table entries visible to the packets. Alignment with the P4Runtime [3] protocol. 11695adff62eSCristian Dumitrescu 11705adff62eSCristian Dumitrescu* Performance: Multiple packets are in-flight within the pipeline at any moment. Each packet is owned by a different time-sharing thread in 11715adff62eSCristian Dumitrescu run-to-completion, with the thread pausing before memory access operations such as packet I/O and table lookup to allow the memory prefetch to complete. 11725adff62eSCristian Dumitrescu The instructions are verified and translated at initialization time with no run-time impact. The instructions are also optimized to detect and "fuse" 11735adff62eSCristian Dumitrescu frequently used patterns into vector-like instructions transparently to the user. 11745adff62eSCristian Dumitrescu 11755adff62eSCristian DumitrescuThe main SWX pipeline components are: 11765adff62eSCristian Dumitrescu 11775adff62eSCristian Dumitrescu* Input and output ports: Each port instantiates a port type that defines the port operations, e.g. Ethernet device port, PCAP port, etc. The RX interface 11785adff62eSCristian Dumitrescu of the input ports and the TX interface of the output ports are single packet based, with packet batching typically implemented internally by each port for 11795adff62eSCristian Dumitrescu performance reasons. 11805adff62eSCristian Dumitrescu 11815adff62eSCristian Dumitrescu* Structure types: Each structure type is used to define the logical layout of a memory block, such as: packet headers, packet meta-data, action data stored 11825adff62eSCristian Dumitrescu in a table entry, mailboxes of extern objects and functions. Similar to C language structs, each structure type is a well defined sequence of fields, with 11835adff62eSCristian Dumitrescu each field having a unique name and a constant size. 11845adff62eSCristian Dumitrescu 11855adff62eSCristian Dumitrescu* Packet headers: Each packet typically has one or multiple headers. The headers are extracted from the input packet as part of the packet parsing operation, 11865adff62eSCristian Dumitrescu which is likely executed immediately after the packet reception. As result of the extract operation, each header is logically removed from the packet, so 11875adff62eSCristian Dumitrescu once the packet parsing operation is completed, the input packet is reduced to opaque payload. Just before transmission, one or several headers are pushed 11885adff62eSCristian Dumitrescu in front of each output packet through the emit operation; these headers can be part of the set of headers that were previously extracted from the input 11895adff62eSCristian Dumitrescu packet (and potentially modified afterwards) or some new headers whose contents is generated by the pipeline (e.g. by reading them from tables). The format 11905adff62eSCristian Dumitrescu of each packet header is defined by instantiating a structure type. 11915adff62eSCristian Dumitrescu 11925adff62eSCristian Dumitrescu* Packet meta-data: The packet meta-data is filled in by the pipeline (e.g. by reading it from tables) or computed by the pipeline. It is not sent out unless 11935adff62eSCristian Dumitrescu some of the meta-data fields are explicitly written into the headers emitted into the output packet. The format of the packet meta-data is defined by 11945adff62eSCristian Dumitrescu instantiating a structure type. 11955adff62eSCristian Dumitrescu 11965adff62eSCristian Dumitrescu* Extern objects and functions: Used to plug into the pipeline any functionality that cannot be efficiently implemented with the existing pipeline instruction 11975adff62eSCristian Dumitrescu set. Each extern object and extern function has its own mailbox, which is used to pass the input arguments to and retrieve the output arguments from the 11985adff62eSCristian Dumitrescu extern object member functions or the extern function. The mailbox format is defined by instantiating a structure type. 11995adff62eSCristian Dumitrescu 12005adff62eSCristian Dumitrescu* Instructions: The pipeline and its actions are defined with instructions from a predefined instruction set. The instructions are used to receive and 12015adff62eSCristian Dumitrescu transmit the current packet, extract and emit headers from/into the packet, read/write the packet headers, packet meta-data and mailboxes, start table 12025adff62eSCristian Dumitrescu lookup operations, read the action arguments from the table entry, call extern object member functions or extern functions. See the rte_swx_pipeline.h file 12035adff62eSCristian Dumitrescu for the complete list of instructions. 12045adff62eSCristian Dumitrescu 12055adff62eSCristian Dumitrescu* Actions: The pipeline actions are dynamically defined through instructions as opposed to predefined. Essentially, the actions are subroutines of the 12065adff62eSCristian Dumitrescu pipeline program and their execution is triggered by the table lookup. The input arguments of each action are read from the table entry (in case of table 12075adff62eSCristian Dumitrescu lookup hit) or the default table action (in case of table lookup miss) and are read-only; their format is defined by instantiating a structure type. The 12085adff62eSCristian Dumitrescu actions have read-write access to the packet headers and meta-data. 12095adff62eSCristian Dumitrescu 12105adff62eSCristian Dumitrescu* Table: Each pipeline typically has one or more lookup tables. The match fields of each table are flexibly selected from the packet headers and meta-data 12115adff62eSCristian Dumitrescu defined for the current pipeline. The set of table actions is flexibly selected for each table from the set of actions defined for the current pipeline. The 12125adff62eSCristian Dumitrescu tables can be looked at as special pipeline operators that result in one of the table actions being called, depending on the result of the table lookup 12135adff62eSCristian Dumitrescu operation. 12145adff62eSCristian Dumitrescu 12155adff62eSCristian Dumitrescu* Pipeline: The pipeline represents the main program that defines the life of the packet, with subroutines (actions) executed on table lookup. As packets 12165adff62eSCristian Dumitrescu go through the pipeline, the packet headers and meta-data are transformed along the way. 12175adff62eSCristian Dumitrescu 12185adff62eSCristian DumitrescuReferences: 12195adff62eSCristian Dumitrescu 12205adff62eSCristian Dumitrescu[1] P4-16 specification: https://p4.org/specs/ 12215adff62eSCristian Dumitrescu 12225adff62eSCristian Dumitrescu[2] P4-16 compiler: https://github.com/p4lang/p4c 12235adff62eSCristian Dumitrescu 12245adff62eSCristian Dumitrescu[3] P4Runtime specification: https://p4.org/specs/ 1225