xref: /dpdk/doc/guides/prog_guide/graph_lib.rst (revision 20365d793e45500ad98d5cbd43c0369f45f6747c)
14dc6d8e6SJerin Jacob..  SPDX-License-Identifier: BSD-3-Clause
24dc6d8e6SJerin Jacob    Copyright(C) 2020 Marvell International Ltd.
34dc6d8e6SJerin Jacob
44dc6d8e6SJerin JacobGraph Library and Inbuilt Nodes
54dc6d8e6SJerin Jacob===============================
64dc6d8e6SJerin Jacob
74dc6d8e6SJerin JacobGraph architecture abstracts the data processing functions as a ``node`` and
84dc6d8e6SJerin Jacob``links`` them together to create a complex ``graph`` to enable reusable/modular
94dc6d8e6SJerin Jacobdata processing functions.
104dc6d8e6SJerin Jacob
114dc6d8e6SJerin JacobThe graph library provides API to enable graph framework operations such as
124dc6d8e6SJerin Jacobcreate, lookup, dump and destroy on graph and node operations such as clone,
134dc6d8e6SJerin Jacobedge update, and edge shrink, etc. The API also allows to create the stats
144dc6d8e6SJerin Jacobcluster to monitor per graph and per node stats.
154dc6d8e6SJerin Jacob
164dc6d8e6SJerin JacobFeatures
174dc6d8e6SJerin Jacob--------
184dc6d8e6SJerin Jacob
194dc6d8e6SJerin JacobFeatures of the Graph library are:
204dc6d8e6SJerin Jacob
214dc6d8e6SJerin Jacob- Nodes as plugins.
224dc6d8e6SJerin Jacob- Support for out of tree nodes.
234dc6d8e6SJerin Jacob- Inbuilt nodes for packet processing.
244dc6d8e6SJerin Jacob- Multi-process support.
254dc6d8e6SJerin Jacob- Low overhead graph walk and node enqueue.
264dc6d8e6SJerin Jacob- Low overhead statistics collection infrastructure.
274dc6d8e6SJerin Jacob- Support to export the graph as a Graphviz dot file. See ``rte_graph_export()``.
284dc6d8e6SJerin Jacob- Allow having another graph walk implementation in the future by segregating
294dc6d8e6SJerin Jacob  the fast path(``rte_graph_worker.h``) and slow path code.
304dc6d8e6SJerin Jacob
314dc6d8e6SJerin JacobAdvantages of Graph architecture
324dc6d8e6SJerin Jacob--------------------------------
334dc6d8e6SJerin Jacob
344dc6d8e6SJerin Jacob- Memory latency is the enemy for high-speed packet processing, moving the
354dc6d8e6SJerin Jacob  similar packet processing code to a node will reduce the I cache and D
364dc6d8e6SJerin Jacob  caches misses.
374dc6d8e6SJerin Jacob- Exploits the probability that most packets will follow the same nodes in the
384dc6d8e6SJerin Jacob  graph.
394dc6d8e6SJerin Jacob- Allow SIMD instructions for packet processing of the node.-
404dc6d8e6SJerin Jacob- The modular scheme allows having reusable nodes for the consumers.
414dc6d8e6SJerin Jacob- The modular scheme allows us to abstract the vendor HW specific
424dc6d8e6SJerin Jacob  optimizations as a node.
434dc6d8e6SJerin Jacob
444dc6d8e6SJerin JacobPerformance tuning parameters
454dc6d8e6SJerin Jacob-----------------------------
464dc6d8e6SJerin Jacob
474dc6d8e6SJerin Jacob- Test with various burst size values (256, 128, 64, 32) using
4889c67ae2SCiara Power  RTE_GRAPH_BURST_SIZE config option.
494dc6d8e6SJerin Jacob  The testing shows, on x86 and arm64 servers, The sweet spot is 256 burst
504dc6d8e6SJerin Jacob  size. While on arm64 embedded SoCs, it is either 64 or 128.
5189c67ae2SCiara Power- Disable node statistics (using ``RTE_LIBRTE_GRAPH_STATS`` config option)
524dc6d8e6SJerin Jacob  if not needed.
534dc6d8e6SJerin Jacob
544dc6d8e6SJerin JacobProgramming model
554dc6d8e6SJerin Jacob-----------------
564dc6d8e6SJerin Jacob
574dc6d8e6SJerin JacobAnatomy of Node:
584dc6d8e6SJerin Jacob~~~~~~~~~~~~~~~~
594dc6d8e6SJerin Jacob
604dc6d8e6SJerin Jacob.. _figure_anatomy_of_a_node:
614dc6d8e6SJerin Jacob
624dc6d8e6SJerin Jacob.. figure:: img/anatomy_of_a_node.*
634dc6d8e6SJerin Jacob
64924e7d8fSThomas Monjalon   Anatomy of a node
654dc6d8e6SJerin Jacob
664dc6d8e6SJerin JacobThe node is the basic building block of the graph framework.
674dc6d8e6SJerin Jacob
684dc6d8e6SJerin JacobA node consists of:
694dc6d8e6SJerin Jacob
704dc6d8e6SJerin Jacobprocess():
714dc6d8e6SJerin Jacob^^^^^^^^^^
724dc6d8e6SJerin Jacob
734dc6d8e6SJerin JacobThe callback function will be invoked by worker thread using
744dc6d8e6SJerin Jacob``rte_graph_walk()`` function when there is data to be processed by the node.
754dc6d8e6SJerin JacobA graph node process the function using ``process()`` and enqueue to next
764dc6d8e6SJerin Jacobdownstream node using ``rte_node_enqueue*()`` function.
774dc6d8e6SJerin Jacob
784dc6d8e6SJerin JacobContext memory:
794dc6d8e6SJerin Jacob^^^^^^^^^^^^^^^
804dc6d8e6SJerin Jacob
814dc6d8e6SJerin JacobIt is memory allocated by the library to store the node-specific context
824dc6d8e6SJerin Jacobinformation. This memory will be used by process(), init(), fini() callbacks.
834dc6d8e6SJerin Jacob
844dc6d8e6SJerin Jacobinit():
854dc6d8e6SJerin Jacob^^^^^^^
864dc6d8e6SJerin Jacob
874dc6d8e6SJerin JacobThe callback function will be invoked by ``rte_graph_create()`` on when
884dc6d8e6SJerin Jacoba node gets attached to a graph.
894dc6d8e6SJerin Jacob
904dc6d8e6SJerin Jacobfini():
914dc6d8e6SJerin Jacob^^^^^^^
924dc6d8e6SJerin Jacob
934dc6d8e6SJerin JacobThe callback function will be invoked by ``rte_graph_destroy()`` on when a
944dc6d8e6SJerin Jacobnode gets detached to a graph.
954dc6d8e6SJerin Jacob
964dc6d8e6SJerin JacobNode name:
974dc6d8e6SJerin Jacob^^^^^^^^^^
984dc6d8e6SJerin Jacob
994dc6d8e6SJerin JacobIt is the name of the node. When a node registers to graph library, the library
1004dc6d8e6SJerin Jacobgives the ID as ``rte_node_t`` type. Both ID or Name shall be used lookup the
1014dc6d8e6SJerin Jacobnode. ``rte_node_from_name()``, ``rte_node_id_to_name()`` are the node
1024dc6d8e6SJerin Jacoblookup functions.
1034dc6d8e6SJerin Jacob
1044dc6d8e6SJerin Jacobnb_edges:
1054dc6d8e6SJerin Jacob^^^^^^^^^
1064dc6d8e6SJerin Jacob
1074dc6d8e6SJerin JacobThe number of downstream nodes connected to this node. The ``next_nodes[]``
1084dc6d8e6SJerin Jacobstores the downstream nodes objects. ``rte_node_edge_update()`` and
1094dc6d8e6SJerin Jacob``rte_node_edge_shrink()`` functions shall be used to update the ``next_node[]``
1104dc6d8e6SJerin Jacobobjects. Consumers of the node APIs are free to update the ``next_node[]``
1114dc6d8e6SJerin Jacobobjects till ``rte_graph_create()`` invoked.
1124dc6d8e6SJerin Jacob
1134dc6d8e6SJerin Jacobnext_node[]:
1144dc6d8e6SJerin Jacob^^^^^^^^^^^^
1154dc6d8e6SJerin Jacob
1164dc6d8e6SJerin JacobThe dynamic array to store the downstream nodes connected to this node. Downstream
1174dc6d8e6SJerin Jacobnode should not be current node itself or a source node.
1184dc6d8e6SJerin Jacob
1194dc6d8e6SJerin JacobSource node:
1204dc6d8e6SJerin Jacob^^^^^^^^^^^^
1214dc6d8e6SJerin Jacob
1224dc6d8e6SJerin JacobSource nodes are static nodes created using ``RTE_NODE_REGISTER`` by passing
1234dc6d8e6SJerin Jacob``flags`` as ``RTE_NODE_SOURCE_F``.
1244dc6d8e6SJerin JacobWhile performing the graph walk, the ``process()`` function of all the source
1254dc6d8e6SJerin Jacobnodes will be called first. So that these nodes can be used as input nodes for a graph.
1264dc6d8e6SJerin Jacob
1274dc6d8e6SJerin JacobNode creation and registration
1284dc6d8e6SJerin Jacob~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1294dc6d8e6SJerin Jacob* Node implementer creates the node by implementing ops and attributes of
1304dc6d8e6SJerin Jacob  ``struct rte_node_register``.
1314dc6d8e6SJerin Jacob
1324dc6d8e6SJerin Jacob* The library registers the node by invoking RTE_NODE_REGISTER on library load
1334dc6d8e6SJerin Jacob  using the constructor scheme. The constructor scheme used here to support multi-process.
1344dc6d8e6SJerin Jacob
1354dc6d8e6SJerin JacobLink the Nodes to create the graph topology
1364dc6d8e6SJerin Jacob~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1374dc6d8e6SJerin Jacob.. _figure_link_the_nodes:
1384dc6d8e6SJerin Jacob
1394dc6d8e6SJerin Jacob.. figure:: img/link_the_nodes.*
1404dc6d8e6SJerin Jacob
141924e7d8fSThomas Monjalon   Topology after linking the nodes
1424dc6d8e6SJerin Jacob
1434dc6d8e6SJerin JacobOnce nodes are available to the program, Application or node public API
1444dc6d8e6SJerin Jacobfunctions can links them together to create a complex packet processing graph.
1454dc6d8e6SJerin Jacob
1464dc6d8e6SJerin JacobThere are multiple different types of strategies to link the nodes.
1474dc6d8e6SJerin Jacob
1484dc6d8e6SJerin JacobMethod (a):
1494dc6d8e6SJerin Jacob^^^^^^^^^^^
1504dc6d8e6SJerin JacobProvide the ``next_nodes[]`` at the node registration time. See  ``struct rte_node_register::nb_edges``.
1514dc6d8e6SJerin JacobThis is a use case to address the static node scheme where one knows upfront the
1524dc6d8e6SJerin Jacob``next_nodes[]`` of the node.
1534dc6d8e6SJerin Jacob
1544dc6d8e6SJerin JacobMethod (b):
1554dc6d8e6SJerin Jacob^^^^^^^^^^^
1564dc6d8e6SJerin JacobUse ``rte_node_edge_get()``, ``rte_node_edge_update()``, ``rte_node_edge_shrink()``
1574dc6d8e6SJerin Jacobto update the ``next_nodes[]`` links for the node runtime but before graph create.
1584dc6d8e6SJerin Jacob
1594dc6d8e6SJerin JacobMethod (c):
1604dc6d8e6SJerin Jacob^^^^^^^^^^^
1614dc6d8e6SJerin JacobUse ``rte_node_clone()`` to clone a already existing node, created using RTE_NODE_REGISTER.
1624dc6d8e6SJerin JacobWhen ``rte_node_clone()`` invoked, The library, would clone all the attributes
1634dc6d8e6SJerin Jacobof the node and creates a new one. The name for cloned node shall be
1644dc6d8e6SJerin Jacob``"parent_node_name-user_provided_name"``.
1654dc6d8e6SJerin Jacob
1664dc6d8e6SJerin JacobThis method enables the use case of Rx and Tx nodes where multiple of those nodes
1674dc6d8e6SJerin Jacobneed to be cloned based on the number of CPU available in the system.
1684dc6d8e6SJerin JacobThe cloned nodes will be identical, except the ``"context memory"``.
1694dc6d8e6SJerin JacobContext memory will have information of port, queue pair in case of Rx and Tx
1704dc6d8e6SJerin Jacobethdev nodes.
1714dc6d8e6SJerin Jacob
1724dc6d8e6SJerin JacobCreate the graph object
1734dc6d8e6SJerin Jacob~~~~~~~~~~~~~~~~~~~~~~~
1744dc6d8e6SJerin JacobNow that the nodes are linked, Its time to create a graph by including
1754dc6d8e6SJerin Jacobthe required nodes. The application can provide a set of node patterns to
176b457a9b3SAshwin Sekhar T Kform a graph object. The ``fnmatch()`` API used underneath for the pattern
1774dc6d8e6SJerin Jacobmatching to include the required nodes. After the graph create any changes to
1784dc6d8e6SJerin Jacobnodes or graph is not allowed.
1794dc6d8e6SJerin Jacob
1804dc6d8e6SJerin JacobThe ``rte_graph_create()`` API shall be used to create the graph.
1814dc6d8e6SJerin Jacob
1824dc6d8e6SJerin JacobExample of a graph object creation:
1834dc6d8e6SJerin Jacob
1844dc6d8e6SJerin Jacob.. code-block:: console
1854dc6d8e6SJerin Jacob
1864dc6d8e6SJerin Jacob   {"ethdev_rx-0-0", ip4*, ethdev_tx-*"}
1874dc6d8e6SJerin Jacob
1884dc6d8e6SJerin JacobIn the above example, A graph object will be created with ethdev Rx
1894dc6d8e6SJerin Jacobnode of port 0 and queue 0, all ipv4* nodes in the system,
1904dc6d8e6SJerin Jacoband ethdev tx node of all ports.
1914dc6d8e6SJerin Jacob
1924dc6d8e6SJerin JacobMulticore graph processing
1934dc6d8e6SJerin Jacob~~~~~~~~~~~~~~~~~~~~~~~~~~
1944dc6d8e6SJerin JacobIn the current graph library implementation, specifically,
1954dc6d8e6SJerin Jacob``rte_graph_walk()`` and ``rte_node_enqueue*`` fast path API functions
1964dc6d8e6SJerin Jacobare designed to work on single-core to have better performance.
1974dc6d8e6SJerin JacobThe fast path API works on graph object, So the multi-core graph
1984dc6d8e6SJerin Jacobprocessing strategy would be to create graph object PER WORKER.
1994dc6d8e6SJerin Jacob
2004dc6d8e6SJerin JacobIn fast path
2014dc6d8e6SJerin Jacob~~~~~~~~~~~~
2024dc6d8e6SJerin JacobTypical fast-path code looks like below, where the application
2034dc6d8e6SJerin Jacobgets the fast-path graph object using ``rte_graph_lookup()``
2044dc6d8e6SJerin Jacobon the worker thread and run the ``rte_graph_walk()`` in a tight loop.
2054dc6d8e6SJerin Jacob
2064dc6d8e6SJerin Jacob.. code-block:: c
2074dc6d8e6SJerin Jacob
2084dc6d8e6SJerin Jacob    struct rte_graph *graph = rte_graph_lookup("worker0");
2094dc6d8e6SJerin Jacob
2104dc6d8e6SJerin Jacob    while (!done) {
2114dc6d8e6SJerin Jacob        rte_graph_walk(graph);
2124dc6d8e6SJerin Jacob    }
2134dc6d8e6SJerin Jacob
2144dc6d8e6SJerin JacobContext update when graph walk in action
2154dc6d8e6SJerin Jacob~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2164dc6d8e6SJerin JacobThe fast-path object for the node is ``struct rte_node``.
2174dc6d8e6SJerin Jacob
2184dc6d8e6SJerin JacobIt may be possible that in slow-path or after the graph walk-in action,
2194dc6d8e6SJerin Jacobthe user needs to update the context of the node hence access to
2204dc6d8e6SJerin Jacob``struct rte_node *`` memory.
2214dc6d8e6SJerin Jacob
2224dc6d8e6SJerin Jacob``rte_graph_foreach_node()``, ``rte_graph_node_get()``,
2234f823975SThomas Monjalon``rte_graph_node_get_by_name()`` APIs can be used to get the
2244dc6d8e6SJerin Jacob``struct rte_node*``. ``rte_graph_foreach_node()`` iterator function works on
2254dc6d8e6SJerin Jacob``struct rte_graph *`` fast-path graph object while others works on graph ID or name.
2264dc6d8e6SJerin Jacob
2274dc6d8e6SJerin JacobGet the node statistics using graph cluster
2284dc6d8e6SJerin Jacob~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2294dc6d8e6SJerin JacobThe user may need to know the aggregate stats of the node across
2304dc6d8e6SJerin Jacobmultiple graph objects. Especially the situation where each graph object bound
2314dc6d8e6SJerin Jacobto a worker thread.
2324dc6d8e6SJerin Jacob
2334dc6d8e6SJerin JacobIntroduced a graph cluster object for statistics.
2344dc6d8e6SJerin Jacob``rte_graph_cluster_stats_create()`` API shall be used for creating a
2354dc6d8e6SJerin Jacobgraph cluster with multiple graph objects and ``rte_graph_cluster_stats_get()``
2364dc6d8e6SJerin Jacobto get the aggregate node statistics.
2374dc6d8e6SJerin Jacob
2384dc6d8e6SJerin JacobAn example statistics output from ``rte_graph_cluster_stats_get()``
2394dc6d8e6SJerin Jacob
2404dc6d8e6SJerin Jacob.. code-block:: diff
2414dc6d8e6SJerin Jacob
2424dc6d8e6SJerin Jacob    +---------+-----------+-------------+---------------+-----------+---------------+-----------+
2434dc6d8e6SJerin Jacob    |Node     |calls      |objs         |realloc_count  |objs/call  |objs/sec(10E6) |cycles/call|
2444dc6d8e6SJerin Jacob    +---------------------+-------------+---------------+-----------+---------------+-----------+
2454dc6d8e6SJerin Jacob    |node0    |12977424   |3322220544   |5              |256.000    |3047.151872    |20.0000    |
2464dc6d8e6SJerin Jacob    |node1    |12977653   |3322279168   |0              |256.000    |3047.210496    |17.0000    |
2474dc6d8e6SJerin Jacob    |node2    |12977696   |3322290176   |0              |256.000    |3047.221504    |17.0000    |
2484dc6d8e6SJerin Jacob    |node3    |12977734   |3322299904   |0              |256.000    |3047.231232    |17.0000    |
2494dc6d8e6SJerin Jacob    |node4    |12977784   |3322312704   |1              |256.000    |3047.243776    |17.0000    |
2504dc6d8e6SJerin Jacob    |node5    |12977825   |3322323200   |0              |256.000    |3047.254528    |17.0000    |
2514dc6d8e6SJerin Jacob    +---------+-----------+-------------+---------------+-----------+---------------+-----------+
2524dc6d8e6SJerin Jacob
2534dc6d8e6SJerin JacobNode writing guidelines
2544dc6d8e6SJerin Jacob~~~~~~~~~~~~~~~~~~~~~~~
2554dc6d8e6SJerin Jacob
2564dc6d8e6SJerin JacobThe ``process()`` function of a node is the fast-path function and that needs
2574dc6d8e6SJerin Jacobto be written carefully to achieve max performance.
2584dc6d8e6SJerin Jacob
2594dc6d8e6SJerin JacobBroadly speaking, there are two different types of nodes.
2604dc6d8e6SJerin Jacob
2614dc6d8e6SJerin JacobStatic nodes
2624dc6d8e6SJerin Jacob~~~~~~~~~~~~
2634dc6d8e6SJerin JacobThe first kind of nodes are those that have a fixed ``next_nodes[]`` for the
2644dc6d8e6SJerin Jacobcomplete burst (like ethdev_rx, ethdev_tx) and it is simple to write.
2654dc6d8e6SJerin Jacob``process()`` function can move the obj burst to the next node either using
2664dc6d8e6SJerin Jacob``rte_node_next_stream_move()`` or using ``rte_node_next_stream_get()`` and
2674dc6d8e6SJerin Jacob``rte_node_next_stream_put()``.
2684dc6d8e6SJerin Jacob
2694dc6d8e6SJerin JacobIntermediate nodes
2704dc6d8e6SJerin Jacob~~~~~~~~~~~~~~~~~~
2714dc6d8e6SJerin JacobThe second kind of such node is ``intermediate nodes`` that decide what is the
2724dc6d8e6SJerin Jacob``next_node[]`` to send to on a per-packet basis. In these nodes,
2734dc6d8e6SJerin Jacob
2744dc6d8e6SJerin Jacob* Firstly, there has to be the best possible packet processing logic.
2754dc6d8e6SJerin Jacob
2764dc6d8e6SJerin Jacob* Secondly, each packet needs to be queued to its next node.
2774dc6d8e6SJerin Jacob
2784dc6d8e6SJerin JacobThis can be done using ``rte_node_enqueue_[x1|x2|x4]()`` APIs if
2794dc6d8e6SJerin Jacobthey are to single next or ``rte_node_enqueue_next()`` that takes array of nexts.
2804dc6d8e6SJerin Jacob
2814dc6d8e6SJerin JacobIn scenario where multiple intermediate nodes are present but most of the time
2824dc6d8e6SJerin Jacobeach node using the same next node for all its packets, the cost of moving every
2834dc6d8e6SJerin Jacobpointer from current node's stream to next node's stream could be avoided.
2844dc6d8e6SJerin JacobThis is called home run and ``rte_node_next_stream_move()`` could be used to
2854dc6d8e6SJerin Jacobjust move stream from the current node to the next node with least number of cycles.
2864dc6d8e6SJerin JacobSince this can be avoided only in the case where all the packets are destined
2874dc6d8e6SJerin Jacobto the same next node, node implementation should be also having worst-case
2884dc6d8e6SJerin Jacobhandling where every packet could be going to different next node.
2894dc6d8e6SJerin Jacob
2904dc6d8e6SJerin JacobExample of intermediate node implementation with home run:
2914dc6d8e6SJerin Jacob^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2924dc6d8e6SJerin Jacob1. Start with speculation that next_node = node->ctx.
2934dc6d8e6SJerin JacobThis could be the next_node application used in the previous function call of this node.
2944dc6d8e6SJerin Jacob
2954dc6d8e6SJerin Jacob2. Get the next_node stream array with required space using
2964dc6d8e6SJerin Jacob``rte_node_next_stream_get(next_node, space)``.
2974dc6d8e6SJerin Jacob
2984dc6d8e6SJerin Jacob3. while n_left_from > 0 (i.e packets left to be sent) prefetch next pkt_set
2994dc6d8e6SJerin Jacoband process current pkt_set to find their next node
3004dc6d8e6SJerin Jacob
3014dc6d8e6SJerin Jacob4. if all the next nodes of the current pkt_set match speculated next node,
3024dc6d8e6SJerin Jacobjust count them as successfully speculated(``last_spec``) till now and
3034dc6d8e6SJerin Jacobcontinue the loop without actually moving them to the next node. else if there is
3044dc6d8e6SJerin Jacoba mismatch, copy all the pkt_set pointers that were ``last_spec`` and move the
3054dc6d8e6SJerin Jacobcurrent pkt_set to their respective next's nodes using ``rte_enqueue_next_x1()``.
3064dc6d8e6SJerin JacobAlso, one of the next_node can be updated as speculated next_node if it is more
3074dc6d8e6SJerin Jacobprobable. Finally, reset ``last_spec`` to zero.
3084dc6d8e6SJerin Jacob
3094dc6d8e6SJerin Jacob5. if n_left_from != 0 then goto 3) to process remaining packets.
3104dc6d8e6SJerin Jacob
3114dc6d8e6SJerin Jacob6. if last_spec == nb_objs, All the objects passed were successfully speculated
3124dc6d8e6SJerin Jacobto single next node. So, the current stream can be moved to next node using
3134dc6d8e6SJerin Jacob``rte_node_next_stream_move(node, next_node)``.
3144dc6d8e6SJerin JacobThis is the ``home run`` where memcpy of buffer pointers to next node is avoided.
3154dc6d8e6SJerin Jacob
3164dc6d8e6SJerin Jacob7. Update the ``node->ctx`` with more probable next node.
3174dc6d8e6SJerin Jacob
3184dc6d8e6SJerin JacobGraph object memory layout
3194dc6d8e6SJerin Jacob--------------------------
3204dc6d8e6SJerin Jacob.. _figure_graph_mem_layout:
3214dc6d8e6SJerin Jacob
3224dc6d8e6SJerin Jacob.. figure:: img/graph_mem_layout.*
3234dc6d8e6SJerin Jacob
324924e7d8fSThomas Monjalon   Memory layout
325924e7d8fSThomas Monjalon
326924e7d8fSThomas MonjalonUnderstanding the memory layout helps to debug the graph library and
3274dc6d8e6SJerin Jacobimprove the performance if needed.
3284dc6d8e6SJerin Jacob
3294dc6d8e6SJerin JacobGraph object consists of a header, circular buffer to store the pending
3304dc6d8e6SJerin Jacobstream when walking over the graph, and variable-length memory to store
3314dc6d8e6SJerin Jacobthe ``rte_node`` objects.
3324dc6d8e6SJerin Jacob
3334dc6d8e6SJerin JacobThe graph_nodes_mem_create() creates and populate this memory. The functions
3344dc6d8e6SJerin Jacobsuch as ``rte_graph_walk()`` and ``rte_node_enqueue_*`` use this memory
3354dc6d8e6SJerin Jacobto enable fastpath services.
3364dc6d8e6SJerin Jacob
3374dc6d8e6SJerin JacobInbuilt Nodes
3384dc6d8e6SJerin Jacob-------------
3394dc6d8e6SJerin Jacob
3404dc6d8e6SJerin JacobDPDK provides a set of nodes for data processing. The following section
3414dc6d8e6SJerin Jacobdetails the documentation for the same.
3424dc6d8e6SJerin Jacob
3434dc6d8e6SJerin Jacobethdev_rx
3444dc6d8e6SJerin Jacob~~~~~~~~~
3454dc6d8e6SJerin JacobThis node does ``rte_eth_rx_burst()`` into stream buffer passed to it
3464dc6d8e6SJerin Jacob(src node stream) and does ``rte_node_next_stream_move()`` only when
3474dc6d8e6SJerin Jacobthere are packets received. Each ``rte_node`` works only on one Rx port and
3484dc6d8e6SJerin Jacobqueue that it gets from node->ctx. For each (port X, rx_queue Y),
3494dc6d8e6SJerin Jacoba rte_node is cloned from  ethdev_rx_base_node as ``ethdev_rx-X-Y`` in
3504dc6d8e6SJerin Jacob``rte_node_eth_config()`` along with updating ``node->ctx``.
3514dc6d8e6SJerin JacobEach graph needs to be associated  with a unique rte_node for a (port, rx_queue).
3524dc6d8e6SJerin Jacob
3534dc6d8e6SJerin Jacobethdev_tx
3544dc6d8e6SJerin Jacob~~~~~~~~~
3554dc6d8e6SJerin JacobThis node does ``rte_eth_tx_burst()`` for a burst of objs received by it.
3564dc6d8e6SJerin JacobIt sends the burst to a fixed Tx Port and Queue information from
3574dc6d8e6SJerin Jacobnode->ctx. For each (port X), this ``rte_node`` is cloned from
3584dc6d8e6SJerin Jacobethdev_tx_node_base as "ethdev_tx-X" in ``rte_node_eth_config()``
3594dc6d8e6SJerin Jacobalong with updating node->context.
3604dc6d8e6SJerin Jacob
3614dc6d8e6SJerin JacobSince each graph doesn't need more than one Txq, per port, a Txq is assigned
3624dc6d8e6SJerin Jacobbased on graph id to each rte_node instance. Each graph needs to be associated
3634dc6d8e6SJerin Jacobwith a rte_node for each (port).
3644dc6d8e6SJerin Jacob
3654dc6d8e6SJerin Jacobpkt_drop
3664dc6d8e6SJerin Jacob~~~~~~~~
3674dc6d8e6SJerin JacobThis node frees all the objects passed to it considering them as
3684dc6d8e6SJerin Jacob``rte_mbufs`` that need to be freed.
3694dc6d8e6SJerin Jacob
3704dc6d8e6SJerin Jacobip4_lookup
3714dc6d8e6SJerin Jacob~~~~~~~~~~
3724dc6d8e6SJerin JacobThis node is an intermediate node that does LPM lookup for the received
3734dc6d8e6SJerin Jacobipv4 packets and the result determines each packets next node.
3744dc6d8e6SJerin Jacob
3754dc6d8e6SJerin JacobOn successful LPM lookup, the result contains the ``next_node`` id and
3764dc6d8e6SJerin Jacob``next-hop`` id with which the packet needs to be further processed.
3774dc6d8e6SJerin Jacob
3784dc6d8e6SJerin JacobOn LPM lookup failure, objects are redirected to pkt_drop node.
3794dc6d8e6SJerin Jacob``rte_node_ip4_route_add()`` is control path API to add ipv4 routes.
3804dc6d8e6SJerin JacobTo achieve home run, node use ``rte_node_stream_move()`` as mentioned in above
3814dc6d8e6SJerin Jacobsections.
3824dc6d8e6SJerin Jacob
3834dc6d8e6SJerin Jacobip4_rewrite
3844dc6d8e6SJerin Jacob~~~~~~~~~~~
3854dc6d8e6SJerin JacobThis node gets packets from ``ip4_lookup`` node with next-hop id for each
3864dc6d8e6SJerin Jacobpacket is embedded in ``node_mbuf_priv1(mbuf)->nh``. This id is used
3874dc6d8e6SJerin Jacobto determine the L2 header to be written to the packet before sending
3884dc6d8e6SJerin Jacobthe packet out to a particular ethdev_tx node.
3894dc6d8e6SJerin Jacob``rte_node_ip4_rewrite_add()`` is control path API to add next-hop info.
3904dc6d8e6SJerin Jacob
391*20365d79SSunil Kumar Koriip6_lookup
392*20365d79SSunil Kumar Kori~~~~~~~~~~
393*20365d79SSunil Kumar KoriThis node is an intermediate node that does LPM lookup for the received
394*20365d79SSunil Kumar KoriIPv6 packets and the result determines each packets next node.
395*20365d79SSunil Kumar Kori
396*20365d79SSunil Kumar KoriOn successful LPM lookup, the result contains the ``next_node`` ID
397*20365d79SSunil Kumar Koriand `next-hop`` ID with which the packet needs to be further processed.
398*20365d79SSunil Kumar Kori
399*20365d79SSunil Kumar KoriOn LPM lookup failure, objects are redirected to ``pkt_drop`` node.
400*20365d79SSunil Kumar Kori``rte_node_ip6_route_add()`` is control path API to add IPv6 routes.
401*20365d79SSunil Kumar KoriTo achieve home run, node use ``rte_node_stream_move()``
402*20365d79SSunil Kumar Korias mentioned in above sections.
403*20365d79SSunil Kumar Kori
4044dc6d8e6SJerin Jacobnull
4054dc6d8e6SJerin Jacob~~~~
4064dc6d8e6SJerin JacobThis node ignores the set of objects passed to it and reports that all are
4074dc6d8e6SJerin Jacobprocessed.
4082a0ae651SVamsi Attunuru
4092a0ae651SVamsi Attunurukernel_tx
4102a0ae651SVamsi Attunuru~~~~~~~~~
4112a0ae651SVamsi AttunuruThis node is an exit node that forwards the packets to kernel.
4122a0ae651SVamsi AttunuruIt will be used to forward any control plane traffic to kernel stack from DPDK.
4132a0ae651SVamsi AttunuruIt uses a raw socket interface to transmit the packets,
4142a0ae651SVamsi Attunuruit uses the packet's destination IP address in sockaddr_in address structure
4152a0ae651SVamsi Attunuruand ``sendto`` function to send data on the raw socket.
4162a0ae651SVamsi AttunuruAfter sending the burst of packets to kernel,
4172a0ae651SVamsi Attunuruthis node frees up the packet buffers.
4182d0cf6a7SVamsi Attunuru
4192d0cf6a7SVamsi Attunurukernel_rx
4202d0cf6a7SVamsi Attunuru~~~~~~~~~
4212d0cf6a7SVamsi AttunuruThis node is a source node which receives packets from kernel
4222d0cf6a7SVamsi Attunuruand forwards to any of the intermediate nodes.
4232d0cf6a7SVamsi AttunuruIt uses the raw socket interface to receive packets from kernel.
4242d0cf6a7SVamsi AttunuruUses ``poll`` function to poll on the socket fd
4252d0cf6a7SVamsi Attunurufor ``POLLIN`` events to read the packets from raw socket
4262d0cf6a7SVamsi Attunuruto stream buffer and does ``rte_node_next_stream_move()``
4272d0cf6a7SVamsi Attunuruwhen there are received packets.
428