xref: /dpdk/doc/guides/sample_app_ug/l3_forward_graph.rst (revision 8809f78c7dd9f33a44a4f89c58fc91ded34296ed)
1..  SPDX-License-Identifier: BSD-3-Clause
2    Copyright(C) 2020 Marvell International Ltd.
3
4L3 Forwarding Graph Sample Application
5======================================
6
7The L3 Forwarding Graph application is a simple example of packet processing
8using the DPDK Graph framework. The application performs L3 forwarding using
9Graph framework and nodes written for graph framework.
10
11Overview
12--------
13
14The application demonstrates the use of the graph framework and graph nodes
15``ethdev_rx``, ``ip4_lookup``, ``ip4_rewrite``, ``ethdev_tx`` and ``pkt_drop`` in DPDK to
16implement packet forwarding.
17
18The initialization is very similar to those of the :doc:`l3_forward`.
19There is also additional initialization of graph for graph object creation
20and configuration per lcore.
21Run-time path is main thing that differs from L3 forwarding sample application.
22Difference is that forwarding logic starting from Rx, followed by LPM lookup,
23TTL update and finally Tx is implemented inside graph nodes. These nodes are
24interconnected in graph framework. Application main loop needs to walk over
25graph using ``rte_graph_walk()`` with graph objects created one per worker lcore.
26
27The lookup method is as per implementation of ``ip4_lookup`` graph node.
28The ID of the output interface for the input packet is the next hop returned by
29the LPM lookup. The set of LPM rules used by the application is statically
30configured and provided to ``ip4_lookup`` graph node and ``ip4_rewrite`` graph node
31using node control API ``rte_node_ip4_route_add()`` and ``rte_node_ip4_rewrite_add()``.
32
33In the sample application, only IPv4 forwarding is supported as of now.
34
35Compiling the Application
36-------------------------
37
38To compile the sample application see :doc:`compiling`.
39
40The application is located in the ``l3fwd-graph`` sub-directory.
41
42Running the Application
43-----------------------
44
45The application has a number of command line options similar to l3fwd::
46
47    ./dpdk-l3fwd-graph [EAL options] -- -p PORTMASK
48                                   [-P]
49                                   --config(port,queue,lcore)[,(port,queue,lcore)]
50                                   [--eth-dest=X,MM:MM:MM:MM:MM:MM]
51                                   [--enable-jumbo [--max-pkt-len PKTLEN]]
52                                   [--no-numa]
53                                   [--per-port-pool]
54
55Where,
56
57* ``-p PORTMASK:`` Hexadecimal bitmask of ports to configure
58
59* ``-P:`` Optional, sets all ports to promiscuous mode so that packets are accepted regardless of the packet's Ethernet MAC destination address.
60  Without this option, only packets with the Ethernet MAC destination address set to the Ethernet address of the port are accepted.
61
62* ``--config (port,queue,lcore)[,(port,queue,lcore)]:`` Determines which queues from which ports are mapped to which cores.
63
64* ``--eth-dest=X,MM:MM:MM:MM:MM:MM:`` Optional, ethernet destination for port X.
65
66* ``--enable-jumbo:`` Optional, enables jumbo frames.
67
68* ``--max-pkt-len:`` Optional, under the premise of enabling jumbo, maximum packet length in decimal (64-9600).
69
70* ``--no-numa:`` Optional, disables numa awareness.
71
72* ``--per-port-pool:`` Optional, set to use independent buffer pools per port. Without this option, single buffer pool is used for all ports.
73
74For example, consider a dual processor socket platform with 8 physical cores, where cores 0-7 and 16-23 appear on socket 0,
75while cores 8-15 and 24-31 appear on socket 1.
76
77To enable L3 forwarding between two ports, assuming that both ports are in the same socket, using two cores, cores 1 and 2,
78(which are in the same socket too), use the following command:
79
80.. code-block:: console
81
82    ./<build_dir>/examples/dpdk-l3fwd-graph -l 1,2 -n 4 -- -p 0x3 --config="(0,0,1),(1,0,2)"
83
84In this command:
85
86*   The -l option enables cores 1, 2
87
88*   The -p option enables ports 0 and 1
89
90*   The --config option enables one queue on each port and maps each (port,queue) pair to a specific core.
91    The following table shows the mapping in this example:
92
93+----------+-----------+-----------+-------------------------------------+
94| **Port** | **Queue** | **lcore** | **Description**                     |
95|          |           |           |                                     |
96+----------+-----------+-----------+-------------------------------------+
97| 0        | 0         | 1         | Map queue 0 from port 0 to lcore 1. |
98|          |           |           |                                     |
99+----------+-----------+-----------+-------------------------------------+
100| 1        | 0         | 2         | Map queue 0 from port 1 to lcore 2. |
101|          |           |           |                                     |
102+----------+-----------+-----------+-------------------------------------+
103
104Refer to the *DPDK Getting Started Guide* for general information on running applications and
105the Environment Abstraction Layer (EAL) options.
106
107.. _l3_fwd_graph_explanation:
108
109Explanation
110-----------
111
112The following sections provide some explanation of the sample application code.
113As mentioned in the overview section, the initialization is similar to that of
114the :doc:`l3_forward`. Run-time path though similar in functionality to that of
115:doc:`l3_forward`, major part of the implementation is in graph nodes via used
116via ``librte_node`` library.
117The following sections describe aspects that are specific to the L3 Forwarding
118Graph sample application.
119
120Graph Node Pre-Init Configuration
121~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
122
123After device configuration and device Rx, Tx queue setup is complete,
124a minimal config of port id, num_rx_queues, num_tx_queues, mempools etc will
125be passed to *ethdev_** node ctrl API ``rte_node_eth_config()``. This will be
126lead to the clone of ``ethdev_rx`` and ``ethdev_tx`` nodes as ``ethdev_rx-X-Y`` and
127``ethdev_tx-X`` where X, Y represent port id and queue id associated with them.
128In case of ``ethdev_tx-X`` nodes, tx queue id assigned per instance of the node
129is same as graph id.
130
131These cloned nodes along with existing static nodes such as ``ip4_lookup`` and
132``ip4_rewrite`` will be used in graph creation to associate node's to lcore
133specific graph object.
134
135.. code-block:: c
136
137    RTE_ETH_FOREACH_DEV(portid)
138    {
139
140        /* ... */
141        ret = rte_eth_dev_configure(portid, nb_rx_queue,
142                                    n_tx_queue, &local_port_conf);
143        /* ... */
144
145        /* Init one TX queue per couple (lcore,port) */
146        queueid = 0;
147        for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
148            /* ... */
149            ret = rte_eth_tx_queue_setup(portid, queueid, nb_txd,
150                                         socketid, txconf);
151            /* ... */
152            queueid++;
153        }
154
155        /* Setup ethdev node config */
156        ethdev_conf[nb_conf].port_id = portid;
157        ethdev_conf[nb_conf].num_rx_queues = nb_rx_queue;
158        ethdev_conf[nb_conf].num_tx_queues = n_tx_queue;
159        if (!per_port_pool)
160            ethdev_conf[nb_conf].mp = pktmbuf_pool[0];
161        else
162          ethdev_conf[nb_conf].mp = pktmbuf_pool[portid];
163        ethdev_conf[nb_conf].mp_count = NB_SOCKETS;
164
165        nb_conf++;
166        printf("\n");
167    }
168
169    for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
170        /* Init RX queues */
171        for (queue = 0; queue < qconf->n_rx_queue; ++queue) {
172            /* ... */
173            if (!per_port_pool)
174                ret = rte_eth_rx_queue_setup(portid, queueid, nb_rxd, socketid,
175                                             &rxq_conf, pktmbuf_pool[0][socketid]);
176            else
177              ret = rte_eth_rx_queue_setup(portid, queueid, nb_rxd, socketid,
178                                           &rxq_conf, pktmbuf_pool[portid][socketid]);
179            /* ... */
180        }
181    }
182
183    /* Ethdev node config, skip rx queue mapping */
184    ret = rte_node_eth_config(ethdev_conf, nb_conf, nb_graphs);
185
186Graph Initialization
187~~~~~~~~~~~~~~~~~~~~
188
189Now a graph needs to be created with a specific set of nodes for every lcore.
190A graph object returned after graph creation is a per lcore object and
191cannot be shared between lcores. Since ``ethdev_tx-X`` node is per port node,
192it can be associated with all the graphs created as all the lcores should have
193Tx capability for every port. But ``ethdev_rx-X-Y`` node is created per
194(port, rx_queue_id), so they should be associated with a graph based on
195the application argument ``--config`` specifying rx queue mapping to lcore.
196
197.. note::
198
199    The Graph creation will fail if the passed set of shell node pattern's
200    are not sufficient to meet their inter-dependency or even one node is not
201    found with a given regex node pattern.
202
203.. code-block:: c
204
205    static const char *const default_patterns[] = {
206        "ip4*",
207        "ethdev_tx-*",
208        "pkt_drop",
209    };
210    const char **node_patterns;
211    uint16_t nb_pattern;
212
213    /* ... */
214
215    /* Create a graph object per lcore with common nodes and
216     * lcore specific nodes based on application arguments
217     */
218    nb_patterns = RTE_DIM(default_patterns);
219    node_patterns = malloc((MAX_RX_QUEUE_PER_LCORE + nb_patterns) *
220                           sizeof(*node_patterns));
221    memcpy(node_patterns, default_patterns,
222           nb_patterns * sizeof(*node_patterns));
223
224    memset(&graph_conf, 0, sizeof(graph_conf));
225
226    /* Common set of nodes in every lcore's graph object */
227    graph_conf.node_patterns = node_patterns;
228
229    for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
230        /* ... */
231
232        /* Skip graph creation if no source exists */
233        if (!qconf->n_rx_queue)
234            continue;
235
236        /* Add rx node patterns of this lcore based on --config */
237        for (i = 0; i < qconf->n_rx_queue; i++) {
238            graph_conf.node_patterns[nb_patterns + i] =
239                                qconf->rx_queue_list[i].node_name;
240        }
241
242        graph_conf.nb_node_patterns = nb_patterns + i;
243        graph_conf.socket_id = rte_lcore_to_socket_id(lcore_id);
244
245        snprintf(qconf->name, sizeof(qconf->name), "worker_%u", lcore_id);
246
247        graph_id = rte_graph_create(qconf->name, &graph_conf);
248
249        /* ... */
250
251        qconf->graph = rte_graph_lookup(qconf->name);
252
253        /* ... */
254    }
255
256Forwarding data(Route, Next-Hop) addition
257~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
258
259Once graph objects are created, node specific info like routes and rewrite
260headers will be provided run-time using ``rte_node_ip4_route_add()`` and
261``rte_node_ip4_rewrite_add()`` API.
262
263.. note::
264
265    Since currently ``ip4_lookup`` and ``ip4_rewrite`` nodes don't support
266    lock-less mechanisms(RCU, etc) to add run-time forwarding data like route and
267    rewrite data, forwarding data is added before packet processing loop is
268    launched on worker lcore.
269
270.. code-block:: c
271
272    /* Add route to ip4 graph infra */
273    for (i = 0; i < IPV4_L3FWD_LPM_NUM_ROUTES; i++) {
274        /* ... */
275
276        dst_port = ipv4_l3fwd_lpm_route_array[i].if_out;
277        next_hop = i;
278
279        /* ... */
280        ret = rte_node_ip4_route_add(ipv4_l3fwd_lpm_route_array[i].ip,
281                                     ipv4_l3fwd_lpm_route_array[i].depth, next_hop,
282                                     RTE_NODE_IP4_LOOKUP_NEXT_REWRITE);
283
284        /* ... */
285
286        memcpy(rewrite_data, val_eth + dst_port, rewrite_len);
287
288        /* Add next hop for a given destination */
289        ret = rte_node_ip4_rewrite_add(next_hop, rewrite_data,
290                                       rewrite_len, dst_port);
291
292        RTE_LOG(INFO, L3FWD_GRAPH, "Added route %s, next_hop %u\n",
293                route_str, next_hop);
294    }
295
296Packet Forwarding using Graph Walk
297~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
298
299Now that all the device configurations are done, graph creations are done and
300forwarding data is updated with nodes, worker lcores will be launched with graph
301main loop. Graph main loop is very simple in the sense that it needs to
302continuously call a non-blocking API ``rte_graph_walk()`` with it's lcore
303specific graph object that was already created.
304
305.. note::
306
307    rte_graph_walk() will walk over all the sources nodes i.e ``ethdev_rx-X-Y``
308    associated with a given graph and Rx the available packets and enqueue them
309    to the following node ``ip4_lookup`` which then will enqueue them to ``ip4_rewrite``
310    node if LPM lookup succeeds. ``ip4_rewrite`` node then will update Ethernet header
311    as per next-hop data and transmit the packet via port 'Z' by enqueuing
312    to ``ethdev_tx-Z`` node instance in its graph object.
313
314.. code-block:: c
315
316    /* Main processing loop */
317    static int
318    graph_main_loop(void *conf)
319    {
320        // ...
321
322        lcore_id = rte_lcore_id();
323        qconf = &lcore_conf[lcore_id];
324        graph = qconf->graph;
325
326        RTE_LOG(INFO, L3FWD_GRAPH,
327                "Entering main loop on lcore %u, graph %s(%p)\n", lcore_id,
328                qconf->name, graph);
329
330        /* Walk over graph until signal to quit */
331        while (likely(!force_quit))
332            rte_graph_walk(graph);
333        return 0;
334    }
335