1.. SPDX-License-Identifier: BSD-3-Clause 2 Copyright(c) 2015-2016 Intel Corporation. 3 4Keep Alive Sample Application 5============================= 6 7The Keep Alive application is a simple example of a 8heartbeat/watchdog for packet processing cores. It demonstrates how 9to detect 'failed' DPDK cores and notify a fault management entity 10of this failure. Its purpose is to ensure the failure of the core 11does not result in a fault that is not detectable by a management 12entity. 13 14 15Overview 16-------- 17 18The application demonstrates how to protect against 'silent outages' 19on packet processing cores. A Keep Alive Monitor Agent Core (main) 20monitors the state of packet processing cores (worker cores) by 21dispatching pings at a regular time interval (default is 5ms) and 22monitoring the state of the cores. Cores states are: Alive, MIA, Dead 23or Buried. MIA indicates a missed ping, and Dead indicates two missed 24pings within the specified time interval. When a core is Dead, a 25callback function is invoked to restart the packet processing core; 26A real life application might use this callback function to notify a 27higher level fault management entity of the core failure in order to 28take the appropriate corrective action. 29 30Note: Only the worker cores are monitored. A local (on the host) mechanism 31or agent to supervise the Keep Alive Monitor Agent Core DPDK core is required 32to detect its failure. 33 34Note: This application is based on the :doc:`l2_forward_real_virtual`. As 35such, the initialization and run-time paths are very similar to those 36of the L2 forwarding application. 37 38Compiling the Application 39------------------------- 40 41To compile the sample application see :doc:`compiling`. 42 43The application is located in the ``l2fwd_keep_alive`` sub-directory. 44 45Running the Application 46----------------------- 47 48The application has a number of command line options: 49 50.. code-block:: console 51 52 ./<build_dir>/examples/dpdk-l2fwd-keepalive [EAL options] \ 53 -- -p PORTMASK [-q NQ] [-K PERIOD] [-T PERIOD] 54 55where, 56 57* ``p PORTMASK``: A hexadecimal bitmask of the ports to configure 58 59* ``q NQ``: A number of queues (=ports) per lcore (default is 1) 60 61* ``K PERIOD``: Heartbeat check period in ms(5ms default; 86400 max) 62 63* ``T PERIOD``: statistics will be refreshed each PERIOD seconds (0 to 64 disable, 10 default, 86400 maximum). 65 66To run the application in linux environment with 4 lcores, 16 ports 678 RX queues per lcore and a ping interval of 10ms, issue the command: 68 69.. code-block:: console 70 71 ./<build_dir>/examples/dpdk-l2fwd-keepalive -l 0-3 -n 4 -- -q 8 -p ffff -K 10 72 73Refer to the *DPDK Getting Started Guide* for general information on 74running applications and the Environment Abstraction Layer (EAL) 75options. 76 77 78Explanation 79----------- 80 81The following sections provide some explanation of the The 82Keep-Alive/'Liveliness' conceptual scheme. As mentioned in the 83overview section, the initialization and run-time paths are very 84similar to those of the :doc:`l2_forward_real_virtual`. 85 86The Keep-Alive/'Liveliness' conceptual scheme: 87 88* A Keep- Alive Agent Runs every N Milliseconds. 89 90* DPDK Cores respond to the keep-alive agent. 91 92* If keep-alive agent detects time-outs, it notifies the 93 fault management entity through a callback function. 94 95The following sections provide some explanation of the code aspects 96that are specific to the Keep Alive sample application. 97 98The keepalive functionality is initialized with a struct 99rte_keepalive and the callback function to invoke in the 100case of a timeout. 101 102.. code-block:: c 103 104 rte_global_keepalive_info = rte_keepalive_create(&dead_core, NULL); 105 if (rte_global_keepalive_info == NULL) 106 rte_exit(EXIT_FAILURE, "keepalive_create() failed"); 107 108The function that issues the pings keepalive_dispatch_pings() 109is configured to run every check_period milliseconds. 110 111.. code-block:: c 112 113 if (rte_timer_reset(&hb_timer, 114 (check_period * rte_get_timer_hz()) / 1000, 115 PERIODICAL, 116 rte_lcore_id(), 117 &rte_keepalive_dispatch_pings, 118 rte_global_keepalive_info 119 ) != 0 ) 120 rte_exit(EXIT_FAILURE, "Keepalive setup failure.\n"); 121 122The rest of the initialization and run-time path follows 123the same paths as the L2 forwarding application. The only 124addition to the main processing loop is the mark alive 125functionality and the example random failures. 126 127.. code-block:: c 128 129 rte_keepalive_mark_alive(&rte_global_keepalive_info); 130 cur_tsc = rte_rdtsc(); 131 132 /* Die randomly within 7 secs for demo purposes.. */ 133 if (cur_tsc - tsc_initial > tsc_lifetime) 134 break; 135 136The rte_keepalive_mark_alive function simply sets the core state to alive. 137 138.. code-block:: c 139 140 static inline void 141 rte_keepalive_mark_alive(struct rte_keepalive *keepcfg) 142 { 143 keepcfg->live_data[rte_lcore_id()].core_state = RTE_KA_STATE_ALIVE; 144 } 145