1 2.. BSD LICENSE 3 Copyright(c) 2015-2016 Intel Corporation. All rights reserved. 4 All rights reserved. 5 6 Redistribution and use in source and binary forms, with or without 7 modification, are permitted provided that the following conditions 8 are met: 9 10 * Redistributions of source code must retain the above copyright 11 notice, this list of conditions and the following disclaimer. 12 * Redistributions in binary form must reproduce the above copyright 13 notice, this list of conditions and the following disclaimer in 14 the documentation and/or other materials provided with the 15 distribution. 16 * Neither the name of Intel Corporation nor the names of its 17 contributors may be used to endorse or promote products derived 18 from this software without specific prior written permission. 19 20 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 21 "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 22 LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 23 A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 24 OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 25 SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 26 LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 27 DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 28 THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 29 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 30 OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 31 32Keep Alive Sample Application 33============================= 34 35The Keep Alive application is a simple example of a 36heartbeat/watchdog for packet processing cores. It demonstrates how 37to detect 'failed' DPDK cores and notify a fault management entity 38of this failure. Its purpose is to ensure the failure of the core 39does not result in a fault that is not detectable by a management 40entity. 41 42 43Overview 44-------- 45 46The application demonstrates how to protect against 'silent outages' 47on packet processing cores. A Keep Alive Monitor Agent Core (master) 48monitors the state of packet processing cores (worker cores) by 49dispatching pings at a regular time interval (default is 5ms) and 50monitoring the state of the cores. Cores states are: Alive, MIA, Dead 51or Buried. MIA indicates a missed ping, and Dead indicates two missed 52pings within the specified time interval. When a core is Dead, a 53callback function is invoked to restart the packet processing core; 54A real life application might use this callback function to notify a 55higher level fault management entity of the core failure in order to 56take the appropriate corrective action. 57 58Note: Only the worker cores are monitored. A local (on the host) mechanism 59or agent to supervise the Keep Alive Monitor Agent Core DPDK core is required 60to detect its failure. 61 62Note: This application is based on the :doc:`l2_forward_real_virtual`. As 63such, the initialization and run-time paths are very similar to those 64of the L2 forwarding application. 65 66Compiling the Application 67------------------------- 68 69To compile the application: 70 71#. Go to the sample application directory: 72 73 .. code-block:: console 74 75 export RTE_SDK=/path/to/rte_sdk cd ${RTE_SDK}/examples/keep_alive 76 77#. Set the target (a default target is used if not specified). For example: 78 79 .. code-block:: console 80 81 export RTE_TARGET=x86_64-native-linuxapp-gcc 82 83 See the *DPDK Getting Started Guide* for possible RTE_TARGET values. 84 85#. Build the application: 86 87 .. code-block:: console 88 89 make 90 91Running the Application 92----------------------- 93 94The application has a number of command line options: 95 96.. code-block:: console 97 98 ./build/l2fwd-keepalive [EAL options] \ 99 -- -p PORTMASK [-q NQ] [-K PERIOD] [-T PERIOD] 100 101where, 102 103* ``p PORTMASK``: A hexadecimal bitmask of the ports to configure 104 105* ``q NQ``: A number of queues (=ports) per lcore (default is 1) 106 107* ``K PERIOD``: Heartbeat check period in ms(5ms default; 86400 max) 108 109* ``T PERIOD``: statistics will be refreshed each PERIOD seconds (0 to 110 disable, 10 default, 86400 maximum). 111 112To run the application in linuxapp environment with 4 lcores, 16 ports 1138 RX queues per lcore and a ping interval of 10ms, issue the command: 114 115.. code-block:: console 116 117 ./build/l2fwd-keepalive -l 0-3 -n 4 -- -q 8 -p ffff -K 10 118 119Refer to the *DPDK Getting Started Guide* for general information on 120running applications and the Environment Abstraction Layer (EAL) 121options. 122 123 124Explanation 125----------- 126 127The following sections provide some explanation of the The 128Keep-Alive/'Liveliness' conceptual scheme. As mentioned in the 129overview section, the initialization and run-time paths are very 130similar to those of the :doc:`l2_forward_real_virtual`. 131 132The Keep-Alive/'Liveliness' conceptual scheme: 133 134* A Keep- Alive Agent Runs every N Milliseconds. 135 136* DPDK Cores respond to the keep-alive agent. 137 138* If keep-alive agent detects time-outs, it notifies the 139 fault management entity through a callback function. 140 141The following sections provide some explanation of the code aspects 142that are specific to the Keep Alive sample application. 143 144The keepalive functionality is initialized with a struct 145rte_keepalive and the callback function to invoke in the 146case of a timeout. 147 148.. code-block:: c 149 150 rte_global_keepalive_info = rte_keepalive_create(&dead_core, NULL); 151 if (rte_global_keepalive_info == NULL) 152 rte_exit(EXIT_FAILURE, "keepalive_create() failed"); 153 154The function that issues the pings keepalive_dispatch_pings() 155is configured to run every check_period milliseconds. 156 157.. code-block:: c 158 159 if (rte_timer_reset(&hb_timer, 160 (check_period * rte_get_timer_hz()) / 1000, 161 PERIODICAL, 162 rte_lcore_id(), 163 &rte_keepalive_dispatch_pings, 164 rte_global_keepalive_info 165 ) != 0 ) 166 rte_exit(EXIT_FAILURE, "Keepalive setup failure.\n"); 167 168The rest of the initialization and run-time path follows 169the same paths as the the L2 forwarding application. The only 170addition to the main processing loop is the mark alive 171functionality and the example random failures. 172 173.. code-block:: c 174 175 rte_keepalive_mark_alive(&rte_global_keepalive_info); 176 cur_tsc = rte_rdtsc(); 177 178 /* Die randomly within 7 secs for demo purposes.. */ 179 if (cur_tsc - tsc_initial > tsc_lifetime) 180 break; 181 182The rte_keepalive_mark_alive function simply sets the core state to alive. 183 184.. code-block:: c 185 186 static inline void 187 rte_keepalive_mark_alive(struct rte_keepalive *keepcfg) 188 { 189 keepcfg->state_flags[rte_lcore_id()] = ALIVE; 190 } 191