1*fc1f2750SBernard Iremonger.. BSD LICENSE 2*fc1f2750SBernard Iremonger Copyright(c) 2010-2014 Intel Corporation. All rights reserved. 3*fc1f2750SBernard Iremonger All rights reserved. 4*fc1f2750SBernard Iremonger 5*fc1f2750SBernard Iremonger Redistribution and use in source and binary forms, with or without 6*fc1f2750SBernard Iremonger modification, are permitted provided that the following conditions 7*fc1f2750SBernard Iremonger are met: 8*fc1f2750SBernard Iremonger 9*fc1f2750SBernard Iremonger * Redistributions of source code must retain the above copyright 10*fc1f2750SBernard Iremonger notice, this list of conditions and the following disclaimer. 11*fc1f2750SBernard Iremonger * Redistributions in binary form must reproduce the above copyright 12*fc1f2750SBernard Iremonger notice, this list of conditions and the following disclaimer in 13*fc1f2750SBernard Iremonger the documentation and/or other materials provided with the 14*fc1f2750SBernard Iremonger distribution. 15*fc1f2750SBernard Iremonger * Neither the name of Intel Corporation nor the names of its 16*fc1f2750SBernard Iremonger contributors may be used to endorse or promote products derived 17*fc1f2750SBernard Iremonger from this software without specific prior written permission. 18*fc1f2750SBernard Iremonger 19*fc1f2750SBernard Iremonger THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 20*fc1f2750SBernard Iremonger "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 21*fc1f2750SBernard Iremonger LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 22*fc1f2750SBernard Iremonger A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 23*fc1f2750SBernard Iremonger OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 24*fc1f2750SBernard Iremonger SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 25*fc1f2750SBernard Iremonger LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 26*fc1f2750SBernard Iremonger DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 27*fc1f2750SBernard Iremonger THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 28*fc1f2750SBernard Iremonger (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 29*fc1f2750SBernard Iremonger OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 30*fc1f2750SBernard Iremonger 31*fc1f2750SBernard Iremonger.. _Environment_Abstraction_Layer: 32*fc1f2750SBernard Iremonger 33*fc1f2750SBernard IremongerEnvironment Abstraction Layer 34*fc1f2750SBernard Iremonger============================= 35*fc1f2750SBernard Iremonger 36*fc1f2750SBernard IremongerThe Environment Abstraction Layer (EAL) is responsible for gaining access to low-level resources such as hardware and memory space. 37*fc1f2750SBernard IremongerIt provides a generic interface that hides the environment specifics from the applications and libraries. 38*fc1f2750SBernard IremongerIt is the responsibility of the initialization routine to decide how to allocate these resources 39*fc1f2750SBernard Iremonger(that is, memory space, PCI devices, timers, consoles, and so on). 40*fc1f2750SBernard Iremonger 41*fc1f2750SBernard IremongerTypical services expected from the EAL are: 42*fc1f2750SBernard Iremonger 43*fc1f2750SBernard Iremonger* Intel® DPDK Loading and Launching: 44*fc1f2750SBernard Iremonger The Intel® DPDK and its application are linked as a single application and must be loaded by some means. 45*fc1f2750SBernard Iremonger 46*fc1f2750SBernard Iremonger* Core Affinity/Assignment Procedures: 47*fc1f2750SBernard Iremonger The EAL provides mechanisms for assigning execution units to specific cores as well as creating execution instances. 48*fc1f2750SBernard Iremonger 49*fc1f2750SBernard Iremonger* System Memory Reservation: 50*fc1f2750SBernard Iremonger The EAL facilitates the reservation of different memory zones, for example, physical memory areas for device interactions. 51*fc1f2750SBernard Iremonger 52*fc1f2750SBernard Iremonger* PCI Address Abstraction: The EAL provides an interface to access PCI address space. 53*fc1f2750SBernard Iremonger 54*fc1f2750SBernard Iremonger* Trace and Debug Functions: Logs, dump_stack, panic and so on. 55*fc1f2750SBernard Iremonger 56*fc1f2750SBernard Iremonger* Utility Functions: Spinlocks and atomic counters that are not provided in libc. 57*fc1f2750SBernard Iremonger 58*fc1f2750SBernard Iremonger* CPU Feature Identification: Determine at runtime if a particular feature, for example, Intel® AVX is supported. 59*fc1f2750SBernard Iremonger Determine if the current CPU supports the feature set that the binary was compiled for. 60*fc1f2750SBernard Iremonger 61*fc1f2750SBernard Iremonger* Interrupt Handling: Interfaces to register/unregister callbacks to specific interrupt sources. 62*fc1f2750SBernard Iremonger 63*fc1f2750SBernard Iremonger* Alarm Functions: Interfaces to set/remove callbacks to be run at a specific time. 64*fc1f2750SBernard Iremonger 65*fc1f2750SBernard IremongerEAL in a Linux-userland Execution Environment 66*fc1f2750SBernard Iremonger--------------------------------------------- 67*fc1f2750SBernard Iremonger 68*fc1f2750SBernard IremongerIn a Linux user space environment, the Intel® DPDK application runs as a user-space application using the pthread library. 69*fc1f2750SBernard IremongerPCI information about devices and address space is discovered through the /sys kernel interface and through a module called igb_uio. 70*fc1f2750SBernard IremongerRefer to the UIO: User-space drivers documentation in the Linux kernel. This memory is mmap'd in the application. 71*fc1f2750SBernard Iremonger 72*fc1f2750SBernard IremongerThe EAL performs physical memory allocation using mmap() in hugetlbfs (using huge page sizes to increase performance). 73*fc1f2750SBernard IremongerThis memory is exposed to Intel® DPDK service layers such as the :ref:`Mempool Library <Mempool_Library>`. 74*fc1f2750SBernard Iremonger 75*fc1f2750SBernard IremongerAt this point, the Intel® DPDK services layer will be initialized, then through pthread setaffinity calls, 76*fc1f2750SBernard Iremongereach execution unit will be assigned to a specific logical core to run as a user-level thread. 77*fc1f2750SBernard Iremonger 78*fc1f2750SBernard IremongerThe time reference is provided by the CPU Time-Stamp Counter (TSC) or by the HPET kernel API through a mmap() call. 79*fc1f2750SBernard Iremonger 80*fc1f2750SBernard IremongerInitialization and Core Launching 81*fc1f2750SBernard Iremonger~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 82*fc1f2750SBernard Iremonger 83*fc1f2750SBernard IremongerPart of the initialization is done by the start function of glibc. 84*fc1f2750SBernard IremongerA check is also performed at initialization time to ensure that the micro architecture type chosen in the config file is supported by the CPU. 85*fc1f2750SBernard IremongerThen, the main() function is called. The core initialization and launch is done in rte_eal_init() (see the API documentation). 86*fc1f2750SBernard IremongerIt consist of calls to the pthread library (more specifically, pthread_self(), pthread_create(), and pthread_setaffinity_np()). 87*fc1f2750SBernard Iremonger 88*fc1f2750SBernard Iremonger.. _pg_figure_2: 89*fc1f2750SBernard Iremonger 90*fc1f2750SBernard Iremonger**Figure 2. EAL Initialization in a Linux Application Environment** 91*fc1f2750SBernard Iremonger 92*fc1f2750SBernard Iremonger.. image3_png has been replaced 93*fc1f2750SBernard Iremonger 94*fc1f2750SBernard Iremonger|linuxapp_launch| 95*fc1f2750SBernard Iremonger 96*fc1f2750SBernard Iremonger.. note:: 97*fc1f2750SBernard Iremonger 98*fc1f2750SBernard Iremonger Initialization of objects, such as memory zones, rings, memory pools, lpm tables and hash tables, 99*fc1f2750SBernard Iremonger should be done as part of the overall application initialization on the master lcore. 100*fc1f2750SBernard Iremonger The creation and initialization functions for these objects are not multi-thread safe. 101*fc1f2750SBernard Iremonger However, once initialized, the objects themselves can safely be used in multiple threads simultaneously. 102*fc1f2750SBernard Iremonger 103*fc1f2750SBernard IremongerMulti-process Support 104*fc1f2750SBernard Iremonger~~~~~~~~~~~~~~~~~~~~~ 105*fc1f2750SBernard Iremonger 106*fc1f2750SBernard IremongerThe Linuxapp EAL allows a multi-process as well as a multi-threaded (pthread) deployment model. 107*fc1f2750SBernard IremongerSee chapter 2.20 108*fc1f2750SBernard Iremonger:ref:`Multi-process Support <Multi-process_Support>` for more details. 109*fc1f2750SBernard Iremonger 110*fc1f2750SBernard IremongerMemory Mapping Discovery and Memory Reservation 111*fc1f2750SBernard Iremonger~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 112*fc1f2750SBernard Iremonger 113*fc1f2750SBernard IremongerThe allocation of large contiguous physical memory is done using the hugetlbfs kernel filesystem. 114*fc1f2750SBernard IremongerThe EAL provides an API to reserve named memory zones in this contiguous memory. 115*fc1f2750SBernard IremongerThe physical address of the reserved memory for that memory zone is also returned to the user by the memory zone reservation API. 116*fc1f2750SBernard Iremonger 117*fc1f2750SBernard Iremonger.. note:: 118*fc1f2750SBernard Iremonger 119*fc1f2750SBernard Iremonger Memory reservations done using the APIs provided by the rte_malloc library are also backed by pages from the hugetlbfs filesystem. 120*fc1f2750SBernard Iremonger However, physical address information is not available for the blocks of memory allocated in this way. 121*fc1f2750SBernard Iremonger 122*fc1f2750SBernard IremongerXen Dom0 support without hugetbls 123*fc1f2750SBernard Iremonger~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 124*fc1f2750SBernard Iremonger 125*fc1f2750SBernard IremongerThe existing memory management implementation is based on the Linux kernel hugepage mechanism. 126*fc1f2750SBernard IremongerHowever, Xen Dom0 does not support hugepages, so a new Linux kernel module rte_dom0_mm is added to workaround this limitation. 127*fc1f2750SBernard Iremonger 128*fc1f2750SBernard IremongerThe EAL uses IOCTL interface to notify the Linux kernel module rte_dom0_mm to allocate memory of specified size, 129*fc1f2750SBernard Iremongerand get all memory segments information from the module, 130*fc1f2750SBernard Iremongerand the EAL uses MMAP interface to map the allocated memory. 131*fc1f2750SBernard IremongerFor each memory segment, the physical addresses are contiguous within it but actual hardware addresses are contiguous within 2MB. 132*fc1f2750SBernard Iremonger 133*fc1f2750SBernard IremongerPCI Access 134*fc1f2750SBernard Iremonger~~~~~~~~~~ 135*fc1f2750SBernard Iremonger 136*fc1f2750SBernard IremongerThe EAL uses the /sys/bus/pci utilities provided by the kernel to scan the content on the PCI bus. 137*fc1f2750SBernard Iremonger 138*fc1f2750SBernard IremongerTo access PCI memory, a kernel module called igb_uio provides a /dev/uioX device file 139*fc1f2750SBernard Iremongerthat can be mmap'd to obtain access to PCI address space from the application. 140*fc1f2750SBernard IremongerIt uses the uio kernel feature (userland driver). 141*fc1f2750SBernard Iremonger 142*fc1f2750SBernard IremongerPer-lcore and Shared Variables 143*fc1f2750SBernard Iremonger~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 144*fc1f2750SBernard Iremonger 145*fc1f2750SBernard Iremonger.. note:: 146*fc1f2750SBernard Iremonger 147*fc1f2750SBernard Iremonger lcore refers to a logical execution unit of the processor, sometimes called a hardware *thread*. 148*fc1f2750SBernard Iremonger 149*fc1f2750SBernard IremongerShared variables are the default behavior. 150*fc1f2750SBernard IremongerPer-lcore variables are implemented using *Thread Local Storage* (TLS) to provide per-thread local storage. 151*fc1f2750SBernard Iremonger 152*fc1f2750SBernard IremongerLogs 153*fc1f2750SBernard Iremonger~~~~ 154*fc1f2750SBernard Iremonger 155*fc1f2750SBernard IremongerA logging API is provided by EAL. 156*fc1f2750SBernard IremongerBy default, in a Linux application, logs are sent to syslog and also to the console. 157*fc1f2750SBernard IremongerHowever, the log function can be overridden by the user to use a different logging mechanism. 158*fc1f2750SBernard Iremonger 159*fc1f2750SBernard IremongerTrace and Debug Functions 160*fc1f2750SBernard Iremonger^^^^^^^^^^^^^^^^^^^^^^^^^ 161*fc1f2750SBernard Iremonger 162*fc1f2750SBernard IremongerThere are some debug functions to dump the stack in glibc. 163*fc1f2750SBernard IremongerThe rte_panic() function can voluntarily provoke a SIG_ABORT, 164*fc1f2750SBernard Iremongerwhich can trigger the generation of a core file, readable by gdb. 165*fc1f2750SBernard Iremonger 166*fc1f2750SBernard IremongerCPU Feature Identification 167*fc1f2750SBernard Iremonger~~~~~~~~~~~~~~~~~~~~~~~~~~ 168*fc1f2750SBernard Iremonger 169*fc1f2750SBernard IremongerThe EAL can query the CPU at runtime (using the rte_cpu_get_feature() function) to determine which CPU features are available. 170*fc1f2750SBernard Iremonger 171*fc1f2750SBernard IremongerUser Space Interrupt and Alarm Handling 172*fc1f2750SBernard Iremonger~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 173*fc1f2750SBernard Iremonger 174*fc1f2750SBernard IremongerThe EAL creates a host thread to poll the UIO device file descriptors to detect the interrupts. 175*fc1f2750SBernard IremongerCallbacks can be registered or unregistered by the EAL functions for a specific interrupt event 176*fc1f2750SBernard Iremongerand are called in the host thread asynchronously. 177*fc1f2750SBernard IremongerThe EAL also allows timed callbacks to be used in the same way as for NIC interrupts. 178*fc1f2750SBernard Iremonger 179*fc1f2750SBernard Iremonger.. note:: 180*fc1f2750SBernard Iremonger 181*fc1f2750SBernard Iremonger The only interrupts supported by the Intel® PDK Poll-Mode Drivers are those for link status change, 182*fc1f2750SBernard Iremonger i.e. link up and link down notification. 183*fc1f2750SBernard Iremonger 184*fc1f2750SBernard IremongerBlacklisting 185*fc1f2750SBernard Iremonger~~~~~~~~~~~~ 186*fc1f2750SBernard Iremonger 187*fc1f2750SBernard IremongerThe EAL PCI device blacklist functionality can be used to mark certain NIC ports as blacklisted, 188*fc1f2750SBernard Iremongerso they are ignored by the Intel® DPDK. 189*fc1f2750SBernard IremongerThe ports to be blacklisted are identified using the PCIe* description (Domain:Bus:Device.Function). 190*fc1f2750SBernard Iremonger 191*fc1f2750SBernard IremongerMisc Functions 192*fc1f2750SBernard Iremonger~~~~~~~~~~~~~~ 193*fc1f2750SBernard Iremonger 194*fc1f2750SBernard IremongerLocks and atomic operations are per-architecture (i686 and x86_64). 195*fc1f2750SBernard Iremonger 196*fc1f2750SBernard IremongerMemory Segments and Memory Zones (memzone) 197*fc1f2750SBernard Iremonger------------------------------------------ 198*fc1f2750SBernard Iremonger 199*fc1f2750SBernard IremongerThe mapping of physical memory is provided by this feature in the EAL. 200*fc1f2750SBernard IremongerAs physical memory can have gaps, the memory is described in a table of descriptors, 201*fc1f2750SBernard Iremongerand each descriptor (called rte_memseg ) describes a contiguous portion of memory. 202*fc1f2750SBernard Iremonger 203*fc1f2750SBernard IremongerOn top of this, the memzone allocator's role is to reserve contiguous portions of physical memory. 204*fc1f2750SBernard IremongerThese zones are identified by a unique name when the memory is reserved. 205*fc1f2750SBernard Iremonger 206*fc1f2750SBernard IremongerThe rte_memzone descriptors are also located in the configuration structure. 207*fc1f2750SBernard IremongerThis structure is accessed using rte_eal_get_configuration(). 208*fc1f2750SBernard IremongerThe lookup (by name) of a memory zone returns a descriptor containing the physical address of the memory zone. 209*fc1f2750SBernard Iremonger 210*fc1f2750SBernard IremongerMemory zones can be reserved with specific start address alignment by supplying the align parameter 211*fc1f2750SBernard Iremonger(by default, they are aligned to cache line size). 212*fc1f2750SBernard IremongerThe alignment value should be a power of two and not less than the cache line size (64 bytes). 213*fc1f2750SBernard IremongerMemory zones can also be reserved from either 2 MB or 1 GB hugepages, provided that both are available on the system. 214*fc1f2750SBernard Iremonger 215*fc1f2750SBernard Iremonger.. |linuxapp_launch| image:: img/linuxapp_launch.svg 216