env_abstraction_layer.rst - OpenGrok history log for /dpdk/doc/guides/prog_guide/env_abstraction

Revision	Date	Author	Comments
# a0dede62	17-Nov-2019	Vamsi Attunuru <vattunuru@marvell.com>	eal/linux: remove KNI restriction on IOVA Now that KNI supports VA (with kernel versions starting 4.6.0), we can accept IOVA as VA, but KNI must be configured for this. Pass iova_mode when creating eal/linux: remove KNI restriction on IOVA Now that KNI supports VA (with kernel versions starting 4.6.0), we can accept IOVA as VA, but KNI must be configured for this. Pass iova_mode when creating KNI netdevs. So far, IOVA detection policy forced IOVA as PA when KNI is loaded, whatever the buses IOVA requirements were. We can now use IOVA as VA, but this comes with a cost in KNI. When no constraint is expressed by the buses, keep the current behavior of choosing PA. Note: this change supposes that dpdk is built on the same kernel than the target system kernel; no objection has been expressed on this topic. Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com> Signed-off-by: Kiran Kumar K <kirankumark@marvell.com> Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com> show more ...
# f43d3dbb	12-Nov-2019	David Marchand <david.marchand@redhat.com>	doc/guides: clean repeated words Shoot repeated words in all our guides. Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Kevin Traynor <ktraynor@redhat.com>
# 7911ba04	18-Oct-2019	Phil Yang <phil.yang@arm.com>	stack: enable lock-free implementation for aarch64 Enable both C11 atomic and non C11 atomic lock-free stack for aarch64. Introduced a new header to reduce the ifdef clutter across generic and C11 stack: enable lock-free implementation for aarch64 Enable both C11 atomic and non C11 atomic lock-free stack for aarch64. Introduced a new header to reduce the ifdef clutter across generic and C11 files. The rte_stack_lf_stubs.h contains stub implementations of __rte_stack_lf_count, __rte_stack_lf_push_elems and __rte_stack_lf_pop_elems. Suggested-by: Gage Eads <gage.eads@intel.com> Suggested-by: Jerin Jacob <jerinj@marvell.com> Signed-off-by: Phil Yang <phil.yang@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Tested-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Acked-by: Jerin Jacob <jerinj@marvell.com> show more ...
# 79a0bbe5	29-Jul-2019	Anatoly Burakov <anatoly.burakov@intel.com>	eal: pick IOVA as PA if IOMMU is not available When IOMMU is not available, /sys/kernel/iommu_groups will not be populated. This is happening since at least 3.6 when VFIO support was added. If the d eal: pick IOVA as PA if IOMMU is not available When IOMMU is not available, /sys/kernel/iommu_groups will not be populated. This is happening since at least 3.6 when VFIO support was added. If the directory is empty, EAL should not pick IOVA as VA as the default IOVA mode. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Tested-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Tested-by: Jerin Jacob <jerinj@marvell.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com> Reviewed-by: David Marchand <david.marchand@redhat.com> show more ...
# bbe29a9b	22-Jul-2019	Jerin Jacob <jerinj@marvell.com>	eal/linux: select IOVA as VA mode for default case When bus layer reports the preferred mode as RTE_IOVA_DC then select the RTE_IOVA_VA mode: - All drivers work in RTE_IOVA_VA mode, irrespective of eal/linux: select IOVA as VA mode for default case When bus layer reports the preferred mode as RTE_IOVA_DC then select the RTE_IOVA_VA mode: - All drivers work in RTE_IOVA_VA mode, irrespective of physical address availability. - By default, a mempool asks for IOVA-contiguous memory using RTE_MEMZONE_IOVA_CONTIG. This is slow in RTE_IOVA_PA mode and it may affect the application boot time. Signed-off-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com> Signed-off-by: David Marchand <david.marchand@redhat.com> show more ...
# b76fafb1	22-Jul-2019	David Marchand <david.marchand@redhat.com>	eal: fix IOVA mode selection as VA for PCI drivers The incriminated commit broke the use of RTE_PCI_DRV_IOVA_AS_VA which was intended to mean "driver only supports VA" but had been understood as "dr eal: fix IOVA mode selection as VA for PCI drivers The incriminated commit broke the use of RTE_PCI_DRV_IOVA_AS_VA which was intended to mean "driver only supports VA" but had been understood as "driver supports both PA and VA" by most net drivers and used to let dpdk processes to run as non root (which do not have access to physical addresses on recent kernels). The check on physical addresses actually closed the gap for those drivers. We don't need to mark them with RTE_PCI_DRV_IOVA_AS_VA and this flag can retain its intended meaning. Document explicitly its meaning. We can check that a driver requirement wrt to IOVA mode is fulfilled before trying to probe a device. Finally, document the heuristic used to select the IOVA mode and hope that we won't break it again. Fixes: 703458e19c16 ("bus/pci: consider only usable devices for IOVA mode") Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com> Tested-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com> show more ...
# 3855b415	03-May-2019	Anatoly Burakov <anatoly.burakov@intel.com>	ipc: add warnings about not using IPC with memory API IPC and memory-related API's should not be mixed because memory relies on IPC internally. Add explicit warnings to IPC API and to the documentat ipc: add warnings about not using IPC with memory API IPC and memory-related API's should not be mixed because memory relies on IPC internally. Add explicit warnings to IPC API and to the documentation about this. Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> show more ...
# d629b7b5	26-Apr-2019	John McNamara <john.mcnamara@intel.com>	doc: fix spelling reported by aspell in guides Fix spelling errors in the guide docs. Signed-off-by: John McNamara <john.mcnamara@intel.com> Acked-by: Rami Rosen <ramirose@gmail.com>
# e75bc77f	03-Apr-2019	Gage Eads <gage.eads@intel.com>	mempool/stack: add lock-free stack mempool handler This commit adds support for lock-free (linked list based) stack mempool handler. In mempool_perf_autotest the lock-based stack outperforms the lo mempool/stack: add lock-free stack mempool handler This commit adds support for lock-free (linked list based) stack mempool handler. In mempool_perf_autotest the lock-based stack outperforms the lock-free handler for certain lcore/alloc count/free count combinations, however: - For applications with preemptible pthreads, a standard (lock-based) stack's worst-case performance (i.e. one thread being preempted while holding the spinlock) is much worse than the lock-free stack's. - Using per-thread mempool caches will largely mitigate the performance difference. Test setup: x86_64 build with default config, dual-socket Xeon E5-2699 v4, running on isolcpus cores with a tickless scheduler. The lock-based stack's rate_persec was 0.6x-3.5x the lock-free stack's. Signed-off-by: Gage Eads <gage.eads@intel.com> Reviewed-by: Olivier Matz <olivier.matz@6wind.com> show more ...
# 1e3380a2	29-Mar-2019	Anatoly Burakov <anatoly.burakov@intel.com>	mem: do not use lockfiles for single file segments mode Due to internal glibc limitations [1], DPDK may exhaust internal file descriptor limits when using smaller page sizes, which results in inabil mem: do not use lockfiles for single file segments mode Due to internal glibc limitations [1], DPDK may exhaust internal file descriptor limits when using smaller page sizes, which results in inability to use system calls such as select() by user applications. Single file segments option stores lock files per page to ensure that pages are deleted when there are no more users, however this is not necessary because the processes will be holding onto the pages anyway because of mmap(). Thus, removing pages from the filesystem is safe even though they may be used by some other secondary process. As a result, single file segments mode no longer stores inordinate amounts of segment fd's, and the above issue with fd limits is solved. However, this will not work for legacy mem mode. For that, simply document that using bigger page sizes is the only option. [1] https://mails.dpdk.org/archives/dev/2019-February/124386.html Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> show more ...
# c33a675b	10-Mar-2019	Shahaf Shuler <shahafs@mellanox.com>	bus: introduce device level DMA memory mapping The DPDK APIs expose 3 different modes to work with memory used for DMA: 1. Use the DPDK owned memory (backed by the DPDK provided hugepages). This me bus: introduce device level DMA memory mapping The DPDK APIs expose 3 different modes to work with memory used for DMA: 1. Use the DPDK owned memory (backed by the DPDK provided hugepages). This memory is allocated by the DPDK libraries, included in the DPDK memory system (memseg lists) and automatically DMA mapped by the DPDK layers. 2. Use memory allocated by the user and register to the DPDK memory systems. Upon registration of memory, the DPDK layers will DMA map it to all needed devices. After registration, allocation of this memory will be done with rte_malloc APIs. 3. Use memory allocated by the user and not registered to the DPDK memory system. This is for users who wants to have tight control on this memory (e.g. avoid the rte_malloc header). The user should create a memory, register it through rte_extmem_register API, and call DMA map function in order to register such memory to the different devices. The scope of the patch focus on #3 above. Currently the only way to map external memory is through VFIO (rte_vfio_dma_map). While VFIO is common, there are other vendors which use different ways to map memory (e.g. Mellanox and NXP). The work in this patch moves the DMA mapping to vendor agnostic APIs. Device level DMA map and unmap APIs were added. Implementation of those APIs was done currently only for PCI devices. For PCI bus devices, the pci driver can expose its own map and unmap functions to be used for the mapping. In case the driver doesn't provide any, the memory will be mapped, if possible, to IOMMU through VFIO APIs. Application usage with those APIs is quite simple: allocate memory * call rte_extmem_register on the memory chunk. * take a device, and query its rte_device. * call the device specific mapping function for this device. Future work will deprecate the rte_vfio_dma_map and rte_vfio_dma_unmap APIs, leaving the rte device APIs as the preferred option for the user. Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com> show more ...
# 218c4e68	06-Mar-2019	Bruce Richardson <bruce.richardson@intel.com>	mk: use linux and freebsd in config names Rather than using linuxapp and bsdapp everywhere, we can change things to use the, more readable, terms "linux" and "freebsd" in our build configs. Rather t mk: use linux and freebsd in config names Rather than using linuxapp and bsdapp everywhere, we can change things to use the, more readable, terms "linux" and "freebsd" in our build configs. Rather than renaming the configs we can just duplicate the existing ones with the new names using symlinks, and use the new names exclusively internally. ["make showconfigs" also only shows the new names to keep the list short] The result is that backward compatibility is kept fully but any new builds or development can be done using the newer names, i.e. both "make config T=x86_64-native-linuxapp-gcc" and "T=x86_64-native-linux-gcc" work. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> show more ...
# 91d7846c	06-Mar-2019	Bruce Richardson <bruce.richardson@intel.com>	eal/linux: rename linuxapp to linux The term "linuxapp" is a legacy one, but just calling the subdirectory "linux" is just clearer for all concerned. Signed-off-by: Bruce Richardson <bruce.richards eal/linux: rename linuxapp to linux The term "linuxapp" is a legacy one, but just calling the subdirectory "linux" is just clearer for all concerned. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> show more ...
# 25c99fbd	06-Mar-2019	Bruce Richardson <bruce.richardson@intel.com>	eal/bsd: rename bsdapp to freebsd The term "bsdapp" is a legacy one, but just calling the subdirectory "freebsd" is just clearer for all concerned. Signed-off-by: Bruce Richardson <bruce.richardson eal/bsd: rename bsdapp to freebsd The term "bsdapp" is a legacy one, but just calling the subdirectory "freebsd" is just clearer for all concerned. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> show more ...
# c3568ea3	19-Feb-2019	David Marchand <david.marchand@redhat.com>	eal: restrict control threads to startup CPU affinity Spawning the ctrl threads on anything that is not part of the eal coremask is not that polite to the rest of the system, especially when you too eal: restrict control threads to startup CPU affinity Spawning the ctrl threads on anything that is not part of the eal coremask is not that polite to the rest of the system, especially when you took good care to pin your processes on cpu resources with tools like taskset (linux) / cpuset (freebsd). Rather than introduce yet another eal options to control on which cpu those ctrl threads are created, let's take the startup cpu affinity as a reference and remove the eal coremask from it. If no cpu is left, then we default to the master core. The cpuset is computed once at init before the original cpu affinity is lost. Introduced a RTE_CPU_AND macro to abstract the differences between linux and freebsd respective macros. Examples in a 4 cores FreeBSD vm: $ ./build/app/testpmd -l 2,3 --no-huge --no-pci -m 512 \ -- -i --total-num-mbufs=2048 $ procstat -S 1057 PID TID COMM TDNAME CPU CSID CPU MASK 1057 100131 testpmd - 2 1 2 1057 100140 testpmd eal-intr-thread 1 1 0-1 1057 100141 testpmd rte_mp_handle 1 1 0-1 1057 100142 testpmd lcore-slave-3 3 1 3 $ cpuset -l 1,2,3 ./build/app/testpmd -l 2,3 --no-huge --no-pci -m 512 \ -- -i --total-num-mbufs=2048 $ procstat -S 1061 PID TID COMM TDNAME CPU CSID CPU MASK 1061 100131 testpmd - 2 2 2 1061 100144 testpmd eal-intr-thread 1 2 1 1061 100145 testpmd rte_mp_handle 1 2 1 1061 100147 testpmd lcore-slave-3 3 2 3 $ cpuset -l 2,3 ./build/app/testpmd -l 2,3 --no-huge --no-pci -m 512 \ -- -i --total-num-mbufs=2048 $ procstat -S 1065 PID TID COMM TDNAME CPU CSID CPU MASK 1065 100131 testpmd - 2 2 2 1065 100148 testpmd eal-intr-thread 2 2 2 1065 100149 testpmd rte_mp_handle 2 2 2 1065 100150 testpmd lcore-slave-3 3 2 3 Fixes: d651ee4919cd ("eal: set affinity for control threads") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com> Reviewed-by: Olivier Matz <olivier.matz@6wind.com> show more ...
# 41328c40	06-Dec-2018	Anatoly Burakov <anatoly.burakov@intel.com>	doc: remove note on memory mode limitation in multi-process Memory mode flags are now shared between primary and secondary processes, so the in documentation about limitations is no longer necessary doc: remove note on memory mode limitation in multi-process Memory mode flags are now shared between primary and secondary processes, so the in documentation about limitations is no longer necessary. Fixes: 64cdfc35aaad ("mem: store memory mode flags in shared config") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> show more ...
# bed79418	20-Dec-2018	Anatoly Burakov <anatoly.burakov@intel.com>	mem: allow usage of non-heap external memory in multiprocess Add multiprocess support for externally allocated memory areas that are not added to DPDK heap (and add relevant doc sections). Signed-o mem: allow usage of non-heap external memory in multiprocess Add multiprocess support for externally allocated memory areas that are not added to DPDK heap (and add relevant doc sections). Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Yongseok Koh <yskoh@mellanox.com> show more ...
# 950e8fb4	20-Dec-2018	Anatoly Burakov <anatoly.burakov@intel.com>	mem: allow registering external memory areas The general use-case of using external memory is well covered by existing external memory API's. However, certain use cases require manual management of mem: allow registering external memory areas The general use-case of using external memory is well covered by existing external memory API's. However, certain use cases require manual management of externally allocated memory areas, so this memory should not be added to the heap. It should, however, be added to DPDK's internal structures, so that API's like ``rte_virt2memseg`` would work on such external memory segments. This commit adds such an API to DPDK. The new functions will allow to register and unregister externally allocated memory areas, as well as documentation for them. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Yongseok Koh <yskoh@mellanox.com> show more ...
# 476c847a	14-Dec-2018	Jim Harris <james.r.harris@intel.com>	malloc: add option --match-allocations SPDK uses the rte_mem_event_callback_register API to create RDMA memory regions (MRs) for newly allocated regions of memory. This is used in both the SPDK NVMe malloc: add option --match-allocations SPDK uses the rte_mem_event_callback_register API to create RDMA memory regions (MRs) for newly allocated regions of memory. This is used in both the SPDK NVMe-oF target and the NVMe-oF host driver. DPDK creates internal malloc_elem structures for these allocated regions. As users malloc and free memory, DPDK will sometimes merge malloc_elems that originated from different allocations that were notified through the registered mem_event callback routine. This results in subsequent allocations that can span across multiple RDMA MRs. This requires SPDK to check each DPDK buffer to see if it crosses an MR boundary, and if so, would have to add considerable logic and complexity to describe that buffer before it can be accessed by the RNIC. It is somewhat analagous to rte_malloc returning a buffer that is not IOVA-contiguous. As a malloc_elem gets split and some of these elements get freed, it can also result in DPDK sending an RTE_MEM_EVENT_FREE notification for a subset of the original RTE_MEM_EVENT_ALLOC notification. This is also problematic for RDMA memory regions, since unregistering the memory region is all-or-nothing. It is not possible to unregister part of a memory region. To support these types of applications, this patch adds a new --match-allocations EAL init flag. When this flag is specified, malloc elements from different hugepage allocations will never be merged. Memory will also only be freed back to the system (with the requisite memory event callback) exactly as it was originally allocated. Since part of this patch is extending the size of struct malloc_elem, we also fix up the malloc autotests so they do not assume its size exactly fits in one cacheline. Signed-off-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com> show more ...
# e3e363a2	22-Nov-2018	Thomas Monjalon <thomas@monjalon.net>	doc: remove PCI-specific details from EAL guide The PCI bus is an independent driver and not part of EAL as it was in the early days. EAL must be understood as a generic layer. Signed-off-by: Thoma doc: remove PCI-specific details from EAL guide The PCI bus is an independent driver and not part of EAL as it was in the early days. EAL must be understood as a generic layer. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: John McNamara <john.mcnamara@intel.com> show more ...
# 075b182b	03-Oct-2018	Eric Zhang <eric.zhang@windriver.com>	eal: force IOVA to a particular mode This patch uses EAL option "--iova-mode" to force the IOVA mode to a particular value. There exists virtual devices that are not directly attached to the PCI bus eal: force IOVA to a particular mode This patch uses EAL option "--iova-mode" to force the IOVA mode to a particular value. There exists virtual devices that are not directly attached to the PCI bus, and therefore the auto detection of the IOVA mode based on probing the PCI bus and IOMMU configuration may not report the required addressing mode. Using the EAL option permits the mode to be explicitly configured in this scenario. Signed-off-by: Eric Zhang <eric.zhang@windriver.com> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com> Reviewed-by: Marko Kovacevic <marko.kovacevic@intel.com> show more ...
# 66498f0f	02-Oct-2018	Anatoly Burakov <anatoly.burakov@intel.com>	doc: add external memory feature Document the addition of external memory support to DPDK. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
# 4a6e683c	17-Jul-2018	Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>	ring: clarify preemptible nature of algorithm rte_ring implementation is not preemptible only under certain circumstances. This clarification is helpful for data plane and control plane communicatio ring: clarify preemptible nature of algorithm rte_ring implementation is not preemptible only under certain circumstances. This clarification is helpful for data plane and control plane communication using rte_ring. Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com> Acked-by: Olivier Matz <olivier.matz@6wind.com> show more ...
# e4348122	31-May-2018	Anatoly Burakov <anatoly.burakov@intel.com>	eal: add option to limit memory allocation on sockets Previously, it was possible to limit maximum amount of memory allowed for allocation by creating validator callbacks. Although a powerful tool, eal: add option to limit memory allocation on sockets Previously, it was possible to limit maximum amount of memory allowed for allocation by creating validator callbacks. Although a powerful tool, it's a bit of a hassle and requires modifying the application for it to work with DPDK example applications. Fix this by adding a new parameter "--socket-limit", with syntax similar to "--socket-mem", which would set per-socket memory allocation limits, and set up a default validator callback to deny all allocations above the limit. This option is incompatible with legacy mode, as validator callbacks are not supported there. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> show more ...
# b3173932	25-May-2018	Anatoly Burakov <anatoly.burakov@intel.com>	doc: update guides for memory subsystem Document new command-line switches and the principles behind the new memory subsystem. Also, replace outdated malloc heap picture. Signed-off-by: Anatoly Bur doc: update guides for memory subsystem Document new command-line switches and the principles behind the new memory subsystem. Also, replace outdated malloc heap picture. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> show more ...
123