|
Revision tags: v24.11, v24.11-rc4, v24.11-rc3, v24.11-rc2 |
|
| #
b24bbaed |
| 25-Oct-2024 |
Mattias Rönnblom <mattias.ronnblom@ericsson.com> |
service: keep per-lcore state in lcore variable
Replace static array of cache-aligned structs with an lcore variable, to slightly benefit code simplicity and performance.
Signed-off-by: Mattias Rön
service: keep per-lcore state in lcore variable
Replace static array of cache-aligned structs with an lcore variable, to slightly benefit code simplicity and performance.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org>
show more ...
|
|
Revision tags: v24.11-rc1 |
|
| #
34ec2384 |
| 09-Aug-2024 |
Mattias Rönnblom <mattias.ronnblom@ericsson.com> |
service: use bitset to represent service flags
Use a multi-word bitset to track which services are mapped to which lcores, allowing the RTE_SERVICE_NUM_MAX compile-time constant to be > 64.
Replace
service: use bitset to represent service flags
Use a multi-word bitset to track which services are mapped to which lcores, allowing the RTE_SERVICE_NUM_MAX compile-time constant to be > 64.
Replace array-of-bytes service-currently-active flags with a more compact multi-word bitset-based representation, reducing memory footprint somewhat.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com> Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
show more ...
|
| #
a37e053b |
| 09-Sep-2024 |
Mattias Rönnblom <mattias.ronnblom@ericsson.com> |
service: extend service function call statistics
Add two new per-service counters.
RTE_SERVICE_ATTR_IDLE_CALL_COUNT tracks the number of service function invocations where no work was performed.
R
service: extend service function call statistics
Add two new per-service counters.
RTE_SERVICE_ATTR_IDLE_CALL_COUNT tracks the number of service function invocations where no work was performed.
RTE_SERVICE_ATTR_ERROR_CALL_COUNT tracks the number invocations resulting in an error.
The semantics of RTE_SERVICE_ATTR_CALL_COUNT remains the same (i.e., counting all invocations, regardless of return value).
The new statistics may be useful for both debugging and profiling (e.g., calculate the average per-call processing latency for non-idle service calls).
Service core tests are extended to cover the new counters, and coverage for RTE_SERVICE_ATTR_CALL_COUNT is improved.
The documentation for the CYCLES attributes are updated to reflect their actual semantics.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
show more ...
|
|
Revision tags: v24.07, v24.07-rc4, v24.07-rc3, v24.07-rc2, v24.07-rc1, v24.03, v24.03-rc4, v24.03-rc3, v24.03-rc2 |
|
| #
c6552d9a |
| 04-Mar-2024 |
Tyler Retzlaff <roretzla@linux.microsoft.com> |
lib: move alignment attribute on types for MSVC
The current location used for __rte_aligned(a) for alignment of types is not compatible with MSVC. There is only a single location accepted by both to
lib: move alignment attribute on types for MSVC
The current location used for __rte_aligned(a) for alignment of types is not compatible with MSVC. There is only a single location accepted by both toolchains.
The standard offers no alignment facility that compatibly interoperates with C and C++ but it may be achieved by relocating the placement of __rte_aligned(a) to the aforementioned location accepted by all currently supported toolchains.
To allow alignment for both compilers, do the following:
* Expand __rte_aligned(a) to __declspec(align(a)) when building with MSVC.
* Move __rte_aligned from the end of {struct,union} definitions to be between {struct,union} and tag.
The placement between {struct,union} and the tag allows the desired alignment to be imparted on the type regardless of the toolchain being used for all of GCC, LLVM, MSVC compilers building both C and C++.
Note: this move has an additional benefit as Doxygen is not confused anymore like for the rte_event_vector struct definition.
Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@huawei.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Signed-off-by: David Marchand <david.marchand@redhat.com>
show more ...
|
|
Revision tags: v24.03-rc1 |
|
| #
ae67895b |
| 08-Dec-2023 |
David Marchand <david.marchand@redhat.com> |
lib: add more logging helpers
Add helpers for logging messages in libraries instead of calling RTE_LOG() directly. Those helpers take care of adding a \n: this will make the transition to RTE_LOG_LI
lib: add more logging helpers
Add helpers for logging messages in libraries instead of calling RTE_LOG() directly. Those helpers take care of adding a \n: this will make the transition to RTE_LOG_LINE trivial.
Note: - for acl and sched libraries that still has some debug multilines messages, a direct call to RTE_LOG is used: this will make it easier to notice such special cases,
Signed-off-by: David Marchand <david.marchand@redhat.com>
show more ...
|
|
Revision tags: v23.11, v23.11-rc4, v23.11-rc3, v23.11-rc2 |
|
| #
2a7a42a5 |
| 26-Oct-2023 |
Tyler Retzlaff <roretzla@linux.microsoft.com> |
eal: use stdatomic API
Replace the use of gcc builtin __atomic_xxx intrinsics with corresponding rte_atomic_xxx optional stdatomic API
Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com> A
eal: use stdatomic API
Replace the use of gcc builtin __atomic_xxx intrinsics with corresponding rte_atomic_xxx optional stdatomic API
Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com> Acked-by: David Marchand <david.marchand@redhat.com>
show more ...
|
|
Revision tags: v23.11-rc1 |
|
| #
18898c4d |
| 16-Oct-2023 |
Tyler Retzlaff <roretzla@linux.microsoft.com> |
eal: use abstracted bit count functions
Use DPDK abstracted bitcount functions instead of gcc __builtin_'s
Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
|
| #
3d4e27fd |
| 25-Aug-2023 |
David Marchand <david.marchand@redhat.com> |
use abstracted bit count functions
Now that DPDK provides such bit count functions, make use of them.
This patch was prepared with a "brutal" commandline:
$ old=__builtin_clzll; new=rte_clz64; g
use abstracted bit count functions
Now that DPDK provides such bit count functions, make use of them.
This patch was prepared with a "brutal" commandline:
$ old=__builtin_clzll; new=rte_clz64; git grep -lw $old :^lib/eal/include/rte_bitops.h | xargs sed -i -e "s#\<$old\>#$new#g" $ old=__builtin_clz; new=rte_clz32; git grep -lw $old :^lib/eal/include/rte_bitops.h | xargs sed -i -e "s#\<$old\>#$new#g"
$ old=__builtin_ctzll; new=rte_ctz64; git grep -lw $old :^lib/eal/include/rte_bitops.h | xargs sed -i -e "s#\<$old\>#$new#g" $ old=__builtin_ctz; new=rte_ctz32; git grep -lw $old :^lib/eal/include/rte_bitops.h | xargs sed -i -e "s#\<$old\>#$new#g"
$ old=__builtin_popcountll; new=rte_popcount64; git grep -lw $old :^lib/eal/include/rte_bitops.h | xargs sed -i -e "s#\<$old\>#$new#g" $ old=__builtin_popcount; new=rte_popcount32; git grep -lw $old :^lib/eal/include/rte_bitops.h | xargs sed -i -e "s#\<$old\>#$new#g"
Then inclusion of rte_bitops.h was added were necessary.
Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com> Reviewed-by: Long Li <longli@microsoft.com>
show more ...
|
|
Revision tags: v23.07, v23.07-rc4, v23.07-rc3, v23.07-rc2, v23.07-rc1 |
|
| #
841e87df |
| 12-May-2023 |
Arnaud Fiorini <arnaud.fiorini@polymtl.ca> |
eal: add tracepoints to track lcores and services
The tracepoints added are used to track lcore role and status, as well as service mapping and service runstates. These tracepoints are then used in
eal: add tracepoints to track lcores and services
The tracepoints added are used to track lcore role and status, as well as service mapping and service runstates. These tracepoints are then used in analyses in Trace Compass.
Signed-off-by: Arnaud Fiorini <arnaud.fiorini@polymtl.ca> Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
show more ...
|
|
Revision tags: v23.03, v23.03-rc4, v23.03-rc3, v23.03-rc2 |
|
| #
f9eb7a4b |
| 02-Mar-2023 |
Tyler Retzlaff <roretzla@linux.microsoft.com> |
use atomic intrinsics closer to C11
Use __atomic_fetch_{add,and,or,sub,xor} instead of __atomic_{add,and,or,sub,xor}_fetch when we have no interest in the result of the operation.
This change reduc
use atomic intrinsics closer to C11
Use __atomic_fetch_{add,and,or,sub,xor} instead of __atomic_{add,and,or,sub,xor}_fetch when we have no interest in the result of the operation.
This change reduces unnecessary code that provided the result of the atomic operation while this result was not used.
It also brings us to a closer alignment with atomics available in C11 standard and will reduce review effort when they are integrated.
Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Ruifeng Wang <ruifeng.wang@arm.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Reviewed-by: David Marchand <david.marchand@redhat.com>
show more ...
|
|
Revision tags: v23.03-rc1, v22.11, v22.11-rc4 |
|
| #
b41e574c |
| 18-Nov-2022 |
David Marchand <david.marchand@redhat.com> |
service: fix build with clang 15
This variable is not used.
Bugzilla ID: 1130 Fixes: 21698354c832 ("service: introduce service cores concept") Cc: stable@dpdk.org
Signed-off-by: David Marchand <da
service: fix build with clang 15
This variable is not used.
Bugzilla ID: 1130 Fixes: 21698354c832 ("service: introduce service cores concept") Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
show more ...
|
|
Revision tags: v22.11-rc3, v22.11-rc2 |
|
| #
329280c5 |
| 20-Oct-2022 |
Erik Gabriel Carrillo <erik.g.carrillo@intel.com> |
service: fix early move to inactive status
Assume thread T2 is a service lcore that is in the middle of executing a service function. Also, assume thread T1 concurrently calls rte_service_lcore_sto
service: fix early move to inactive status
Assume thread T2 is a service lcore that is in the middle of executing a service function. Also, assume thread T1 concurrently calls rte_service_lcore_stop(), which will set the "service_active_on_lcore" state to false. If thread T1 then calls rte_service_may_be_active(), it can return zero even though T2 is still running the service function. If T1 then proceeds to free data being used by T2, a crash can ensue.
Move the logic that clears the "service_active_on_lcore" state from the rte_service_lcore_stop() function to the service_runner_func() to ensure that we: - don't let the "service_active_on_lcore" state linger as 1 - don't clear the state early
Fixes: 6550113be62d ("service: fix lingering active status") Cc: stable@dpdk.org
Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
show more ...
|
|
Revision tags: v22.11-rc1 |
|
| #
809bd244 |
| 05-Oct-2022 |
Mattias Rönnblom <mattias.ronnblom@ericsson.com> |
service: tweak cycle statistics semantics
As a part of its service function, a service usually polls some kind of source (e.g., an RX queue, a ring, an eventdev port, or a timer wheel) to retrieve o
service: tweak cycle statistics semantics
As a part of its service function, a service usually polls some kind of source (e.g., an RX queue, a ring, an eventdev port, or a timer wheel) to retrieve one or more items of work.
In low-load situations, the service framework reports a significant amount of cycles spent for all running services, despite the fact they have performed little or no actual work.
The per-call cycle expenditure for an idle service (i.e., a service currently without pending jobs) is typically very low. Polling an empty ring or RX queue is inexpensive. However, since the service function call frequency on an idle or lightly loaded lcore is going to be very high indeed, the service function calls' cycles adds up to a significant amount. The only thing preventing the idle services' cycles counters to make up 100% of the available CPU cycles is the overhead of the service framework itself.
If the RTE_SERVICE_ATTR_CYCLES or RTE_SERVICE_LCORE_ATTR_CYCLES are used to estimate service core load, the cores may look very busy when the system is mostly doing nothing useful at all.
This patch allows for an idle service to indicate that no actual work was performed during a particular service function call (by returning -EAGAIN). In such cases the RTE_SERVICE_ATTR_CYCLES and RTE_SERVICE_LCORE_ATTR_CYCLES values are not incremented.
The convention of returning -EAGAIN for idle services may in the future also be used to have the lcore enter a short sleep, or reduce its operating frequency, in case all services are currently idle.
This change is backward-compatible.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
show more ...
|
| #
074b4db2 |
| 05-Oct-2022 |
Mattias Rönnblom <mattias.ronnblom@ericsson.com> |
service: reduce average case service core overhead
Optimize service loop so that the starting point is the lowest-indexed service mapped to the lcore in question, and terminate the loop at the highe
service: reduce average case service core overhead
Optimize service loop so that the starting point is the lowest-indexed service mapped to the lcore in question, and terminate the loop at the highest-indexed service.
While the worst case latency remains the same, this patch significantly reduces the service framework overhead for the average case. In particular, scenarios where an lcore only runs a single service, or multiple services which id values are close (e.g., three services with ids 17, 18 and 22), show significant improvements.
The worse case is a where the lcore two services mapped to it; one with service id 0 and the other with id 63.
On a service lcore serving a single service, the service loop overhead is reduced from ~190 core clock cycles to ~46, on an Intel Cascade Lake generation Xeon. On weakly ordered CPUs, the gain is larger, since the loop included load-acquire atomic operations.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
show more ...
|
| #
b54ade8f |
| 05-Oct-2022 |
Mattias Rönnblom <mattias.ronnblom@ericsson.com> |
service: introduce per-lcore cycles counter
Introduce a per-lcore counter for the total time spent on processing services on that core.
This counter is useful when measuring individual lcore load.
service: introduce per-lcore cycles counter
Introduce a per-lcore counter for the total time spent on processing services on that core.
This counter is useful when measuring individual lcore load.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
show more ...
|
| #
eb111cbd |
| 05-Oct-2022 |
Mattias Rönnblom <mattias.ronnblom@ericsson.com> |
service: reduce statistics overhead for parallel services
Move the statistics from the service data structure to the per-lcore struct. This eliminates contention for the counter cache lines, which d
service: reduce statistics overhead for parallel services
Move the statistics from the service data structure to the per-lcore struct. This eliminates contention for the counter cache lines, which decreases the producer-side statistics overhead for services deployed across many lcores.
Prior to this patch, enabling statistics for a service with a per-service function call latency of 1000 clock cycles deployed across 16 cores on a Intel Xeon 6230N @ 2,3 GHz would incur a cost of ~10000 core clock cycles per service call. After this patch, the statistics overhead is reduce to 22 clock cycles per call.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
show more ...
|
|
Revision tags: v22.07, v22.07-rc4 |
|
| #
99e4e840 |
| 11-Jul-2022 |
Harry van Haaren <harry.van.haaren@intel.com> |
service: fix stats race condition for MT safe service
This commit fixes a potential racey-add that could occur if multiple service-lcores were executing the same MT-safe service at the same time, wi
service: fix stats race condition for MT safe service
This commit fixes a potential racey-add that could occur if multiple service-lcores were executing the same MT-safe service at the same time, with service statistics collection enabled.
Because multiple threads can run and execute the service, the stats values can have multiple writer threads, resulting in the requirement of using atomic addition for correctness.
Note that when a MT unsafe service is executed, a spinlock is held, so the stats increments are protected. This fact is used to avoid executing atomic add instructions when not required. Regular reads and increments are used, and only the store is specified as atomic, reducing perf impact on e.g. x86 arch.
This patch causes a 1.25x increase in cycle-cost for polling a MT safe service when statistics are enabled. No change was seen for MT unsafe services, or when statistics are disabled.
Fixes: 21698354c832 ("service: introduce service cores concept")
Reported-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com> Suggested-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Suggested-by: Morten Brørup <mb@smartsharesystems.com> Suggested-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
show more ...
|
|
Revision tags: v22.07-rc3 |
|
| #
6550113b |
| 05-Jul-2022 |
Harry van Haaren <harry.van.haaren@intel.com> |
service: fix lingering active status
This commit fixes an issue where calling rte_service_lcore_stop() would result in a service's "active on lcore" status becoming stale.
The stale status would re
service: fix lingering active status
This commit fixes an issue where calling rte_service_lcore_stop() would result in a service's "active on lcore" status becoming stale.
The stale status would result in rte_service_may_be_active() always returning "1", indicating that the service is not certainly stopped.
This is fixed by ensuring the "active on lcore" status of each service is set to 0 when an lcore is stopped.
Fixes: e30dd31847d2 ("service: add mechanism for quiescing") Fixes: 8929de043eb4 ("service: retrieve lcore active state")
Reported-by: Naga Harish K S V <s.v.naga.harish.k@intel.com> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
show more ...
|
|
Revision tags: v22.07-rc2, v22.07-rc1, v22.03, v22.03-rc4, v22.03-rc3, v22.03-rc2 |
|
| #
30a1de10 |
| 15-Feb-2022 |
Sean Morrissey <sean.morrissey@intel.com> |
lib: remove unneeded header includes
These header includes have been flagged by the iwyu_tool and removed.
Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>
|
|
Revision tags: v22.03-rc1, v21.11, v21.11-rc4, v21.11-rc3, v21.11-rc2, v21.11-rc1, v21.08, v21.08-rc4, v21.08-rc3, v21.08-rc2, v21.08-rc1, v21.05, v21.05-rc4, v21.05-rc3, v21.05-rc2, v21.05-rc1 |
|
| #
99a2dd95 |
| 20-Apr-2021 |
Bruce Richardson <bruce.richardson@intel.com> |
lib: remove librte_ prefix from directory names
There is no reason for the DPDK libraries to all have 'librte_' prefix on the directory names. This prefix makes the directory names longer and also m
lib: remove librte_ prefix from directory names
There is no reason for the DPDK libraries to all have 'librte_' prefix on the directory names. This prefix makes the directory names longer and also makes it awkward to add features referring to individual libraries in the build - should the lib names be specified with or without the prefix. Therefore, we can just remove the library prefix and use the library's unique name as the directory name, i.e. 'eal' rather than 'librte_eal'
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
show more ...
|