| 950e8fb4 | 20-Dec-2018 |
Anatoly Burakov <anatoly.burakov@intel.com> |
mem: allow registering external memory areas
The general use-case of using external memory is well covered by existing external memory API's. However, certain use cases require manual management of
mem: allow registering external memory areas
The general use-case of using external memory is well covered by existing external memory API's. However, certain use cases require manual management of externally allocated memory areas, so this memory should not be added to the heap. It should, however, be added to DPDK's internal structures, so that API's like ``rte_virt2memseg`` would work on such external memory segments.
This commit adds such an API to DPDK. The new functions will allow to register and unregister externally allocated memory areas, as well as documentation for them.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>
show more ...
|
| e3e363a2 | 22-Nov-2018 |
Thomas Monjalon <thomas@monjalon.net> |
doc: remove PCI-specific details from EAL guide
The PCI bus is an independent driver and not part of EAL as it was in the early days. EAL must be understood as a generic layer.
Signed-off-by: Thoma
doc: remove PCI-specific details from EAL guide
The PCI bus is an independent driver and not part of EAL as it was in the early days. EAL must be understood as a generic layer.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: John McNamara <john.mcnamara@intel.com>
show more ...
|
| 7307cf63 | 22-Oct-2018 |
Ori Kam <orika@mellanox.com> |
ethdev: add raw encapsulation action
Currenlty the encap/decap actions only support encapsulation of VXLAN and NVGRE L2 packets (L2 encapsulation is where the inner packet has a valid Ethernet heade
ethdev: add raw encapsulation action
Currenlty the encap/decap actions only support encapsulation of VXLAN and NVGRE L2 packets (L2 encapsulation is where the inner packet has a valid Ethernet header, while L3 encapsulation is where the inner packet doesn't have the Ethernet header). In addtion the parameter to to the encap action is a list of rte items, this results in 2 extra translation, between the application to the actioni and from the action to the NIC. This results in negative impact on the insertion performance.
Looking forward there are going to be a need to support many more tunnel encapsulations. For example MPLSoGRE, MPLSoUDP. Adding the new encapsulation will result in duplication of code. For example the code for handling NVGRE and VXLAN are exactly the same, and each new tunnel will have the same exact structure.
This patch introduce a raw encapsulation that can support L2 tunnel types and L3 tunnel types. In addtion the new encapsulations commands are using raw buffer inorder to save the converstion time, both for the application and the PMD.
In order to encapsulate L3 tunnel type there is a need to use both actions in the same rule: The decap to remove the L2 of the original packet, and then encap command to encapsulate the packet with the tunnel. For decap L3 there is also a need to use both commands in the same flow first the decap command to remove the outer tunnel header and then encap to add the L2 header.
Signed-off-by: Ori Kam <orika@mellanox.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
show more ...
|
| e605a1d3 | 26-Oct-2018 |
Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> |
hash: add lock-free r/w concurrency
Add lock-free read-write concurrency. This is achieved by the following changes.
1) Add memory ordering to avoid race conditions. The only race condition that ca
hash: add lock-free r/w concurrency
Add lock-free read-write concurrency. This is achieved by the following changes.
1) Add memory ordering to avoid race conditions. The only race condition that can occur is - using the key store element before the key write is completed. Hence, while inserting the element the release memory order is used. Any other race condition is caught by the key comparison. Memory orderings are added only where needed. For ex: reads in the writer's context do not need memory ordering as there is a single writer.
key_idx in the bucket entry and pdata in the key store element are used for synchronisation. key_idx is used to release an inserted entry in the bucket to the reader. Use of pdata for synchronisation is required due to updation of an existing entry where-in only the pdata is updated without updating key_idx.
2) Reader-writer concurrency issue, caused by moving the keys to their alternative locations during key insert, is solved by introducing a global counter(tbl_chng_cnt) indicating a change in table.
3) Add the flag to enable reader-writer concurrency during run time.
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com> Reviewed-by: Steve Capper <steve.capper@arm.com> Reviewed-by: Yipeng Wang <yipeng1.wang@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
show more ...
|
| 450f0791 | 19-Oct-2018 |
Liang Ma <liang.j.ma@intel.com> |
power: add traffic pattern aware power control
1. Abstract
For packet processing workloads such as DPDK polling is continuous. This means CPU cores always show 100% busy independent of how much wor
power: add traffic pattern aware power control
1. Abstract
For packet processing workloads such as DPDK polling is continuous. This means CPU cores always show 100% busy independent of how much work those cores are doing. It is critical to accurately determine how busy a core is hugely important for the following reasons:
* No indication of overload conditions.
* User does not know how much real load is on a system, resulting in wasted energy as no power management is utilized.
Compared to the original l3fwd-power design, instead of going to sleep after detecting an empty poll, the new mechanism just lowers the core frequency. As a result, the application does not stop polling the device, which leads to improved handling of bursts of traffic.
When the system become busy, the empty poll mechanism can also increase the core frequency (including turbo) to do best effort for intensive traffic. This gives us more flexible and balanced traffic awareness over the standard l3fwd-power application.
2. Proposed solution
The proposed solution focuses on how many times empty polls are executed. The less the number of empty polls, means current core is busy with processing workload, therefore, the higher frequency is needed. The high empty poll number indicates the current core not doing any real work therefore, we can lower the frequency to safe power.
In the current implementation, each core has 1 empty-poll counter which assume 1 core is dedicated to 1 queue. This will need to be expanded in the future to support multiple queues per core.
2.1 Power state definition:
LOW: Not currently used, reserved for future use.
MED: the frequency is used to process modest traffic workload.
HIGH: the frequency is used to process busy traffic workload.
2.2 There are two phases to establish the power management system:
a.Initialization/Training phase. The training phase is necessary in order to figure out the system polling baseline numbers from idle to busy. The highest poll count will be during idle, where all polls are empty. These poll counts will be different between systems due to the many possible processor micro-arch, cache and device configurations, hence the training phase. In the training phase, traffic is blocked so the training algorithm can average the empty-poll numbers for the LOW, MED and HIGH power states in order to create a baseline. The core's counter are collected every 10ms, and the Training phase will take 2 seconds. Training is disabled as default configuration. The default parameter is applied. Sample App still can trigger training if that's needed. Once the training phase has been executed once on a system, the application can then be started with the relevant thresholds provided on the command line, allowing the application to start passing start traffic immediately
b.Normal phase. Traffic starts immediately based on the default thresholds, or based on the user supplied thresholds via the command line parameters. The run-time poll counts are compared with the baseline and the decision will be taken to move to MED power state or HIGH power state. The counters are calculated every 10ms.
3. Proposed API
1. rte_power_empty_poll_stat_init(struct ep_params **eptr, uint8_t *freq_tlb, struct ep_policy *policy); which is used to initialize the power management system. 2. rte_power_empty_poll_stat_free(void); which is used to free the resource hold by power management system. 3. rte_power_empty_poll_stat_update(unsigned int lcore_id); which is used to update specific core empty poll counter, not thread safe 4. rte_power_poll_stat_update(unsigned int lcore_id, uint8_t nb_pkt); which is used to update specific core valid poll counter, not thread safe 5. rte_power_empty_poll_stat_fetch(unsigned int lcore_id); which is used to get specific core empty poll counter. 6. rte_power_poll_stat_fetch(unsigned int lcore_id); which is used to get specific core valid poll counter.
7. rte_empty_poll_detection(struct rte_timer *tim, void *arg); which is used to detect empty poll state changes then take action.
Signed-off-by: Liang Ma <liang.j.ma@intel.com> Reviewed-by: Lei Yao <lei.a.yao@intel.com> Acked-by: David Hunt <david.hunt@intel.com>
show more ...
|
| cd85039e | 12-Oct-2018 |
Maxime Coquelin <maxime.coquelin@redhat.com> |
vhost: restrict postcopy live-migration enablement
Postcopy live-migration feature requires the application to not populate the guest memory. As the vhost library cannot prevent the application to t
vhost: restrict postcopy live-migration enablement
Postcopy live-migration feature requires the application to not populate the guest memory. As the vhost library cannot prevent the application to that (e.g. preventing the application to call mlockall()), the feature is disabled by default.
The application should only enable the feature if it does not force the guest memory to be populated.
In case the user passes the RTE_VHOST_USER_POSTCOPY_SUPPORT flag at registration but the feature was not compiled, registration fails.
For the same reason, postcopy and dequeue zero copy features are not compatible, so don't advertize postcopy support if dequeue zero copy is requested.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
show more ...
|