lib/mldev/rte_mldev.h

*d82cac58SJerin Jacob/* SPDX-License-Identifier: BSD-3-Clause
*d82cac58SJerin Jacob * Copyright (c) 2022 Marvell.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob#ifndef RTE_MLDEV_H
*d82cac58SJerin Jacob#define RTE_MLDEV_H
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * @file rte_mldev.h
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @warning
*d82cac58SJerin Jacob * @b EXPERIMENTAL:
*d82cac58SJerin Jacob * All functions in this file may be changed or removed without prior notice.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * ML (Machine Learning) device API.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * The ML framework is built on the following model:
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob *     +-----------------+               rte_ml_[en|de]queue_burst()
*d82cac58SJerin Jacob *     |                 |                          |
*d82cac58SJerin Jacob *     |     Machine     o------+     +--------+    |
*d82cac58SJerin Jacob *     |     Learning    |      |     | queue  |    |    +------+
*d82cac58SJerin Jacob *     |     Inference   o------+-----o        |<===o===>|Core 0|
*d82cac58SJerin Jacob *     |     Engine      |      |     | pair 0 |         +------+
*d82cac58SJerin Jacob *     |                 o----+ |     +--------+
*d82cac58SJerin Jacob *     |                 |    | |
*d82cac58SJerin Jacob *     +-----------------+    | |     +--------+
*d82cac58SJerin Jacob *              ^             | |     | queue  |         +------+
*d82cac58SJerin Jacob *              |             | +-----o        |<=======>|Core 1|
*d82cac58SJerin Jacob *              |             |       | pair 1 |         +------+
*d82cac58SJerin Jacob *              |             |       +--------+
*d82cac58SJerin Jacob *     +--------+--------+    |
*d82cac58SJerin Jacob *     | +-------------+ |    |       +--------+
*d82cac58SJerin Jacob *     | |   Model 0   | |    |       | queue  |         +------+
*d82cac58SJerin Jacob *     | +-------------+ |    +-------o        |<=======>|Core N|
*d82cac58SJerin Jacob *     | +-------------+ |            | pair N |         +------+
*d82cac58SJerin Jacob *     | |   Model 1   | |            +--------+
*d82cac58SJerin Jacob *     | +-------------+ |
*d82cac58SJerin Jacob *     | +-------------+ |<------> rte_ml_model_load()
*d82cac58SJerin Jacob *     | |   Model ..  | |-------> rte_ml_model_info_get()
*d82cac58SJerin Jacob *     | +-------------+ |<------- rte_ml_model_start()
*d82cac58SJerin Jacob *     | +-------------+ |<------- rte_ml_model_stop()
*d82cac58SJerin Jacob *     | |   Model N   | |<------- rte_ml_model_params_update()
*d82cac58SJerin Jacob *     | +-------------+ |<------- rte_ml_model_unload()
*d82cac58SJerin Jacob *     +-----------------+
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * ML Device: A hardware or software-based implementation of ML device API for
*d82cac58SJerin Jacob * running inferences using a pre-trained ML model.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * ML Model: An ML model is an algorithm trained over a dataset. A model consists of
*d82cac58SJerin Jacob * procedure/algorithm and data/pattern required to make predictions on live data.
*d82cac58SJerin Jacob * Once the model is created and trained outside of the DPDK scope, the model can be loaded
*d82cac58SJerin Jacob * via rte_ml_model_load() and then start it using rte_ml_model_start() API.
*d82cac58SJerin Jacob * The rte_ml_model_params_update() can be used to update the model parameters such as weight
*d82cac58SJerin Jacob * and bias without unloading the model using rte_ml_model_unload().
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * ML Inference: ML inference is the process of feeding data to the model via
*d82cac58SJerin Jacob * rte_ml_enqueue_burst() API and use rte_ml_dequeue_burst() API to get the calculated
*d82cac58SJerin Jacob * outputs/predictions from the started model.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * In all functions of the ML device API, the ML device is designated by an
*d82cac58SJerin Jacob * integer >= 0 named as device identifier *dev_id*.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * The functions exported by the ML device API to setup a device designated by
*d82cac58SJerin Jacob * its device identifier must be invoked in the following order:
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob *      - rte_ml_dev_configure()
*d82cac58SJerin Jacob *      - rte_ml_dev_queue_pair_setup()
*d82cac58SJerin Jacob *      - rte_ml_dev_start()
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * A model is required to run the inference operations with the user specified inputs.
*d82cac58SJerin Jacob * Application needs to invoke the ML model API in the following order before queueing
*d82cac58SJerin Jacob * inference jobs.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob *      - rte_ml_model_load()
*d82cac58SJerin Jacob *      - rte_ml_model_start()
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * A model can be loaded on a device only after the device has been configured and can be
*d82cac58SJerin Jacob * started or stopped only after a device has been started.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * The rte_ml_model_info_get() API is provided to retrieve the information related to the model.
*d82cac58SJerin Jacob * The information would include the shape and type of input and output required for the inference.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * Data quantization and dequantization is one of the main aspects in ML domain. This involves
*d82cac58SJerin Jacob * conversion of input data from a higher precision to a lower precision data type and vice-versa
*d82cac58SJerin Jacob * for the output. APIs are provided to enable quantization through rte_ml_io_quantize() and
*d82cac58SJerin Jacob * dequantization through rte_ml_io_dequantize(). These APIs have the capability to handle input
*d82cac58SJerin Jacob * and output buffers holding data for multiple batches.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * Two utility APIs rte_ml_io_input_size_get() and rte_ml_io_output_size_get() can used to get the
*d82cac58SJerin Jacob * size of quantized and de-quantized multi-batch input and output buffers.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * User can optionally update the model parameters with rte_ml_model_params_update() after
*d82cac58SJerin Jacob * invoking rte_ml_model_stop() API on a given model ID.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * The application can invoke, in any order, the functions exported by the ML API to enqueue
*d82cac58SJerin Jacob * inference jobs and dequeue inference response.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * If the application wants to change the device configuration (i.e., call
*d82cac58SJerin Jacob * rte_ml_dev_configure() or rte_ml_dev_queue_pair_setup()), then application must stop the
*d82cac58SJerin Jacob * device using rte_ml_dev_stop() API. Likewise, if model parameters need to be updated then
*d82cac58SJerin Jacob * the application must call rte_ml_model_stop() followed by rte_ml_model_params_update() API
*d82cac58SJerin Jacob * for the given model. The application does not need to call rte_ml_dev_stop() API for
*d82cac58SJerin Jacob * any model re-configuration such as rte_ml_model_params_update(), rte_ml_model_unload() etc.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * Once the device is in the start state after invoking rte_ml_dev_start() API and the model is in
*d82cac58SJerin Jacob * start state after invoking rte_ml_model_start() API, then the application can call
*d82cac58SJerin Jacob * rte_ml_enqueue_burst() and rte_ml_dequeue_burst() API on the destined device and model ID.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * Finally, an application can close an ML device by invoking the rte_ml_dev_close() function.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * Typical application utilisation of the ML API will follow the following
*d82cac58SJerin Jacob * programming flow.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * - rte_ml_dev_configure()
*d82cac58SJerin Jacob * - rte_ml_dev_queue_pair_setup()
*d82cac58SJerin Jacob * - rte_ml_model_load()
*d82cac58SJerin Jacob * - rte_ml_dev_start()
*d82cac58SJerin Jacob * - rte_ml_model_start()
*d82cac58SJerin Jacob * - rte_ml_model_info_get()
*d82cac58SJerin Jacob * - rte_ml_enqueue_burst()
*d82cac58SJerin Jacob * - rte_ml_dequeue_burst()
*d82cac58SJerin Jacob * - rte_ml_model_stop()
*d82cac58SJerin Jacob * - rte_ml_model_unload()
*d82cac58SJerin Jacob * - rte_ml_dev_stop()
*d82cac58SJerin Jacob * - rte_ml_dev_close()
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * Regarding multi-threading, by default, all the functions of the ML Device API exported by a PMD
*d82cac58SJerin Jacob * are lock-free functions which assume to not be invoked in parallel on different logical cores
*d82cac58SJerin Jacob * on the same target object. For instance, the dequeue function of a poll mode driver cannot be
*d82cac58SJerin Jacob * invoked in parallel on two logical cores to operate on same queue pair. Of course, this function
*d82cac58SJerin Jacob * can be invoked in parallel by different logical core on different queue pair.
*d82cac58SJerin Jacob * It is the responsibility of the user application to enforce this rule.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob#include <rte_common.h>
*d82cac58SJerin Jacob#include <rte_log.h>
*d82cac58SJerin Jacob#include <rte_mempool.h>
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob#ifdef __cplusplus
*d82cac58SJerin Jacobextern "C" {
*d82cac58SJerin Jacob#endif
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/* Logging Macro */
*d82cac58SJerin Jacobextern int rte_ml_dev_logtype;
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob#define RTE_MLDEV_LOG(level, fmt, args...)                                                         \
*d82cac58SJerin Jacob	rte_log(RTE_LOG_##level, rte_ml_dev_logtype, "%s(): " fmt "\n", __func__, ##args)
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob#define RTE_ML_STR_MAX 128
*d82cac58SJerin Jacob/**< Maximum length of name string */
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob#define RTE_MLDEV_DEFAULT_MAX 32
*d82cac58SJerin Jacob/** Maximum number of devices if rte_ml_dev_init() is not called. */
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/* Device operations */
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Initialize the device array before probing devices. If not called, the first device probed would
*d82cac58SJerin Jacob * initialize the array to a size of RTE_MLDEV_DEFAULT_MAX.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param dev_max
*d82cac58SJerin Jacob *   Maximum number of devices.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   0 on success, -rte_errno otherwise:
*d82cac58SJerin Jacob *   - ENOMEM if out of memory
*d82cac58SJerin Jacob *   - EINVAL if 0 size
*d82cac58SJerin Jacob *   - EBUSY if already initialized
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobint
*d82cac58SJerin Jacobrte_ml_dev_init(size_t dev_max);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Get the total number of ML devices that have been successfully initialised.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   - The total number of usable ML devices.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobuint16_t
*d82cac58SJerin Jacobrte_ml_dev_count(void);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Check if the device is in ready state.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param dev_id
*d82cac58SJerin Jacob *   The identifier of the device.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   - 0 if device state is not in ready state.
*d82cac58SJerin Jacob *   - 1 if device state is ready state.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobint
*d82cac58SJerin Jacobrte_ml_dev_is_valid_dev(int16_t dev_id);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Return the NUMA socket to which a device is connected.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param dev_id
*d82cac58SJerin Jacob *   The identifier of the device.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   - The NUMA socket id to which the device is connected
*d82cac58SJerin Jacob *   - 0 If the socket could not be determined.
*d82cac58SJerin Jacob *   - -EINVAL: if the dev_id value is not valid.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobint
*d82cac58SJerin Jacobrte_ml_dev_socket_id(int16_t dev_id);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**  ML device information */
*d82cac58SJerin Jacobstruct rte_ml_dev_info {
*d82cac58SJerin Jacob	const char *driver_name;
*d82cac58SJerin Jacob	/**< Driver name */
*d82cac58SJerin Jacob	uint16_t max_models;
*d82cac58SJerin Jacob	/**< Maximum number of models supported by the device.
*d82cac58SJerin Jacob	 * @see struct rte_ml_dev_config::nb_models
*d82cac58SJerin Jacob	 */
*d82cac58SJerin Jacob	uint16_t max_queue_pairs;
*d82cac58SJerin Jacob	/**< Maximum number of queues pairs supported by the device.
*d82cac58SJerin Jacob	 * @see struct rte_ml_dev_config::nb_queue_pairs
*d82cac58SJerin Jacob	 */
*d82cac58SJerin Jacob	uint16_t max_desc;
*d82cac58SJerin Jacob	/**< Maximum allowed number of descriptors for queue pair by the device.
*d82cac58SJerin Jacob	 * @see struct rte_ml_dev_qp_conf::nb_desc
*d82cac58SJerin Jacob	 */
*d82cac58SJerin Jacob	uint16_t max_segments;
*d82cac58SJerin Jacob	/**< Maximum number of scatter-gather entries supported by the device.
*d82cac58SJerin Jacob	 * @see struct rte_ml_buff_seg  struct rte_ml_buff_seg::next
*d82cac58SJerin Jacob	 */
*d82cac58SJerin Jacob	uint16_t min_align_size;
*d82cac58SJerin Jacob	/**< Minimum alignment size of IO buffers used by the device. */
*d82cac58SJerin Jacob};
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Retrieve the information of the device.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param dev_id
*d82cac58SJerin Jacob *   The identifier of the device.
*d82cac58SJerin Jacob * @param dev_info
*d82cac58SJerin Jacob *   A pointer to a structure of type *rte_ml_dev_info* to be filled with the info of the device.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   - 0: Success, driver updates the information of the ML device
*d82cac58SJerin Jacob *   - < 0: Error code returned by the driver info get function.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobint
*d82cac58SJerin Jacobrte_ml_dev_info_get(int16_t dev_id, struct rte_ml_dev_info *dev_info);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/** ML device configuration structure */
*d82cac58SJerin Jacobstruct rte_ml_dev_config {
*d82cac58SJerin Jacob	int socket_id;
*d82cac58SJerin Jacob	/**< Socket to allocate resources on. */
*d82cac58SJerin Jacob	uint16_t nb_models;
*d82cac58SJerin Jacob	/**< Number of models to be loaded on the device.
*d82cac58SJerin Jacob	 * This value cannot exceed the max_models which is previously provided in
*d82cac58SJerin Jacob	 * struct rte_ml_dev_info::max_models
*d82cac58SJerin Jacob	 */
*d82cac58SJerin Jacob	uint16_t nb_queue_pairs;
*d82cac58SJerin Jacob	/**< Number of queue pairs to configure on this device.
*d82cac58SJerin Jacob	 * This value cannot exceed the max_models which is previously provided in
*d82cac58SJerin Jacob	 * struct rte_ml_dev_info::max_queue_pairs
*d82cac58SJerin Jacob	 */
*d82cac58SJerin Jacob};
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Configure an ML device.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * This function must be invoked first before any other function in the API.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * ML Device can be re-configured, when in a stopped state. Device cannot be re-configured after
*d82cac58SJerin Jacob * rte_ml_dev_close() is called.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * The caller may use rte_ml_dev_info_get() to get the capability of each resources available for
*d82cac58SJerin Jacob * this ML device.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param dev_id
*d82cac58SJerin Jacob *   The identifier of the device to configure.
*d82cac58SJerin Jacob * @param config
*d82cac58SJerin Jacob *   The ML device configuration structure.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   - 0: Success, device configured.
*d82cac58SJerin Jacob *   - < 0: Error code returned by the driver configuration function.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobint
*d82cac58SJerin Jacobrte_ml_dev_configure(int16_t dev_id, const struct rte_ml_dev_config *config);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/* Forward declaration */
*d82cac58SJerin Jacobstruct rte_ml_op;
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**< Callback function called during rte_ml_dev_stop(), invoked once per flushed ML op */
*d82cac58SJerin Jacobtypedef void (*rte_ml_dev_stop_flush_t)(int16_t dev_id, uint16_t qp_id, struct rte_ml_op *op);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/** ML device queue pair configuration structure. */
*d82cac58SJerin Jacobstruct rte_ml_dev_qp_conf {
*d82cac58SJerin Jacob	uint32_t nb_desc;
*d82cac58SJerin Jacob	/**< Number of descriptors per queue pair.
*d82cac58SJerin Jacob	 * This value cannot exceed the max_desc which previously provided in
*d82cac58SJerin Jacob	 * struct rte_ml_dev_info:max_desc
*d82cac58SJerin Jacob	 */
*d82cac58SJerin Jacob	rte_ml_dev_stop_flush_t cb;
*d82cac58SJerin Jacob	/**< Callback function called during rte_ml_dev_stop(), invoked once per active ML op.
*d82cac58SJerin Jacob	 * Value NULL is allowed, in which case callback will not be invoked.
*d82cac58SJerin Jacob	 * This function can be used to properly dispose of outstanding ML ops from all
*d82cac58SJerin Jacob	 * queue pairs, for example ops containing  memory pointers.
*d82cac58SJerin Jacob	 * @see rte_ml_dev_stop()
*d82cac58SJerin Jacob	 */
*d82cac58SJerin Jacob};
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Set up a queue pair for a device. This should only be called when the device is stopped.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param dev_id
*d82cac58SJerin Jacob *   The identifier of the device.
*d82cac58SJerin Jacob * @param queue_pair_id
*d82cac58SJerin Jacob *   The index of the queue pairs to set up. The value must be in the range [0, nb_queue_pairs - 1]
*d82cac58SJerin Jacob * previously supplied to rte_ml_dev_configure().
*d82cac58SJerin Jacob * @param qp_conf
*d82cac58SJerin Jacob *   The pointer to the configuration data to be used for the queue pair.
*d82cac58SJerin Jacob * @param socket_id
*d82cac58SJerin Jacob *   The *socket_id* argument is the socket identifier in case of NUMA.
*d82cac58SJerin Jacob * The value can be *SOCKET_ID_ANY* if there is no NUMA constraint for the memory allocated
*d82cac58SJerin Jacob * for the queue pair.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   - 0: Success, queue pair correctly set up.
*d82cac58SJerin Jacob *   - < 0: Queue pair configuration failed.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobint
*d82cac58SJerin Jacobrte_ml_dev_queue_pair_setup(int16_t dev_id, uint16_t queue_pair_id,
*d82cac58SJerin Jacob			    const struct rte_ml_dev_qp_conf *qp_conf, int socket_id);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Start an ML device.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * The device start step consists of setting the configured features and enabling the ML device
*d82cac58SJerin Jacob * to accept inference jobs.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param dev_id
*d82cac58SJerin Jacob *   The identifier of the device.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   - 0: Success, device started.
*d82cac58SJerin Jacob *   - <0: Error code of the driver device start function.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobint
*d82cac58SJerin Jacobrte_ml_dev_start(int16_t dev_id);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Stop an ML device. A stopped device cannot accept inference jobs.
*d82cac58SJerin Jacob * The device can be restarted with a call to rte_ml_dev_start().
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param dev_id
*d82cac58SJerin Jacob *   The identifier of the device.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   - 0: Success, device stopped.
*d82cac58SJerin Jacob *   - <0: Error code of the driver device stop function.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobint
*d82cac58SJerin Jacobrte_ml_dev_stop(int16_t dev_id);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Close an ML device. The device cannot be restarted!
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param dev_id
*d82cac58SJerin Jacob *   The identifier of the device.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *  - 0 on successfully closing device.
*d82cac58SJerin Jacob *  - <0 on failure to close device.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobint
*d82cac58SJerin Jacobrte_ml_dev_close(int16_t dev_id);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/** Status of ML operation */
*d82cac58SJerin Jacobenum rte_ml_op_status {
*d82cac58SJerin Jacob	RTE_ML_OP_STATUS_SUCCESS = 0,
*d82cac58SJerin Jacob	/**< Operation completed successfully */
*d82cac58SJerin Jacob	RTE_ML_OP_STATUS_NOT_PROCESSED,
*d82cac58SJerin Jacob	/**< Operation has not yet been processed by the device. */
*d82cac58SJerin Jacob	RTE_ML_OP_STATUS_ERROR,
*d82cac58SJerin Jacob	/**< Operation completed with error.
*d82cac58SJerin Jacob	 * Application can invoke rte_ml_op_error_get() to get PMD specific
*d82cac58SJerin Jacob	 * error code if needed.
*d82cac58SJerin Jacob	 */
*d82cac58SJerin Jacob};
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/** ML operation's input and output buffer representation as scatter gather list
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacobstruct rte_ml_buff_seg {
*d82cac58SJerin Jacob	rte_iova_t iova_addr;
*d82cac58SJerin Jacob	/**< IOVA address of segment buffer. */
*d82cac58SJerin Jacob	void *addr;
*d82cac58SJerin Jacob	/**< Virtual address of segment buffer. */
*d82cac58SJerin Jacob	uint32_t length;
*d82cac58SJerin Jacob	/**< Segment length. */
*d82cac58SJerin Jacob	uint32_t reserved;
*d82cac58SJerin Jacob	/**< Reserved for future use. */
*d82cac58SJerin Jacob	struct rte_ml_buff_seg *next;
*d82cac58SJerin Jacob	/**< Points to next segment. Value NULL represents the last segment. */
*d82cac58SJerin Jacob};
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * ML Operation.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * This structure contains data related to performing an ML operation on the buffers using
*d82cac58SJerin Jacob * the model specified through model_id.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacobstruct rte_ml_op {
*d82cac58SJerin Jacob	uint16_t model_id;
*d82cac58SJerin Jacob	/**< Model ID to be used for the operation. */
*d82cac58SJerin Jacob	uint16_t nb_batches;
*d82cac58SJerin Jacob	/**< Number of batches. Minimum value must be one.
*d82cac58SJerin Jacob	 * Input buffer must hold inference data for each batch as contiguous.
*d82cac58SJerin Jacob	 */
*d82cac58SJerin Jacob	uint32_t reserved;
*d82cac58SJerin Jacob	/**< Reserved for future use. */
*d82cac58SJerin Jacob	struct rte_mempool *mempool;
*d82cac58SJerin Jacob	/**< Pool from which operation is allocated. */
*d82cac58SJerin Jacob	struct rte_ml_buff_seg input;
*d82cac58SJerin Jacob	/**< Input buffer to hold the inference data. */
*d82cac58SJerin Jacob	struct rte_ml_buff_seg output;
*d82cac58SJerin Jacob	/**< Output buffer to hold the inference output by the driver. */
*d82cac58SJerin Jacob	RTE_STD_C11
*d82cac58SJerin Jacob	union {
*d82cac58SJerin Jacob		uint64_t user_u64;
*d82cac58SJerin Jacob		/**< User data as uint64_t.*/
*d82cac58SJerin Jacob		void *user_ptr;
*d82cac58SJerin Jacob		/**< User data as void*.*/
*d82cac58SJerin Jacob	};
*d82cac58SJerin Jacob	enum rte_ml_op_status status;
*d82cac58SJerin Jacob	/**< Operation status. */
*d82cac58SJerin Jacob	uint64_t impl_opaque;
*d82cac58SJerin Jacob	/**< Implementation specific opaque value.
*d82cac58SJerin Jacob	 * An implementation may use this field to hold
*d82cac58SJerin Jacob	 * implementation specific value to share between
*d82cac58SJerin Jacob	 * dequeue and enqueue operation.
*d82cac58SJerin Jacob	 * The application should not modify this field.
*d82cac58SJerin Jacob	 */
*d82cac58SJerin Jacob} __rte_cache_aligned;
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/* Enqueue/Dequeue operations */
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Enqueue a burst of ML inferences for processing on an ML device.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * The rte_ml_enqueue_burst() function is invoked to place ML inference
*d82cac58SJerin Jacob * operations on the queue *qp_id* of the device designated by its *dev_id*.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * The *nb_ops* parameter is the number of inferences to process which are
*d82cac58SJerin Jacob * supplied in the *ops* array of *rte_ml_op* structures.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * The rte_ml_enqueue_burst() function returns the number of inferences it
*d82cac58SJerin Jacob * actually enqueued for processing. A return value equal to *nb_ops* means that
*d82cac58SJerin Jacob * all packets have been enqueued.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param dev_id
*d82cac58SJerin Jacob *   The identifier of the device.
*d82cac58SJerin Jacob * @param qp_id
*d82cac58SJerin Jacob *   The index of the queue pair which inferences are to be enqueued for processing.
*d82cac58SJerin Jacob * The value must be in the range [0, nb_queue_pairs - 1] previously supplied to
*d82cac58SJerin Jacob * *rte_ml_dev_configure*.
*d82cac58SJerin Jacob * @param ops
*d82cac58SJerin Jacob *   The address of an array of *nb_ops* pointers to *rte_ml_op* structures which contain the
*d82cac58SJerin Jacob * ML inferences to be processed.
*d82cac58SJerin Jacob * @param nb_ops
*d82cac58SJerin Jacob *   The number of operations to process.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   The number of inference operations actually enqueued to the ML device.
*d82cac58SJerin Jacob * The return value can be less than the value of the *nb_ops* parameter when the ML device queue
*d82cac58SJerin Jacob * is full or if invalid parameters are specified in a *rte_ml_op*.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobuint16_t
*d82cac58SJerin Jacobrte_ml_enqueue_burst(int16_t dev_id, uint16_t qp_id, struct rte_ml_op **ops, uint16_t nb_ops);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Dequeue a burst of processed ML inferences operations from a queue on the ML device.
*d82cac58SJerin Jacob * The dequeued operations are stored in *rte_ml_op* structures whose pointers are supplied
*d82cac58SJerin Jacob * in the *ops* array.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * The rte_ml_dequeue_burst() function returns the number of inferences actually dequeued,
*d82cac58SJerin Jacob * which is the number of *rte_ml_op* data structures effectively supplied into the *ops* array.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * A return value equal to *nb_ops* indicates that the queue contained at least nb_ops* operations,
*d82cac58SJerin Jacob * and this is likely to signify that other processed operations remain in the devices output queue.
*d82cac58SJerin Jacob * Application implementing a "retrieve as many processed operations as possible" policy can check
*d82cac58SJerin Jacob * this specific case and keep invoking the rte_ml_dequeue_burst() function until a value less than
*d82cac58SJerin Jacob * *nb_ops* is returned.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * The rte_ml_dequeue_burst() function does not provide any error notification to avoid
*d82cac58SJerin Jacob * the corresponding overhead.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param dev_id
*d82cac58SJerin Jacob *   The identifier of the device.
*d82cac58SJerin Jacob * @param qp_id
*d82cac58SJerin Jacob *   The index of the queue pair from which to retrieve processed packets.
*d82cac58SJerin Jacob * The value must be in the range [0, nb_queue_pairs - 1] previously supplied to
*d82cac58SJerin Jacob * rte_ml_dev_configure().
*d82cac58SJerin Jacob * @param ops
*d82cac58SJerin Jacob *   The address of an array of pointers to *rte_ml_op* structures that must be large enough to
*d82cac58SJerin Jacob * store *nb_ops* pointers in it.
*d82cac58SJerin Jacob * @param nb_ops
*d82cac58SJerin Jacob *   The maximum number of inferences to dequeue.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   The number of operations actually dequeued, which is the number of pointers
*d82cac58SJerin Jacob * to *rte_ml_op* structures effectively supplied to the *ops* array.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobuint16_t
*d82cac58SJerin Jacobrte_ml_dequeue_burst(int16_t dev_id, uint16_t qp_id, struct rte_ml_op **ops, uint16_t nb_ops);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Verbose error structure definition.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacobstruct rte_ml_op_error {
*d82cac58SJerin Jacob	char message[RTE_ML_STR_MAX]; /**< Human-readable error message. */
*d82cac58SJerin Jacob	uint64_t errcode;	      /**< Vendor specific error code. */
*d82cac58SJerin Jacob};
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Get PMD specific error information for an ML op.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * When an ML operation completed with RTE_ML_OP_STATUS_ERROR as status,
*d82cac58SJerin Jacob * This API allows to get PMD specific error details.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param[in] dev_id
*d82cac58SJerin Jacob *   Device identifier
*d82cac58SJerin Jacob * @param[in] op
*d82cac58SJerin Jacob *   Handle of ML operation
*d82cac58SJerin Jacob * @param[in] error
*d82cac58SJerin Jacob *   Address of structure rte_ml_op_error to be filled
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   - Returns 0 on success
*d82cac58SJerin Jacob *   - Returns negative value on failure
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobint
*d82cac58SJerin Jacobrte_ml_op_error_get(int16_t dev_id, struct rte_ml_op *op, struct rte_ml_op_error *error);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/* Statistics operations */
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/** Device statistics. */
*d82cac58SJerin Jacobstruct rte_ml_dev_stats {
*d82cac58SJerin Jacob	uint64_t enqueued_count;
*d82cac58SJerin Jacob	/**< Count of all operations enqueued */
*d82cac58SJerin Jacob	uint64_t dequeued_count;
*d82cac58SJerin Jacob	/**< Count of all operations dequeued */
*d82cac58SJerin Jacob	uint64_t enqueue_err_count;
*d82cac58SJerin Jacob	/**< Total error count on operations enqueued */
*d82cac58SJerin Jacob	uint64_t dequeue_err_count;
*d82cac58SJerin Jacob	/**< Total error count on operations dequeued */
*d82cac58SJerin Jacob};
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Retrieve the general I/O statistics of a device.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param dev_id
*d82cac58SJerin Jacob *   The identifier of the device.
*d82cac58SJerin Jacob * @param stats
*d82cac58SJerin Jacob *   Pointer to structure to where statistics will be copied.
*d82cac58SJerin Jacob * On error, this location may or may not have been modified.
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   - 0 on success
*d82cac58SJerin Jacob *   - -EINVAL: If invalid parameter pointer is provided.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobint
*d82cac58SJerin Jacobrte_ml_dev_stats_get(int16_t dev_id, struct rte_ml_dev_stats *stats);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Reset the statistics of a device.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param dev_id
*d82cac58SJerin Jacob *   The identifier of the device.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobvoid
*d82cac58SJerin Jacobrte_ml_dev_stats_reset(int16_t dev_id);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * A name-key lookup element for extended statistics.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * This structure is used to map between names and ID numbers for extended ML device statistics.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacobstruct rte_ml_dev_xstats_map {
*d82cac58SJerin Jacob	uint16_t id;
*d82cac58SJerin Jacob	/**< xstat identifier */
*d82cac58SJerin Jacob	char name[RTE_ML_STR_MAX];
*d82cac58SJerin Jacob	/**< xstat name */
*d82cac58SJerin Jacob};
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Retrieve names of extended statistics of an ML device.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param dev_id
*d82cac58SJerin Jacob *   The identifier of the device.
*d82cac58SJerin Jacob * @param[out] xstats_map
*d82cac58SJerin Jacob *   Block of memory to insert id and names into. Must be at least size in capacity.
*d82cac58SJerin Jacob * If set to NULL, function returns required capacity.
*d82cac58SJerin Jacob * @param size
*d82cac58SJerin Jacob *   Capacity of xstats_map (number of name-id maps).
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   - Positive value on success:
*d82cac58SJerin Jacob *      - The return value is the number of entries filled in the stats map.
*d82cac58SJerin Jacob *      - If xstats_map set to NULL then required capacity for xstats_map.
*d82cac58SJerin Jacob *   - Negative value on error:
*d82cac58SJerin Jacob *      - -ENODEV: for invalid *dev_id*.
*d82cac58SJerin Jacob *      - -ENOTSUP: if the device doesn't support this function.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobint
*d82cac58SJerin Jacobrte_ml_dev_xstats_names_get(int16_t dev_id, struct rte_ml_dev_xstats_map *xstats_map,
*d82cac58SJerin Jacob			    uint32_t size);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Retrieve the value of a single stat by requesting it by name.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param dev_id
*d82cac58SJerin Jacob *   The identifier of the device.
*d82cac58SJerin Jacob * @param name
*d82cac58SJerin Jacob *   The stat name to retrieve.
*d82cac58SJerin Jacob * @param stat_id
*d82cac58SJerin Jacob *   If non-NULL, the numerical id of the stat will be returned, so that further requests for
*d82cac58SJerin Jacob * the stat can be got using rte_ml_dev_xstats_get, which will be faster as it doesn't need to
*d82cac58SJerin Jacob * scan a list of names for the stat.
*d82cac58SJerin Jacob * @param[out] value
*d82cac58SJerin Jacob *   Must be non-NULL, retrieved xstat value will be stored in this address.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   - 0: Successfully retrieved xstat value.
*d82cac58SJerin Jacob *   - -EINVAL: invalid parameters.
*d82cac58SJerin Jacob *   - -ENOTSUP: if not supported.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobint
*d82cac58SJerin Jacobrte_ml_dev_xstats_by_name_get(int16_t dev_id, const char *name, uint16_t *stat_id, uint64_t *value);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Retrieve extended statistics of an ML device.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param dev_id
*d82cac58SJerin Jacob *   The identifier of the device.
*d82cac58SJerin Jacob * @param stat_ids
*d82cac58SJerin Jacob *   The id numbers of the stats to get. The ids can be fetched from the stat position in the
*d82cac58SJerin Jacob * stat list from rte_ml_dev_xstats_names_get(), or by using rte_ml_dev_xstats_by_name_get().
*d82cac58SJerin Jacob * @param values
*d82cac58SJerin Jacob *   The values for each stats request by ID.
*d82cac58SJerin Jacob * @param nb_ids
*d82cac58SJerin Jacob *   The number of stats requested.
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   - Positive value: number of stat entries filled into the values array
*d82cac58SJerin Jacob *   - Negative value on error:
*d82cac58SJerin Jacob *      - -ENODEV: for invalid *dev_id*.
*d82cac58SJerin Jacob *      - -ENOTSUP: if the device doesn't support this function.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobint
*d82cac58SJerin Jacobrte_ml_dev_xstats_get(int16_t dev_id, const uint16_t *stat_ids, uint64_t *values, uint16_t nb_ids);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Reset the values of the xstats of the selected component in the device.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param dev_id
*d82cac58SJerin Jacob *   The identifier of the device.
*d82cac58SJerin Jacob * @param stat_ids
*d82cac58SJerin Jacob *   Selects specific statistics to be reset. When NULL, all statistics will be reset.
*d82cac58SJerin Jacob * If non-NULL, must point to array of at least *nb_ids* size.
*d82cac58SJerin Jacob * @param nb_ids
*d82cac58SJerin Jacob *   The number of ids available from the *ids* array. Ignored when ids is NULL.
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   - 0: Successfully reset the statistics to zero.
*d82cac58SJerin Jacob *   - -EINVAL: invalid parameters.
*d82cac58SJerin Jacob *   - -ENOTSUP: if not supported.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobint
*d82cac58SJerin Jacobrte_ml_dev_xstats_reset(int16_t dev_id, const uint16_t *stat_ids, uint16_t nb_ids);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/* Utility operations */
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Dump internal information about *dev_id* to the FILE* provided in *fd*.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param dev_id
*d82cac58SJerin Jacob *   The identifier of the device.
*d82cac58SJerin Jacob * @param fd
*d82cac58SJerin Jacob *   A pointer to a file for output.
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   - 0: on success.
*d82cac58SJerin Jacob *   - <0: on failure.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobint
*d82cac58SJerin Jacobrte_ml_dev_dump(int16_t dev_id, FILE *fd);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Trigger the ML device self test.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param dev_id
*d82cac58SJerin Jacob *   The identifier of the device.
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   - 0: Selftest successful.
*d82cac58SJerin Jacob *   - -ENOTSUP: if the device doesn't support selftest.
*d82cac58SJerin Jacob *   - other values < 0 on failure.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobint
*d82cac58SJerin Jacobrte_ml_dev_selftest(int16_t dev_id);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/* Model operations */
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/** ML model load parameters
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * Parameters required to load an ML model.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacobstruct rte_ml_model_params {
*d82cac58SJerin Jacob	void *addr;
*d82cac58SJerin Jacob	/**< Address of model buffer */
*d82cac58SJerin Jacob	size_t size;
*d82cac58SJerin Jacob	/**< Size of model buffer */
*d82cac58SJerin Jacob};
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Load an ML model to the device.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * Load an ML model to the device with parameters requested in the structure rte_ml_model_params.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param[in] dev_id
*d82cac58SJerin Jacob *   The identifier of the device.
*d82cac58SJerin Jacob * @param[in] params
*d82cac58SJerin Jacob *   Parameters for the model to be loaded.
*d82cac58SJerin Jacob * @param[out] model_id
*d82cac58SJerin Jacob *   Identifier of the model loaded.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   - 0: Success, Model loaded.
*d82cac58SJerin Jacob *   - < 0: Failure, Error code of the model load driver function.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobint
*d82cac58SJerin Jacobrte_ml_model_load(int16_t dev_id, struct rte_ml_model_params *params, uint16_t *model_id);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Unload an ML model from the device.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param[in] dev_id
*d82cac58SJerin Jacob *   The identifier of the device.
*d82cac58SJerin Jacob * @param[in] model_id
*d82cac58SJerin Jacob *   Identifier of the model to be unloaded.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   - 0: Success, Model unloaded.
*d82cac58SJerin Jacob *   - < 0: Failure, Error code of the model unload driver function.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobint
*d82cac58SJerin Jacobrte_ml_model_unload(int16_t dev_id, uint16_t model_id);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Start an ML model for the given device ID.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * Start an ML model to accept inference requests.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param[in] dev_id
*d82cac58SJerin Jacob *   The identifier of the device.
*d82cac58SJerin Jacob * @param[in] model_id
*d82cac58SJerin Jacob *   Identifier of the model to be started.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   - 0: Success, Model loaded.
*d82cac58SJerin Jacob *   - < 0: Failure, Error code of the model start driver function.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobint
*d82cac58SJerin Jacobrte_ml_model_start(int16_t dev_id, uint16_t model_id);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Stop an ML model for the given device ID.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * Model stop would disable the ML model to be used for inference jobs.
*d82cac58SJerin Jacob * All inference jobs must have been completed before model stop is attempted.
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob * @param[in] dev_id
*d82cac58SJerin Jacob *   The identifier of the device.
*d82cac58SJerin Jacob * @param[in] model_id
*d82cac58SJerin Jacob *   Identifier of the model to be stopped.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   - 0: Success, Model unloaded.
*d82cac58SJerin Jacob *   - < 0: Failure, Error code of the model stop driver function.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobint
*d82cac58SJerin Jacobrte_ml_model_stop(int16_t dev_id, uint16_t model_id);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Input and output data types. ML models can operate on reduced precision
*d82cac58SJerin Jacob * datatypes to achieve better power efficiency, lower network latency and lower memory footprint.
*d82cac58SJerin Jacob * This enum is used to represent the lower precision integer and floating point types used
*d82cac58SJerin Jacob * by ML models.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacobenum rte_ml_io_type {
*d82cac58SJerin Jacob	RTE_ML_IO_TYPE_UNKNOWN = 0,
*d82cac58SJerin Jacob	/**< Invalid or unknown type */
*d82cac58SJerin Jacob	RTE_ML_IO_TYPE_INT8,
*d82cac58SJerin Jacob	/**< 8-bit integer */
*d82cac58SJerin Jacob	RTE_ML_IO_TYPE_UINT8,
*d82cac58SJerin Jacob	/**< 8-bit unsigned integer */
*d82cac58SJerin Jacob	RTE_ML_IO_TYPE_INT16,
*d82cac58SJerin Jacob	/**< 16-bit integer */
*d82cac58SJerin Jacob	RTE_ML_IO_TYPE_UINT16,
*d82cac58SJerin Jacob	/**< 16-bit unsigned integer */
*d82cac58SJerin Jacob	RTE_ML_IO_TYPE_INT32,
*d82cac58SJerin Jacob	/**< 32-bit integer */
*d82cac58SJerin Jacob	RTE_ML_IO_TYPE_UINT32,
*d82cac58SJerin Jacob	/**< 32-bit unsigned integer */
*d82cac58SJerin Jacob	RTE_ML_IO_TYPE_FP8,
*d82cac58SJerin Jacob	/**< 8-bit floating point number */
*d82cac58SJerin Jacob	RTE_ML_IO_TYPE_FP16,
*d82cac58SJerin Jacob	/**< IEEE 754 16-bit floating point number */
*d82cac58SJerin Jacob	RTE_ML_IO_TYPE_FP32,
*d82cac58SJerin Jacob	/**< IEEE 754 32-bit floating point number */
*d82cac58SJerin Jacob	RTE_ML_IO_TYPE_BFLOAT16
*d82cac58SJerin Jacob	/**< 16-bit brain floating point number. */
*d82cac58SJerin Jacob};
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Input and output format. This is used to represent the encoding type of multi-dimensional
*d82cac58SJerin Jacob * used by ML models.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacobenum rte_ml_io_format {
*d82cac58SJerin Jacob	RTE_ML_IO_FORMAT_NCHW = 1,
*d82cac58SJerin Jacob	/**< Batch size (N) x channels (C) x height (H) x width (W) */
*d82cac58SJerin Jacob	RTE_ML_IO_FORMAT_NHWC,
*d82cac58SJerin Jacob	/**< Batch size (N) x height (H) x width (W) x channels (C) */
*d82cac58SJerin Jacob	RTE_ML_IO_FORMAT_CHWN,
*d82cac58SJerin Jacob	/**< Channels (C) x height (H) x width (W) x batch size (N) */
*d82cac58SJerin Jacob	RTE_ML_IO_FORMAT_3D,
*d82cac58SJerin Jacob	/**< Format to represent a 3 dimensional data */
*d82cac58SJerin Jacob	RTE_ML_IO_FORMAT_2D,
*d82cac58SJerin Jacob	/**< Format to represent matrix data */
*d82cac58SJerin Jacob	RTE_ML_IO_FORMAT_1D,
*d82cac58SJerin Jacob	/**< Format to represent vector data */
*d82cac58SJerin Jacob	RTE_ML_IO_FORMAT_SCALAR,
*d82cac58SJerin Jacob	/**< Format to represent scalar data */
*d82cac58SJerin Jacob};
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Input and output shape. This structure represents the encoding format and dimensions
*d82cac58SJerin Jacob * of the tensor or vector.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * The data can be a 4D / 3D tensor, matrix, vector or a scalar. Number of dimensions used
*d82cac58SJerin Jacob * for the data would depend on the format. Unused dimensions to be set to 1.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacobstruct rte_ml_io_shape {
*d82cac58SJerin Jacob	enum rte_ml_io_format format;
*d82cac58SJerin Jacob	/**< Format of the data */
*d82cac58SJerin Jacob	uint32_t w;
*d82cac58SJerin Jacob	/**< First dimension */
*d82cac58SJerin Jacob	uint32_t x;
*d82cac58SJerin Jacob	/**< Second dimension */
*d82cac58SJerin Jacob	uint32_t y;
*d82cac58SJerin Jacob	/**< Third dimension */
*d82cac58SJerin Jacob	uint32_t z;
*d82cac58SJerin Jacob	/**< Fourth dimension */
*d82cac58SJerin Jacob};
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/** Input and output data information structure
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * Specifies the type and shape of input and output data.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacobstruct rte_ml_io_info {
*d82cac58SJerin Jacob	char name[RTE_ML_STR_MAX];
*d82cac58SJerin Jacob	/**< Name of data */
*d82cac58SJerin Jacob	struct rte_ml_io_shape shape;
*d82cac58SJerin Jacob	/**< Shape of data */
*d82cac58SJerin Jacob	enum rte_ml_io_type qtype;
*d82cac58SJerin Jacob	/**< Type of quantized data */
*d82cac58SJerin Jacob	enum rte_ml_io_type dtype;
*d82cac58SJerin Jacob	/**< Type of de-quantized data */
*d82cac58SJerin Jacob};
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/** Model information structure */
*d82cac58SJerin Jacobstruct rte_ml_model_info {
*d82cac58SJerin Jacob	char name[RTE_ML_STR_MAX];
*d82cac58SJerin Jacob	/**< Model name. */
*d82cac58SJerin Jacob	char version[RTE_ML_STR_MAX];
*d82cac58SJerin Jacob	/**< Model version */
*d82cac58SJerin Jacob	uint16_t model_id;
*d82cac58SJerin Jacob	/**< Model ID */
*d82cac58SJerin Jacob	uint16_t device_id;
*d82cac58SJerin Jacob	/**< Device ID */
*d82cac58SJerin Jacob	uint16_t batch_size;
*d82cac58SJerin Jacob	/**< Maximum number of batches that the model can process simultaneously */
*d82cac58SJerin Jacob	uint32_t nb_inputs;
*d82cac58SJerin Jacob	/**< Number of inputs */
*d82cac58SJerin Jacob	const struct rte_ml_io_info *input_info;
*d82cac58SJerin Jacob	/**< Input info array. Array size is equal to nb_inputs */
*d82cac58SJerin Jacob	uint32_t nb_outputs;
*d82cac58SJerin Jacob	/**< Number of outputs */
*d82cac58SJerin Jacob	const struct rte_ml_io_info *output_info;
*d82cac58SJerin Jacob	/**< Output info array. Array size is equal to nb_output */
*d82cac58SJerin Jacob	uint64_t wb_size;
*d82cac58SJerin Jacob	/**< Size of model weights and bias */
*d82cac58SJerin Jacob};
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Get ML model information.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param[in] dev_id
*d82cac58SJerin Jacob *   The identifier of the device.
*d82cac58SJerin Jacob * @param[in] model_id
*d82cac58SJerin Jacob *   Identifier for the model created
*d82cac58SJerin Jacob * @param[out] model_info
*d82cac58SJerin Jacob *   Pointer to a model info structure
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   - Returns 0 on success
*d82cac58SJerin Jacob *   - Returns negative value on failure
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobint
*d82cac58SJerin Jacobrte_ml_model_info_get(int16_t dev_id, uint16_t model_id, struct rte_ml_model_info *model_info);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Update the model parameters without unloading model.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * Update model parameters such as weights and bias without unloading the model.
*d82cac58SJerin Jacob * rte_ml_model_stop() must be called before invoking this API.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param[in] dev_id
*d82cac58SJerin Jacob *   The identifier of the device.
*d82cac58SJerin Jacob * @param[in] model_id
*d82cac58SJerin Jacob *   Identifier for the model created
*d82cac58SJerin Jacob * @param[in] buffer
*d82cac58SJerin Jacob *   Pointer to the model weights and bias buffer.
*d82cac58SJerin Jacob * Size of the buffer is equal to wb_size returned in *rte_ml_model_info*.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   - Returns 0 on success
*d82cac58SJerin Jacob *   - Returns negative value on failure
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobint
*d82cac58SJerin Jacobrte_ml_model_params_update(int16_t dev_id, uint16_t model_id, void *buffer);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/* IO operations */
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Get size of quantized and dequantized input buffers.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * Calculate the size of buffers required for quantized and dequantized input data.
*d82cac58SJerin Jacob * This API would return the buffer sizes for the number of batches provided and would
*d82cac58SJerin Jacob * consider the alignment requirements as per the PMD. Input sizes computed by this API can
*d82cac58SJerin Jacob * be used by the application to allocate buffers.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param[in] dev_id
*d82cac58SJerin Jacob *   The identifier of the device.
*d82cac58SJerin Jacob * @param[in] model_id
*d82cac58SJerin Jacob *   Identifier for the model created
*d82cac58SJerin Jacob * @param[in] nb_batches
*d82cac58SJerin Jacob *   Number of batches of input to be processed in a single inference job
*d82cac58SJerin Jacob * @param[out] input_qsize
*d82cac58SJerin Jacob *   Quantized input size pointer.
*d82cac58SJerin Jacob * NULL value is allowed, in which case input_qsize is not calculated by the driver.
*d82cac58SJerin Jacob * @param[out] input_dsize
*d82cac58SJerin Jacob *   Dequantized input size pointer.
*d82cac58SJerin Jacob * NULL value is allowed, in which case input_dsize is not calculated by the driver.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   - Returns 0 on success
*d82cac58SJerin Jacob *   - Returns negative value on failure
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobint
*d82cac58SJerin Jacobrte_ml_io_input_size_get(int16_t dev_id, uint16_t model_id, uint32_t nb_batches,
*d82cac58SJerin Jacob			 uint64_t *input_qsize, uint64_t *input_dsize);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Get size of quantized and dequantized output buffers.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * Calculate the size of buffers required for quantized and dequantized output data.
*d82cac58SJerin Jacob * This API would return the buffer sizes for the number of batches provided and would consider
*d82cac58SJerin Jacob * the alignment requirements as per the PMD. Output sizes computed by this API can be used by the
*d82cac58SJerin Jacob * application to allocate buffers.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param[in] dev_id
*d82cac58SJerin Jacob *   The identifier of the device.
*d82cac58SJerin Jacob * @param[in] model_id
*d82cac58SJerin Jacob *   Identifier for the model created
*d82cac58SJerin Jacob * @param[in] nb_batches
*d82cac58SJerin Jacob *   Number of batches of input to be processed in a single inference job
*d82cac58SJerin Jacob * @param[out] output_qsize
*d82cac58SJerin Jacob *   Quantized output size pointer.
*d82cac58SJerin Jacob * NULL value is allowed, in which case output_qsize is not calculated by the driver.
*d82cac58SJerin Jacob * @param[out] output_dsize
*d82cac58SJerin Jacob *   Dequantized output size pointer.
*d82cac58SJerin Jacob * NULL value is allowed, in which case output_dsize is not calculated by the driver.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   - Returns 0 on success
*d82cac58SJerin Jacob *   - Returns negative value on failure
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobint
*d82cac58SJerin Jacobrte_ml_io_output_size_get(int16_t dev_id, uint16_t model_id, uint32_t nb_batches,
*d82cac58SJerin Jacob			  uint64_t *output_qsize, uint64_t *output_dsize);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Quantize input data.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * Quantization converts data from a higher precision types to a lower precision types to improve
*d82cac58SJerin Jacob * the throughput and efficiency of the model execution with minimal loss of accuracy.
*d82cac58SJerin Jacob * Types of dequantized data and quantized data are specified by the model.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param[in] dev_id
*d82cac58SJerin Jacob *   The identifier of the device.
*d82cac58SJerin Jacob * @param[in] model_id
*d82cac58SJerin Jacob *   Identifier for the model
*d82cac58SJerin Jacob * @param[in] nb_batches
*d82cac58SJerin Jacob *   Number of batches in the dequantized input buffer
*d82cac58SJerin Jacob * @param[in] dbuffer
*d82cac58SJerin Jacob *   Address of dequantized input data
*d82cac58SJerin Jacob * @param[in] qbuffer
*d82cac58SJerin Jacob *   Address of quantized input data
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   - Returns 0 on success
*d82cac58SJerin Jacob *   - Returns negative value on failure
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobint
*d82cac58SJerin Jacobrte_ml_io_quantize(int16_t dev_id, uint16_t model_id, uint16_t nb_batches, void *dbuffer,
*d82cac58SJerin Jacob		   void *qbuffer);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Dequantize output data.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * Dequantization converts data from a lower precision type to a higher precision type.
*d82cac58SJerin Jacob * Types of quantized data and dequantized are specified by the model.
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param[in] dev_id
*d82cac58SJerin Jacob *   The identifier of the device.
*d82cac58SJerin Jacob * @param[in] model_id
*d82cac58SJerin Jacob *   Identifier for the model
*d82cac58SJerin Jacob * @param[in] nb_batches
*d82cac58SJerin Jacob *   Number of batches in the dequantized output buffer
*d82cac58SJerin Jacob * @param[in] qbuffer
*d82cac58SJerin Jacob *   Address of quantized output data
*d82cac58SJerin Jacob * @param[in] dbuffer
*d82cac58SJerin Jacob *   Address of dequantized output data
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *   - Returns 0 on success
*d82cac58SJerin Jacob *   - Returns negative value on failure
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobint
*d82cac58SJerin Jacobrte_ml_io_dequantize(int16_t dev_id, uint16_t model_id, uint16_t nb_batches, void *qbuffer,
*d82cac58SJerin Jacob		     void *dbuffer);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/* ML op pool operations */
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Create an ML operation pool
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param name
*d82cac58SJerin Jacob *   ML operations pool name
*d82cac58SJerin Jacob * @param nb_elts
*d82cac58SJerin Jacob *   Number of elements in pool
*d82cac58SJerin Jacob * @param cache_size
*d82cac58SJerin Jacob *   Number of elements to cache on lcore, see
*d82cac58SJerin Jacob *   *rte_mempool_create* for further details about cache size
*d82cac58SJerin Jacob * @param user_size
*d82cac58SJerin Jacob *   Size of private data to allocate for user with each operation
*d82cac58SJerin Jacob * @param socket_id
*d82cac58SJerin Jacob *   Socket to identifier allocate memory on
*d82cac58SJerin Jacob * @return
*d82cac58SJerin Jacob *  - On success pointer to mempool
*d82cac58SJerin Jacob *  - On failure NULL
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobstruct rte_mempool *
*d82cac58SJerin Jacobrte_ml_op_pool_create(const char *name, unsigned int nb_elts, unsigned int cache_size,
*d82cac58SJerin Jacob		      uint16_t user_size, int socket_id);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob/**
*d82cac58SJerin Jacob * Free an ML operation pool
*d82cac58SJerin Jacob *
*d82cac58SJerin Jacob * @param mempool
*d82cac58SJerin Jacob *   A pointer to the mempool structure.
*d82cac58SJerin Jacob *   If NULL then, the function does nothing.
*d82cac58SJerin Jacob */
*d82cac58SJerin Jacob__rte_experimental
*d82cac58SJerin Jacobvoid
*d82cac58SJerin Jacobrte_ml_op_pool_free(struct rte_mempool *mempool);
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob#ifdef __cplusplus
*d82cac58SJerin Jacob}
*d82cac58SJerin Jacob#endif
*d82cac58SJerin Jacob
*d82cac58SJerin Jacob#endif /* RTE_MLDEV_H */