1.. SPDX-License-Identifier: BSD-3-Clause 2 Copyright (c) 2022 Marvell. 3 4dpdk-test-mldev Application 5=========================== 6 7The ``dpdk-test-mldev`` tool is a Data Plane Development Kit (DPDK) application 8that allows testing various mldev use cases. 9This application has a generic framework to add new mldev based test cases 10to verify functionality 11and measure the performance of inference execution on DPDK ML devices. 12 13 14Application and Options 15----------------------- 16 17The application has a number of command line options: 18 19.. code-block:: console 20 21 dpdk-test-mldev [EAL Options] -- [application options] 22 23 24EAL Options 25~~~~~~~~~~~ 26 27The following are the EAL command-line options that can be used 28with the ``dpdk-test-mldev`` application. 29See the DPDK Getting Started Guides for more information on these options. 30 31``-c <COREMASK>`` or ``-l <CORELIST>`` 32 Set the hexadecimal bitmask of the cores to run on. 33 The corelist is a list of cores to use. 34 35``-a <PCI_ID>`` 36 Attach a PCI based ML device. 37 Specific to drivers using a PCI based ML device. 38 39``--vdev <driver>`` 40 Add a virtual mldev device. 41 Specific to drivers using a ML virtual device. 42 43 44Application Options 45~~~~~~~~~~~~~~~~~~~ 46 47The following are the command-line options supported by the test application. 48 49``--test <name>`` 50 Name of the test to execute. 51 ML tests are divided into three groups: Device, Model and Inference tests. 52 Test name should be one of the following supported tests. 53 54 **ML Device Tests** :: 55 56 device_ops 57 58 **ML Model Tests** :: 59 60 model_ops 61 62 **ML Inference Tests** :: 63 64 inference_ordered 65 inference_interleave 66 67``--dev_id <n>`` 68 Set the device ID of the ML device to be used for the test. 69 Default value is ``0``. 70 71``--socket_id <n>`` 72 Set the socket ID of the application resources. 73 Default value is ``SOCKET_ID_ANY``. 74 75``--models <model_list>`` 76 Set the list of model files to be used for the tests. 77 Application expects the ``model_list`` in comma separated form 78 (i.e. ``--models model_A.bin,model_B.bin``). 79 Maximum number of models supported by the test is ``8``. 80 81``--filelist <file_list>`` 82 Set the list of model, input, output and reference files to be used for the tests. 83 Application expects the ``file_list`` to be in comma separated form 84 (i.e. ``--filelist <model,input,output>[,reference]``). 85 86 Multiple filelist entries can be specified when running the tests with multiple models. 87 Both quantized and dequantized outputs are written to the disk. 88 Dequantized output file would have the name specified by the user through ``--filelist`` option. 89 A suffix ``.q`` is appended to quantized output filename. 90 Maximum number of filelist entries supported by the test is ``8``. 91 92``--repetitions <n>`` 93 Set the number of inference repetitions to be executed in the test per each model. 94 Default value is ``1``. 95 96``--burst_size <n>`` 97 Set the burst size to be used when enqueuing / dequeuing inferences. 98 Default value is ``1``. 99 100``--queue_pairs <n>`` 101 Set the number of queue-pairs to be used for inference enqueue and dequeue operations. 102 Default value is ``1``. 103 104``--queue_size <n>`` 105 Set the size of queue-pair to be created for inference enqueue / dequeue operations. 106 Queue size would translate into ``rte_ml_dev_qp_conf::nb_desc`` field during queue-pair creation. 107 Default value is ``1``. 108 109``--tolerance <n>`` 110 Set the tolerance value in percentage to be used for output validation. 111 Default value is ``0``. 112 113``--stats`` 114 Enable reporting device extended stats. 115 116``--debug`` 117 Enable the tests to run in debug mode. 118 119``--help`` 120 Print help message. 121 122 123ML Device Tests 124--------------- 125 126ML device tests are functional tests to validate ML device API. 127Device tests validate the ML device handling configure, close, start and stop APIs. 128 129 130Application Options 131~~~~~~~~~~~~~~~~~~~ 132 133Supported command line options for the ``device_ops`` test are following:: 134 135 --debug 136 --test 137 --dev_id 138 --socket_id 139 --queue_pairs 140 --queue_size 141 142 143DEVICE_OPS Test 144~~~~~~~~~~~~~~~ 145 146Device ops test validates the device configuration and reconfiguration support. 147The test configures ML device based on the options 148``--queue_pairs`` and ``--queue_size`` specified by the user, 149and later reconfigures the ML device with the number of queue pairs and queue size 150based on the maximum specified through the device info. 151 152 153Example 154^^^^^^^ 155 156Command to run ``device_ops`` test: 157 158.. code-block:: console 159 160 sudo <build_dir>/app/dpdk-test-mldev -c 0xf -a <PCI_ID> -- \ 161 --test=device_ops 162 163Command to run ``device_ops`` test with user options: 164 165.. code-block:: console 166 167 sudo <build_dir>/app/dpdk-test-mldev -c 0xf -a <PCI_ID> -- \ 168 --test=device_ops --queue_pairs <M> --queue_size <N> 169 170 171ML Model Tests 172-------------- 173 174Model tests are functional tests to validate ML model API. 175Model tests validate the functioning of load, start, stop and unload ML models. 176 177 178Application Options 179~~~~~~~~~~~~~~~~~~~ 180 181Supported command line options for the ``model_ops`` test are following:: 182 183 --debug 184 --test 185 --dev_id 186 --socket_id 187 --models 188 189List of model files to be used for the ``model_ops`` test can be specified 190through the option ``--models <model_list>`` as a comma separated list. 191Maximum number of models supported in the test is ``8``. 192 193.. note:: 194 195 * The ``--models <model_list>`` is a mandatory option for running this test. 196 * Options not supported by the test are ignored if specified. 197 198 199MODEL_OPS Test 200~~~~~~~~~~~~~~ 201 202The test is a collection of multiple sub-tests, 203each with a different order of slow-path operations 204when handling with `N` number of models. 205 206**Sub-test A:** 207executes the sequence of load / start / stop / unload for a model in order, 208followed by next model. 209 210.. _figure_mldev_model_ops_subtest_a: 211 212.. figure:: img/mldev_model_ops_subtest_a.* 213 214 Execution sequence of model_ops subtest A. 215 216**Sub-test B:** 217executes load for all models, followed by a start for all models. 218Upon successful start of all models, stop is invoked for all models followed by unload. 219 220.. _figure_mldev_model_ops_subtest_b: 221 222.. figure:: img/mldev_model_ops_subtest_b.* 223 224 Execution sequence of model_ops subtest B. 225 226**Sub-test C:** 227loads all models, followed by a start and stop of all models in order. 228Upon completion of stop, unload is invoked for all models. 229 230.. _figure_mldev_model_ops_subtest_c: 231 232.. figure:: img/mldev_model_ops_subtest_c.* 233 234 Execution sequence of model_ops subtest C. 235 236**Sub-test D:** 237executes load and start for all models available. 238Upon successful start of all models, stop is executed for the models. 239 240.. _figure_mldev_model_ops_subtest_d: 241 242.. figure:: img/mldev_model_ops_subtest_d.* 243 244 Execution sequence of model_ops subtest D. 245 246 247Example 248^^^^^^^ 249 250Command to run ``model_ops`` test: 251 252.. code-block:: console 253 254 sudo <build_dir>/app/dpdk-test-mldev -c 0xf -a <PCI_ID> -- \ 255 --test=model_ops --models model_1.bin,model_2.bin,model_3.bin, model_4.bin 256 257 258ML Inference Tests 259------------------ 260 261Inference tests are a set of tests to validate end-to-end inference execution on ML device. 262These tests executes the full sequence of operations required to run inferences 263with one or multiple models. 264 265 266Application Options 267~~~~~~~~~~~~~~~~~~~ 268 269Supported command line options for inference tests are following:: 270 271 --debug 272 --test 273 --dev_id 274 --socket_id 275 --filelist 276 --repetitions 277 --burst_size 278 --queue_pairs 279 --queue_size 280 --tolerance 281 --stats 282 283List of files to be used for the inference tests can be specified 284through the option ``--filelist <file_list>`` as a comma separated list. 285A filelist entry would be of the format 286``--filelist <model_file,input_file,output_file>[,reference_file]`` 287and is used to specify the list of files required to test with a single model. 288Multiple filelist entries are supported by the test, one entry per model. 289Maximum number of file entries supported by the test is ``8``. 290 291When ``--burst_size <num>`` option is specified for the test, 292enqueue and dequeue burst would try to enqueue or dequeue 293``num`` number of inferences per each call respectively. 294 295In the inference test, a pair of lcores are mapped to each queue pair. 296Minimum number of lcores required for the tests is equal to ``(queue_pairs * 2 + 1)``. 297 298Output validation of inference would be enabled only 299when a reference file is specified through the ``--filelist`` option. 300Application would additionally consider the tolerance value 301provided through ``--tolerance`` option during validation. 302When the tolerance values is 0, CRC32 hash of inference output 303and reference output are compared. 304When the tolerance is non-zero, element wise comparison of output is performed. 305Validation is considered as successful only 306when all the elements of the output tensor are with in the tolerance range specified. 307 308Enabling ``--stats`` would print the extended stats supported by the driver. 309 310.. note:: 311 312 * The ``--filelist <file_list>`` is a mandatory option for running inference tests. 313 * Options not supported by the tests are ignored if specified. 314 * Element wise comparison is not supported when 315 the output dtype is either fp8, fp16 or bfloat16. 316 This is applicable only when the tolerance is greater than zero 317 and for pre-quantized models only. 318 319 320INFERENCE_ORDERED Test 321~~~~~~~~~~~~~~~~~~~~~~ 322 323This is a functional test for validating the end-to-end inference execution on ML device. 324This test configures ML device and queue pairs 325as per the queue-pair related options (queue_pairs and queue_size) specified by the user. 326Upon successful configuration of the device and queue pairs, 327the first model specified through the filelist is loaded to the device 328and inferences are enqueued by a pool of worker threads to the ML device. 329Total number of inferences enqueued for the model are equal to the repetitions specified. 330A dedicated pool of worker threads would dequeue the inferences from the device. 331The model is unloaded upon completion of all inferences for the model. 332The test would continue loading and executing inference requests for all models 333specified through ``filelist`` option in an ordered manner. 334 335.. _figure_mldev_inference_ordered: 336 337.. figure:: img/mldev_inference_ordered.* 338 339 Execution of inference_ordered on single model. 340 341 342Example 343^^^^^^^ 344 345Example command to run ``inference_ordered`` test: 346 347.. code-block:: console 348 349 sudo <build_dir>/app/dpdk-test-mldev -c 0xf -a <PCI_ID> -- \ 350 --test=inference_ordered --filelist model.bin,input.bin,output.bin 351 352Example command to run ``inference_ordered`` test with a specific burst size: 353 354.. code-block:: console 355 356 sudo <build_dir>/app/dpdk-test-mldev -c 0xf -a <PCI_ID> -- \ 357 --test=inference_ordered --filelist model.bin,input.bin,output.bin \ 358 --burst_size 12 359 360Example command to run ``inference_ordered`` test with multiple queue-pairs and queue size: 361 362.. code-block:: console 363 364 sudo <build_dir>/app/dpdk-test-mldev -c 0xf -a <PCI_ID> -- \ 365 --test=inference_ordered --filelist model.bin,input.bin,output.bin \ 366 --queue_pairs 4 --queue_size 16 367 368Example command to run ``inference_ordered`` with output validation using tolerance of ``1%``: 369 370.. code-block:: console 371 372 sudo <build_dir>/app/dpdk-test-mldev -c 0xf -a <PCI_ID> -- \ 373 --test=inference_ordered --filelist model.bin,input.bin,output.bin,reference.bin \ 374 --tolerance 1.0 375 376 377INFERENCE_INTERLEAVE Test 378~~~~~~~~~~~~~~~~~~~~~~~~~ 379 380This is a stress test for validating the end-to-end inference execution on ML device. 381The test configures the ML device and queue pairs 382as per the queue-pair related options (queue_pairs and queue_size) specified by the user. 383Upon successful configuration of the device and queue pairs, 384all models specified through the filelist are loaded to the device. 385Inferences for multiple models are enqueued by a pool of worker threads in parallel. 386Inference execution by the device is interleaved between multiple models. 387Total number of inferences enqueued for a model are equal to the repetitions specified. 388An additional pool of threads would dequeue the inferences from the device. 389Models would be unloaded upon completion of inferences for all models loaded. 390 391.. _figure_mldev_inference_interleave: 392 393.. figure:: img/mldev_inference_interleave.* 394 395 Execution of inference_interleave on single model. 396 397 398Example 399^^^^^^^ 400 401Example command to run ``inference_interleave`` test: 402 403.. code-block:: console 404 405 sudo <build_dir>/app/dpdk-test-mldev -c 0xf -a <PCI_ID> -- \ 406 --test=inference_interleave --filelist model.bin,input.bin,output.bin 407 408Example command to run ``inference_interleave`` test with multiple models: 409 410.. code-block:: console 411 412 sudo <build_dir>/app/dpdk-test-mldev -c 0xf -a <PCI_ID> -- \ 413 --test=inference_interleave --filelist model_A.bin,input_A.bin,output_A.bin \ 414 --filelist model_B.bin,input_B.bin,output_B.bin 415 416Example command to run ``inference_interleave`` test 417with a specific burst size, multiple queue-pairs and queue size: 418 419.. code-block:: console 420 421 sudo <build_dir>/app/dpdk-test-mldev -c 0xf -a <PCI_ID> -- \ 422 --test=inference_interleave --filelist model.bin,input.bin,output.bin \ 423 --queue_pairs 8 --queue_size 12 --burst_size 16 424 425Example command to run ``inference_interleave`` test 426with multiple models and output validation using tolerance of ``2.0%``: 427 428.. code-block:: console 429 430 sudo <build_dir>/app/dpdk-test-mldev -c 0xf -a <PCI_ID> -- \ 431 --test=inference_interleave \ 432 --filelist model_A.bin,input_A.bin,output_A.bin,reference_A.bin \ 433 --filelist model_B.bin,input_B.bin,output_B.bin,reference_B.bin \ 434 --tolerance 2.0 435 436 437Debug mode 438---------- 439 440ML tests can be executed in debug mode by enabling the option ``--debug``. 441Execution of tests in debug mode would enable additional prints. 442 443When a validation failure is observed, output from that buffer is written to the disk, 444with the filenames having similar convention when the test has passed. 445Additionally index of the buffer would be appended to the filenames. 446