1# README for the matrix-vector multiplication demo code 2 3## Synopsis 4 5This program implements the multiplication of a matrix and a vector. It is 6written in C and has been parallelized using the Pthreads parallel programming 7model. Each thread gets assigned a contiguous set of rows of the matrix to 8work on and the results are stored in the output vector. 9 10The code initializes the data, executes the matrix-vector multiplication, and 11checks the correctness of the results. In case of an error, a message to this 12extent is printed and the program aborts. Otherwise it prints a one line 13message on the screen. 14 15## About this code 16 17This is a standalone code, not a library. It is meant as a simple example to 18experiment with gprofng. 19 20## Directory structure 21 22There are four directories: 23 241. `bindir` - after the build, it contains the executable. 25 262. `experiments` - after the installation, it contains the executable and 27also has an example profiling script called `profile.sh`. 28 293. `objects` - after the build, it contains the object files. 30 314. `src` - contains the source code and the make file to build, install, 32and check correct functioning of the executable. 33 34## Code internals 35 36This is the main execution flow: 37 38* Parse the user options. 39* Compute the internal settings for the algorithm. 40* Initialize the data and compute the reference results needed for the correctness 41check. 42* Create and execute the threads. Each thread performs the matrix-vector 43multiplication on a pre-determined set of rows. 44* Verify the results are correct. 45* Print statistics and release the allocated memory. 46 47## Installation 48 49The Makefile in the `src` subdirectory can be used to build, install and check the 50code. 51 52Use `make` at the command line to (re)build the executable called `mxv-pthreads`. It will be 53stored in the directory `bindir`: 54 55``` 56$ make 57gcc -o ../objects/main.o -c -g -O -Wall -Werror=undef -Wstrict-prototypes main.c 58gcc -o ../objects/manage_data.o -c -g -O -Wall -Werror=undef -Wstrict-prototypes manage_data.c 59gcc -o ../objects/workload.o -c -g -O -Wall -Werror=undef -Wstrict-prototypes workload.c 60gcc -o ../objects/mxv.o -c -g -O -Wall -Werror=undef -Wstrict-prototypes mxv.c 61gcc -o ../bindir/mxv-pthreads ../objects/main.o ../objects/manage_data.o ../objects/workload.o ../objects/mxv.o -lm -lpthread 62ldd ../bindir/mxv-pthreads 63 linux-vdso.so.1 (0x0000ffff9ea8b000) 64 libm.so.6 => /lib64/libm.so.6 (0x0000ffff9e9ad000) 65 libc.so.6 => /lib64/libc.so.6 (0x0000ffff9e7ff000) 66 /lib/ld-linux-aarch64.so.1 (0x0000ffff9ea4e000) 67$ 68``` 69The `make install` command installs the executable in directory `experiments`. 70 71``` 72$ make install 73Installed mxv-pthreads in ../experiments 74$ 75``` 76The `make check` command may be used to verify the program works as expected: 77 78``` 79$ make check 80Running mxv-pthreads in ../experiments 81mxv: error check passed - rows = 1000 columns = 1500 threads = 2 82$ 83``` 84The `make clean` comand removes the object files from the `objects` directory 85and the executable from the `bindir` directory. 86 87The `make veryclean` command implies `make clean`, but also removes the 88executable from directory `experiments`. 89 90## Usage 91 92The code takes several options, but all have a default value. If the code is 93executed without any options, these defaults will be used. To get an overview of 94all the options supported, and the defaults, use the `-h` option: 95 96``` 97$ ./mxv-pthreads -h 98Usage: ./mxv-pthreads [-m <number of rows>] [-n <number of columns] [-r <repeat count>] [-t <number of threads] [-v] [-h] 99 -m - number of rows, default = 2000 100 -n - number of columns, default = 3000 101 -r - the number of times the algorithm is repeatedly executed, default = 200 102 -t - the number of threads used, default = 1 103 -v - enable verbose mode, off by default 104 -h - print this usage overview and exit 105$ 106``` 107 108For more extensive run time diagnostic messages use the `-v` option. 109 110As an example, these are the options to compute the product of a 2000x1000 matrix 111with a vector of length 1000 and use 4 threads. Verbose mode has been enabled: 112 113``` 114$ ./mxv-pthreads -m 2000 -n 1000 -t 4 -v 115Verbose mode enabled 116Allocated data structures 117Initialized matrix and vectors 118Defined workload distribution 119Assigned work to threads 120Thread 0 has been created 121Thread 1 has been created 122Thread 2 has been created 123Thread 3 has been created 124Matrix vector multiplication has completed 125Verify correctness of result 126Error check passed 127mxv: error check passed - rows = 2000 columns = 1000 threads = 4 128$ 129``` 130 131## Executing the examples 132 133Directory `experiments` contains the `profile.sh` script. This script 134checks if gprofng can be found and for the executable to be installed. 135 136The script will then run a data collection experiment, followed by a series 137of invocations of `gprofng display text` to show various views. The results 138are printed on stdout. 139 140To include the commands executed in the output of the script, and store the 141results in a file called `LOG`, execute the script as follows: 142 143``` 144$ bash -x ./profile.sh >& LOG 145``` 146 147## Additional comments 148 149* The reason that compiler based inlining is disabled is to make the call tree 150look more interesting. For the same reason, the core multiplication function 151`mxv_core` has inlining disabled through the `void __attribute__ ((noinline))` 152attribute. Of course you're free to change this. It certainly does not affect 153the workings of the code. 154 155* This distribution includes a script called `profile.sh`. It is in the 156`experiments` directory and meant as an example for (new) users of gprofng. 157It can be used to produce profiles at the command line. It is also suitable 158as a starting point to develop your own profiling script(s). 159