xref: /llvm-project/flang/examples/FlangOmpReport/yaml_summarizer.py (revision d4a0154902fb9b0611ed857134b26a64a1d5ad1e)
17ccacaf4SAndrzej Warzynski"""YAML Summariser
27ccacaf4SAndrzej Warzynski
32186a4aeSAndrzej WarzynskiThe flang plugin ``flang-omp-report`` takes one Fortran
47ccacaf4SAndrzej Warzynskifile in and returns a YAML report file of the input file.
57ccacaf4SAndrzej WarzynskiThis becomes an issue when you want to analyse an entire project
67ccacaf4SAndrzej Warzynskiinto one final report.
77ccacaf4SAndrzej WarzynskiThe purpose of this Python script is to generate a final YAML
87ccacaf4SAndrzej Warzynskisummary from all of the files generated by ``flang-omp-report``.
97ccacaf4SAndrzej Warzynski
107ccacaf4SAndrzej WarzynskiCurrently, it requires ``ruamel.yaml``,
117ccacaf4SAndrzej Warzynskiwhich can be installed with:
127ccacaf4SAndrzej Warzynski
137ccacaf4SAndrzej Warzynski    ``pip3 install ruamel.yaml``
147ccacaf4SAndrzej Warzynski
157ccacaf4SAndrzej WarzynskiBy default it scans the directory it is ran in
167ccacaf4SAndrzej Warzynskifor any YAML files and outputs a summary to
177ccacaf4SAndrzej Warzynskistdout. It can be ran as:
187ccacaf4SAndrzej Warzynski
197ccacaf4SAndrzej Warzynski    ``python3 yaml_summarizer.py``
207ccacaf4SAndrzej Warzynski
217ccacaf4SAndrzej WarzynskiParameters:
227ccacaf4SAndrzej Warzynski
237ccacaf4SAndrzej Warzynski    -d   --directory   Specify which directory to scan. Multiple directories can be searched by
24*d4a01549SJay Foad                       providing a semicolon separated list of directories.
257ccacaf4SAndrzej Warzynski
267ccacaf4SAndrzej Warzynski    -l   --log         Combine all yaml files into one log (instead of generating a summary)
277ccacaf4SAndrzej Warzynski
287ccacaf4SAndrzej Warzynski    -o   --output      Specify a directory in which to save the summary file
297ccacaf4SAndrzej Warzynski
307ccacaf4SAndrzej Warzynski    -r   --recursive   Recursively search directory for all yaml files
317ccacaf4SAndrzej Warzynski
327ccacaf4SAndrzej WarzynskiExamples:
337ccacaf4SAndrzej Warzynski
347ccacaf4SAndrzej Warzynski    ``python3 yaml_summarizer.py -d ~/llvm-project/build/ -r``
357ccacaf4SAndrzej Warzynski
367ccacaf4SAndrzej Warzynski    ``python3 yaml_summarizer.py -d "~/llvm-project/build/;~/llvm-project/flang/test/Examples"``
377ccacaf4SAndrzej Warzynski
387ccacaf4SAndrzej Warzynski    ``python3 yaml_summarizer.py -l -o ~/examples/report.yaml``
397ccacaf4SAndrzej Warzynski
407ccacaf4SAndrzej WarzynskiPseudo-examples:
417ccacaf4SAndrzej Warzynski
427ccacaf4SAndrzej Warzynski    Summary:
437ccacaf4SAndrzej Warzynski
447ccacaf4SAndrzej Warzynski    $ python3 yaml_summarizer.py file_1.yaml file_2.yaml
457ccacaf4SAndrzej Warzynski    <Unique OMP constructs with there grouped clauses from file_1.yaml and file_2.yaml>
467ccacaf4SAndrzej Warzynski
479e37b1e5SGabriel Ravier    Constructs are in the form:
487ccacaf4SAndrzej Warzynski    - construct: someOMPconstruct
497ccacaf4SAndrzej Warzynski    count: 8
507ccacaf4SAndrzej Warzynski    clauses:
517ccacaf4SAndrzej Warzynski    - clause: clauseOne
527ccacaf4SAndrzej Warzynski        count: 4
537ccacaf4SAndrzej Warzynski    - clause: ClauseTwo
547ccacaf4SAndrzej Warzynski        count: 2
557ccacaf4SAndrzej Warzynski
567ccacaf4SAndrzej Warzynski    Log:
577ccacaf4SAndrzej Warzynski
587ccacaf4SAndrzej Warzynski    $ python3 yaml_summarizer.py -l file_1.yaml file_2.yaml
597ccacaf4SAndrzej Warzynski    file_1.yaml
607ccacaf4SAndrzej Warzynski    <OMP clauses and constructs from file_1.yaml>
617ccacaf4SAndrzej Warzynski    file_2.yaml
627ccacaf4SAndrzej Warzynski    <OMP clauses and constructs from file_2.yaml>
637ccacaf4SAndrzej Warzynski
647ccacaf4SAndrzej Warzynski    Constructs are in the form:
657ccacaf4SAndrzej Warzynski    - construct: someOMPConstruct
667ccacaf4SAndrzej Warzynski    line: 12
677ccacaf4SAndrzej Warzynski    clauses:
687ccacaf4SAndrzej Warzynski    - clause: clauseOne
697ccacaf4SAndrzej Warzynski        details: 'someDetailForClause'
707ccacaf4SAndrzej Warzynski"""
717ccacaf4SAndrzej Warzynski
727ccacaf4SAndrzej Warzynskiimport sys
737ccacaf4SAndrzej Warzynskiimport glob
747ccacaf4SAndrzej Warzynskiimport argparse
757ccacaf4SAndrzej Warzynskifrom pathlib import Path
767ccacaf4SAndrzej Warzynskifrom os.path import isdir
777ccacaf4SAndrzej Warzynski
787ccacaf4SAndrzej Warzynskifrom ruamel.yaml import YAML
797ccacaf4SAndrzej Warzynski
80f98ee40fSTobias Hieta
817ccacaf4SAndrzej Warzynskidef find_yaml_files(search_directory: Path, search_pattern: str):
827ccacaf4SAndrzej Warzynski    """
837ccacaf4SAndrzej Warzynski    Find all '.yaml' files and returns an iglob iterator to them.
847ccacaf4SAndrzej Warzynski
857ccacaf4SAndrzej Warzynski    Keyword arguments:
867ccacaf4SAndrzej Warzynski    search_pattern -- Search pattern for 'iglob' to use for finding '.yaml' files.
877ccacaf4SAndrzej Warzynski                      If this is set to 'None', then it will default to just searching
887ccacaf4SAndrzej Warzynski                      for all '.yaml' files in the current directory.
897ccacaf4SAndrzej Warzynski    """
907ccacaf4SAndrzej Warzynski    # @TODO: Currently *all* yaml files are read - regardless of whether they have
917ccacaf4SAndrzej Warzynski    # been generated with  'flang-omp-report' or not. This might result in the script
927ccacaf4SAndrzej Warzynski    # reading files that it should ignore.
937ccacaf4SAndrzej Warzynski    if search_directory:
94f98ee40fSTobias Hieta        return glob.iglob(
95f98ee40fSTobias Hieta            str(search_directory.joinpath(search_pattern)), recursive=True
96f98ee40fSTobias Hieta        )
977ccacaf4SAndrzej Warzynski
987ccacaf4SAndrzej Warzynski    return glob.iglob(str("/" + search_pattern), recursive=True)
997ccacaf4SAndrzej Warzynski
100f98ee40fSTobias Hieta
1017ccacaf4SAndrzej Warzynskidef process_log(data, result: list):
1027ccacaf4SAndrzej Warzynski    """
1037ccacaf4SAndrzej Warzynski    Process the data input as a 'log' to the result array. This esssentially just
1047ccacaf4SAndrzej Warzynski    stitches together all of the input '.yaml' files into one result.
1057ccacaf4SAndrzej Warzynski
1067ccacaf4SAndrzej Warzynski    Keyword arguments:
1077ccacaf4SAndrzej Warzynski    data -- Data from yaml.load() for a yaml file. So the type can be 'Any'.
1087ccacaf4SAndrzej Warzynski    result -- Array to add the processed data to.
1097ccacaf4SAndrzej Warzynski    """
1107ccacaf4SAndrzej Warzynski    for datum in data:
111f98ee40fSTobias Hieta        items = result.get(datum["file"], [])
112f98ee40fSTobias Hieta        items.append(
113f98ee40fSTobias Hieta            {
114f98ee40fSTobias Hieta                "construct": datum["construct"],
115f98ee40fSTobias Hieta                "line": datum["line"],
116f98ee40fSTobias Hieta                "clauses": datum["clauses"],
117f98ee40fSTobias Hieta            }
118f98ee40fSTobias Hieta        )
119f98ee40fSTobias Hieta        result[datum["file"]] = items
120f98ee40fSTobias Hieta
1217ccacaf4SAndrzej Warzynski
1227ccacaf4SAndrzej Warzynskidef add_clause(datum, construct):
1237ccacaf4SAndrzej Warzynski    """
1247ccacaf4SAndrzej Warzynski    Add clauses to the construct if they're missing
1257ccacaf4SAndrzej Warzynski    Otherwise increment their count by one.
1267ccacaf4SAndrzej Warzynski
1277ccacaf4SAndrzej Warzynski    Keyword arguments:
1287ccacaf4SAndrzej Warzynski    datum -- Data construct containing clauses to check.
1297ccacaf4SAndrzej Warzynski    construct -- Construct to add or increment clause count.
1307ccacaf4SAndrzej Warzynski    """
131f98ee40fSTobias Hieta    to_check = [i["clause"] for i in construct["clauses"]]
132f98ee40fSTobias Hieta    to_add = [i["clause"] for i in datum["clauses"]]
1337ccacaf4SAndrzej Warzynski    clauses = construct["clauses"]
1347ccacaf4SAndrzej Warzynski    for item in to_add:
1357ccacaf4SAndrzej Warzynski        if item in to_check:
1367ccacaf4SAndrzej Warzynski            for clause in clauses:
1377ccacaf4SAndrzej Warzynski                if clause["clause"] == item:
1387ccacaf4SAndrzej Warzynski                    clause["count"] += 1
1397ccacaf4SAndrzej Warzynski        else:
140f98ee40fSTobias Hieta            clauses.append({"clause": item, "count": 1})
141f98ee40fSTobias Hieta
1427ccacaf4SAndrzej Warzynski
1437ccacaf4SAndrzej Warzynskidef process_summary(data, result: dict):
1447ccacaf4SAndrzej Warzynski    """
1457ccacaf4SAndrzej Warzynski    Process the data input as a 'summary' to the 'result' dictionary.
1467ccacaf4SAndrzej Warzynski
1477ccacaf4SAndrzej Warzynski    Keyword arguments:
1487ccacaf4SAndrzej Warzynski    data -- Data from yaml.load() for a yaml file. So the type can be 'Any'.
1497ccacaf4SAndrzej Warzynski    result -- Dictionary to add the processed data to.
1507ccacaf4SAndrzej Warzynski    """
1517ccacaf4SAndrzej Warzynski    for datum in data:
152f98ee40fSTobias Hieta        construct = next(
153f98ee40fSTobias Hieta            (item for item in result if item["construct"] == datum["construct"]), None
154f98ee40fSTobias Hieta        )
1557ccacaf4SAndrzej Warzynski        clauses = []
1567ccacaf4SAndrzej Warzynski        # Add the construct and clauses to the summary if
1577ccacaf4SAndrzej Warzynski        # they haven't been seen before
1587ccacaf4SAndrzej Warzynski        if not construct:
159f98ee40fSTobias Hieta            for i in datum["clauses"]:
160f98ee40fSTobias Hieta                clauses.append({"clause": i["clause"], "count": 1})
161f98ee40fSTobias Hieta            result.append(
162f98ee40fSTobias Hieta                {"construct": datum["construct"], "count": 1, "clauses": clauses}
163f98ee40fSTobias Hieta            )
1647ccacaf4SAndrzej Warzynski        else:
1657ccacaf4SAndrzej Warzynski            construct["count"] += 1
1667ccacaf4SAndrzej Warzynski
1677ccacaf4SAndrzej Warzynski            add_clause(datum, construct)
1687ccacaf4SAndrzej Warzynski
169f98ee40fSTobias Hieta
1707ccacaf4SAndrzej Warzynskidef clean_summary(result):
1717ccacaf4SAndrzej Warzynski    """Cleans the result after processing the yaml files with summary format."""
1727ccacaf4SAndrzej Warzynski    # Remove all "clauses" that are empty to keep things compact
1737ccacaf4SAndrzej Warzynski    for construct in result:
1747ccacaf4SAndrzej Warzynski        if construct["clauses"] == []:
1757ccacaf4SAndrzej Warzynski            construct.pop("clauses")
1767ccacaf4SAndrzej Warzynski
177f98ee40fSTobias Hieta
1787ccacaf4SAndrzej Warzynskidef clean_log(result):
1797ccacaf4SAndrzej Warzynski    """Cleans the result after processing the yaml files with log format."""
1807ccacaf4SAndrzej Warzynski    for constructs in result.values():
1817ccacaf4SAndrzej Warzynski        for construct in constructs:
1827ccacaf4SAndrzej Warzynski            if construct["clauses"] == []:
1837ccacaf4SAndrzej Warzynski                construct.pop("clauses")
1847ccacaf4SAndrzej Warzynski
185f98ee40fSTobias Hieta
1867ccacaf4SAndrzej Warzynskidef output_result(yaml: YAML, result, output_file: Path):
1877ccacaf4SAndrzej Warzynski    """
1887ccacaf4SAndrzej Warzynski    Outputs result to either 'stdout' or to a output file.
1897ccacaf4SAndrzej Warzynski
1907ccacaf4SAndrzej Warzynski    Keyword arguments:
1917ccacaf4SAndrzej Warzynski    result -- Format result to output.
1927ccacaf4SAndrzej Warzynski    output_file -- File to output result to. If this is 'None' then result will be
1937ccacaf4SAndrzej Warzynski                   outputted to 'stdout'.
1947ccacaf4SAndrzej Warzynski    """
1957ccacaf4SAndrzej Warzynski    if output_file:
196f98ee40fSTobias Hieta        with open(output_file, "w+", encoding="utf-8") as file:
1977ccacaf4SAndrzej Warzynski            if output_file.suffix == ".yaml":
1987ccacaf4SAndrzej Warzynski                yaml.dump(result, file)
1997ccacaf4SAndrzej Warzynski            else:
2007ccacaf4SAndrzej Warzynski                file.write(result)
2017ccacaf4SAndrzej Warzynski    else:
2027ccacaf4SAndrzej Warzynski        yaml.dump(result, sys.stdout)
2037ccacaf4SAndrzej Warzynski
204f98ee40fSTobias Hieta
205f98ee40fSTobias Hietadef process_yaml(
206f98ee40fSTobias Hieta    search_directories: list, search_pattern: str, result_format: str, output_file: Path
207f98ee40fSTobias Hieta):
2087ccacaf4SAndrzej Warzynski    """
2097ccacaf4SAndrzej Warzynski    Reads each yaml file, calls the appropiate format function for
2107ccacaf4SAndrzej Warzynski    the file and then ouputs the result to either 'stdout' or to an output file.
2117ccacaf4SAndrzej Warzynski
2127ccacaf4SAndrzej Warzynski    Keyword arguments:
2137ccacaf4SAndrzej Warzynski    search_directories -- List of directory paths to search for '.yaml' files in.
2147ccacaf4SAndrzej Warzynski    search_pattern -- String pattern formatted for use with glob.iglob to find all
2157ccacaf4SAndrzej Warzynski                      '.yaml' files.
2167ccacaf4SAndrzej Warzynski    result_format -- String representing output format. Current supported strings are: 'log'.
2177ccacaf4SAndrzej Warzynski    output_file -- Path to output file (If value is None, then default to outputting to 'stdout').
2187ccacaf4SAndrzej Warzynski    """
2197ccacaf4SAndrzej Warzynski    if result_format == "log":
2207ccacaf4SAndrzej Warzynski        result = {}
2217ccacaf4SAndrzej Warzynski        action = process_log
2227ccacaf4SAndrzej Warzynski        clean_report = clean_log
2237ccacaf4SAndrzej Warzynski    else:
2247ccacaf4SAndrzej Warzynski        result = []
2257ccacaf4SAndrzej Warzynski        action = process_summary
2267ccacaf4SAndrzej Warzynski        clean_report = clean_summary
2277ccacaf4SAndrzej Warzynski
2287ccacaf4SAndrzej Warzynski    yaml = YAML()
2297ccacaf4SAndrzej Warzynski
2307ccacaf4SAndrzej Warzynski    for search_directory in search_directories:
2317ccacaf4SAndrzej Warzynski        for file in find_yaml_files(search_directory, search_pattern):
232f98ee40fSTobias Hieta            with open(file, "r", encoding="utf-8") as yaml_file:
2337ccacaf4SAndrzej Warzynski                data = yaml.load(yaml_file)
2347ccacaf4SAndrzej Warzynski                action(data, result)
2357ccacaf4SAndrzej Warzynski
2367ccacaf4SAndrzej Warzynski    if clean_report is not None:
2377ccacaf4SAndrzej Warzynski        clean_report(result)
2387ccacaf4SAndrzej Warzynski
2397ccacaf4SAndrzej Warzynski    output_result(yaml, result, output_file)
2407ccacaf4SAndrzej Warzynski
241f98ee40fSTobias Hieta
2427ccacaf4SAndrzej Warzynskidef create_arg_parser():
2437ccacaf4SAndrzej Warzynski    """Create and return a argparse.ArgumentParser modified for script."""
2447ccacaf4SAndrzej Warzynski    parser = argparse.ArgumentParser()
245f98ee40fSTobias Hieta    parser.add_argument(
246f98ee40fSTobias Hieta        "-d", "--directory", help="Specify a directory to scan", dest="dir", type=str
247f98ee40fSTobias Hieta    )
248f98ee40fSTobias Hieta    parser.add_argument(
249f98ee40fSTobias Hieta        "-o",
250f98ee40fSTobias Hieta        "--output",
251f98ee40fSTobias Hieta        help="Writes to a file instead of\
252f98ee40fSTobias Hieta                                                stdout",
253f98ee40fSTobias Hieta        dest="output",
254f98ee40fSTobias Hieta        type=str,
255f98ee40fSTobias Hieta    )
256f98ee40fSTobias Hieta    parser.add_argument(
257f98ee40fSTobias Hieta        "-r",
258f98ee40fSTobias Hieta        "--recursive",
259f98ee40fSTobias Hieta        help="Recursive search for .yaml files",
260f98ee40fSTobias Hieta        dest="recursive",
261f98ee40fSTobias Hieta        type=bool,
262f98ee40fSTobias Hieta        nargs="?",
263f98ee40fSTobias Hieta        const=True,
264f98ee40fSTobias Hieta        default=False,
265f98ee40fSTobias Hieta    )
2667ccacaf4SAndrzej Warzynski
2677ccacaf4SAndrzej Warzynski    exclusive_parser = parser.add_mutually_exclusive_group()
268f98ee40fSTobias Hieta    exclusive_parser.add_argument(
269f98ee40fSTobias Hieta        "-l",
270f98ee40fSTobias Hieta        "--log",
271f98ee40fSTobias Hieta        help="Modifies report format: " "Combines the log '.yaml' files into one file.",
272f98ee40fSTobias Hieta        action="store_true",
273f98ee40fSTobias Hieta        dest="log",
274f98ee40fSTobias Hieta    )
2757ccacaf4SAndrzej Warzynski    return parser
2767ccacaf4SAndrzej Warzynski
277f98ee40fSTobias Hieta
2787ccacaf4SAndrzej Warzynskidef parse_arguments():
2797ccacaf4SAndrzej Warzynski    """Parses arguments given to script and returns a tuple of processed arguments."""
2807ccacaf4SAndrzej Warzynski    parser = create_arg_parser()
2817ccacaf4SAndrzej Warzynski    args = parser.parse_args()
2827ccacaf4SAndrzej Warzynski
2837ccacaf4SAndrzej Warzynski    if args.dir:
2847ccacaf4SAndrzej Warzynski        search_directory = [Path(path) for path in args.dir.split(";")]
2857ccacaf4SAndrzej Warzynski    else:
2867ccacaf4SAndrzej Warzynski        search_directory = [Path.cwd()]
2877ccacaf4SAndrzej Warzynski
2887ccacaf4SAndrzej Warzynski    if args.recursive:
2897ccacaf4SAndrzej Warzynski        search_pattern = "**/*.yaml"
2907ccacaf4SAndrzej Warzynski    else:
2917ccacaf4SAndrzej Warzynski        search_pattern = "*.yaml"
2927ccacaf4SAndrzej Warzynski
2937ccacaf4SAndrzej Warzynski    if args.log:
2947ccacaf4SAndrzej Warzynski        result_format = "log"
2957ccacaf4SAndrzej Warzynski    else:
2967ccacaf4SAndrzej Warzynski        result_format = "summary"
2977ccacaf4SAndrzej Warzynski
2987ccacaf4SAndrzej Warzynski    if args.output:
2997ccacaf4SAndrzej Warzynski        if isdir(args.output):
3007ccacaf4SAndrzej Warzynski            output_file = Path(args.output).joinpath("summary.yaml")
3017ccacaf4SAndrzej Warzynski        elif isdir(Path(args.output).resolve().parent):
3027ccacaf4SAndrzej Warzynski            output_file = Path(args.output)
3037ccacaf4SAndrzej Warzynski    else:
3047ccacaf4SAndrzej Warzynski        output_file = None
3057ccacaf4SAndrzej Warzynski
3067ccacaf4SAndrzej Warzynski    return (search_directory, search_pattern, result_format, output_file)
3077ccacaf4SAndrzej Warzynski
308f98ee40fSTobias Hieta
3097ccacaf4SAndrzej Warzynskidef main():
3107ccacaf4SAndrzej Warzynski    """Main function of script."""
3117ccacaf4SAndrzej Warzynski    (search_directory, search_pattern, result_format, output_file) = parse_arguments()
3127ccacaf4SAndrzej Warzynski
3137ccacaf4SAndrzej Warzynski    process_yaml(search_directory, search_pattern, result_format, output_file)
3147ccacaf4SAndrzej Warzynski
3157ccacaf4SAndrzej Warzynski    return 0
3167ccacaf4SAndrzej Warzynski
317f98ee40fSTobias Hieta
3187ccacaf4SAndrzej Warzynskiif __name__ == "__main__":
3197ccacaf4SAndrzej Warzynski    sys.exit(main())
320