Code Heatmaps

BOLT has gained the ability to print code heatmaps based on sampling-based LBR profiles generated by perf. The output is produced in colored ASCII to be displayed in a color-capable terminal. It looks something like this:

Heatmaps can be generated for BOLTed and non-BOLTed binaries. You can use them to compare the code layout before and after optimizations.

To generate a heatmap, start with running your app under perf:

$ perf record -e cycles:u -j any,u -- <executable with args>

or if you want to monitor the existing process(es):

$ perf record -e cycles:u -j any,u [-p PID|-a] -- sleep <interval>

Note that at the moment running with LBR (-j any,u or -b) is a requirement.

Once the run is complete, and perf.data is generated, run llvm-bolt-heatmap:

$ llvm-bolt-heatmap -p perf.data <executable>

By default the heatmap will be dumped to stdout. You can change it with -o <heatmapfile> option. Each character/block in the heatmap shows the execution data accumulated for corresponding 64 bytes of code. You can change this granularity with a -block-size option. E.g. set it to 4096 to see code usage grouped by 4K pages. Other useful options are:

-line-size=<uint>   - number of entries per line (default 256)
-max-address=<uint> - maximum address considered valid for heatmap (default 4GB)

If you prefer to look at the data in a browser (or would like to share it that way), then you can use an HTML conversion tool. E.g.:

$ aha -b -f <heatmapfile> > <heatmapfile>.html