|  | .. role:: raw-html(raw) | 
|  | :format: html | 
|  |  | 
|  | ================================= | 
|  | LLVM Code Coverage Mapping Format | 
|  | ================================= | 
|  |  | 
|  | .. contents:: | 
|  | :local: | 
|  |  | 
|  | Introduction | 
|  | ============ | 
|  |  | 
|  | LLVM's code coverage mapping format is used to provide code coverage | 
|  | analysis using LLVM's and Clang's instrumentation based profiling | 
|  | (Clang's ``-fprofile-instr-generate`` option). | 
|  |  | 
|  | This document is aimed at those who would like to know how LLVM's code coverage | 
|  | mapping works under the hood. A prior knowledge of how Clang's profile guided | 
|  | optimization works is useful, but not required. For those interested in using | 
|  | LLVM to provide code coverage analysis for their own programs, see the `Clang | 
|  | documentation <https://clang.llvm.org/docs/SourceBasedCodeCoverage.html>`. | 
|  |  | 
|  | We start by briefly describing LLVM's code coverage mapping format and the | 
|  | way that Clang and LLVM's code coverage tool work with this format. After | 
|  | the basics are down, more advanced features of the coverage mapping format | 
|  | are discussed - such as the data structures, LLVM IR representation and | 
|  | the binary encoding. | 
|  |  | 
|  | High Level Overview | 
|  | =================== | 
|  |  | 
|  | LLVM's code coverage mapping format is designed to be a self contained | 
|  | data format that can be embedded into the LLVM IR and into object files. | 
|  | It's described in this document as a **mapping** format because its goal is | 
|  | to store the data that is required for a code coverage tool to map between | 
|  | the specific source ranges in a file and the execution counts obtained | 
|  | after running the instrumented version of the program. | 
|  |  | 
|  | The mapping data is used in two places in the code coverage process: | 
|  |  | 
|  | 1. When clang compiles a source file with ``-fcoverage-mapping``, it | 
|  | generates the mapping information that describes the mapping between the | 
|  | source ranges and the profiling instrumentation counters. | 
|  | This information gets embedded into the LLVM IR and conveniently | 
|  | ends up in the final executable file when the program is linked. | 
|  |  | 
|  | 2. It is also used by *llvm-cov* - the mapping information is extracted from an | 
|  | object file and is used to associate the execution counts (the values of the | 
|  | profile instrumentation counters), and the source ranges in a file. | 
|  | After that, the tool is able to generate various code coverage reports | 
|  | for the program. | 
|  |  | 
|  | The coverage mapping format aims to be a "universal format" that would be | 
|  | suitable for usage by any frontend, and not just by Clang. It also aims to | 
|  | provide the frontend the possibility of generating the minimal coverage mapping | 
|  | data in order to reduce the size of the IR and object files - for example, | 
|  | instead of emitting mapping information for each statement in a function, the | 
|  | frontend is allowed to group the statements with the same execution count into | 
|  | regions of code, and emit the mapping information only for those regions. | 
|  |  | 
|  | Advanced Concepts | 
|  | ================= | 
|  |  | 
|  | The remainder of this guide is meant to give you insight into the way the | 
|  | coverage mapping format works. | 
|  |  | 
|  | The coverage mapping format operates on a per-function level as the | 
|  | profile instrumentation counters are associated with a specific function. | 
|  | For each function that requires code coverage, the frontend has to create | 
|  | coverage mapping data that can map between the source code ranges and | 
|  | the profile instrumentation counters for that function. | 
|  |  | 
|  | Mapping Region | 
|  | -------------- | 
|  |  | 
|  | The function's coverage mapping data contains an array of mapping regions. | 
|  | A mapping region stores the `source code range`_ that is covered by this region, | 
|  | the `file id <coverage file id_>`_, the `coverage mapping counter`_ and | 
|  | the region's kind. | 
|  | There are several kinds of mapping regions: | 
|  |  | 
|  | * Code regions associate portions of source code and `coverage mapping | 
|  | counters`_. They make up the majority of the mapping regions. They are used | 
|  | by the code coverage tool to compute the execution counts for lines, | 
|  | highlight the regions of code that were never executed, and to obtain | 
|  | the various code coverage statistics for a function. | 
|  | For example: | 
|  |  | 
|  | :raw-html:`<pre class='highlight' style='line-height:initial;'><span>int main(int argc, const char *argv[]) </span><span style='background-color:#4A789C'>{    </span> <span class='c1'>// Code Region from 1:40 to 9:2</span> | 
|  | <span style='background-color:#4A789C'>                                            </span> | 
|  | <span style='background-color:#4A789C'>  if (argc > 1) </span><span style='background-color:#85C1F5'>{                         </span>   <span class='c1'>// Code Region from 3:17 to 5:4</span> | 
|  | <span style='background-color:#85C1F5'>    printf("%s\n", argv[1]);              </span> | 
|  | <span style='background-color:#85C1F5'>  }</span><span style='background-color:#4A789C'> else </span><span style='background-color:#F6D55D'>{                                </span>   <span class='c1'>// Code Region from 5:10 to 7:4</span> | 
|  | <span style='background-color:#F6D55D'>    printf("\n");                         </span> | 
|  | <span style='background-color:#F6D55D'>  }</span><span style='background-color:#4A789C'>                                         </span> | 
|  | <span style='background-color:#4A789C'>  return 0;                                 </span> | 
|  | <span style='background-color:#4A789C'>}</span> | 
|  | </pre>` | 
|  | * Skipped regions are used to represent source ranges that were skipped | 
|  | by Clang's preprocessor. They don't associate with | 
|  | `coverage mapping counters`_, as the frontend knows that they are never | 
|  | executed. They are used by the code coverage tool to mark the skipped lines | 
|  | inside a function as non-code lines that don't have execution counts. | 
|  | For example: | 
|  |  | 
|  | :raw-html:`<pre class='highlight' style='line-height:initial;'><span>int main() </span><span style='background-color:#4A789C'>{               </span> <span class='c1'>// Code Region from 1:12 to 6:2</span> | 
|  | <span style='background-color:#85C1F5'>#ifdef DEBUG             </span>   <span class='c1'>// Skipped Region from 2:1 to 4:2</span> | 
|  | <span style='background-color:#85C1F5'>  printf("Hello world"); </span> | 
|  | <span style='background-color:#85C1F5'>#</span><span style='background-color:#4A789C'>endif                     </span> | 
|  | <span style='background-color:#4A789C'>  return 0;                </span> | 
|  | <span style='background-color:#4A789C'>}</span> | 
|  | </pre>` | 
|  | * Expansion regions are used to represent Clang's macro expansions. They | 
|  | have an additional property - *expanded file id*. This property can be | 
|  | used by the code coverage tool to find the mapping regions that are created | 
|  | as a result of this macro expansion, by checking if their file id matches the | 
|  | expanded file id. They don't associate with `coverage mapping counters`_, | 
|  | as the code coverage tool can determine the execution count for this region | 
|  | by looking up the execution count of the first region with a corresponding | 
|  | file id. | 
|  | For example: | 
|  |  | 
|  | :raw-html:`<pre class='highlight' style='line-height:initial;'><span>int func(int x) </span><span style='background-color:#4A789C'>{                             </span> | 
|  | <span style='background-color:#4A789C'>  #define MAX(x,y) </span><span style='background-color:#85C1F5'>((x) > (y)? </span><span style='background-color:#F6D55D'>(x)</span><span style='background-color:#85C1F5'> : </span><span style='background-color:#F4BA70'>(y)</span><span style='background-color:#85C1F5'>)</span><span style='background-color:#4A789C'>     </span> | 
|  | <span style='background-color:#4A789C'>  return </span><span style='background-color:#7FCA9F'>MAX</span><span style='background-color:#4A789C'>(x, 42);                          </span> <span class='c1'>// Expansion Region from 3:10 to 3:13</span> | 
|  | <span style='background-color:#4A789C'>}</span> | 
|  | </pre>` | 
|  | * Branch regions associate instrumentable branch conditions in the source code | 
|  | with a `coverage mapping counter`_ to track how many times an individual | 
|  | condition evaluated to 'true' and another `coverage mapping counter`_ to | 
|  | track how many times that condition evaluated to false.  Instrumentable | 
|  | branch conditions may comprise larger boolean expressions using boolean | 
|  | logical operators.  The 'true' and 'false' cases reflect unique branch paths | 
|  | that can be traced back to the source code. | 
|  | For example: | 
|  |  | 
|  | :raw-html:`<pre class='highlight' style='line-height:initial;'><span>int func(int x, int y) { | 
|  | <span>  if (<span style='background-color:#4A789C'>(x > 1)</span> || <span style='background-color:#4A789C'>(y > 3)</span>) {</span>  <span class='c1'>// Branch Region from 3:6 to 3:12</span> | 
|  | <span>                             </span><span class='c1'>// Branch Region from 3:17 to 3:23</span> | 
|  | <span>    printf("%d\n", x);              </span> | 
|  | <span>  } else {                                </span> | 
|  | <span>    printf("\n");                         </span> | 
|  | <span>  }</span> | 
|  | <span>  return 0;                                 </span> | 
|  | <span>}</span> | 
|  | </pre>` | 
|  |  | 
|  | * Decision regions associate multiple branch regions with a boolean | 
|  | expression in the source code.  This information also includes the number of | 
|  | bitmap bits needed to represent the expression's executed test vectors as | 
|  | well as the total number of instrumentable branch conditions that comprise | 
|  | the expression.  Decision regions are used to visualize Modified | 
|  | Condition/Decision Coverage (MC/DC) in *llvm-cov* for each boolean | 
|  | expression.  When decision regions are used, control flow IDs are assigned to | 
|  | each associated branch region. One ID represents the current branch | 
|  | condition, and two additional IDs represent the next branch condition in the | 
|  | control flow given a true or false evaluation, respectively.  This allows | 
|  | *llvm-cov* to reconstruct the control flow around the conditions in order to | 
|  | comprehend the full list of potential executable test vectors. | 
|  |  | 
|  | .. _source code range: | 
|  |  | 
|  | Source Range: | 
|  | ^^^^^^^^^^^^^ | 
|  |  | 
|  | The source range record contains the starting and ending location of a certain | 
|  | mapping region. Both locations include the line and the column numbers. | 
|  |  | 
|  | .. _coverage file id: | 
|  |  | 
|  | File ID: | 
|  | ^^^^^^^^ | 
|  |  | 
|  | The file id an integer value that tells us | 
|  | in which source file or macro expansion is this region located. | 
|  | It enables Clang to produce mapping information for the code | 
|  | defined inside macros, like this example demonstrates: | 
|  |  | 
|  | :raw-html:`<pre class='highlight' style='line-height:initial;'><span>void func(const char *str) </span><span style='background-color:#4A789C'>{        </span> <span class='c1'>// Code Region from 1:28 to 6:2 with file id 0</span> | 
|  | <span style='background-color:#4A789C'>  #define PUT </span><span style='background-color:#85C1F5'>printf("%s\n", str)</span><span style='background-color:#4A789C'>   </span> <span class='c1'>// 2 Code Regions from 2:15 to 2:34 with file ids 1 and 2</span> | 
|  | <span style='background-color:#4A789C'>  if(*str)                          </span> | 
|  | <span style='background-color:#4A789C'>    </span><span style='background-color:#F6D55D'>PUT</span><span style='background-color:#4A789C'>;                            </span> <span class='c1'>// Expansion Region from 4:5 to 4:8 with file id 0 that expands a macro with file id 1</span> | 
|  | <span style='background-color:#4A789C'>  </span><span style='background-color:#F6D55D'>PUT</span><span style='background-color:#4A789C'>;                              </span> <span class='c1'>// Expansion Region from 5:3 to 5:6 with file id 0 that expands a macro with file id 2</span> | 
|  | <span style='background-color:#4A789C'>}</span> | 
|  | </pre>` | 
|  |  | 
|  | .. _coverage mapping counter: | 
|  | .. _coverage mapping counters: | 
|  |  | 
|  | Counter: | 
|  | ^^^^^^^^ | 
|  |  | 
|  | A coverage mapping counter can represent a reference to the profile | 
|  | instrumentation counter. The execution count for a region with such counter | 
|  | is determined by looking up the value of the corresponding profile | 
|  | instrumentation counter. | 
|  |  | 
|  | It can also represent a binary arithmetical expression that operates on | 
|  | coverage mapping counters or other expressions. | 
|  | The execution count for a region with an expression counter is determined by | 
|  | evaluating the expression's arguments and then adding them together or | 
|  | subtracting them from one another. | 
|  | In the example below, a subtraction expression is used to compute the execution | 
|  | count for the compound statement that follows the *else* keyword: | 
|  |  | 
|  | :raw-html:`<pre class='highlight' style='line-height:initial;'><span>int main(int argc, const char *argv[]) </span><span style='background-color:#4A789C'>{   </span> <span class='c1'>// Region's counter is a reference to the profile counter #0</span> | 
|  | <span style='background-color:#4A789C'>                                           </span> | 
|  | <span style='background-color:#4A789C'>  if (argc > 1) </span><span style='background-color:#85C1F5'>{                        </span>   <span class='c1'>// Region's counter is a reference to the profile counter #1</span> | 
|  | <span style='background-color:#85C1F5'>    printf("%s\n", argv[1]);             </span><span>   </span> | 
|  | <span style='background-color:#85C1F5'>  }</span><span style='background-color:#4A789C'> else </span><span style='background-color:#F6D55D'>{                               </span>   <span class='c1'>// Region's counter is an expression (reference to the profile counter #0 - reference to the profile counter #1)</span> | 
|  | <span style='background-color:#F6D55D'>    printf("\n");                        </span> | 
|  | <span style='background-color:#F6D55D'>  }</span><span style='background-color:#4A789C'>                                        </span> | 
|  | <span style='background-color:#4A789C'>  return 0;                                </span> | 
|  | <span style='background-color:#4A789C'>}</span> | 
|  | </pre>` | 
|  |  | 
|  | Finally, a coverage mapping counter can also represent an execution count of | 
|  | of zero. The zero counter is used to provide coverage mapping for | 
|  | unreachable statements and expressions, like in the example below: | 
|  |  | 
|  | :raw-html:`<pre class='highlight' style='line-height:initial;'><span>int main() </span><span style='background-color:#4A789C'>{                  </span> | 
|  | <span style='background-color:#4A789C'>  return 0;                   </span> | 
|  | <span style='background-color:#4A789C'>  </span><span style='background-color:#85C1F5'>printf("Hello world!\n")</span><span style='background-color:#4A789C'>;   </span> <span class='c1'>// Unreachable region's counter is zero</span> | 
|  | <span style='background-color:#4A789C'>}</span> | 
|  | </pre>` | 
|  |  | 
|  | The zero counters allow the code coverage tool to display proper line execution | 
|  | counts for the unreachable lines and highlight the unreachable code. | 
|  | Without them, the tool would think that those lines and regions were still | 
|  | executed, as it doesn't possess the frontend's knowledge. | 
|  |  | 
|  | Note that branch regions are created to track branch conditions in the source | 
|  | code and refer to two coverage mapping counters, one to track the number of | 
|  | times the branch condition evaluated to "true", and one to track the number of | 
|  | times the branch condition evaluated to "false". | 
|  |  | 
|  | LLVM IR Representation | 
|  | ====================== | 
|  |  | 
|  | The coverage mapping data is stored in the LLVM IR using a global constant | 
|  | structure variable called *__llvm_coverage_mapping* with the *IPSK_covmap* | 
|  | section specifier (i.e. ".lcovmap$M" on Windows and "__llvm_covmap" elsewhere). | 
|  |  | 
|  | For example, let’s consider a C file and how it gets compiled to LLVM: | 
|  |  | 
|  | .. _coverage mapping sample: | 
|  |  | 
|  | .. code-block:: c | 
|  |  | 
|  | int foo() { | 
|  | return 42; | 
|  | } | 
|  | int bar() { | 
|  | return 13; | 
|  | } | 
|  |  | 
|  | The coverage mapping variable generated by Clang has 2 fields: | 
|  |  | 
|  | * Coverage mapping header. | 
|  |  | 
|  | * An optionally compressed list of filenames present in the translation unit. | 
|  |  | 
|  | The variable has 8-byte alignment because ld64 cannot always pack symbols from | 
|  | different object files tightly (the word-level alignment assumption is baked in | 
|  | too deeply). | 
|  |  | 
|  | .. code-block:: llvm | 
|  |  | 
|  | @__llvm_coverage_mapping = internal constant { { i32, i32, i32, i32 }, [32 x i8] } | 
|  | { | 
|  | { i32, i32, i32, i32 } ; Coverage map header | 
|  | { | 
|  | i32 0,  ; Always 0. In prior versions, the number of affixed function records | 
|  | i32 32, ; The length of the string that contains the encoded translation unit filenames | 
|  | i32 0,  ; Always 0. In prior versions, the length of the affixed string that contains the encoded coverage mapping data | 
|  | i32 3,  ; Coverage mapping format version | 
|  | }, | 
|  | [32 x i8] c"..." ; Encoded data (dissected later) | 
|  | }, section "__llvm_covmap", align 8 | 
|  |  | 
|  | The current version of the format is version 6. | 
|  |  | 
|  | There is one difference between versions 6 and 5: | 
|  |  | 
|  | * The first entry in the filename list is the compilation directory. When the | 
|  | filename is relative, the compilation directory is combined with the relative | 
|  | path to get an absolute path. This can reduce size by omitting the duplicate | 
|  | prefix in filenames. | 
|  |  | 
|  | There is one difference between versions 5 and 4: | 
|  |  | 
|  | * The notion of branch region has been introduced along with a corresponding | 
|  | region kind.  Branch regions encode two counters, one to track how many | 
|  | times a "true" branch condition is taken, and one to track how many times a | 
|  | "false" branch condition is taken. | 
|  |  | 
|  | There are two differences between versions 4 and 3: | 
|  |  | 
|  | * Function records are now named symbols, and are marked *linkonce_odr*. This | 
|  | allows linkers to merge duplicate function records. Merging of duplicate | 
|  | *dummy* records (emitted for functions included-but-not-used in a translation | 
|  | unit) reduces size bloat in the coverage mapping data. As part of this | 
|  | change, region mapping information for a function is now included within the | 
|  | function record, instead of being affixed to the coverage header. | 
|  |  | 
|  | * The filename list for a translation unit may optionally be zlib-compressed. | 
|  |  | 
|  | The only difference between versions 3 and 2 is that a special encoding for | 
|  | column end locations was introduced to indicate gap regions. | 
|  |  | 
|  | In version 1, the function record for *foo* was defined as follows: | 
|  |  | 
|  | .. code-block:: llvm | 
|  |  | 
|  | { i8*, i32, i32, i64 } { i8* getelementptr inbounds ([3 x i8]* @__profn_foo, i32 0, i32 0), ; Function's name | 
|  | i32 3, ; Function's name length | 
|  | i32 9, ; Function's encoded coverage mapping data string length | 
|  | i64 0  ; Function's structural hash | 
|  | } | 
|  |  | 
|  | In version 2, the function record for *foo* was defined as follows: | 
|  |  | 
|  | .. code-block:: llvm | 
|  |  | 
|  | { i64, i32, i64 } { | 
|  | i64 0x5cf8c24cdb18bdac, ; Function's name MD5 | 
|  | i32 9, ; Function's encoded coverage mapping data string length | 
|  | i64 0  ; Function's structural hash | 
|  |  | 
|  | Coverage Mapping Header: | 
|  | ------------------------ | 
|  |  | 
|  | As shown above, the coverage mapping header has the following fields: | 
|  |  | 
|  | * The number of function records affixed to the coverage header. Always 0, but present for backwards compatibility. | 
|  |  | 
|  | * The length of the string in the third field of *__llvm_coverage_mapping* that contains the encoded translation unit filenames. | 
|  |  | 
|  | * The length of the string in the third field of *__llvm_coverage_mapping* that contains any encoded coverage mapping data affixed to the coverage header. Always 0, but present for backwards compatibility. | 
|  |  | 
|  | * The format version. The current version is 6 (encoded as a 5). | 
|  |  | 
|  | .. _function records: | 
|  |  | 
|  | Function record: | 
|  | ---------------- | 
|  |  | 
|  | A function record is a structure of the following type: | 
|  |  | 
|  | .. code-block:: llvm | 
|  |  | 
|  | { i64, i32, i64, i64, [? x i8] } | 
|  |  | 
|  | It contains the function name's MD5, the length of the encoded mapping data for | 
|  | that function, the function's structural hash value, the hash of the filenames | 
|  | in the function's translation unit, and the encoded mapping data. | 
|  |  | 
|  | Dissecting the sample: | 
|  | ^^^^^^^^^^^^^^^^^^^^^^ | 
|  |  | 
|  | Here's an overview of the encoded data that was stored in the | 
|  | IR for the `coverage mapping sample`_ that was shown earlier: | 
|  |  | 
|  | * The IR contains the following string constant that represents the encoded | 
|  | coverage mapping data for the sample translation unit: | 
|  |  | 
|  | .. code-block:: llvm | 
|  |  | 
|  | c"\01\15\1Dx\DA\13\D1\0F-N-*\D6/+\CE\D6/\C9-\D0O\CB\CF\D7K\06\00N+\07]" | 
|  |  | 
|  | * The string contains values that are encoded in the LEB128 format, which is | 
|  | used throughout for storing integers. It also contains a compressed payload. | 
|  |  | 
|  | * The first three LEB128-encoded numbers in the sample specify the number of | 
|  | filenames, the length of the uncompressed filenames, and the length of the | 
|  | compressed payload (or 0 if compression is disabled). In this sample, there | 
|  | is 1 filename that is 21 bytes in length (uncompressed), and stored in 29 | 
|  | bytes (compressed). | 
|  |  | 
|  | * The coverage mapping from the first function record is encoded in this string: | 
|  |  | 
|  | .. code-block:: llvm | 
|  |  | 
|  | c"\01\00\00\01\01\01\0C\02\02" | 
|  |  | 
|  | This string consists of the following bytes: | 
|  |  | 
|  | +----------+-------------------------------------------------------------------------------------------------------------------------+ | 
|  | | ``0x01`` | The number of file ids used by this function. There is only one file id used by the mapping data in this function.      | | 
|  | +----------+-------------------------------------------------------------------------------------------------------------------------+ | 
|  | | ``0x00`` | An index into the filenames array which corresponds to the file "/Users/alex/test.c".                                   | | 
|  | +----------+-------------------------------------------------------------------------------------------------------------------------+ | 
|  | | ``0x00`` | The number of counter expressions used by this function. This function doesn't use any expressions.                     | | 
|  | +----------+-------------------------------------------------------------------------------------------------------------------------+ | 
|  | | ``0x01`` | The number of mapping regions that are stored in an array for the function's file id #0.                                | | 
|  | +----------+-------------------------------------------------------------------------------------------------------------------------+ | 
|  | | ``0x01`` | The coverage mapping counter for the first region in this function. The value of 1 tells us that it's a coverage        | | 
|  | |          | mapping counter that is a reference to the profile instrumentation counter with an index of 0.                          | | 
|  | +----------+-------------------------------------------------------------------------------------------------------------------------+ | 
|  | | ``0x01`` | The starting line of the first mapping region in this function.                                                         | | 
|  | +----------+-------------------------------------------------------------------------------------------------------------------------+ | 
|  | | ``0x0C`` | The starting column of the first mapping region in this function.                                                       | | 
|  | +----------+-------------------------------------------------------------------------------------------------------------------------+ | 
|  | | ``0x02`` | The ending line of the first mapping region in this function.                                                           | | 
|  | +----------+-------------------------------------------------------------------------------------------------------------------------+ | 
|  | | ``0x02`` | The ending column of the first mapping region in this function.                                                         | | 
|  | +----------+-------------------------------------------------------------------------------------------------------------------------+ | 
|  |  | 
|  | * The length of the substring that contains the encoded coverage mapping data | 
|  | for the second function record is also 9. It's structured like the mapping data | 
|  | for the first function record. | 
|  |  | 
|  | * The two trailing bytes are zeroes and are used to pad the coverage mapping | 
|  | data to give it the 8-byte alignment. | 
|  |  | 
|  | Encoding | 
|  | ======== | 
|  |  | 
|  | The per-function coverage mapping data is encoded as a stream of bytes, | 
|  | with a simple structure. The structure consists of the encoding | 
|  | `types <cvmtypes_>`_ like variable-length unsigned integers, that | 
|  | are used to encode `File ID Mapping`_, `Counter Expressions`_ and | 
|  | the `Mapping Regions`_. | 
|  |  | 
|  | The format of the structure follows: | 
|  |  | 
|  | ``[file id mapping, counter expressions, mapping regions]`` | 
|  |  | 
|  | The translation unit filenames are encoded using the same encoding | 
|  | `types <cvmtypes_>`_ as the per-function coverage mapping data, with the | 
|  | following structure: | 
|  |  | 
|  | ``[numFilenames : LEB128, filename0 : string, filename1 : string, ...]`` | 
|  |  | 
|  | .. _cvmtypes: | 
|  |  | 
|  | Types | 
|  | ----- | 
|  |  | 
|  | This section describes the basic types that are used by the encoding format | 
|  | and can appear after ``:`` in the ``[foo : type]`` description. | 
|  |  | 
|  | .. _LEB128: | 
|  |  | 
|  | LEB128 | 
|  | ^^^^^^ | 
|  |  | 
|  | LEB128 is an unsigned integer value that is encoded using DWARF's LEB128 | 
|  | encoding, optimizing for the case where values are small | 
|  | (1 byte for values less than 128). | 
|  |  | 
|  | .. _CoverageStrings: | 
|  |  | 
|  | Strings | 
|  | ^^^^^^^ | 
|  |  | 
|  | ``[length : LEB128, characters...]`` | 
|  |  | 
|  | String values are encoded with a `LEB value <LEB128_>`_ for the length | 
|  | of the string and a sequence of bytes for its characters. | 
|  |  | 
|  | .. _file id mapping: | 
|  |  | 
|  | File ID Mapping | 
|  | --------------- | 
|  |  | 
|  | ``[numIndices : LEB128, filenameIndex0 : LEB128, filenameIndex1 : LEB128, ...]`` | 
|  |  | 
|  | File id mapping in a function's coverage mapping stream | 
|  | contains the indices into the translation unit's filenames array. | 
|  |  | 
|  | Counter | 
|  | ------- | 
|  |  | 
|  | ``[value : LEB128]`` | 
|  |  | 
|  | A `coverage mapping counter`_ is stored in a single `LEB value <LEB128_>`_. | 
|  | It is composed of two things --- the `tag <counter-tag_>`_ | 
|  | which is stored in the lowest 2 bits, and the `counter data`_ which is stored | 
|  | in the remaining bits. | 
|  |  | 
|  | .. _counter-tag: | 
|  |  | 
|  | Tag: | 
|  | ^^^^ | 
|  |  | 
|  | The counter's tag encodes the counter's kind | 
|  | and, if the counter is an expression, the expression's kind. | 
|  | The possible tag values are: | 
|  |  | 
|  | * 0 - The counter is zero. | 
|  |  | 
|  | * 1 - The counter is a reference to the profile instrumentation counter. | 
|  |  | 
|  | * 2 - The counter is a subtraction expression. | 
|  |  | 
|  | * 3 - The counter is an addition expression. | 
|  |  | 
|  | .. _counter data: | 
|  |  | 
|  | Data: | 
|  | ^^^^^ | 
|  |  | 
|  | The counter's data is interpreted in the following manner: | 
|  |  | 
|  | * When the counter is a reference to the profile instrumentation counter, | 
|  | then the counter's data is the id of the profile counter. | 
|  | * When the counter is an expression, then the counter's data | 
|  | is the index into the array of counter expressions. | 
|  |  | 
|  | .. _Counter Expressions: | 
|  |  | 
|  | Counter Expressions | 
|  | ------------------- | 
|  |  | 
|  | ``[numExpressions : LEB128, expr0LHS : LEB128, expr0RHS : LEB128, expr1LHS : LEB128, expr1RHS : LEB128, ...]`` | 
|  |  | 
|  | Counter expressions consist of two counters as they | 
|  | represent binary arithmetic operations. | 
|  | The expression's kind is determined from the `tag <counter-tag_>`_ of the | 
|  | counter that references this expression. | 
|  |  | 
|  | .. _Mapping Regions: | 
|  |  | 
|  | Mapping Regions | 
|  | --------------- | 
|  |  | 
|  | ``[numRegionArrays : LEB128, regionsForFile0, regionsForFile1, ...]`` | 
|  |  | 
|  | The mapping regions are stored in an array of sub-arrays where every | 
|  | region in a particular sub-array has the same file id. | 
|  |  | 
|  | The file id for a sub-array of regions is the index of that | 
|  | sub-array in the main array e.g. The first sub-array will have the file id | 
|  | of 0. | 
|  |  | 
|  | Sub-Array of Regions | 
|  | ^^^^^^^^^^^^^^^^^^^^ | 
|  |  | 
|  | ``[numRegions : LEB128, region0, region1, ...]`` | 
|  |  | 
|  | The mapping regions for a specific file id are stored in an array that is | 
|  | sorted in an ascending order by the region's starting location. | 
|  |  | 
|  | Mapping Region | 
|  | ^^^^^^^^^^^^^^ | 
|  |  | 
|  | ``[header, source range]`` | 
|  |  | 
|  | The mapping region record contains two sub-records --- | 
|  | the `header`_, which stores the counter and/or the region's kind, | 
|  | and the `source range`_ that contains the starting and ending | 
|  | location of this region. | 
|  |  | 
|  | .. _header: | 
|  |  | 
|  | Header | 
|  | ^^^^^^ | 
|  |  | 
|  | ``[counter]`` | 
|  |  | 
|  | or | 
|  |  | 
|  | ``[pseudo-counter]`` | 
|  |  | 
|  | The header encodes the region's counter and the region's kind. A branch region | 
|  | will encode two counters. | 
|  |  | 
|  | The value of the counter's tag distinguishes between the counters and | 
|  | pseudo-counters --- if the tag is zero, than this header contains a | 
|  | pseudo-counter, otherwise this header contains an ordinary counter. | 
|  |  | 
|  | Counter: | 
|  | """""""" | 
|  |  | 
|  | A mapping region whose header has a counter with a non-zero tag is | 
|  | a code region. | 
|  |  | 
|  | Pseudo-Counter: | 
|  | """"""""""""""" | 
|  |  | 
|  | ``[value : LEB128]`` | 
|  |  | 
|  | A pseudo-counter is stored in a single `LEB value <LEB128_>`_, just like | 
|  | the ordinary counter. It has the following interpretation: | 
|  |  | 
|  | * bits 0-1: tag, which is always 0. | 
|  |  | 
|  | * bit 2: expansionRegionTag. If this bit is set, then this mapping region | 
|  | is an expansion region. | 
|  |  | 
|  | * remaining bits: data. If this region is an expansion region, then the data | 
|  | contains the expanded file id of that region. | 
|  |  | 
|  | Otherwise, the data contains the region's kind. The possible region | 
|  | kind values are: | 
|  |  | 
|  | * 0 - This mapping region is a code region with a counter of zero. | 
|  | * 2 - This mapping region is a skipped region. | 
|  | * 4 - This mapping region is a branch region. | 
|  |  | 
|  | .. _source range: | 
|  |  | 
|  | Source Range | 
|  | ^^^^^^^^^^^^ | 
|  |  | 
|  | ``[deltaLineStart : LEB128, columnStart : LEB128, numLines : LEB128, columnEnd : LEB128]`` | 
|  |  | 
|  | The source range record contains the following fields: | 
|  |  | 
|  | * *deltaLineStart*: The difference between the starting line of the | 
|  | current mapping region and the starting line of the previous mapping region. | 
|  |  | 
|  | If the current mapping region is the first region in the current | 
|  | sub-array, then it stores the starting line of that region. | 
|  |  | 
|  | * *columnStart*: The starting column of the mapping region. | 
|  |  | 
|  | * *numLines*: The difference between the ending line and the starting line | 
|  | of the current mapping region. | 
|  |  | 
|  | * *columnEnd*: The ending column of the mapping region. If the high bit is set, | 
|  | the current mapping region is a gap area. A count for a gap area is only used | 
|  | as the line execution count if there are no other regions on a line. | 
|  |  | 
|  | Testing Format | 
|  | ============== | 
|  |  | 
|  | .. warning:: | 
|  | This section is for the LLVM developers who are working on ``llvm-cov`` only. | 
|  |  | 
|  | ``llvm-cov`` uses a special file format (called ``.covmapping`` below) for | 
|  | testing purposes. This format is private and should have no use for general | 
|  | users. As a developer, you can get such files by the ``convert-for-testing`` | 
|  | subcommand of ``llvm-cov``. | 
|  |  | 
|  | The structure of the ``.covmapping`` files follows: | 
|  |  | 
|  | ``[magicNumber : u64, version : u64, profileNames, coverageMapping, coverageRecords]`` | 
|  |  | 
|  | Magic Number and Version | 
|  | ------------------------ | 
|  |  | 
|  | The magic is ``0x6d766f636d766c6c``, which is the ASCII string | 
|  | ``llvmcovm`` in little-endian. | 
|  |  | 
|  | There are two versions for now: | 
|  |  | 
|  | - Version1, encoded as ``0x6174616474736574`` (ASCII string ``testdata``). | 
|  | - Version2, encoded as 1. | 
|  |  | 
|  | The only difference between Version1 and Version2 is in the encoding of the | 
|  | ``coverageMapping`` fields, which is explained later. | 
|  |  | 
|  | Profile Names | 
|  | ------------- | 
|  |  | 
|  | ``profileNames``, ``coverageMapping`` and ``coverageRecords`` are 3 sections | 
|  | extracted from the original binary file. | 
|  |  | 
|  | ``profileNames`` encodes the size, address and the raw data of the section: | 
|  |  | 
|  | ``[profileNamesSize : LEB128, profileNamesAddr : LEB128, profileNamesData : bytes]`` | 
|  |  | 
|  | Coverage Mapping | 
|  | ---------------- | 
|  |  | 
|  | This field is padded with zero bytes to make it 8-byte aligned. | 
|  |  | 
|  | ``coverageMapping`` contains the records of the source files. In version 1, | 
|  | only one record is stored: | 
|  |  | 
|  | ``[padding : bytes, coverageMappingData : bytes]`` | 
|  |  | 
|  | Version 2 relaxes this restriction by encoding the size of | 
|  | ``coverageMappingData`` as a LEB128 number before the data: | 
|  |  | 
|  | ``[coverageMappingSize : LEB128, padding : bytes, coverageMappingData : bytes]`` | 
|  |  | 
|  | The current version is 2. | 
|  |  | 
|  | Coverage Records | 
|  | ---------------- | 
|  |  | 
|  | This field is padded with zero bytes to make it 8-byte aligned. | 
|  |  | 
|  | ``coverageRecords`` is encoded as: | 
|  |  | 
|  | ``[padding : bytes, coverageRecordsData : bytes]`` | 
|  |  | 
|  | The rest data in the file is considered as the ``coverageRecordsData``. |