1*67e74705SXin Li========================== 2*67e74705SXin LiSource-based Code Coverage 3*67e74705SXin Li========================== 4*67e74705SXin Li 5*67e74705SXin Li.. contents:: 6*67e74705SXin Li :local: 7*67e74705SXin Li 8*67e74705SXin LiIntroduction 9*67e74705SXin Li============ 10*67e74705SXin Li 11*67e74705SXin LiThis document explains how to use clang's source-based code coverage feature. 12*67e74705SXin LiIt's called "source-based" because it operates on AST and preprocessor 13*67e74705SXin Liinformation directly. This allows it to generate very precise coverage data. 14*67e74705SXin Li 15*67e74705SXin LiClang ships two other code coverage implementations: 16*67e74705SXin Li 17*67e74705SXin Li* :doc:`SanitizerCoverage` - A low-overhead tool meant for use alongside the 18*67e74705SXin Li various sanitizers. It can provide up to edge-level coverage. 19*67e74705SXin Li 20*67e74705SXin Li* gcov - A GCC-compatible coverage implementation which operates on DebugInfo. 21*67e74705SXin Li 22*67e74705SXin LiFrom this point onwards "code coverage" will refer to the source-based kind. 23*67e74705SXin Li 24*67e74705SXin LiThe code coverage workflow 25*67e74705SXin Li========================== 26*67e74705SXin Li 27*67e74705SXin LiThe code coverage workflow consists of three main steps: 28*67e74705SXin Li 29*67e74705SXin Li* Compiling with coverage enabled. 30*67e74705SXin Li 31*67e74705SXin Li* Running the instrumented program. 32*67e74705SXin Li 33*67e74705SXin Li* Creating coverage reports. 34*67e74705SXin Li 35*67e74705SXin LiThe next few sections work through a complete, copy-'n-paste friendly example 36*67e74705SXin Libased on this program: 37*67e74705SXin Li 38*67e74705SXin Li.. code-block:: cpp 39*67e74705SXin Li 40*67e74705SXin Li % cat <<EOF > foo.cc 41*67e74705SXin Li #define BAR(x) ((x) || (x)) 42*67e74705SXin Li template <typename T> void foo(T x) { 43*67e74705SXin Li for (unsigned I = 0; I < 10; ++I) { BAR(I); } 44*67e74705SXin Li } 45*67e74705SXin Li int main() { 46*67e74705SXin Li foo<int>(0); 47*67e74705SXin Li foo<float>(0); 48*67e74705SXin Li return 0; 49*67e74705SXin Li } 50*67e74705SXin Li EOF 51*67e74705SXin Li 52*67e74705SXin LiCompiling with coverage enabled 53*67e74705SXin Li=============================== 54*67e74705SXin Li 55*67e74705SXin LiTo compile code with coverage enabled, pass ``-fprofile-instr-generate 56*67e74705SXin Li-fcoverage-mapping`` to the compiler: 57*67e74705SXin Li 58*67e74705SXin Li.. code-block:: console 59*67e74705SXin Li 60*67e74705SXin Li # Step 1: Compile with coverage enabled. 61*67e74705SXin Li % clang++ -fprofile-instr-generate -fcoverage-mapping foo.cc -o foo 62*67e74705SXin Li 63*67e74705SXin LiNote that linking together code with and without coverage instrumentation is 64*67e74705SXin Lisupported: any uninstrumented code simply won't be accounted for. 65*67e74705SXin Li 66*67e74705SXin LiRunning the instrumented program 67*67e74705SXin Li================================ 68*67e74705SXin Li 69*67e74705SXin LiThe next step is to run the instrumented program. When the program exits it 70*67e74705SXin Liwill write a **raw profile** to the path specified by the ``LLVM_PROFILE_FILE`` 71*67e74705SXin Lienvironment variable. If that variable does not exist, the profile is written 72*67e74705SXin Lito ``default.profraw`` in the current directory of the program. If 73*67e74705SXin Li``LLVM_PROFILE_FILE`` contains a path to a non-existent directory, the missing 74*67e74705SXin Lidirectory structure will be created. Additionally, the following special 75*67e74705SXin Li**pattern strings** are rewritten: 76*67e74705SXin Li 77*67e74705SXin Li* "%p" expands out to the process ID. 78*67e74705SXin Li 79*67e74705SXin Li* "%h" expands out to the hostname of the machine running the program. 80*67e74705SXin Li 81*67e74705SXin Li* "%Nm" expands out to the instrumented binary's signature. When this pattern 82*67e74705SXin Li is specified, the runtime creates a pool of N raw profiles which are used for 83*67e74705SXin Li on-line profile merging. The runtime takes care of selecting a raw profile 84*67e74705SXin Li from the pool, locking it, and updating it before the program exits. If N is 85*67e74705SXin Li not specified (i.e the pattern is "%m"), it's assumed that ``N = 1``. N must 86*67e74705SXin Li be between 1 and 9. The merge pool specifier can only occur once per filename 87*67e74705SXin Li pattern. 88*67e74705SXin Li 89*67e74705SXin Li.. code-block:: console 90*67e74705SXin Li 91*67e74705SXin Li # Step 2: Run the program. 92*67e74705SXin Li % LLVM_PROFILE_FILE="foo.profraw" ./foo 93*67e74705SXin Li 94*67e74705SXin LiCreating coverage reports 95*67e74705SXin Li========================= 96*67e74705SXin Li 97*67e74705SXin LiRaw profiles have to be **indexed** before they can be used to generate 98*67e74705SXin Licoverage reports. This is done using the "merge" tool in ``llvm-profdata``, so 99*67e74705SXin Linamed because it can combine and index profiles at the same time: 100*67e74705SXin Li 101*67e74705SXin Li.. code-block:: console 102*67e74705SXin Li 103*67e74705SXin Li # Step 3(a): Index the raw profile. 104*67e74705SXin Li % llvm-profdata merge -sparse foo.profraw -o foo.profdata 105*67e74705SXin Li 106*67e74705SXin LiThere are multiple different ways to render coverage reports. One option is to 107*67e74705SXin Ligenerate a line-oriented report: 108*67e74705SXin Li 109*67e74705SXin Li.. code-block:: console 110*67e74705SXin Li 111*67e74705SXin Li # Step 3(b): Create a line-oriented coverage report. 112*67e74705SXin Li % llvm-cov show ./foo -instr-profile=foo.profdata 113*67e74705SXin Li 114*67e74705SXin LiTo demangle any C++ identifiers in the output, use: 115*67e74705SXin Li 116*67e74705SXin Li.. code-block:: console 117*67e74705SXin Li 118*67e74705SXin Li % llvm-cov show ./foo -instr-profile=foo.profdata | c++filt -n 119*67e74705SXin Li 120*67e74705SXin LiThis report includes a summary view as well as dedicated sub-views for 121*67e74705SXin Litemplated functions and their instantiations. For our example program, we get 122*67e74705SXin Lidistinct views for ``foo<int>(...)`` and ``foo<float>(...)``. If 123*67e74705SXin Li``-show-line-counts-or-regions`` is enabled, ``llvm-cov`` displays sub-line 124*67e74705SXin Liregion counts (even in macro expansions): 125*67e74705SXin Li 126*67e74705SXin Li.. code-block:: none 127*67e74705SXin Li 128*67e74705SXin Li 20| 1|#define BAR(x) ((x) || (x)) 129*67e74705SXin Li ^20 ^2 130*67e74705SXin Li 2| 2|template <typename T> void foo(T x) { 131*67e74705SXin Li 22| 3| for (unsigned I = 0; I < 10; ++I) { BAR(I); } 132*67e74705SXin Li ^22 ^20 ^20^20 133*67e74705SXin Li 2| 4|} 134*67e74705SXin Li ------------------ 135*67e74705SXin Li | void foo<int>(int): 136*67e74705SXin Li | 1| 2|template <typename T> void foo(T x) { 137*67e74705SXin Li | 11| 3| for (unsigned I = 0; I < 10; ++I) { BAR(I); } 138*67e74705SXin Li | ^11 ^10 ^10^10 139*67e74705SXin Li | 1| 4|} 140*67e74705SXin Li ------------------ 141*67e74705SXin Li | void foo<float>(int): 142*67e74705SXin Li | 1| 2|template <typename T> void foo(T x) { 143*67e74705SXin Li | 11| 3| for (unsigned I = 0; I < 10; ++I) { BAR(I); } 144*67e74705SXin Li | ^11 ^10 ^10^10 145*67e74705SXin Li | 1| 4|} 146*67e74705SXin Li ------------------ 147*67e74705SXin Li 148*67e74705SXin LiIt's possible to generate a file-level summary of coverage statistics (instead 149*67e74705SXin Liof a line-oriented report) with: 150*67e74705SXin Li 151*67e74705SXin Li.. code-block:: console 152*67e74705SXin Li 153*67e74705SXin Li # Step 3(c): Create a coverage summary. 154*67e74705SXin Li % llvm-cov report ./foo -instr-profile=foo.profdata 155*67e74705SXin Li Filename Regions Miss Cover Functions Executed 156*67e74705SXin Li ----------------------------------------------------------------------- 157*67e74705SXin Li /tmp/foo.cc 13 0 100.00% 3 100.00% 158*67e74705SXin Li ----------------------------------------------------------------------- 159*67e74705SXin Li TOTAL 13 0 100.00% 3 100.00% 160*67e74705SXin Li 161*67e74705SXin LiA few final notes: 162*67e74705SXin Li 163*67e74705SXin Li* The ``-sparse`` flag is optional but can result in dramatically smaller 164*67e74705SXin Li indexed profiles. This option should not be used if the indexed profile will 165*67e74705SXin Li be reused for PGO. 166*67e74705SXin Li 167*67e74705SXin Li* Raw profiles can be discarded after they are indexed. Advanced use of the 168*67e74705SXin Li profile runtime library allows an instrumented program to merge profiling 169*67e74705SXin Li information directly into an existing raw profile on disk. The details are 170*67e74705SXin Li out of scope. 171*67e74705SXin Li 172*67e74705SXin Li* The ``llvm-profdata`` tool can be used to merge together multiple raw or 173*67e74705SXin Li indexed profiles. To combine profiling data from multiple runs of a program, 174*67e74705SXin Li try e.g: 175*67e74705SXin Li 176*67e74705SXin Li .. code-block:: console 177*67e74705SXin Li 178*67e74705SXin Li % llvm-profdata merge -sparse foo1.profraw foo2.profdata -o foo3.profdata 179*67e74705SXin Li 180*67e74705SXin LiFormat compatibility guarantees 181*67e74705SXin Li=============================== 182*67e74705SXin Li 183*67e74705SXin Li* There are no backwards or forwards compatibility guarantees for the raw 184*67e74705SXin Li profile format. Raw profiles may be dependent on the specific compiler 185*67e74705SXin Li revision used to generate them. It's inadvisable to store raw profiles for 186*67e74705SXin Li long periods of time. 187*67e74705SXin Li 188*67e74705SXin Li* Tools must retain **backwards** compatibility with indexed profile formats. 189*67e74705SXin Li These formats are not forwards-compatible: i.e, a tool which uses format 190*67e74705SXin Li version X will not be able to understand format version (X+k). 191*67e74705SXin Li 192*67e74705SXin Li* There is a third format in play: the format of the coverage mappings emitted 193*67e74705SXin Li into instrumented binaries. Tools must retain **backwards** compatibility 194*67e74705SXin Li with these formats. These formats are not forwards-compatible. 195*67e74705SXin Li 196*67e74705SXin LiUsing the profiling runtime without static initializers 197*67e74705SXin Li======================================================= 198*67e74705SXin Li 199*67e74705SXin LiBy default the compiler runtime uses a static initializer to determine the 200*67e74705SXin Liprofile output path and to register a writer function. To collect profiles 201*67e74705SXin Liwithout using static initializers, do this manually: 202*67e74705SXin Li 203*67e74705SXin Li* Export a ``int __llvm_profile_runtime`` symbol from each instrumented shared 204*67e74705SXin Li library and executable. When the linker finds a definition of this symbol, it 205*67e74705SXin Li knows to skip loading the object which contains the profiling runtime's 206*67e74705SXin Li static initializer. 207*67e74705SXin Li 208*67e74705SXin Li* Forward-declare ``void __llvm_profile_initialize_file(void)`` and call it 209*67e74705SXin Li once from each instrumented executable. This function parses 210*67e74705SXin Li ``LLVM_PROFILE_FILE``, sets the output path, and truncates any existing files 211*67e74705SXin Li at that path. To get the same behavior without truncating existing files, 212*67e74705SXin Li pass a filename pattern string to ``void __llvm_profile_set_filename(char 213*67e74705SXin Li *)``. These calls can be placed anywhere so long as they precede all calls 214*67e74705SXin Li to ``__llvm_profile_write_file``. 215*67e74705SXin Li 216*67e74705SXin Li* Forward-declare ``int __llvm_profile_write_file(void)`` and call it to write 217*67e74705SXin Li out a profile. This function returns 0 when it succeeds, and a non-zero value 218*67e74705SXin Li otherwise. Calling this function multiple times appends profile data to an 219*67e74705SXin Li existing on-disk raw profile. 220*67e74705SXin Li 221*67e74705SXin LiDrawbacks and limitations 222*67e74705SXin Li========================= 223*67e74705SXin Li 224*67e74705SXin Li* Code coverage does not handle unpredictable changes in control flow or stack 225*67e74705SXin Li unwinding in the presence of exceptions precisely. Consider the following 226*67e74705SXin Li function: 227*67e74705SXin Li 228*67e74705SXin Li .. code-block:: cpp 229*67e74705SXin Li 230*67e74705SXin Li int f() { 231*67e74705SXin Li may_throw(); 232*67e74705SXin Li return 0; 233*67e74705SXin Li } 234*67e74705SXin Li 235*67e74705SXin Li If the call to ``may_throw()`` propagates an exception into ``f``, the code 236*67e74705SXin Li coverage tool may mark the ``return`` statement as executed even though it is 237*67e74705SXin Li not. A call to ``longjmp()`` can have similar effects. 238