blob: dfcc14dbdc93bbfafaae4ea60467ae9ff0a603d2 [file] [log] [blame]
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>SAFECode Users Guide</title>
<link rel="stylesheet" href="llvm.css" type="text/css">
</head>
<body>
<div class="doc_title">
SAFECode Users Guide
</div>
<!-- ********************************************************************** -->
<!-- * Table of Contents * -->
<!-- ********************************************************************** -->
<ul>
<li><a href="#overview">Overview</a></li>
<li><a href="#lto">SAFECode/LLVM-GCC Integration</a></li>
<li><a href="#sc">The sc tool</a></li>
</ul>
<!-- ********************************************************************** -->
<!-- * Authors * -->
<!-- ********************************************************************** -->
<div class="doc_author">
<p>Written by the LLVM Research Group</p>
</div>
<!-- *********************************************************************** -->
<div class="doc_section">
<a name="overview"><b>Overview</b></a>
</div>
<!-- *********************************************************************** -->
<div class="doc_text">
<p>
This manual provides information on using the tools that come with the SAFECode
compiler. There are two primary methods in which to use SAFECode: it
can be integreated into the system linker via a plugin, or it can be run
explicitly on a program using the sc tool. The first method is the easiest to
use when compiling large programs with SAFECode; it lets llvm-gcc take care of
compiling files into a single bitcode file and performs the SAFECode compiler
transformations right before code generation within the linker.
The second method requires that
you first compile your program to a single LLVM bitcode file before feeding it
into the sc tool; it is harder to use in that you must determine, on your own,
how to get your program compiled into a single LLVM bitcode file. However, on
the plus side, the sc tool is easier to control with its vast array of
command-line options.
</p>
</div>
<!-- *********************************************************************** -->
<div class="doc_section">
<a name="lto"><b>SAFECode/LLVM-GCC Integration</b></a>
</div>
<!-- *********************************************************************** -->
<div class="doc_text">
<p>
The easiest way to use SAFECode is to use the SAFECode libLTO plugin that
integrates with the linker (the Mac OS X and Gold linker are currently
supported). When used in this way, the SAFECode transforms are performed
transparently by the linker when it creates the final executable.
</p>
<p>
To install the SAFECode libLTO linker, follow the steps below:
</p>
<ol>
<li>
Make a backup of your system's currently existing libLTO.so or libLTO.dylib.
</li>
<li>
Copy the SAFECode libLTO.so (Linux) or libLTO.dylib (Mac OS X) over your
system's libLTO.so (or libLTO.dylib, as appropriate).
</li>
<li>
Make a directory somewhere. We'll refer to it as $PREFIX. Make a symbolic
link from $PREFIX/gcc to llvm-gcc; do the same for $PREFIX/g++ and llvm-g++.
This step is needed for some versions of libtool that don't understand
that llvm-gcc is a compiler driver.
</li>
</ol>
<p>
That's it! Now, when you want to compile a program with SAFECode, you simply
use the <tt>-O4</tt> option to llvm-gcc and link in the SAFECode run-time
libraries. To configure an autoconf-based software package to use SAFECode, do
the following:
</p>
<ol>
<li> Set the environment variable CC to $PREFIX/gcc.</li>
<li> Set the environment variable CXX to $PREFIX/g++.</li>
<li> Set the environment variable CFLAGS to "-O4"</li>
<li> Set the environment variable LDFLAGS to
"-L$SAFECODE/$CONFIG/lib -lsc_dbg_rt -lpoolalloc_bitmap -lstdc++" where:
<ol>
<li> $SAFECODE is the root of the SAFECode object tree.</li>
<li> $CONFIG is the type of build (Debug, Release, or Profile).</li>
</ol>
</li>
<li> Run the configure script</li>
<li> Type "make" to compile the source code.</li>
</ol>
</div>
<!-- *********************************************************************** -->
<div class="doc_section">
<a name="sc"><b>The sc Tool</b></a>
</div>
<!-- *********************************************************************** -->
<div class="doc_text">
<p>
The sc tool is the SAFECode compiler. It takes a whole program in LLVM bitcode
form, transforms it to be memory safe, and outputs instrumented LLVM bitcode.
The output of the sc tool can be converted to native code via the LLVM llc tool
and linked with the SAFECode run-time library.
</p>
<p>
The sc tool utilizes <i>whole-program analysis</i>: this
means that you <i><b>must</b></i> compile your program into separate LLVM
bitcode files, link these files together into a single bitcode file, process
the complete bitcode file with sc, and then generate native code.
You <i><b>cannot</b></i> run individiual bitcode object files through sc and
then link them together.
</p>
<p>
The steps for compiling a program with SAFECode are as follows:
</p>
<ol>
<li>
Compile the program into a single LLVM bitcode file:
<ol>
<li>Compile the program's source file into LLVM bitcode.</li>
<li>Link the LLVM bitcode files together into a single bitcode file.</li>
</ol>
</li>
<li>Process the single bitcode file with the sc tool.</li>
<li>Generate native code for the program.</li>
<li>Link the native code with the SAFECode run-time library.</li>
</ol>
</div>
<!-- *********************************************************************** -->
<div class="doc_subsection">
<a name="sc"><b>Compiling a Program Into a Single Bitcode File</b></a>
</div>
<!-- *********************************************************************** -->
<div class="doc_text">
<p>
Compiling a program into a single bitcode file requires that one first compile
each source file into an LLVM bitcode file and then link these bitcode files
together into a single bitcode file. To compile source files into an LLVM
bitcode file instead of into a native object file, using the -emit-llvm option
to llvm-gcc:
</p>
<ul>
<li><tt>llvm-gcc -emit-llvm -c <i>srcfilename</i>.c</tt></li>
</ul>
<p>
Once all of the the source files have been compiled, they can be linked
together into a single bitcode file. The llvm-ld tool can be used to do this.
<!--
other options include the new Gold linker and extensions to the Mac OS X
linker (both of these have been extended to link together LLVM bitcode files).
-->
</p>
<p>
To link files together using llvm-ld, do the following:
</p>
<ul>
<li><tt>llvm-ld -o <i>output_filename</i>.bc <i>file1</i>.bc <i>file2</i>.bc
...</tt></li>
</ul>
</div>
<!-- *********************************************************************** -->
<div class="doc_subsection">
<a name="sc"><b>Using the sc Tool</b></a>
</div>
<!-- *********************************************************************** -->
<div class="doc_text">
<p>
The sc tool takes as input a bitcode file representing a whole program and
outputs an instrumented bitcode file. Invoke the sc tool as follows:
</p>
<ul>
<li><tt>sc <i>options</i> -o <i>output_filename</i>.bc
<i>input_filename</i>.bc</tt></li>
</ul>
Options to the sc tool include:
<ul>
<li>
<tt>-f</tt>:
Overwrite the output file if it already exists. By default, the sc tool will
not overwrite a pre-existing file.
</li>
<li>
<tt>-terminate</tt>:
By default, SAFECode reports errors in a separate log file and permits the
application to continue execution. This option will cause the generated
program to terminate on the first memory safety error.
</li>
<li>
<tt>-pa</tt>:
Indicates which type of pool allocation the sc tool should use. Options
include:
<ul>
<li>
-pa=simple<br>
This option uses a single pool to record bounds information for memory
objects.
</li>
<li>
-pa=multi<br>
This option uses multiple context-insensitive pools to store bounds
information on memory objects.
</li>
<li>
-pa=apa<br>
This option uses full-blown, context-sensitive, pools to store information
on memory objects.
</li>
</ul>
</li>
<li>
<tt>-rewrite-oob</tt>:
Permit Out of Bounds (OOB) pointers to be created as long as they are not
dereferenced. By default, SAFECode only permits pointers to move one byte
past the end of the object provided that the pointer is not dereferenced
(this behavior is consistent with the C programming language standard); all
other Out of Bounds pointers are flagged as an error. With this option
enabled, arbitrary Out of Bounds pointers are permitted as long as they are
not dereferenced. This option eases the restrictions on pointer indexing for
programs that are memory safe but do not follow the C standard strictly.
</li>
<li>
<tt>-dpchecks</tt>:
Enable dangling pointer detection. By default, the sc tool outputs a program
that <i>tolerates</i> dangling pointer dereferences. This option ensures,
with additional run-time overhead, that dangling pointer dereferences are
detected at run-time.
</li>
<li>
<tt>-disable-debuginfo</tt>:
Disable debug information. By default, if the program has debug information
compiled into it (i.e., the -g option was used on the llvm-gcc command line
when creating the bitcode input files), then memory safety errors reported at
run-time will attempt to print out the source file name and line number of
where the error occurred. With this option, the processed program will still
catch memory errors but will not attempt to provide detailed information to
help diagnose the error.
</li>
<li>
<tt>-help</tt>:
Prints available options.
</li>
</ul>
</div>
<!-- *********************************************************************** -->
<div class="doc_subsection">
<a name="link"><b>Creating an Executable</b></a>
</div>
<!-- *********************************************************************** -->
<div class="doc_text">
<p>
Creating an executable from the output of the sc tool requires that one
generate the native code for the output bitcode and link in the run-time
libraries implementing the memory allocator and the SAFECode run-time checks.
Generating native code can be done using the LLVM llc tool:
</p>
<ul>
<li><tt>llc -f -o <i>output_filename.s</i> <i>output_from_sc</i>.bc</tt></li>
</ul>
<p>
Creating the final executable can be done using GCC. At a minimum, you need to
link in the two SAFECode run-time libraries and the C++ standard library. You
will also need to link in any additional native code libraries that were not
linked in as LLVM bitcode libraries.
Below is an example of creating the final executable. In the example,
<i>Configuration</i> is either Release, Debug, or Profile depending upon
whether you built a Release, Debug, or Profile version of SAFECode:
</p>
<ul>
<li>
<tt>gcc -o <i>executable</i> <i>output_from_llc</i>.s
<i>Configuration</i>/lib/libsc_dbg_rt.a
<i>Configuration</i>/lib/libpoolalloc_bitmap.a
-lstdc++
</li>
</ul>