safecode/docs/UsersGuide.html - llvm-archive - Git at Google

 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
                       "http://www.w3.org/TR/html4/strict.dtd">
 <html>
 <head>
   <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
   <title>SAFECode Users Guide</title>
   <link rel="stylesheet" href="llvm.css" type="text/css">
 </head>

 <body>

 <div class="doc_title">
 SAFECode Users Guide
 </div>

 <!-- ********************************************************************** -->
 <!-- * Table of Contents                                                  * -->
 <!-- ********************************************************************** -->
 <ul>
   <li><a href="#overview">Overview</a></li>
   <li><a href="#lto">SAFECode/LLVM-GCC Integration</a></li>
   <li><a href="#sc">The sc tool</a></li>
 </ul>

 <!-- ********************************************************************** -->
 <!-- * Authors                                                            * -->
 <!-- ********************************************************************** -->
 <div class="doc_author">
   <p>Written by the LLVM Research Group</p>
 </div>


 <!-- *********************************************************************** -->
 <div class="doc_section">
   <a name="overview"><b>Overview</b></a>
 </div>
 <!-- *********************************************************************** -->

 <div class="doc_text">

 <p>
 This manual provides information on using the tools that come with the SAFECode
 compiler.  There are two primary methods in which to use SAFECode: it
 can be integreated into the system linker via a plugin, or it can be run
 explicitly on a program using the sc tool.  The first method is the easiest to
 use when compiling large programs with SAFECode; it lets llvm-gcc take care of
 compiling files into a single bitcode file and performs the SAFECode compiler
 transformations right before code generation within the linker.
 The second method requires that
 you first compile your program to a single LLVM bitcode file before feeding it
 into the sc tool; it is harder to use in that you must determine, on your own,
 how to get your program compiled into a single LLVM bitcode file.  However, on
 the plus side, the sc tool is easier to control with its vast array of
 command-line options.
 </p>
 </div>

 <!-- *********************************************************************** -->
 <div class="doc_section">
   <a name="lto"><b>SAFECode/LLVM-GCC Integration</b></a>
 </div>
 <!-- *********************************************************************** -->

 <div class="doc_text">

 <p>
 The easiest way to use SAFECode is to use the SAFECode libLTO plugin that
 integrates with the linker (the Mac OS X and Gold linker are currently
 supported).  When used in this way, the SAFECode transforms are performed
 transparently by the linker when it creates the final executable.
 </p>

 <p>
 To install the SAFECode libLTO linker, follow the steps below:
 </p>

 <ol>
 <li>
 Make a backup of your system's currently existing libLTO.so or libLTO.dylib.
 </li>

 <li>
 Copy the SAFECode libLTO.so (Linux) or libLTO.dylib (Mac OS X) over your
 system's libLTO.so (or libLTO.dylib, as appropriate).
 </li>

 <li>
 Make a directory somewhere.  We'll refer to it as $PREFIX.  Make a symbolic
 link from $PREFIX/gcc to llvm-gcc; do the same for $PREFIX/g++ and llvm-g++.
 This step is needed for some versions of libtool that don't understand
 that llvm-gcc is a compiler driver.
 </li>
 </ol>

 <p>
 That's it!  Now, when you want to compile a program with SAFECode, you simply
 use the <tt>-O4</tt> option to llvm-gcc and link in the SAFECode run-time
 libraries.  To configure an autoconf-based software package to use SAFECode, do
 the following:
 </p>

 <ol>
 <li> Set the environment variable CC to $PREFIX/gcc.</li>
 <li> Set the environment variable CXX to $PREFIX/g++.</li>
 <li> Set the environment variable CFLAGS to "-O4"</li>
 <li> Set the environment variable LDFLAGS to
      "-L$SAFECODE/$CONFIG/lib -lsc_dbg_rt -lpoolalloc_bitmap -lstdc++" where:
     <ol>
     <li> $SAFECODE is the root of the SAFECode object tree.</li>
     <li> $CONFIG is the type of build (Debug, Release, or Profile).</li>
     </ol>
 </li>
 <li> Run the configure script</li>
 <li> Type "make" to compile the source code.</li>
 </ol>

 </div>

 <!-- *********************************************************************** -->
 <div class="doc_section">
   <a name="sc"><b>The sc Tool</b></a>
 </div>
 <!-- *********************************************************************** -->

 <div class="doc_text">

 <p>
 The sc tool is the SAFECode compiler.  It takes a whole program in LLVM bitcode
 form, transforms it to be memory safe, and outputs instrumented LLVM bitcode.
 The output of the sc tool can be converted to native code via the LLVM llc tool
 and linked with the SAFECode run-time library.
 </p>

 <p>
 The sc tool utilizes <i>whole-program analysis</i>: this
 means that you <i><b>must</b></i> compile your program into separate LLVM
 bitcode files, link these files together into a single bitcode file, process
 the complete bitcode file with sc, and then generate native code.
 You <i><b>cannot</b></i> run individiual bitcode object files through sc and
 then link them together.
 </p>

 <p>
 The steps for compiling a program with SAFECode are as follows:
 </p>

 <ol>
   <li>
   Compile the program into a single LLVM bitcode file:
   <ol>
     <li>Compile the program's source file into LLVM bitcode.</li>
     <li>Link the LLVM bitcode files together into a single bitcode file.</li>
   </ol>
   </li>
   <li>Process the single bitcode file with the sc tool.</li>
   <li>Generate native code for the program.</li>
   <li>Link the native code with the SAFECode run-time library.</li>
 </ol>
 </div>

 <!-- *********************************************************************** -->
 <div class="doc_subsection">
   <a name="sc"><b>Compiling a Program Into a Single Bitcode File</b></a>
 </div>
 <!-- *********************************************************************** -->

 <div class="doc_text">

 <p>
 Compiling a program into a single bitcode file requires that one first compile
 each source file into an LLVM bitcode file and then link these bitcode files
 together into a single bitcode file.  To compile source files into an LLVM
 bitcode file instead of into a native object file, using the -emit-llvm option
 to llvm-gcc:
 </p>

 <ul>
   <li><tt>llvm-gcc -emit-llvm -c <i>srcfilename</i>.c</tt></li>
 </ul>

 <p>
 Once all of the the source files have been compiled, they can be linked
 together into a single bitcode file.  The llvm-ld tool can be used to do this.
 <!--
   other options include the new Gold linker and extensions to the Mac OS X
   linker (both of these have been extended to link together LLVM bitcode files).
 -->
 </p>

 <p>
 To link files together using llvm-ld, do the following:
 </p>

 <ul>
   <li><tt>llvm-ld -o <i>output_filename</i>.bc <i>file1</i>.bc <i>file2</i>.bc
 ...</tt></li>
 </ul>

 </div>

 <!-- *********************************************************************** -->
 <div class="doc_subsection">
   <a name="sc"><b>Using the sc Tool</b></a>
 </div>
 <!-- *********************************************************************** -->

 <div class="doc_text">

 <p>
 The sc tool takes as input a bitcode file representing a whole program and
 outputs an instrumented bitcode file.  Invoke the sc tool as follows:
 </p>

 <ul>
   <li><tt>sc <i>options</i> -o <i>output_filename</i>.bc
 <i>input_filename</i>.bc</tt></li>
 </ul>

 Options to the sc tool include:

 <ul>
   <li>
   <tt>-f</tt>:
   Overwrite the output file if it already exists.  By default, the sc tool will
   not overwrite a pre-existing file.
   </li>

   <li>
   <tt>-terminate</tt>:
   By default, SAFECode reports errors in a separate log file and permits the
   application to continue execution.  This option will cause the generated
   program to terminate on the first memory safety error.
   </li>

   <li>
   <tt>-pa</tt>:
   Indicates which type of pool allocation the sc tool should use.  Options
   include:
   <ul>
     <li>
     -pa=simple<br>
     This option uses a single pool to record bounds information for memory
     objects.
     </li>

     <li>
     -pa=multi<br>
     This option uses multiple context-insensitive pools to store bounds
     information on memory objects.
     </li>

     <li>
     -pa=apa<br>
     This option uses full-blown, context-sensitive, pools to store information
     on memory objects.
     </li>
   </ul>
   </li>

   <li>
   <tt>-rewrite-oob</tt>:
   Permit Out of Bounds (OOB) pointers to be created as long as they are not
   dereferenced.  By default, SAFECode only permits pointers to move one byte
   past the end of the object provided that the pointer is not dereferenced
   (this behavior is consistent with the C programming language standard); all
   other Out of Bounds pointers are flagged as an error.  With this option
   enabled, arbitrary Out of Bounds pointers are permitted as long as they are
   not dereferenced.  This option eases the restrictions on pointer indexing for
   programs that are memory safe but do not follow the C standard strictly.
   </li>

   <li>
   <tt>-dpchecks</tt>:
   Enable dangling pointer detection.  By default, the sc tool outputs a program
   that <i>tolerates</i> dangling pointer dereferences.  This option ensures,
   with additional run-time overhead, that dangling pointer dereferences are
   detected at run-time.
   </li>

   <li>
   <tt>-disable-debuginfo</tt>:
   Disable debug information.  By default, if the program has debug information
   compiled into it (i.e., the -g option was used on the llvm-gcc command line
   when creating the bitcode input files), then memory safety errors reported at
   run-time will attempt to print out the source file name and line number of
   where the error occurred.  With this option, the processed program will still
   catch memory errors but will not attempt to provide detailed information to
   help diagnose the error.
   </li>

   <li>
   <tt>-help</tt>:
   Prints available options.
   </li>
 </ul>

 </div>

 <!-- *********************************************************************** -->
 <div class="doc_subsection">
   <a name="link"><b>Creating an Executable</b></a>
 </div>
 <!-- *********************************************************************** -->

 <div class="doc_text">

 <p>
 Creating an executable from the output of the sc tool requires that one
 generate the native code for the output bitcode and link in the run-time
 libraries implementing the memory allocator and the SAFECode run-time checks.
 Generating native code can be done using the LLVM llc tool:
 </p>

 <ul>
   <li><tt>llc -f -o <i>output_filename.s</i> <i>output_from_sc</i>.bc</tt></li>
 </ul>

 <p>
 Creating the final executable can be done using GCC.  At a minimum, you need to
 link in the two SAFECode run-time libraries and the C++ standard library.  You
 will also need to link in any additional native code libraries that were not
 linked in as LLVM bitcode libraries.

 Below is an example of creating the final executable.  In the example,
 <i>Configuration</i> is either Release, Debug, or Profile depending upon
 whether you built a Release, Debug, or Profile version of SAFECode:
 </p>

 <ul>
   <li>
   <tt>gcc -o <i>executable</i> <i>output_from_llc</i>.s
   <i>Configuration</i>/lib/libsc_dbg_rt.a
   <i>Configuration</i>/lib/libpoolalloc_bitmap.a
   -lstdc++
   </li>
 </ul>
	<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
	"http://www.w3.org/TR/html4/strict.dtd">
	<html>
	<head>
	<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
	<title>SAFECode Users Guide</title>
	<link rel="stylesheet" href="llvm.css" type="text/css">
	</head>

	<body>

	<div class="doc_title">
	SAFECode Users Guide
	</div>

	<!-- ********************************************************************** -->
	<!-- * Table of Contents * -->
	<!-- ********************************************************************** -->
	<ul>
	<li><a href="#overview">Overview</a></li>
	<li><a href="#lto">SAFECode/LLVM-GCC Integration</a></li>
	<li><a href="#sc">The sc tool</a></li>
	</ul>

	<!-- ********************************************************************** -->
	<!-- * Authors * -->
	<!-- ********************************************************************** -->
	<div class="doc_author">
	<p>Written by the LLVM Research Group</p>
	</div>


	<!-- *********************************************************************** -->
	<div class="doc_section">
	<a name="overview"><b>Overview</b></a>
	</div>
	<!-- *********************************************************************** -->

	<div class="doc_text">

	<p>
	This manual provides information on using the tools that come with the SAFECode
	compiler. There are two primary methods in which to use SAFECode: it
	can be integreated into the system linker via a plugin, or it can be run
	explicitly on a program using the sc tool. The first method is the easiest to
	use when compiling large programs with SAFECode; it lets llvm-gcc take care of
	compiling files into a single bitcode file and performs the SAFECode compiler
	transformations right before code generation within the linker.
	The second method requires that
	you first compile your program to a single LLVM bitcode file before feeding it
	into the sc tool; it is harder to use in that you must determine, on your own,
	how to get your program compiled into a single LLVM bitcode file. However, on
	the plus side, the sc tool is easier to control with its vast array of
	command-line options.
	</p>
	</div>

	<!-- *********************************************************************** -->
	<div class="doc_section">
	<a name="lto"><b>SAFECode/LLVM-GCC Integration</b></a>
	</div>
	<!-- *********************************************************************** -->

	<div class="doc_text">

	<p>
	The easiest way to use SAFECode is to use the SAFECode libLTO plugin that
	integrates with the linker (the Mac OS X and Gold linker are currently
	supported). When used in this way, the SAFECode transforms are performed
	transparently by the linker when it creates the final executable.
	</p>

	<p>
	To install the SAFECode libLTO linker, follow the steps below:
	</p>

	<ol>
	<li>
	Make a backup of your system's currently existing libLTO.so or libLTO.dylib.
	</li>

	<li>
	Copy the SAFECode libLTO.so (Linux) or libLTO.dylib (Mac OS X) over your
	system's libLTO.so (or libLTO.dylib, as appropriate).
	</li>

	<li>
	Make a directory somewhere. We'll refer to it as $PREFIX. Make a symbolic
	link from $PREFIX/gcc to llvm-gcc; do the same for $PREFIX/g++ and llvm-g++.
	This step is needed for some versions of libtool that don't understand
	that llvm-gcc is a compiler driver.
	</li>
	</ol>

	<p>
	That's it! Now, when you want to compile a program with SAFECode, you simply
	use the <tt>-O4</tt> option to llvm-gcc and link in the SAFECode run-time
	libraries. To configure an autoconf-based software package to use SAFECode, do
	the following:
	</p>

	<ol>
	<li> Set the environment variable CC to $PREFIX/gcc.</li>
	<li> Set the environment variable CXX to $PREFIX/g++.</li>
	<li> Set the environment variable CFLAGS to "-O4"</li>
	<li> Set the environment variable LDFLAGS to
	"-L$SAFECODE/$CONFIG/lib -lsc_dbg_rt -lpoolalloc_bitmap -lstdc++" where:
	<ol>
	<li> $SAFECODE is the root of the SAFECode object tree.</li>
	<li> $CONFIG is the type of build (Debug, Release, or Profile).</li>
	</ol>
	</li>
	<li> Run the configure script</li>
	<li> Type "make" to compile the source code.</li>
	</ol>

	</div>

	<!-- *********************************************************************** -->
	<div class="doc_section">
	<a name="sc"><b>The sc Tool</b></a>
	</div>
	<!-- *********************************************************************** -->

	<div class="doc_text">

	<p>
	The sc tool is the SAFECode compiler. It takes a whole program in LLVM bitcode
	form, transforms it to be memory safe, and outputs instrumented LLVM bitcode.
	The output of the sc tool can be converted to native code via the LLVM llc tool
	and linked with the SAFECode run-time library.
	</p>

	<p>
	The sc tool utilizes <i>whole-program analysis</i>: this
	means that you <i><b>must</b></i> compile your program into separate LLVM
	bitcode files, link these files together into a single bitcode file, process
	the complete bitcode file with sc, and then generate native code.
	You <i><b>cannot</b></i> run individiual bitcode object files through sc and
	then link them together.
	</p>

	<p>
	The steps for compiling a program with SAFECode are as follows:
	</p>

	<ol>
	<li>
	Compile the program into a single LLVM bitcode file:
	<ol>
	<li>Compile the program's source file into LLVM bitcode.</li>
	<li>Link the LLVM bitcode files together into a single bitcode file.</li>
	</ol>
	</li>
	<li>Process the single bitcode file with the sc tool.</li>
	<li>Generate native code for the program.</li>
	<li>Link the native code with the SAFECode run-time library.</li>
	</ol>
	</div>

	<!-- *********************************************************************** -->
	<div class="doc_subsection">
	<a name="sc"><b>Compiling a Program Into a Single Bitcode File</b></a>
	</div>
	<!-- *********************************************************************** -->

	<div class="doc_text">

	<p>
	Compiling a program into a single bitcode file requires that one first compile
	each source file into an LLVM bitcode file and then link these bitcode files
	together into a single bitcode file. To compile source files into an LLVM
	bitcode file instead of into a native object file, using the -emit-llvm option
	to llvm-gcc:
	</p>

	<ul>
	<li><tt>llvm-gcc -emit-llvm -c <i>srcfilename</i>.c</tt></li>
	</ul>

	<p>
	Once all of the the source files have been compiled, they can be linked
	together into a single bitcode file. The llvm-ld tool can be used to do this.
	<!--
	other options include the new Gold linker and extensions to the Mac OS X
	linker (both of these have been extended to link together LLVM bitcode files).
	-->
	</p>

	<p>
	To link files together using llvm-ld, do the following:
	</p>

	<ul>
	<li><tt>llvm-ld -o <i>output_filename</i>.bc <i>file1</i>.bc <i>file2</i>.bc
	...</tt></li>
	</ul>

	</div>

	<!-- *********************************************************************** -->
	<div class="doc_subsection">
	<a name="sc"><b>Using the sc Tool</b></a>
	</div>
	<!-- *********************************************************************** -->

	<div class="doc_text">

	<p>
	The sc tool takes as input a bitcode file representing a whole program and
	outputs an instrumented bitcode file. Invoke the sc tool as follows:
	</p>

	<ul>
	<li><tt>sc <i>options</i> -o <i>output_filename</i>.bc
	<i>input_filename</i>.bc</tt></li>
	</ul>

	Options to the sc tool include:

	<ul>
	<li>
	<tt>-f</tt>:
	Overwrite the output file if it already exists. By default, the sc tool will
	not overwrite a pre-existing file.
	</li>

	<li>
	<tt>-terminate</tt>:
	By default, SAFECode reports errors in a separate log file and permits the
	application to continue execution. This option will cause the generated
	program to terminate on the first memory safety error.
	</li>

	<li>
	<tt>-pa</tt>:
	Indicates which type of pool allocation the sc tool should use. Options
	include:
	<ul>
	<li>
	-pa=simple<br>
	This option uses a single pool to record bounds information for memory
	objects.
	</li>

	<li>
	-pa=multi<br>
	This option uses multiple context-insensitive pools to store bounds
	information on memory objects.
	</li>

	<li>
	-pa=apa<br>
	This option uses full-blown, context-sensitive, pools to store information
	on memory objects.
	</li>
	</ul>
	</li>

	<li>
	<tt>-rewrite-oob</tt>:
	Permit Out of Bounds (OOB) pointers to be created as long as they are not
	dereferenced. By default, SAFECode only permits pointers to move one byte
	past the end of the object provided that the pointer is not dereferenced
	(this behavior is consistent with the C programming language standard); all
	other Out of Bounds pointers are flagged as an error. With this option
	enabled, arbitrary Out of Bounds pointers are permitted as long as they are
	not dereferenced. This option eases the restrictions on pointer indexing for
	programs that are memory safe but do not follow the C standard strictly.
	</li>

	<li>
	<tt>-dpchecks</tt>:
	Enable dangling pointer detection. By default, the sc tool outputs a program
	that <i>tolerates</i> dangling pointer dereferences. This option ensures,
	with additional run-time overhead, that dangling pointer dereferences are
	detected at run-time.
	</li>

	<li>
	<tt>-disable-debuginfo</tt>:
	Disable debug information. By default, if the program has debug information
	compiled into it (i.e., the -g option was used on the llvm-gcc command line
	when creating the bitcode input files), then memory safety errors reported at
	run-time will attempt to print out the source file name and line number of
	where the error occurred. With this option, the processed program will still
	catch memory errors but will not attempt to provide detailed information to
	help diagnose the error.
	</li>

	<li>
	<tt>-help</tt>:
	Prints available options.
	</li>
	</ul>

	</div>

	<!-- *********************************************************************** -->
	<div class="doc_subsection">
	<a name="link"><b>Creating an Executable</b></a>
	</div>
	<!-- *********************************************************************** -->

	<div class="doc_text">

	<p>
	Creating an executable from the output of the sc tool requires that one
	generate the native code for the output bitcode and link in the run-time
	libraries implementing the memory allocator and the SAFECode run-time checks.
	Generating native code can be done using the LLVM llc tool:
	</p>

	<ul>
	<li><tt>llc -f -o <i>output_filename.s</i> <i>output_from_sc</i>.bc</tt></li>
	</ul>

	<p>
	Creating the final executable can be done using GCC. At a minimum, you need to
	link in the two SAFECode run-time libraries and the C++ standard library. You
	will also need to link in any additional native code libraries that were not
	linked in as LLVM bitcode libraries.

	Below is an example of creating the final executable. In the example,
	<i>Configuration</i> is either Release, Debug, or Profile depending upon
	whether you built a Release, Debug, or Profile version of SAFECode:
	</p>

	<ul>
	<li>
	<tt>gcc -o <i>executable</i> <i>output_from_llc</i>.s
	<i>Configuration</i>/lib/libsc_dbg_rt.a
	<i>Configuration</i>/lib/libpoolalloc_bitmap.a
	-lstdc++
	</li>
	</ul>