llvm-gcc-4.0/order-files/HOW TO BUILD - llvm-archive - Git at Google

        Creating Order files, i.e., Scatter Loading the Compiler
        --------------------------------------------------------

 This is a brief description on how to generate the cc1*.order files.
 The order files are intended to minimize the number of page-ins of
 the compiler as it is loaded.  If there is enough memory this
 benifits only the first load of the compiler since it will stay
 resident after that.

 Unfortunately it's a manual process since one of the tools requires
 an explicit interrupt from the terminal.  You should only need to
 (re)do the order files if there's any major reorganizations or
 additions to the compilers.

 There are five steps involved to genrrate the order files.

 1. Select test cases.

    These should be "average" compilations to exercise each of the
    cc1* compilers.  They should be large enough to take enough time
    to generate acceptible results.  As of this writing the
    following cases were chosen:

    For cc1        - gcc/c-decl.c (from the compiler sources)
    For cc1plus    - Finder_FE/AboutWindow/AboutWindow.cp
    For cc1obj     - MailViewer/Compose.subproj/MessageEditor.m
    For cc1objplus - devkit/cpp.subproj/Cpp-precomp.mm

    Of these four the cc1objplus test case is not very good.
    Unfortunately there are few .mm files of any significant size.
    If a better one can be found it probably should be used.

 2. Capture the command lines needed to build the chosen files.

    For the selected projects built with with PB set PB's building
    preferences for detailed build logs.  That way you can the
    full command lines you need.  In non-PB projects like gcc the
    command lines are of course echoed on the terminal.

 3. Run selected command lines with -### to get the cc1* lines.

    From the full command lines you need the cc1* lines generated
    by the driver.  The easiest way to get these is to add -###
    to the full command lines captured in step 2.

 4. Prepare to generate the order files

    If you don't already have it you should build a set of cc1*
    compilers with -O2 with symbols.  The easiest way to do this
    is FSF-style but using buildit with build_gcc probably will
    also work.

    In the gcc objects directory you will of course have the cc1*
    compilers.  You need to substitute these in each of the cc1*
    command lines captured in step 3.  You also need to run these
    with ~perf/bin/pcsample to generate the order files.  Thus,
    for each cc1* command line it should have the following in
    the beginning in place of the original cc1* of the step 3
    lines.

    sudo ~perf/bin/pcsample -O -E $gcc3-obj/cc1* ... rest of line...

    Where $gcc3-obj represents whatever the path is to the gcc3
    objects directory and cc1* is one of the cc1* compilers (is
    -B necessary here?).

    Note, you need sudo because pcsample will only run as root.
    Also, if you have a dual processor you need to reboot as in
    single  processor mode.  If you don't pcscample will tell you
    to do that by executing the command,

    nvram boot-args="cpus=1"

 5. Generate the order files

    Run the lines created in step 4.  The order files (cc1*.order)
    will be left in /tmp in the direectory indicated by the summary
    that pcsample displays when you hit cmd-c to stop the pcsample
    execution.  Be sure to run pcsample long enough to compile the
    entire program.

    At this point you now have the order files created.  You place
    them in the order-files directory at the top level of the gcc
    source directory.

    You can also use them to measure he effects of these order files
    on compiler page-ins.  If you do this go to the next step (6).
    Otherwisw you are ready to go.

 6. Creating the compilers with ther order files

    You will need two versions of the cc1* files; the ones from above
    and a set linked with their respective order file.

    From a gcc compiler build extract the command lines that link
    the cc1* files.  Change the -o file to something else, for
    example, cc1 to cc1X.  Then add the following options to the
    link line.  Note, if you build using buildit and build_gcc the
    lines will already be there referencing the order-files
    directory.  Otherwise you need to add,

    -sectorder __TEXT __text $order/cc1*.order -e start

    where $order is the directory containing the order file being
    used and cc1* is of course a reference to a specific order
    file.

  7. Measuring the performance improvement

     You need to have two terminbal windows open; T1 and T2.  The
     execute the follwoing commands on the indicated terminals:

     T2: sudo fs_usage -w > /tmp/fs.out1  (do NOT execute yet)
     T1: ~perf/bin/flushmem  (note this can take a while)
     T2: fs_usage
     T1: use cc1* line originally used to build order file
     T2: ctl-c when cc1* compilation done

     T2: sudo fs_usage -w > /tmp/fs.out2  (do not execute yet)
     T1: ~perf/bin/flushmem
     T2: fs_usage
     T1: use cc1*X line originally used to build order file
     T2: ctl-c when cc1XXX compilation done

     In the first group of commands you use the original cc1* line
     with the command line used to build the order file.  You also
     run fs_usage at the same time to measure the paging behavior.

     The second group of commands is similar but you use the cc1*
     linked with its order file (with the -sectorder stuff mentioned
     in step 6).  For this discussion call this compiler cc1*X.

     In both cases you need to run ~perf/bin/flushmem to make sure
     compiler is flused from the cached.  That way you are
     measuring the initial page-in bechavor as thee compiler is
     loaded.  Warning, the flushmem's sometimes take quite a
     while.

     At this point you should have /tmp/fs.out1 and /tmp/fs.out2.
     You need to extract the page-ins times for the compilers in
     order to sum them up.  The easies way to do this is make the
     data tab delimited for importing into Excel.

     T1: fgrep cc1*  /tmp/fs.out1 | fgrep PAGE_IN | \
         tr -s "[:blank:]" '\t' >/tmp/cc1*.pageins-1
     T1: fgrep cc1*X /tmp/fs.out2 | fgrep PAGE_IN | \
         tr -s "[:blank:]" '\t' >/tmp/cc1*.pageins-2

     Load each of these into Excel and sum the pagin times to
     determine the percent change.

     Remember that the '*'s in the above illustrations are not really
     a '*'.  It is just a short way to show the general command lines
     where in reality you explicitly specify cc1, cc1plus, cc1obj, or
     cc1objplus.

     Also remember that you are measuring the page-in performance
     improvement on the test cases used to generate the order files
     in the first place.  Thus you should expect that these would
     probably show the greatest improvement.  That is why it is
     important to try to choose representative test cases in the
     first place.  You can try the measurements on other tests.
     But that requires you again extracting the cc1* lines using
     -###.  The above procedure only uses the orignal files since
     the command lines are already handy.
	Creating Order files, i.e., Scatter Loading the Compiler
	--------------------------------------------------------

	This is a brief description on how to generate the cc1*.order files.
	The order files are intended to minimize the number of page-ins of
	the compiler as it is loaded. If there is enough memory this
	benifits only the first load of the compiler since it will stay
	resident after that.

	Unfortunately it's a manual process since one of the tools requires
	an explicit interrupt from the terminal. You should only need to
	(re)do the order files if there's any major reorganizations or
	additions to the compilers.

	There are five steps involved to genrrate the order files.

	1. Select test cases.

	These should be "average" compilations to exercise each of the
	cc1* compilers. They should be large enough to take enough time
	to generate acceptible results. As of this writing the
	following cases were chosen:

	For cc1 - gcc/c-decl.c (from the compiler sources)
	For cc1plus - Finder_FE/AboutWindow/AboutWindow.cp
	For cc1obj - MailViewer/Compose.subproj/MessageEditor.m
	For cc1objplus - devkit/cpp.subproj/Cpp-precomp.mm

	Of these four the cc1objplus test case is not very good.
	Unfortunately there are few .mm files of any significant size.
	If a better one can be found it probably should be used.

	2. Capture the command lines needed to build the chosen files.

	For the selected projects built with with PB set PB's building
	preferences for detailed build logs. That way you can the
	full command lines you need. In non-PB projects like gcc the
	command lines are of course echoed on the terminal.

	3. Run selected command lines with -### to get the cc1* lines.

	From the full command lines you need the cc1* lines generated
	by the driver. The easiest way to get these is to add -###
	to the full command lines captured in step 2.

	4. Prepare to generate the order files

	If you don't already have it you should build a set of cc1*
	compilers with -O2 with symbols. The easiest way to do this
	is FSF-style but using buildit with build_gcc probably will
	also work.

	In the gcc objects directory you will of course have the cc1*
	compilers. You need to substitute these in each of the cc1*
	command lines captured in step 3. You also need to run these
	with ~perf/bin/pcsample to generate the order files. Thus,
	for each cc1* command line it should have the following in
	the beginning in place of the original cc1* of the step 3
	lines.

	sudo ~perf/bin/pcsample -O -E $gcc3-obj/cc1* ... rest of line...

	Where $gcc3-obj represents whatever the path is to the gcc3
	objects directory and cc1* is one of the cc1* compilers (is
	-B necessary here?).

	Note, you need sudo because pcsample will only run as root.
	Also, if you have a dual processor you need to reboot as in
	single processor mode. If you don't pcscample will tell you
	to do that by executing the command,

	nvram boot-args="cpus=1"

	5. Generate the order files

	Run the lines created in step 4. The order files (cc1*.order)
	will be left in /tmp in the direectory indicated by the summary
	that pcsample displays when you hit cmd-c to stop the pcsample
	execution. Be sure to run pcsample long enough to compile the
	entire program.

	At this point you now have the order files created. You place
	them in the order-files directory at the top level of the gcc
	source directory.

	You can also use them to measure he effects of these order files
	on compiler page-ins. If you do this go to the next step (6).
	Otherwisw you are ready to go.

	6. Creating the compilers with ther order files

	You will need two versions of the cc1* files; the ones from above
	and a set linked with their respective order file.

	From a gcc compiler build extract the command lines that link
	the cc1* files. Change the -o file to something else, for
	example, cc1 to cc1X. Then add the following options to the
	link line. Note, if you build using buildit and build_gcc the
	lines will already be there referencing the order-files
	directory. Otherwise you need to add,

	-sectorder __TEXT __text $order/cc1*.order -e start

	where $order is the directory containing the order file being
	used and cc1* is of course a reference to a specific order
	file.

	7. Measuring the performance improvement

	You need to have two terminbal windows open; T1 and T2. The
	execute the follwoing commands on the indicated terminals:

	T2: sudo fs_usage -w > /tmp/fs.out1 (do NOT execute yet)
	T1: ~perf/bin/flushmem (note this can take a while)
	T2: fs_usage
	T1: use cc1* line originally used to build order file
	T2: ctl-c when cc1* compilation done

	T2: sudo fs_usage -w > /tmp/fs.out2 (do not execute yet)
	T1: ~perf/bin/flushmem
	T2: fs_usage
	T1: use cc1*X line originally used to build order file
	T2: ctl-c when cc1XXX compilation done

	In the first group of commands you use the original cc1* line
	with the command line used to build the order file. You also
	run fs_usage at the same time to measure the paging behavior.

	The second group of commands is similar but you use the cc1*
	linked with its order file (with the -sectorder stuff mentioned
	in step 6). For this discussion call this compiler cc1*X.

	In both cases you need to run ~perf/bin/flushmem to make sure
	compiler is flused from the cached. That way you are
	measuring the initial page-in bechavor as thee compiler is
	loaded. Warning, the flushmem's sometimes take quite a
	while.

	At this point you should have /tmp/fs.out1 and /tmp/fs.out2.
	You need to extract the page-ins times for the compilers in
	order to sum them up. The easies way to do this is make the
	data tab delimited for importing into Excel.

	T1: fgrep cc1* /tmp/fs.out1 \| fgrep PAGE_IN \| \
	tr -s "[:blank:]" '\t' >/tmp/cc1*.pageins-1
	T1: fgrep cc1*X /tmp/fs.out2 \| fgrep PAGE_IN \| \
	tr -s "[:blank:]" '\t' >/tmp/cc1*.pageins-2

	Load each of these into Excel and sum the pagin times to
	determine the percent change.

	Remember that the '*'s in the above illustrations are not really
	a '*'. It is just a short way to show the general command lines
	where in reality you explicitly specify cc1, cc1plus, cc1obj, or
	cc1objplus.

	Also remember that you are measuring the page-in performance
	improvement on the test cases used to generate the order files
	in the first place. Thus you should expect that these would
	probably show the greatest improvement. That is why it is
	important to try to choose representative test cases in the
	first place. You can try the measurements on other tests.
	But that requires you again extracting the cc1* lines using
	-###. The above procedure only uses the orignal files since
	the command lines are already handy.