blob: 7cca046f37a9da612cc5b30171ae811954190a5c [file] [log] [blame]
Creating Order files, i.e., Scatter Loading the Compiler
This is a brief description on how to generate the cc1*.order files.
The order files are intended to minimize the number of page-ins of
the compiler as it is loaded. If there is enough memory this
benifits only the first load of the compiler since it will stay
resident after that.
Unfortunately it's a manual process since one of the tools requires
an explicit interrupt from the terminal. You should only need to
(re)do the order files if there's any major reorganizations or
additions to the compilers.
There are five steps involved to genrrate the order files.
1. Select test cases.
These should be "average" compilations to exercise each of the
cc1* compilers. They should be large enough to take enough time
to generate acceptible results. As of this writing the
following cases were chosen:
For cc1 - gcc/c-decl.c (from the compiler sources)
For cc1plus - Finder_FE/AboutWindow/AboutWindow.cp
For cc1obj - MailViewer/Compose.subproj/MessageEditor.m
For cc1objplus - devkit/cpp.subproj/
Of these four the cc1objplus test case is not very good.
Unfortunately there are few .mm files of any significant size.
If a better one can be found it probably should be used.
2. Capture the command lines needed to build the chosen files.
For the selected projects built with with PB set PB's building
preferences for detailed build logs. That way you can the
full command lines you need. In non-PB projects like gcc the
command lines are of course echoed on the terminal.
3. Run selected command lines with -### to get the cc1* lines.
From the full command lines you need the cc1* lines generated
by the driver. The easiest way to get these is to add -###
to the full command lines captured in step 2.
4. Prepare to generate the order files
If you don't already have it you should build a set of cc1*
compilers with -O2 with symbols. The easiest way to do this
is FSF-style but using buildit with build_gcc probably will
also work.
In the gcc objects directory you will of course have the cc1*
compilers. You need to substitute these in each of the cc1*
command lines captured in step 3. You also need to run these
with ~perf/bin/pcsample to generate the order files. Thus,
for each cc1* command line it should have the following in
the beginning in place of the original cc1* of the step 3
sudo ~perf/bin/pcsample -O -E $gcc3-obj/cc1* ... rest of line...
Where $gcc3-obj represents whatever the path is to the gcc3
objects directory and cc1* is one of the cc1* compilers (is
-B necessary here?).
Note, you need sudo because pcsample will only run as root.
Also, if you have a dual processor you need to reboot as in
single processor mode. If you don't pcscample will tell you
to do that by executing the command,
nvram boot-args="cpus=1"
5. Generate the order files
Run the lines created in step 4. The order files (cc1*.order)
will be left in /tmp in the direectory indicated by the summary
that pcsample displays when you hit cmd-c to stop the pcsample
execution. Be sure to run pcsample long enough to compile the
entire program.
At this point you now have the order files created. You place
them in the order-files directory at the top level of the gcc
source directory.
You can also use them to measure he effects of these order files
on compiler page-ins. If you do this go to the next step (6).
Otherwisw you are ready to go.
6. Creating the compilers with ther order files
You will need two versions of the cc1* files; the ones from above
and a set linked with their respective order file.
From a gcc compiler build extract the command lines that link
the cc1* files. Change the -o file to something else, for
example, cc1 to cc1X. Then add the following options to the
link line. Note, if you build using buildit and build_gcc the
lines will already be there referencing the order-files
directory. Otherwise you need to add,
-sectorder __TEXT __text $order/cc1*.order -e start
where $order is the directory containing the order file being
used and cc1* is of course a reference to a specific order
7. Measuring the performance improvement
You need to have two terminbal windows open; T1 and T2. The
execute the follwoing commands on the indicated terminals:
T2: sudo fs_usage -w > /tmp/fs.out1 (do NOT execute yet)
T1: ~perf/bin/flushmem (note this can take a while)
T2: fs_usage
T1: use cc1* line originally used to build order file
T2: ctl-c when cc1* compilation done
T2: sudo fs_usage -w > /tmp/fs.out2 (do not execute yet)
T1: ~perf/bin/flushmem
T2: fs_usage
T1: use cc1*X line originally used to build order file
T2: ctl-c when cc1XXX compilation done
In the first group of commands you use the original cc1* line
with the command line used to build the order file. You also
run fs_usage at the same time to measure the paging behavior.
The second group of commands is similar but you use the cc1*
linked with its order file (with the -sectorder stuff mentioned
in step 6). For this discussion call this compiler cc1*X.
In both cases you need to run ~perf/bin/flushmem to make sure
compiler is flused from the cached. That way you are
measuring the initial page-in bechavor as thee compiler is
loaded. Warning, the flushmem's sometimes take quite a
At this point you should have /tmp/fs.out1 and /tmp/fs.out2.
You need to extract the page-ins times for the compilers in
order to sum them up. The easies way to do this is make the
data tab delimited for importing into Excel.
T1: fgrep cc1* /tmp/fs.out1 | fgrep PAGE_IN | \
tr -s "[:blank:]" '\t' >/tmp/cc1*.pageins-1
T1: fgrep cc1*X /tmp/fs.out2 | fgrep PAGE_IN | \
tr -s "[:blank:]" '\t' >/tmp/cc1*.pageins-2
Load each of these into Excel and sum the pagin times to
determine the percent change.
Remember that the '*'s in the above illustrations are not really
a '*'. It is just a short way to show the general command lines
where in reality you explicitly specify cc1, cc1plus, cc1obj, or
Also remember that you are measuring the page-in performance
improvement on the test cases used to generate the order files
in the first place. Thus you should expect that these would
probably show the greatest improvement. That is why it is
important to try to choose representative test cases in the
first place. You can try the measurements on other tests.
But that requires you again extracting the cc1* lines using
-###. The above procedure only uses the orignal files since
the command lines are already handy.