Blame - docs/Benchmarking.rst - llvm-project/llvm

blob: cd7b835e47c53582932e1da8ba00030f3140cd9e [file] [log] [blame]

Rafael Espindola	703e2db	2017-05-24 16:39:12 +0000	[diff] [blame]	1	==================================
				2	Benchmarking tips
				3	==================================
				4
				5
				6	Introduction
				7	============
				8
				9	For benchmarking a patch we want to reduce all possible sources of
				10	noise as much as possible. How to do that is very OS dependent.
				11
				12	Note that low noise is required, but not sufficient. It does not
Youngsuk Kim	63daa5e	2024-04-09 17:06:41 -0400	[diff] [blame]	13	exclude measurement bias.
				14	See `"Producing Wrong Data Without Doing Anything Obviously Wrong!" by Mytkowicz, Diwan, Hauswith and Sweeney (ASPLOS 2009) <https://users.cs.northwestern.edu/~robby/courses/322-2013-spring/mytkowicz-wrong-data.pdf>`_
				15	for example.
Rafael Espindola	703e2db	2017-05-24 16:39:12 +0000	[diff] [blame]	16
				17	General
				18	================================
				19
				20	* Use a high resolution timer, e.g. perf under linux.
				21
				22	* Run the benchmark multiple times to be able to recognize noise.
				23
				24	* Disable as many processes or services as possible on the target system.
				25
				26	* Disable frequency scaling, turbo boost and address space
				27	randomization (see OS specific section).
				28
				29	* Static link if the OS supports it. That avoids any variation that
				30	might be introduced by loading dynamic libraries. This can be done
				31	by passing ``-DLLVM_BUILD_STATIC=ON`` to cmake.
				32
				33	* Try to avoid storage. On some systems you can use tmpfs. Putting the
				34	program, inputs and outputs on tmpfs avoids touching a real storage
				35	system, which can have a pretty big variability.
				36
				37	To mount it (on linux and freebsd at least)::
				38
				39	mount -t tmpfs -o size=<XX>g none dir_to_mount
				40
				41	Linux
				42	=====
				43
				44	* Disable address space randomization::
				45
				46	echo 0 > /proc/sys/kernel/randomize_va_space
				47
				48	* Set scaling_governor to performance::
				49
				50	for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
				51	do
uint256_t	c54a070	2024-06-07 17:05:50 +0900	[diff] [blame]	52	echo performance > $i
Rafael Espindola	703e2db	2017-05-24 16:39:12 +0000	[diff] [blame]	53	done
				54
				55	* Use https://github.com/lpechacek/cpuset to reserve cpus for just the
				56	program you are benchmarking. If using perf, leave at least 2 cores
				57	so that perf runs in one and your program in another::
				58
				59	cset shield -c N1,N2 -k on
				60
				61	This will move all threads out of N1 and N2. The ``-k on`` means
				62	that even kernel threads are moved out.
				63
				64	* Disable the SMT pair of the cpus you will use for the benchmark. The
				65	pair of cpu N can be found in
				66	``/sys/devices/system/cpu/cpuN/topology/thread_siblings_list`` and
				67	disabled with::
				68
				69	echo 0 > /sys/devices/system/cpu/cpuX/online
				70
				71
				72	* Run the program with::
				73
				74	cset shield --exec -- perf stat -r 10 <cmd>
				75
				76	This will run the command after ``--`` in the isolated cpus. The
				77	particular perf command runs the ``<cmd>`` 10 times and reports
				78	statistics.
				79
				80	With these in place you can expect perf variations of less than 0.1%.
				81
				82	Linux Intel
				83	-----------
				84
				85	* Disable turbo mode::
				86
				87	echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo