| ======================== | 
 | Scudo Hardened Allocator | 
 | ======================== | 
 |  | 
 | .. contents:: | 
 |    :local: | 
 |    :depth: 1 | 
 |  | 
 | Introduction | 
 | ============ | 
 |  | 
 | The Scudo Hardened Allocator is a user-mode allocator based on LLVM Sanitizer's | 
 | CombinedAllocator, which aims at providing additional mitigations against heap | 
 | based vulnerabilities, while maintaining good performance. | 
 |  | 
 | Currently, the allocator supports (was tested on) the following architectures: | 
 |  | 
 | - i386 (& i686) (32-bit); | 
 | - x86_64 (64-bit); | 
 | - armhf (32-bit); | 
 | - AArch64 (64-bit); | 
 | - MIPS (32-bit & 64-bit). | 
 |  | 
 | The name "Scudo" has been retained from the initial implementation (Escudo | 
 | meaning Shield in Spanish and Portuguese). | 
 |  | 
 | Design | 
 | ====== | 
 |  | 
 | Allocator | 
 | --------- | 
 | Scudo can be considered a Frontend to the Sanitizers' common allocator (later | 
 | referenced as the Backend). It is split between a Primary allocator, fast and | 
 | efficient, that services smaller allocation sizes, and a Secondary allocator | 
 | that services larger allocation sizes and is backed by the operating system | 
 | memory mapping primitives. | 
 |  | 
 | Scudo was designed with security in mind, but aims at striking a good balance | 
 | between security and performance. It is highly tunable and configurable. | 
 |  | 
 | Chunk Header | 
 | ------------ | 
 | Every chunk of heap memory will be preceded by a chunk header. This has two | 
 | purposes, the first one being to store various information about the chunk, | 
 | the second one being to detect potential heap overflows. In order to achieve | 
 | this, the header will be checksummed, involving the pointer to the chunk itself | 
 | and a global secret. Any corruption of the header will be detected when said | 
 | header is accessed, and the process terminated. | 
 |  | 
 | The following information is stored in the header: | 
 |  | 
 | - the 16-bit checksum; | 
 | - the class ID for that chunk, which is the "bucket" where the chunk resides | 
 |   for Primary backed allocations, or 0 for Secondary backed allocations; | 
 | - the size (Primary) or unused bytes amount (Secondary) for that chunk, which is | 
 |   necessary for computing the size of the chunk; | 
 | - the state of the chunk (available, allocated or quarantined); | 
 | - the allocation type (malloc, new, new[] or memalign), to detect potential | 
 |   mismatches in the allocation APIs used; | 
 | - the offset of the chunk, which is the distance in bytes from the beginning of | 
 |   the returned chunk to the beginning of the Backend allocation; | 
 |  | 
 | This header fits within 8 bytes, on all platforms supported. | 
 |  | 
 | The checksum is computed as a CRC32 (made faster with hardware support) | 
 | of the global secret, the chunk pointer itself, and the 8 bytes of header with | 
 | the checksum field zeroed out. It is not intended to be cryptographically | 
 | strong.  | 
 |  | 
 | The header is atomically loaded and stored to prevent races. This is important | 
 | as two consecutive chunks could belong to different threads. We also want to | 
 | avoid any type of double fetches of information located in the header, and use | 
 | local copies of the header for this purpose. | 
 |  | 
 | Delayed Freelist | 
 | ----------------- | 
 | A delayed freelist allows us to not return a chunk directly to the Backend, but | 
 | to keep it aside for a while. Once a criterion is met, the delayed freelist is | 
 | emptied, and the quarantined chunks are returned to the Backend. This helps | 
 | mitigate use-after-free vulnerabilities by reducing the determinism of the | 
 | allocation and deallocation patterns. | 
 |  | 
 | This feature is using the Sanitizer's Quarantine as its base, and the amount of | 
 | memory that it can hold is configurable by the user (see the Options section | 
 | below). | 
 |  | 
 | Randomness | 
 | ---------- | 
 | It is important for the allocator to not make use of fixed addresses. We use | 
 | the dynamic base option for the SizeClassAllocator, allowing us to benefit | 
 | from the randomness of the system memory mapping functions. | 
 |  | 
 | Usage | 
 | ===== | 
 |  | 
 | Library | 
 | ------- | 
 | The allocator static library can be built from the LLVM build tree thanks to | 
 | the ``scudo`` CMake rule. The associated tests can be exercised thanks to the | 
 | ``check-scudo`` CMake rule. | 
 |  | 
 | Linking the static library to your project can require the use of the | 
 | ``whole-archive`` linker flag (or equivalent), depending on your linker. | 
 | Additional flags might also be necessary. | 
 |  | 
 | Your linked binary should now make use of the Scudo allocation and deallocation | 
 | functions. | 
 |  | 
 | You may also build Scudo like this:  | 
 |  | 
 | .. code:: console | 
 |  | 
 |   cd $LLVM/projects/compiler-rt/lib | 
 |   clang++ -fPIC -std=c++11 -msse4.2 -O2 -I. scudo/*.cpp \ | 
 |     $(\ls sanitizer_common/*.{cc,S} | grep -v "sanitizer_termination\|sanitizer_common_nolibc\|sancov_\|sanitizer_unwind\|sanitizer_symbol") \ | 
 |     -shared -o libscudo.so -pthread | 
 |  | 
 | and then use it with existing binaries as follows: | 
 |  | 
 | .. code:: console | 
 |  | 
 |   LD_PRELOAD=`pwd`/libscudo.so ./a.out | 
 |  | 
 | Clang | 
 | ----- | 
 | With a recent version of Clang (post rL317337), the allocator can be linked with | 
 | a binary at compilation using the ``-fsanitize=scudo`` command-line argument, if | 
 | the target platform is supported. Currently, the only other Sanitizer Scudo is | 
 | compatible with is UBSan (eg: ``-fsanitize=scudo,undefined``). Compiling with | 
 | Scudo will also enforce PIE for the output binary. | 
 |  | 
 | Options | 
 | ------- | 
 | Several aspects of the allocator can be configured on a per process basis | 
 | through the following ways: | 
 |  | 
 | - at compile time, by defining ``SCUDO_DEFAULT_OPTIONS`` to the options string | 
 |   you want set by default; | 
 |  | 
 | - by defining a ``__scudo_default_options`` function in one's program that | 
 |   returns the options string to be parsed. Said function must have the following | 
 |   prototype: ``extern "C" const char* __scudo_default_options(void)``, with a | 
 |   default visibility. This will override the compile time define; | 
 |  | 
 | - through the environment variable SCUDO_OPTIONS, containing the options string | 
 |   to be parsed. Options defined this way will override any definition made | 
 |   through ``__scudo_default_options``. | 
 |  | 
 | The options string follows a syntax similar to ASan, where distinct options | 
 | can be assigned in the same string, separated by colons. | 
 |  | 
 | For example, using the environment variable: | 
 |  | 
 | .. code:: console | 
 |  | 
 |   SCUDO_OPTIONS="DeleteSizeMismatch=1:QuarantineSizeKb=64" ./a.out | 
 |  | 
 | Or using the function: | 
 |  | 
 | .. code:: cpp | 
 |  | 
 |   extern "C" const char *__scudo_default_options() { | 
 |     return "DeleteSizeMismatch=1:QuarantineSizeKb=64"; | 
 |   } | 
 |  | 
 |  | 
 | The following options are available: | 
 |  | 
 | +-----------------------------+----------------+----------------+------------------------------------------------+ | 
 | | Option                      | 64-bit default | 32-bit default | Description                                    | | 
 | +-----------------------------+----------------+----------------+------------------------------------------------+ | 
 | | QuarantineSizeKb            | 256            | 64             | The size (in Kb) of quarantine used to delay   | | 
 | |                             |                |                | the actual deallocation of chunks. Lower value | | 
 | |                             |                |                | may reduce memory usage but decrease the       | | 
 | |                             |                |                | effectiveness of the mitigation; a negative    | | 
 | |                             |                |                | value will fallback to the defaults. Setting   | | 
 | |                             |                |                | *both* this and ThreadLocalQuarantineSizeKb to | | 
 | |                             |                |                | zero will disable the quarantine entirely.     | | 
 | +-----------------------------+----------------+----------------+------------------------------------------------+ | 
 | | QuarantineChunksUpToSize    | 2048           | 512            | Size (in bytes) up to which chunks can be      | | 
 | |                             |                |                | quarantined.                                   | | 
 | +-----------------------------+----------------+----------------+------------------------------------------------+ | 
 | | ThreadLocalQuarantineSizeKb | 1024           | 256            | The size (in Kb) of per-thread cache use to    | | 
 | |                             |                |                | offload the global quarantine. Lower value may | | 
 | |                             |                |                | reduce memory usage but might increase         | | 
 | |                             |                |                | contention on the global quarantine. Setting   | | 
 | |                             |                |                | *both* this and QuarantineSizeKb to zero will  | | 
 | |                             |                |                | disable the quarantine entirely.               | | 
 | +-----------------------------+----------------+----------------+------------------------------------------------+ | 
 | | DeallocationTypeMismatch    | true           | true           | Whether or not we report errors on             | | 
 | |                             |                |                | malloc/delete, new/free, new/delete[], etc.    | | 
 | +-----------------------------+----------------+----------------+------------------------------------------------+ | 
 | | DeleteSizeMismatch          | true           | true           | Whether or not we report errors on mismatch    | | 
 | |                             |                |                | between sizes of new and delete.               | | 
 | +-----------------------------+----------------+----------------+------------------------------------------------+ | 
 | | ZeroContents                | false          | false          | Whether or not we zero chunk contents on       | | 
 | |                             |                |                | allocation and deallocation.                   | | 
 | +-----------------------------+----------------+----------------+------------------------------------------------+ | 
 |  | 
 | Allocator related common Sanitizer options can also be passed through Scudo | 
 | options, such as ``allocator_may_return_null`` or ``abort_on_error``. A detailed | 
 | list including those can be found here: | 
 | https://github.com/google/sanitizers/wiki/SanitizerCommonFlags. | 
 |  | 
 | Error Types | 
 | =========== | 
 |  | 
 | The allocator will output an error message, and potentially terminate the | 
 | process, when an unexpected behavior is detected. The output usually starts with | 
 | ``"Scudo ERROR:"`` followed by a short summary of the problem that occurred as | 
 | well as the pointer(s) involved. Once again, Scudo is meant to be a mitigation, | 
 | and might not be the most useful of tools to help you root-cause the issue, | 
 | please consider `ASan <https://github.com/google/sanitizers/wiki/AddressSanitizer>`_ | 
 | for this purpose. | 
 |  | 
 | Here is a list of the current error messages and their potential cause: | 
 |  | 
 | - ``"corrupted chunk header"``: the checksum verification of the chunk header | 
 |   has failed. This is likely due to one of two things: the header was | 
 |   overwritten (partially or totally), or the pointer passed to the function is | 
 |   not a chunk at all; | 
 |  | 
 | - ``"race on chunk header"``: two different threads are attempting to manipulate | 
 |   the same header at the same time. This is usually symptomatic of a | 
 |   race-condition or general lack of locking when performing operations on that | 
 |   chunk; | 
 |  | 
 | - ``"invalid chunk state"``: the chunk is not in the expected state for a given | 
 |   operation, eg: it is not allocated when trying to free it, or it's not | 
 |   quarantined when trying to recycle it, etc. A double-free is the typical | 
 |   reason this error would occur; | 
 |  | 
 | - ``"misaligned pointer"``: we strongly enforce basic alignment requirements, 8 | 
 |   bytes on 32-bit platforms, 16 bytes on 64-bit platforms. If a pointer passed | 
 |   to our functions does not fit those, something is definitely wrong. | 
 |  | 
 | - ``"allocation type mismatch"``: when the optional deallocation type mismatch | 
 |   check is enabled, a deallocation function called on a chunk has to match the | 
 |   type of function that was called to allocate it. Security implications of such | 
 |   a mismatch are not necessarily obvious but situational at best; | 
 |  | 
 | - ``"invalid sized delete"``: when the C++14 sized delete operator is used, and | 
 |   the optional check enabled, this indicates that the size passed when | 
 |   deallocating a chunk is not congruent with the one requested when allocating | 
 |   it. This is likely to be a `compiler issue <https://software.intel.com/en-us/forums/intel-c-compiler/topic/783942>`_, | 
 |   as was the case with Intel C++ Compiler, or some type confusion on the object | 
 |   being deallocated; | 
 |  | 
 | - ``"RSS limit exhausted"``: the maximum RSS optionally specified has been | 
 |   exceeded; | 
 |  | 
 | Several other error messages relate to parameter checking on the libc allocation | 
 | APIs and are fairly straightforward to understand. |