devmtg/2013-11/index.html - llvm-www - Git at Google

 <!--#include virtual="../../header.incl" -->

 <div class="www_sectiontitle">2013 LLVM Developers' Meeting</div>
 <table>
         <tr><td valign="top">
 <ol>
         <li><a href="#agenda">November 7 - Meeting Agenda</a></li>
         <li><a href="#abstracts">Talk Abstracts</a></li>
         <li><a href="#poster">Poster Abstracts</a></li>
         <li><a href="#bof">BoF Abstracts</a></li>
 </ol>
 </td><td>
 <ul>
   <li><b>What</b>: The seventh general meeting of LLVM Developers and Users.</li>
   <li><b>When</b>: November 6-7, 2013</li>
   <li><b>Where</b>: Le Meridien, San Francisco, CA</li>
 </ul>
 </td></tr></table>

 <p align="center"><h2><b>SPONSORED BY: <a href="http://apple.com">Apple</a>, <a href="http://www.google.com">Google</a></b>, <a href="http://www.qualcomm.com/quicinc/">QuIC</a>, <a href="http://www.codeplay.com">Codeplay Software</a></h2></p>

 <p>The meeting serves as a forum for <a href="http://llvm.org">LLVM</a>,
 <a href="http://clang.llvm.org">Clang</a>, <a href="http://lldb.llvm.org">LLDB</a> and
 other LLVM project developers and users to get acquainted, learn how LLVM is used, and
 exchange ideas about LLVM and its (potential) applications. More broadly, we
 believe the event will be of particular interest to the following people:</p>

 <ul>
 <li>Active developers of projects in the LLVM Umbrella
 (LLVM core, Clang, LLDB, libc++, compiler_rt, klee, dragonegg, lld, etc).</li>
 <li>Anyone interested in using these as part of another project.</li>
 <li>Compiler, programming language, and runtime enthusiasts.</li>
 <li>Those interested in using compiler and toolchain technology in novel
 and interesting ways.</li>
 </ul>
 <p>
 We also invite you to sign up for the <a href="http://lists.llvm.org/mailman/listinfo/llvm-devmeeting">official Developer Meeting mailing list</a> to be kept informed of updates concerning the meeting.
 </p>

 <div class="www_sectiontitle" id="agenda">November 7 - Meeting Agenda</div>
 <p>
 <table id="devmtg">
   <tr><th>Media</th><th>Talk</th></tr>

   <tr><td><a href="slides/Lattner-WelcomeTalk.pdf">Slides</a><br><a href="videos/Lattner-WelcomeTalk-720.mov">Video</a> (Computer)<br><a href="videos/Lattner-WelcomeTalk-360.mov">Video</a> (Mobile)</td><td><b>Welcome</b><br>Tanya Lattner</td></tr>

   <tr><td><a href="slides/Lattner-LLVM Early Days.pdf">Slides[1]</a> <a href="slides/Adve-LLVM10yr.pdf">Slides[2]</a><br><a href="videos/Lattner-LLVM Early Days-720.mov">Video</a> (Computer)<br><a href="videos/Lattner-LLVM Early Days-360.mov">Video</a> (Mobile)</td><td><b><a href="#talk1">LLVM: 10 years and going strong</a></b><br>Chris Lattner, <i>Apple</i><br>Vikram Adve, <i>University of Illinois, Urbana-Champaign</i></td></tr>

 <tr><td><a href="slides/Zakai-Emscripten.pdf">Slides</a><br><a href="videos/Zakai-Emscripten-720.mov">Video</a> (Computer)<br><a href="videos/Zakai-Emscripten-360.mov">Video</a> (Mobile)</td><td><b><a href="#talk2">Emscripten: Compiling LLVM bitcode to JavaScript
 </a></b><br>Alon Zakai, <i>Mozilla</i></td></tr>
   <tr><td><a href="slides/Koch-FunctionMerging.pdf">Slides</a><br><a href="videos/Koch-CodeSize-720.mov">Video</a> (Computer)<br><a href="videos/Koch-CodeSize-360.mov">Video</a> (Mobile)</td><td><b><a href="#talk3">Code Size Reduction using Similar Function Merging</a></b><br>Tobias Edler von Koch, <i>University of Edinburgh / QuIC</i><br> Pranav Bhandarkar, <i>QuIC</i></td></tr>
   <tr><td><a href="slides/BenchmarkBOFNotes.html">Notes</a></td><td><a href="#bof1"><b>BOF: Performance Tracking & Benchmarking Infrastructure</b></a><br>Kristof Beyls, <i>ARM</i></td></tr>


  <tr><td><a href="slides/Fischer-Julia.html">Slides</a><br><a href="videos/Fischer-Julia-720.mov">Video</a> (Computer)<br><a href="videos/Fischer-Julia-360.mov">Video</a> (Mobile)</td><td><b><a href="#talk4">Julia: An LLVM-based approach to scientific computing</a></b><br>Keno Fischer, <i>Harvard College/MIT CSAIL</i></td></tr>
   <tr><td><a href="slides/Lopes-SMT.pdf">Slides</a><br><a href="videos/Lopes-SMTSolvers-720.mov">Video</a> (Computer)<br><a href="videos/Lopes-SMTSolvers-360.mov">Video</a> (Mobile)</td><td><b><a href="#talk5">Verifying optimizations using SMT solvers</a></b><br>Nuno Lopes, <i>INESC-ID / U. Lisboa</i></td></tr>
   <tr><td><a href="slides/TableGenBOFNotes.html">Notes</a></td><td><a href="#bof2"><b>BOF: TableNextGen</b></a><br>Mihail Popa, <i>ARM</i></td></tr>

 <tr><td><a href="slides/Serebryany-ASAN.pdf">Slides</a><br><a href="videos/Serebryany-ASAN-720.mov">Video</a> (Computer)<br><a href="videos/Serebryany-ASAN-360.mov">Video</a> (Mobile)</td><td><b><a href="#talk6">New Address Sanitizer Features</a></b><br>Kostya Serebryany,<i>Google</i><br> Alexey Samsonov, <i>Google</i></td></tr>
   <tr><td><a href="slides/Stellard-R600.pdf">Slides</a><br><a href="videos/Stellard-R600-720.mov">Video</a> (Computer)<br><a href="videos/Stellard-R600-360.mov">Video</a> (Mobile)</td><td><b><a href="#talk7">A Detailed Look at the R600 Backend</a></b><br>Tom Stellard, <i>Advanced Micro Devices Inc.</i></td></tr>
   <tr><td><a href="slides/DebugBOFNotes.html">Notes</a></td><td><a href="#bof3"><b>BOF: Debug Info</b></a><br>Eric Christopher, <i>Google</i></td></tr>

 <tr><td><a href="slides/Robinson-PS4Toolchain.pdf">Slides</a><br><a href="videos/Robinson-PS4Toolchain-720.mov">Video</a> (Computer)<br><a href="videos/Robinson-PS4Toolchain-360.mov">Video</a> (Mobile)</td><td><b><a href="#talk8">Developer Toolchain for the PlayStation®4</a></b><br>Paul T. Robinson, <i>Sony Computer Entertainment America</i></td></tr>
   <tr><td><a href="slides/Tzannes-ASaP.pdf">Slides</a><br><a href="videos/Tzannes-ASaP-720.mov">Video</a> (Computer)<br><a href="videos/Tzannes-ASaP-360.mov">Video</a> (Mobile)</td><td><b><a href="#talk9">Annotations for Safe Parallelism in Clang</a></b><br>Alexandros Tzannes, <i>University of Illinois, Urbana-Champaign</i></td></tr>
   <tr><td></td><td><a href="#bof4"><b>BOF: Extending the Sanitizer tools and porting them to other platforms</b></a><br>Kostya Serebryany, <i>Google</i><br>Alexey Samsonov, <i>Google</i><br>Evgeniy Stepanov, <i>Google</i></td></tr>

 <tr><td><a href="slides/Rotem-Vectorization.pdf">Slides</a><br><a href="videos/Rotem-Vectorization-720.mov">Video</a> (Computer)<br><a href="videos/Rotem-Vectorization-360.mov">Video</a> (Mobile)</td><td><b><a href="#talk10">Vectorization in LLVM</a></b><br>Nadav Rotem, <i>Apple</i><br> Arnold Schwaighofer, <i>Apple</i></td></td></tr>
   <tr><td><a href="slides/Kleckner-ClangVisualC++.pdf">Slides</a><br><a href="videos/Kleckner-ClangVisualC++-720.mov">Video</a> (Computer)<br><a href="videos/Kleckner-ClangVisualC++-360.mov">Video</a> (Mobile)<td><b><a href="#talk11">Bringing clang and LLVM to Visual C++ users</a></b><br>Reid Kleckner, <i>Google</i></td></td></tr>
   <tr><td></td><td><a href="#bof5"><b>BOF: High Level Loop Optimization / Polly</b></a><br>Tobias Grosser, <i>INRIA</i><br> Sebastian Pop, <i>QuIC</i><br> Zino Benaissa, <i>QuIC</i></td></tr>

 <tr><td>See Abstracts</td><td><b><a href="#poster">Posters</a></b></td></tr>

 <tr><td><a href="slides/Wanderman-Milne-Cloudera.pdf">Slides</a><br><a href="videos/Wanderman-Milne-Cloudera-720.mov">Video</a> (Computer)<br><a href="videos/Wanderman-Milne-Cloudera-360.mov	">Video</a> (Mobile)</td><td><b><a href="#talk12">Building a Modern Database with LLVM</a></b><br>Skye Wanderman-Milne, <i>Cloudera</i></td></tr>
 <tr><td><a href="slides/Riley-DebugginWithLLDB.pdf">Slides</a><br><a href="videos/Riley-DebugginWithLLDB-720.mov">Video</a> (Computer)<br><a href="videos/Riley-DebugginWithLLDB-360.mov">Video</a> (Mobile)<td><b><a href="#talk13">Adapting LLDB for your hardware: Remote Debugging the Hexagon DSP</a></b><br>Colin Riley, <i>Codeplay</i></td></tr>
   <tr><td></td><td><a href="#bof6"><b>BOF: Optimizations using LTO</b></a><br>Zino Benaissa, <i>QuIC</i><br>Tony  Linthicum, <i>QuIC</i></td></tr>

   <tr><td><a href="slides/Carruth-PGO.pdf">Slides</a><br><a href="videos/Carruth-PGO-720.mov">Video</a> (Computer)<br><a href="videos/Carruth-PGO-360.mov">Video</a> (Mobile)</td><td><a href="#talk14"><b>PGO in LLVM: Status and Current Work</a></b><br>Bob Wilson, <i>Apple</i><br> Chandler Carruth, <i>Google</i><br> Diego Novillo, <i>Google</i></td></tr>
   <tr><td>See Abstracts<td><b><a href="#light">Lightning Talks</a></b><br></td></tr>
   <tr><td></td><td><a href="#bof7"><b>BOF: JIT & MCJIT</a></b><br>Andy Kaylor, <i>Intel Corporation</i></td></tr>

 </table>
 </p>


 <div class="www_sectiontitle" id="abstracts">Talk Abstracts</div>

 <p>
 <b><a id="talk1">LLVM: 10 years and going strong
 </a></b><br>
 <i>Chris Lattner - Apple,
 Vikram Adve - University of Illinois, Urbana-Champaign</i><br>
 <a href="slides/Lattner-LLVM Early Days.pdf">Slides[1]<a> <a href="slides/Adve-LLVM10yr.pdf">Slides[2]</a><br>
 <a href="videos/Lattner-LLVM Early Days-720.mov">Video</a> (Computer) <a href="videos/Lattner-LLVM Early Days-360.mov">Video</a> (Mobile)<br>
 Keynote talk celebrating the 10th anniversary of LLVM 1.0.
 </p>

 <p>
 <b><a id="talk2">Emscripten: Compiling LLVM bitcode to JavaScript</a></b><br>
 <i>Alon Zakai - Mozilla</i><br>
 <a href="slides/Zakai-Emscripten.pdf">Slides</a><br>
 <a href="videos/Zakai-Emscripten-720.mov">Video</a> (Computer) <a href="videos/Zakai-Emscripten-360.mov">Video</a> (Mobile)<br>
 Emscripten is an open source compiler that converts LLVM bitcode to JavaScript. JavaScript is a fairly unusual target for compilation, being a high-level dynamic language instead of a low-level CPU assembly, but efficient compilation to JavaScript is useful because of the ubiquity of web browsers which use it as their standard language. This talk will detail how Emscripten utilizes LLVM and clang to convert C/C++ into JavaScript, and cover the specific challenges that compiling to JavaScript entails, such as the lack of goto statements, while on the other hand making other aspects of compilation simpler, for example having native exception handling support. Some such issues are general and have to do with JavaScript itself, but specific challenges with Emscripten's interaction with LLVM will also be described, as well as opportunities for better integration between the projects in the future.
 </p>

 <p>
 <b><a id="talk3">Code Size Reduction using Similar Function Merging</a></b><br>
 <i>Tobias Edler von Koch - University of Edinburgh / QuIC, Pranav Bhandarkar - QuIC</i><br>
 <a href="slides/Koch-FunctionMerging.pdf">Slides</a><br>
 <a href="videos/Koch-CodeSize-720.mov">Video</a> (Computer) <a href="videos/Koch-CodeSize-360.mov">Video</a> (Mobile)<br>
 Code size reduction is a critical goal for compiler optimizations targeting embedded applications. While LLVM continues to improve its performance optimization capabilities, it is currently still lacking a robust set of optimizations specifically targeting code size. In our talk, we will describe an optimization pass that aims to reduce code size by merging similar functions at the IR level. Significantly extending the existing MergeFunctions optimization, the pass is capable of merging multiple functions even if there are minor differences between them. A number of heuristics are used to determine when merging of functions is profitable. Alongside hash tables, these also ensure that compilation time remains at an acceptable level. We will describe our experience of using this new optimization pass to reduce the code size of a significant embedded application at Qualcomm Innovation Center by 2%.
 </p>

 <p>
 <b><a id="talk4">Julia: An LLVM-based approach to scientific computing</a></b><br>
 <i>Keno Fischer - Harvard College/MIT CSAIL</i><br>
 <a href="slides/Fischer-Julia.html">Slides</a><br>
 <a href="videos/Fischer-Julia-720.mov">Video</a> (Computer) <a href="videos/Fischer-Julia-360.mov">Video</a> (Mobile)<br>
 Julia is a new high-level dynamic programming language specifically designed for
 scientific and technical computing, while at the same time not ignoring the
 need for the expressiveness and the power of a modern general purpose
 programming language.
 <br>
 Thanks to LLVM's JIT compilation capabilities, for which Julia was written
 from the ground up, Julia can achieve a level of performance usually reserved
 for compiled programs written in C, C++ or other compiled languages. It thus
 manages to bridge the gap between very high level languages such as MATLAB, R or
 Python usually used for algorithm prototyping and those languages used when
 performance is of the essence, reducing development time and the possibility for
 subtle differences between the prototype and the production algorithms.
 </p>

 <p>
 <b><a id="talk5">Verifying optimizations using SMT solvers</a></b><br>
 <i>Nuno Lopes - INESC-ID / U. Lisboa</i><br>
 <a href="slides/">Slides</a>
 <a href="videos/Lopes-SMTSolvers-720.mov">Video</a> (Computer) <a href="videos/Lopes-SMTSolvers-360.mov">Video</a> (Mobile)<br>
 Instcombine and Selection DAG optimizations, although usually simple, can easily hide bugs.
 We've had many cases in the past where these optimizers were producing wrong code in certain corner cases.
 In this talk I'll describe a way to prove the correctness of such optimization using an off-the-shelf SMT solver (bit-vector theory).  I'll give examples of past bugs found in these optimizations, how to encode them into SMT-Lib 2 format, and how to spot the bugs.
 The encoding to the SMT format, although manual, is straightfoward and consumes little time. The verification is then automatic.
 </p>

 <p>
 <b><a id="talk6">New Address Sanitizer Features</a></b><br>
 <i>Kostya Serebryany - Google,
 Alexey Samsonov - Google</i><br>
 <a href="slides/Serebryany-ASAN.pdf">Slides</a><br>
 <a href="videos/Serebryany-ASAN-720.mov">Video</a> (Computer) <a href="videos/Serebryany-ASAN-360.mov">Video</a> (Mobile)<br>
 AddressSanitizer is a fast memory error detector that uses LLVM for compile-time instrumentation. In this talk we will present several new features in AddressSanitizer.
 <ul>
 <li>Initialization order checker finds bugs where the program behavior depends on the order in which global variables from different modules are initialized.</li>
 <li>Stack-use-after-scope detector finds uses of stack-allocated objects outside of the scope where they are defined.</li>
 <li>Similarly, stack-use-after-return detector finds uses of stack variables after the functions they are defined in have exited.</li>
 <li>LeakSanitizer finds heap memory leaks; it is built on top of AddressSanitizer memory allocator.</li>
 <li>We will also give an update on AddressSanitizer for Linux kernel.
 </li></ul>
 </p>

 <p>
 <b><a id="talk7">A Detailed Look at the R600 Backend</a></b><br>
 <i>Tom Stellard - Advanced Micro Devices Inc.</i><br>
 <a href="slides/Stellard-R600.pdf">Slides</a><br>
 <a href="videos/Stellard-R600-720.mov">Video</a> (Computer) <a href="videos/Stellard-R600-360.mov">Video</a> (Mobile)<br>
 The R600 backend, which targets AMD GPUs, was merged into LLVM prior to
 the 3.3 release.  It is one component of AMD's open source GPU drivers
 which provide support for several popular graphics and compute APIs.
 The backend supports two different generation of GPUs, the older
 VLIW4/VLIW5 architecture and the more recent GCN architecture.  In this
 talk, I will discuss the history of the R600 backend, how it is used,
 and why we choose to use LLVM for our open source drivers.  Additionally,
 I'll give an in-depth look at the backend and its features and present an
 overview of the unique architecture of supported GPUs.  I will describe
 the challenges this architecture presented in writing an LLVM backend and
 the approaches we have taken for instruction selection and scheduling.
 I will also look at the future goals for this backend and areas for
 improvement in the backend as well as core LLVM.
 </p>

 <p>
 <b><a id="talk8">Developer Toolchain for the PlayStation®4</a></b><br>
 <i>Paul T. Robinson - Sony Computer Entertainment America</i><br>
 <a href="slides/Robinson-PS4Toolchain.pdf">Slides</a><br>
 <a href="videos/Robinson-PS4Toolchain-360.mov">Video</a> (Computer) <a href="videos/Robinson-PS4Toolchain-360.mov">Video</a> (Mobile)<br>
 The PlayStation®4 has a developer toolchain centered on Clang as the CPU compiler.  We describe how Clang/LLVM fits into Sony Computer Entertainment's (mostly proprietary) toolchain, focusing on customizations, game-developer experience, and working with the open-source community.
 </p>

 <p>
 <b><a id="talk9">Annotations for Safe Parallelism in Clang</a></b><br>
 <i>Alexandros Tzannes -
 University of Illinois, Urbana-Champaign</i><br>
 <a href="slides/Tzannes-ASaP.pdf">Slides</a><br>
 <a href="videos/Tzannes-ASaP-720.mov">Video</a> (Computer) <a href="videos/Tzannes-ASaP-360.mov">Video</a> (Mobile)<br>
 The Annotations for Safe Parallelism (ASaP) project at UIUC is implementing a static checker in Clang to allow writing provably safe parallel code. ASaP is inspired by DPJ (Deterministic Parallel Java) but unlike it, it does not extend the base language. Instead, we rely on the rich C++11 attribute system to enrich C++ types and to pass information to our ASaP checker. The ASaP checker gives strong guarantees such as race-freedom, *strong* atomicity, and deadlock freedom for commonly used parallelism patterns, and it is at the prototyping stage where we can prove the parallel safety of simple TBB programs. We are evolving ASaP in collaboration with our Autodesk partners who help guide its design in order to solve incrementally complex problems faced by real software teams in industry. In this presentation, I will present an overview of how the checker works, what is currently supported, what we have "in the works", and some discussion about incorporating some of the ideas of the thread safety annotation to assist our analysis.
 </p>

 <p>
 <b><a id="talk10">Vectorization in LLVM</a></b><br>
 <i>Nadav Rotem - Apple, Arnold Schwaighofer - Apple</i><br>
 <a href="slides/Rotem-Vectorization.pdf">Slides</a><br>
 <a href="videos/Rotem-Vectorization-720.mov">Video</a> (Computer) <a href="videos/Rotem-Vectorization-360.mov">Video</a> (Mobile)<br>
 Vectorization is a powerful optimization that can accelerate programs in multiple domains.  Over the last year two new vectorization passes were added to LLVM: the Loop-vectorizer, which vectorizes loops, and the SLP-vectorizer, which combines independent scalar calculations into a vector. Both of these optimizations together show a significant performance increase on many applications.  In this talk we’ll present our work on the vectorizers in the past year.  We’ll discuss the overall architecture of these passes, the cost model for deciding when vectorization is profitable, and describe some interesting design tradeoffs. Finally, we want to talk about some ideas to further improve the vectorization infrastructure.
 </p>

 <p>
 <b><a id="talk11">Bringing clang and LLVM to Visual C++ users
 </a></b><br>
 <i>Reid Kleckner - Google</i><br>
 <a href="slides/Kleckner-ClangVisualC++.pdf">Slides</a><br>
 <a href="videos/Kleckner-ClangVisualC++-360.mov">Video</a> (Computer) <a href="videos/Kleckner-ClangVisualC++-360.mov">Video</a> (Mobile)<br>
 This talk covers the work we've been doing to help make clang and LLVM more
 compatible with Microsoft's Visual C++ toolchain.  With a compatible toolchain,
 we can deliver all of the features that clang and LLVM have to offer, such as
 AddressSanitizer.  Perhaps the most important point of compatibility is the C++
 ABI, which is a huge and complicated beast that covers name mangling, calling
 conventions, record layout, vtable layout, virtual inheritance, and more.  This
 talk will go into detail about some of the more interesting parts of the ABI.
 </p>

 <p>
 <b><a id="talk12">Building a Modern Database with LLVM</a></b><br>
 <i>Skye Wanderman-Milne - Cloudera</i><br>
 <a href="slides/Wanderman-Milne-Cloudera.pdf">Slides</a><br>
 <a href="videos/Wanderman-Milne-Cloudera-720.mov">Video</a> (Computer) <a href="videos/Wanderman-Milne-Cloudera-360.mov">Video</a> (Mobile)<br>
 Cloudera Impala is a low-latency SQL query engine for Apache Hadoop. In order to achieve optimal CPU efficiency and query execution times, Impala uses LLVM to perform JIT code generation to take advantage of query-specific information unavailable at compile time. For example, code generation allows us to remove many conditionals (and the associated branch misprediction overhead) necessary for handling multiples types, operators, functions, etc.; inline what would otherwise be virtual function calls; and propagate query-specific constants. These optimization can reduce overall query time by almost 300%.
 <br>
 In this talk, I'll outline the motivation for using LLVM within Impala and go over some examples and results of JIT optimizations we currently perform, as well as ones we'd like to implement in the future.
 </p>

 <p>
 <b><a id="talk13">Adapting LLDB for your hardware: Remote Debugging the Hexagon DSP
 </a></b><br>
 <i>Colin Riley - Codeplay</i><br>
 <a href="slides/Riley-DebugginWithLLDB.pdf">Slides</a><br>
 <a href="videos/Riley-DebugginWithLLDB-720.mov">Video</a> (Computer) <a href="videos/Riley-DebugginWithLLDB-360.mov">Video</a> (Mobile)<br>
 LLDB is at the stage of development where support is being added for a wide range of hardware devices. Its modular approach means adapting it to debug a new system has a well-defined step-by-step process, which can progress fairly quickly. Presented is a guide of what implementation steps are required to get your hardware supported via LLDB using Remote Debugging, giving examples from work we are doing to support the Hexagon DSP within LLDB.
 </p>

 <p>
 <b><a id="talk14">PGO in LLVM: Status and Current Work
 </a></b><br>
 <i>Bob Wilson - Apple,
 Chandler Carruth - Google,
 Diego Novillo - Google</i><br>
 <a href="slides/Carruth-PGO.pdf">Slides</a><br>
 <a href="videos/Carruth-PGO-720.mov">Video</a> (Computer) <a href="videos/Carruth-PGO-360.mov">Video</a> (Mobile)<br>
 Profile Guided Optimization (PGO) is one of the most fundamental weaknesses in the LLVM optimization portfolio. We have had several attempts to build it, and to this day we still lack a holistic platform for driving optimizations through profiling. This talk will consist of three light-speed crash courses on where PGO is in LLVM, where it needs to be, and how several of us are working to get it there.
 <br>
 First, we will present some motivational background on what PGO is good for and what it isn't. We will cover exactly how profile information interacts with the LLVM optimizations, the strategies we use at a high level to organize and use profile information, and the specific optimizations that are in turn driven by it. Much of this will cover infrastructure as it exists today, with some forward-looking information added into the mix.
 <br>
 Next, we will cover one planned technique for getting profile information into LLVM: AutoProfile. This technique simplifies the use and deployment of PGO by using external profile sources such as Linux perf events or other sample-based external profilers. When available, it has some key advantages: no instrumentation build mode, reduced instrumentation overhead, and more predictable application behavior by using hardware to assist the profiling.
 <br>
 Finally, we will cover an alternate strategy to provide more traditional and detailed profiling through compiler inserted instrumentation. This approach will also strive toward two fundamental goals: resilience of the profile to beth source code and compiler changes, and visualization of the profile by developers to understand how their code is being exercised. The second draws obvious parallels with code coverage tools, and the design tries to unify these two use cases in a way that the same infrastructure can drive both.

 </p>

 <div class="www_sectiontitle" id="poster">Poster Abstracts</div>
 <p>
 <b>Finding a few needles in some large haystacks: Identifying missing target optimizations using a superoptimizer</b><br>
 <i>Hal Finkel - Argonne National Laboratory</i><br>
 <a href="slides/Finkel-Poster.pdf">Poster</a><br>
 So you're developing an LLVM backend, and you've added a bunch of TableGen patterns, custom DAG combines and other lowering code; are you done? This poster describes the development of a specialized superoptimizer, applied to the output of the compiler on large codebases, to look for missing optimizations in the PowerPC backend. This superoptimizer extracts potentially-interesting instruction sequences from assembly code, and then uses the open-source CVC4 SMT solver to search for provably-correct shorter alternatives.
 </p>

 <p>
 <b>Intel® AVX-512 Architecture. Comprehensive vector extension for HPC and enterprise</b><br>
 <i>Elena Demikhovsky, Intel® Software and Services Group - Israel</i><br>
 <a href="slides/Demikhovsky-Poster.pdf">Poster</a><br>
 Knights Landing (KNL) is the second generation of the Intel® MIC architecture-based products. KNL will support Intel® Advanced Vector Extensions 512  instruction set architecture, a significant leap in SIMD support.  This new ISA, designed with unprecedented level of richness, offers a new level of support and opportunities for vectorizing compilers to target efficiently. The poster presents Intel®AVX-512 ISA and shows how the new capabilities may be used in LLVM compiler.
 </p>

 <p>
 <b>Fracture: Inverting the Target Independent Code Generator</b><br>
 <i>Richard T. Carback III – Charles Stark Draper Laboratories</i><br>
 <a href="slides/Carback-Poster.pdf">Poster</a><br>
 Fracture is a TableGen backend and associated library that ingests a basic block of target instructions and emits a DAG which resembles the post-legalization phase of LLVM’s SelectionDAG instruction selection process. It leverages the pre-existing target TableGen definitions, without modification, to provide a generic way to abstract LLVM IR efficiently from different target instruction sets. Fracture can speed up a variety of applications and also enable generic implementations of a number of static and dynamic analysis tools. Examples include interactive debuggers or disassemblers that provide LLVM IR representations to users unfamiliar with the instruction set, static analysis algorithms that solve indirect control transfer (ICT) problems modified for IR to use KLEE or other LLVM technologies, and IR-based decompilers or emulators extended to work on machine binaries.
 </p>

 <p>
 <b>Automatic generation of LLVM backends from LISA</b><br>
 <i>Jeroen Dobbelaere - Synopsys</i><br>
 <a href="slides/Dobbelaere-Poster.pdf">Poster</a><br>
 LISA (language for instruction-set architectures) allows for the efficient specification of processor architectures,
 including non-standard, customized architectures. Using a LISA input specification designers can automatically
 generate instruction-set simulator, assembler, linker, debugger interface as well as RTL.
 <br>
 We have extended LISA to allow for the generation of a LLVM compiler backend tailored to the custom architecture.
 This work includes the development of a new scheduler that is able to handle hazards with high latency and delay slots,
 expanding the applicability of LLVM to a wider range of architectures. The LISA-based design flow allows for rapid
 architectural explorations, profiling dozens of different processors architectures within hours, with the automatic
 generation of a LLVM compiler being a key enabler of this design methodology.
 </p>

 <p>
 <b> clad - Automatic Differentiation with Clang</b><br>
 <i>Violeta Ilieva (Princeton University), CERN; Vassil Vassilev, CERN</i><br>
 <a href="slides/Vassilev-Poster.pdf">Poster</a><br>
 Automatic differentiation (AD) evaluates the derivative of a function specified in a computer program by applying a set of techniques to change the semantics of that function. Unlike other methods for differentiation, such as numerical and symbolic, AD yields machine-precision derivatives even of complicated functions at relatively low processing and storage costs. We would like to present our AD tool, clad - a clang plugin that derives C++ functions through implementing source code transformation and employing the chain rule of differential calculus in its forward mode. That is, clad decomposes the original functions into elementary statements and generates their derivatives with respect to the user-defined independent variables. The combination of these intermediate expressions forms additional source code, built through modifying clang’s abstract syntax tree (AST) along the control flow. Compared to other tools, clad has the advantage of relying on clang and llvm modules for parsing the original program. It uses clang's plugin mechanism for constructing the derivative's AST representation, for generating executable code, and for performing global analysis. Thus it results in low maintenance, high compatibility, and excellent performance.
 </p>


 <div class="www_sectiontitle" id="light">Lightning Talk Abstracts</div>
 <p>
 <b>Fixing MC for ARM v7-A: Just a few corner cases – how hard can it be?</b><br>
 <i>Mihail Popa - ARM</i><br>
 <a href="slides/Popa-MCARM.pdf">Slides</a><br>
 <a href="videos/Popa-MCARM-720.mov">Video</a> (Computer) <a href="videos/Popa-MCARM-360.mov">Video</a> (Mobile)<br>
 In 2012, MC Hammer was presented as a testing infrastructure to exhaustively verify the MC layer implementation for the ARM backend. Within ARM we have been working to fix any bugs and we have reached the point where all but one problem remains unsolved.  Some of the issues discovered in this process have proven to be excessively difficult to fix. The purpose of the presentation is to give a brief rundown of the major headaches and to suggest possible courses of action for improving LLVM infrastructure.
 </p>

 <p>
 <b>VLIW Support in the MC Layer</b><br>
 <i>Mario Guerra - Qualcomm Innovation Center, Incorporated</i><br>
 <a href="slides/Guerra-VLIW.pdf">Slides</a><br>
 <a href="videos/Guerra-VLIW-720.mov">Video</a> (Computer) <a href="videos/Guerra-VLIW-360.mov">Video</a> (Mobile)<br>
 Modern DSP architectures such as Hexagon use VLIW instruction packets, which are not well suited to the single instruction streaming model of the LLVM MC layer. Developing an assembler for Hexagon presents unique challenges in the MC layer, especially since Hexagon leverages an optimizing assembler to achieve maximum performance. It is possible to support VLIW within the MC layer by treating every MC instruction as a bundle, and adding all instructions in a packet as sub instruction operands. Furthermore, subclassing MCInst to create a target-specific type of MCInst allows us to capture packet information that will be used to make optimization decisions prior to emitting the code to object format.
 </p>

 <p>
 <b>Link-Time Optimization without Linker Support</b><br>
 <i>Yunzhong Gao - Sony Computer Entertainment America</i><br>
 <a href="slides/Gao-LTO.pdf">Slides</a><br>
 <a href="videos/Gao-LTO-720.mov">Video</a> (Computer) <a href="videos/Gao-LTO-360.mov">Video</a> (Mobile)<br>
 LLVM's plugin for the Gold linker enables link-time optimization (LTO).  But the toolchain for PlayStation®4 does not include Gold.  Here's how we achieved LTO without a bitcode-aware linker.
 </p>

 <p>
 <b>A comparison of the DWARF debugging information produced by LLVM and GCC</b><br>
 <i>Keith Walker, ARM</i><br>
 <a href="slides/Walker-DWARF.pdf">Slides</a><br>
 <a href="videos/Walker-DWARF-720.mov">Video</a> (Computer) <a href="videos/Walker-DWARF-360.mov">Video</a> (Mobile)<br>
 This talk explores the quality of the DWARF debugging information generated by LLVM by
 comparing it with that produced by GCC for ARM/AArch64 based targets. It highlights where LLVM's debugging information is superior to that generated by GCC
 and also where there are deficiencies and scope for further development.
 I will also explain how these difference translate into good or bad debug experiences
 for users of LLVM.
 </p>

 <p>
 <b>aarch64 neon work</b><br>
 <i>Ana Pazos - QuIC, Jiangning Liu - ARM </i><br>
 <a href="videos/Pazos-Aarch64-720.mov">Video</a> (Computer) <a href="videos/Pazos-Aarch64-360.mov">Video</a> (Mobile)<br>
 <a href="slides/Pazos-Aarch64.pdf">Slides</a><br>ARM and Qualcom are implementing aarch64 advanced SIMD (neon) instruction set. We as a joint team will be implementing all of 25 classes of neon instructions on MC layer as well as all of ACLE(ARM C Language Extension) intrinsics on C level. Our talk will highlight the design choice of unique arm_neon.h for both ARM(aarch32) and aarch64, appropriate decision making of value types on LLVM IR for generating SISD instruction classes, the patterns’ qualities in .td files by reducing LLVM IR intrinsics, and all of the test categories to build a robust back-end. Finally, we’d like to mention some future plan like enabling machine instruction based scheduler, and performance tuning etc.
 </p>

 <p>
 <b>JavaScript JIT with LLVM</b><br>
 <i>Filip Pizlo - Apple Inc.</i><br>
 <a href="slides/Pizlo-JavascriptJIT.pdf">Slides</a><br>
 <a href="videos/Pizlo-JavascriptJIT-720.mov">Video</a> (Computer) <a href="videos/Pizlo-JavascriptJIT-360.mov">Video</a> (Mobile)<br>
 Dynamic languages present unique challenges for compilation, such as the need for type speculation and self-modifying code.  This talk shows how to add support for these features to LLVM and use them to implement a JIT for JavaScript.
 </p>

 <p>
 <b>Debug Info Quick Update</b><br>
 <i>Eric Christopher - Google Inc.</i><br>
 <a href="slides/Christopher-DebugInfo.pdf">Slides</a><br>
 <a href="videos/Christopher-DebugInfo-720.mov">Video</a> (Computer) <a href="videos/Christopher-DebugInfo-360.mov">Video</a> (Mobile)<br>
 A quick update on what's been going on in debug info support since the Euro meeting.
 </p>

 <p>
 <b>lld a linker framework</b><br>
 <i>Shankar Easwaran, Qualcomm Innovation Centre.</i><br>
 <a href="slides/Easwaran-LLD.pdf">Slides</a><br>
 <a href="videos/Easwaran-LLD-720.mov">Video</a> (Computer) <a href="videos/Easwaran-LLD-360.mov">Video</a> (Mobile)<br>
 The lld project is working towards becoming a production quality linker targeting PECOFF, Darwin, ELF formats.The lld project is under heavy development. The talk discusses on how lld achieves universal linking and how its moving towards becoming a linker framework that could be an integral part of llvm. The talk continues to discuss by exposes new opportunities with linking like, lld API's, Symbol resolution improvements, Link time optimizations(LTO) and enhancing the user experience by providing diagnostics, user driven inputs that drive linker behavior.
 </p>

 	<div class="www_sectiontitle" id="bof">BoF Abstracts</div>
 <p>
 <b><a id="bof1">BOF: Performance Tracking & Benchmarking Infrastructure
 </a></b><br>
 <i>Kristof Beyls - ARM</i><br>
 We lack a good public infrastructure to efficiently track performance
 improvements/regressions easily. As a small step to improve on the
 current situation, I propose to organize a BoF to discuss mainly the
 following topics:
 <br>
 (a) What advantages do we want the performance tracking and
    benchmarking infrastructure to give us?
 <br>
 (b) What are the main technical and non-technical challenges we expect
    for setting up an infrastructure?
 </p>

 <p>
 <b><a id="bof2">BOF: TableNextGen
 </a></b><br>
 <i>Mihail Popa - ARM</i><br>
 Tablegen is an essential component of the LLVM ecosystem and time has come to consider its evolution. The largest issues are the lack of formal specification, the mixing of logical concepts and the unsuitability for automated generation. The aim of this BoF is to gather ideas toward an improved specification language which follows the generally accepted criteria for domain specific languages: well defined domain meta-models, formally defined semantics, simplicity, expressiveness, lack of redundancy.
 </p>

 <p>
 <b><a id="bof3">BOF: Debug Info
 </a></b><br>
 <i>Eric Christopher - Google</i><br>
 </p>

 <p>
 <b><a id="bof4">BOF: Extending the Sanitizer tools and porting them to other platforms
 </a></b><br>
 <i>Kostya Serebryany - Google,
 Alexey Samsonov - Google,
 Evgeniy Stepanov - Google</i><br>
 </p>

 <p>
 <b><a id="bof5">BOF: High Level Loop Optimization / Polly
 </a></b><br>
 <i>Tobias Grosser - INRIA,
 Sebastian Pop - QuIC,
 Zino Benaissa - QuIC</i><br>
 Discussions about Loop Optimizations, both generic ones as well as
 polyhedral Loop Optimizations as implemented in Polly. Topics include
 the pass order for high level loop optimizations, scalar evolution,
 dependence analysis, high level loop optimizations in core LLVM, the polyhedral infrastructure of Polly as well as the isl polyhedral support library.
 </p>

 <p>
 <b><a id="bof6">BOF: Optimizations using LTO
 </a></b><br>
 <i>Zino Benaissa - QuIC</i><br>
 </p>

 <p>
 <b><a id="bof7">BOF: JIT & MCJIT
 </a></b><br>
 <i>Andy Kaylor - Intel Corporation</i><br>

 </p>

 <!-- *********************************************************************** -->
 <hr>

 <!--#include virtual="../../footer.incl" -->