| commit | 45c549857383eade0c47db75951fb9441260653a | [log] [tgz] |
|---|---|---|
| author | S. VenkataKeerthy <31350914+svkeerthy@users.noreply.github.com> | Fri Aug 29 14:56:56 2025 -0700 |
| committer | GitHub <noreply@github.com> | Fri Aug 29 14:56:56 2025 -0700 |
| tree | d34c8e7778f5599193b9c117e27d5079cda166a8 | |
| parent | e317c7e36f6681e88b62824b243f383dec8b106f [diff] |
[IR2Vec] Refactor vocabulary to use canonical type IDs (#155323) Refactor IR2Vec vocabulary to use canonical type IDs, improving the embedding representation for LLVM IR types. The previous implementation used raw Type::TypeID values directly in the vocabulary, which led to redundant entries (e.g., all float variants mapped to "FloatTy" but had separate slots). This change improves the vocabulary by: 1. Making the type representation more consistent by properly canonicalizing types 2. Reducing vocabulary size by eliminating redundant entries 3. Improving the embedding quality by ensuring similar types share the same representation (Tracking issue - #141817)
Welcome to the LLVM project!
This repository contains the source code for LLVM, a toolkit for the construction of highly optimized compilers, optimizers, and run-time environments.
The LLVM project has multiple components. The core of the project is itself called “LLVM”. This contains all of the tools, libraries, and header files needed to process intermediate representations and convert them into object files. Tools include an assembler, disassembler, bitcode analyzer, and bitcode optimizer.
C-like languages use the Clang frontend. This component compiles C, C++, Objective-C, and Objective-C++ code into LLVM bitcode -- and from there into object files, using LLVM.
Other components include: the libc++ C++ standard library, the LLD linker, and more.
Consult the Getting Started with LLVM page for information on building and running LLVM.
For information on how to contribute to the LLVM project, please take a look at the Contributing to LLVM guide.
Join the LLVM Discourse forums, Discord chat, LLVM Office Hours or Regular sync-ups.
The LLVM project has adopted a code of conduct for participants to all modes of communication within the project.