[MLIR][XeGPU] Support partial subgroup lane distribution (#201667) for convert_layout Add lowering support in XeGPUSgToLaneDistribute for values that are distributed across only a fraction of the subgroup. - SgToLaneConvertLayout now lowers a rank-2 xegpu.convert_layout that shrinks the lane layout along the outer (distributed) dimension while keeping lane_data unchanged (e.g. [16, 1] -> [8, 1]). The partial-subgroup case is detected directly in the pattern: equal order, rank 2, unit inner lane layout, and a genuinely distributed outer lane layout (> 1, which also rules out the degenerate [1, 1] layout). Because the data is no longer replicated in every lane, it is gathered across lanes and the distributed outer dimension is doubled when the lane count is halved. - The cross-lane gather is factored into a dedicated helper, shuffleDataAsLaneLayoutChange(): it bitcasts the source to i32, issues gpu.shuffle up to fetch the values from the dropped lanes, and concatenates the lane-local and gathered data with vector.shuffle. Only halving the lane count (factor of two), rank-2 vectors, and bit widths that are a multiple of 32 are supported; other cases fail the match. - SgToLaneVectorExtractStridedSlice now adjusts the effective subgroup size when the source lane layout along the distributed dimension is smaller than the hardware subgroup size, so slice offsets/sizes are scaled correctly (e.g. a subgroup-space offset of 8 maps to a distributed offset of 1). Add a unit test exercising the dpas_mx scale operand path.
Welcome to the LLVM project!
This repository contains the source code for LLVM, a toolkit for the construction of highly optimized compilers, optimizers, and run-time environments.
The LLVM project has multiple components. The core of the project is itself called “LLVM”. This contains all of the tools, libraries, and header files needed to process intermediate representations and convert them into object files. Tools include an assembler, disassembler, bitcode analyzer, and bitcode optimizer.
C-like languages use the Clang frontend. This component compiles C, C++, Objective-C, and Objective-C++ code into LLVM bitcode -- and from there into object files, using LLVM.
Other components include: the libc++ C++ standard library, the LLD linker, and more.
Consult the Getting Started with LLVM page for information on building and running LLVM.
For information on how to contribute to the LLVM project, please take a look at the Contributing to LLVM guide.
Join the LLVM Discourse forums, Discord chat, LLVM Office Hours or Regular sync-ups.
The LLVM project has adopted a code of conduct for participants to all modes of communication within the project.