)]}'
{
  "commit": "dafffe262d6d1114fa83ec155241aad4e7793845",
  "tree": "b375f4458f33ad8db8a1c13338c4a4b9fa146f26",
  "parents": [
    "33d5a3b455d3bb0d0487dabb98728aeaa8cba03b"
  ],
  "author": {
    "name": "Rahul Joshi",
    "email": "rjoshi@nvidia.com",
    "time": "Mon Sep 01 13:44:18 2025 -0700"
  },
  "committer": {
    "name": "GitHub",
    "email": "noreply@github.com",
    "time": "Mon Sep 01 13:44:18 2025 -0700"
  },
  "message": "[LLVM][MC][DecoderEmitter] Add support to specialize decoder per bitwidth (#154865)\n\nThis change adds an option to specialize decoders per bitwidth, which\ncan help reduce the (compiled) code size of the decoder code.\n\n**Current state**:\nCurrently, the code generated by the decoder emitter consists of two key\nfunctions: `decodeInstruction` which is the entry point into the\ngenerated code and `decodeToMCInst` which is invoked when a decode op is\nreached while traversing through the decoder table. Both functions are\ntemplated on `InsnType` which is the raw instruction bits that are\nsupplied to `decodeInstruction`.\n\nSeveral backends call `decodeInstruction` with different `InsnType`\ntypes, leading to several template instantiations of these functions in\nthe final code. As an example, AMDGPU instantiates this function with\ntype `DecoderUInt128` type for decoding 96/128-bit instructions,\n`uint64_t` for decoding 64-bit instructions, and `uint32_t` for decoding\n32-bit instructions. Since there is just one `decodeToMCInst` in the\ngenerated code, it has code that handles decoding for *all* instruction\nsizes. However, the decoders emitted for different instructions sizes\nrarely have any intersection with each other. That means, in the AMDGPU\ncase, the instantiation with InsnType \u003d\u003d DecoderUInt128 has decoder code\nfor 32/64-bit instructions that is *never exercised*. Conversely, the\ninstantiation with InsnType \u003d\u003d uint64_t has decoder code for\n128/96/32-bit instructions that is never exercised. This leads to\nunnecessary dead code in the generated disassembler binary (that the\ncompiler cannot eliminate by itself).\n\n**New state**:\nWith this change, we introduce an option\n`specialize-decoders-per-bitwidth`. Under this mode, the DecoderEmitter\nwill generate several versions of `decodeToMCInst` function, one for\neach bitwidth. The code is still templated, but will require backends to\nspecify, for each `InsnType` used, the bitwidth of the instruction that\nthe type is used to represent using a type-trait `InsnBitWidth`. This\nwill enable the templated code to choose the right variant of\n`decodeToMCInst`. Under this mode, a particular instantiation will only\nend up instantiating a single variant of `decodeToMCInst` generated and\nthat will include only those decoders that are applicable to a single\nbitwidth, resulting in elimination of the code duplication through\ninstantiation and a reduction in code size.\n\nAdditionally, under this mode, decoders are uniqued only within a given\nbitwidth (as opposed to across all bitwidths without this option), so\nthe decoder index values assigned are smaller, and consume less bytes in\ntheir ULEB128 encoding. As a result, the generated decoder tables can\nalso reduce in size.\n\nAdopt this feature for the AMDGPU and RISCV backend. In a release build,\nthis results in a net 55% reduction in the .text size of\nlibLLVMAMDGPUDisassembler.so and a 5% reduction in the .rodata size. For\nRISCV, which today uses a single `uint64_t` type, this results in a 3.7%\nincrease in code size (expected as we instantiate the code 3 times now).\n\nActual measured sizes are as follows:\n```\nBaseline commit: 72c04bb882ad70230bce309c3013d9cc2c99e9a7\nConfiguration: Ubuntu clang version 18.1.3, release build with asserts disabled.\n \nAMDGPU        Before       After      Change\n\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\n.text         612327       275607     55% reduction\n.rodata       369728       351336      5% reduction          \n\nRISCV:\n\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\n.text          47407       49187      3.7% increase   \n.rodata        35768       35839      0.1% increase\n```",
  "tree_diff": [
    {
      "type": "modify",
      "old_id": "70762a4a5ebaee66ea97c15c6e5b5ac2e8678257",
      "old_mode": 33188,
      "old_path": "llvm/include/llvm/MC/MCDecoder.h",
      "new_id": "459c8a6a5ea34044f6a454c6ac15f7207e5c797a",
      "new_mode": 33188,
      "new_path": "llvm/include/llvm/MC/MCDecoder.h"
    },
    {
      "type": "modify",
      "old_id": "619ff4e5c73c4091744de4919c9ac54f308e836f",
      "old_mode": 33188,
      "old_path": "llvm/lib/Target/AMDGPU/CMakeLists.txt",
      "new_id": "05295ae73be231fab69f6ea50199a16de5b314e6",
      "new_mode": 33188,
      "new_path": "llvm/lib/Target/AMDGPU/CMakeLists.txt"
    },
    {
      "type": "modify",
      "old_id": "6a2beeed41dfdbd454f1d2383a09e18bcf0a219a",
      "old_mode": 33188,
      "old_path": "llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp",
      "new_id": "80d194afa926be3a970fe599047a941bb78c7553",
      "new_mode": 33188,
      "new_path": "llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp"
    },
    {
      "type": "modify",
      "old_id": "f4d164bf10c3c6be1b31326c70204f9926e7b6b1",
      "old_mode": 33188,
      "old_path": "llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.h",
      "new_id": "ded447b6f8d5a75d14fa0817a21fab963da36cc9",
      "new_mode": 33188,
      "new_path": "llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.h"
    },
    {
      "type": "modify",
      "old_id": "47329b2c2f4d26ab5cda49c1cb67ec5fa0046c61",
      "old_mode": 33188,
      "old_path": "llvm/lib/Target/RISCV/CMakeLists.txt",
      "new_id": "531238ae85029dd52d578ea0d8ac0943869c729e",
      "new_mode": 33188,
      "new_path": "llvm/lib/Target/RISCV/CMakeLists.txt"
    },
    {
      "type": "modify",
      "old_id": "de1bdb4a8811cccbdc208090f4468866a36f970f",
      "old_mode": 33188,
      "old_path": "llvm/lib/Target/RISCV/Disassembler/RISCVDisassembler.cpp",
      "new_id": "c8b89f5192c3dc2a1aab5e5433de7040666b0dd7",
      "new_mode": 33188,
      "new_path": "llvm/lib/Target/RISCV/Disassembler/RISCVDisassembler.cpp"
    },
    {
      "type": "add",
      "old_id": "0000000000000000000000000000000000000000",
      "old_mode": 0,
      "old_path": "/dev/null",
      "new_id": "b4142e983ef776d71065f146c883b65c380bc631",
      "new_mode": 33188,
      "new_path": "llvm/test/TableGen/DecoderEmitterBitwidthSpecialization.td"
    },
    {
      "type": "modify",
      "old_id": "7bed18c19a5138fe75e167cc20269813723f2a70",
      "old_mode": 33188,
      "old_path": "llvm/test/TableGen/DecoderEmitterFnTable.td",
      "new_id": "8929e6da716e6f8850626199c6ca8ce474062753",
      "new_mode": 33188,
      "new_path": "llvm/test/TableGen/DecoderEmitterFnTable.td"
    },
    {
      "type": "modify",
      "old_id": "dbbf866f057e55b7a42b537e0b08984e1fa378e4",
      "old_mode": 33188,
      "old_path": "llvm/test/TableGen/HwModeEncodeDecode3.td",
      "new_id": "5e9ac7d17e45aa6051e1a1e4dcd46d17a2f037dd",
      "new_mode": 33188,
      "new_path": "llvm/test/TableGen/HwModeEncodeDecode3.td"
    },
    {
      "type": "modify",
      "old_id": "769c5895ec3c1e346dd0f46ff1e7d1d5f41b5fbf",
      "old_mode": 33188,
      "old_path": "llvm/test/TableGen/VarLenDecoder.td",
      "new_id": "10e254f7673e63eac90f618e60eccbf8115558b9",
      "new_mode": 33188,
      "new_path": "llvm/test/TableGen/VarLenDecoder.td"
    },
    {
      "type": "modify",
      "old_id": "e4992b9e9e725ecc0137e1bc48d554b32c9c0fc2",
      "old_mode": 33188,
      "old_path": "llvm/utils/TableGen/DecoderEmitter.cpp",
      "new_id": "354c2a788d5b1b688d283faffe23b6dbcdca119d",
      "new_mode": 33188,
      "new_path": "llvm/utils/TableGen/DecoderEmitter.cpp"
    },
    {
      "type": "modify",
      "old_id": "11bc53793650809f1897b172915a62e1abcc8c0f",
      "old_mode": 33188,
      "old_path": "llvm/utils/gn/secondary/llvm/lib/Target/AMDGPU/Disassembler/BUILD.gn",
      "new_id": "9cc98cd8642d64f490ed5b7594a95afc21067d2c",
      "new_mode": 33188,
      "new_path": "llvm/utils/gn/secondary/llvm/lib/Target/AMDGPU/Disassembler/BUILD.gn"
    },
    {
      "type": "modify",
      "old_id": "cb579221fd3669ab65efdf9f5a8aaafc78835d5f",
      "old_mode": 33188,
      "old_path": "llvm/utils/gn/secondary/llvm/lib/Target/RISCV/Disassembler/BUILD.gn",
      "new_id": "447a67af6be7baae2648e60783677f8f7843f8f9",
      "new_mode": 33188,
      "new_path": "llvm/utils/gn/secondary/llvm/lib/Target/RISCV/Disassembler/BUILD.gn"
    }
  ]
}
