tree 09a314e24dfefd871a5f830b3563720884a9fd90
parent 4e63e0457cc1f768c628e71a0786fdb8a6ec271e
author Srinivasa Ravi <srinivasar@nvidia.com> 1747213799 +0530
committer GitHub <noreply@github.com> 1747213799 +0530
gpgsig -----BEGIN PGP SIGNATURE-----
 
 wsFcBAABCAAQBQJoJF3nCRC1aQ7uu5UhlAAAnqoQAFAeD/4Gp8iWQu0EImdQxxdr
 P2h+sJzaUOMJAxVgPGvPOCMPk7pKNWEuDlKjcI4QmuKJsDNQKD+rtJu4A8yLUzen
 8s7FaU7IGMuYwzCwLi9ZOr20VIA7trz3hFJnmQ/YLTzk3yztjDRlQXOl6ZlpHPcG
 j5y17mLsFYyGvrvc/jB4eL8tvh1LvejN6kV1tNmOppqRptGKiXJAuLC6gHqidhUI
 Y5KoMEv3o0C45vlhiaaWm6gp5xDbrrjurT/T021BfdBsIiPxruXaYR027HBmtWJH
 IdVlzk5BYeic3auaZJpVyijIbimKcF9lHapW7KsHJZIbIwAHtYeNt3xAsRYSt1XG
 /PIFl934z3VB7+u0kykPCePJy/FtB9Oiij7skei5XfK2UQwBDnJfIC7HpcrY8Fpx
 x9VSTO5mPFDx3NVAlxv6oVs7wqPmSSVGeGJYzjUZerECGaNqJRXt6On0HUA4fbKJ
 0TC/O884+JVMZzbdtipHnleIRHxqTC3/Yc1whD+gMDx2D5tLL0N0H5m5UgiqK1Cv
 tNx8j/8z2+wfKBVxYX4UggpfN8lLbBtFl8sOn99EqqoV1V1itpguGZ5h/fUG0fFH
 SPuFYFTPtNUogHM3Hqc3xECdNzEGQD+ix8SVpgvZbGwJXBMPuCuGD1KS/0z0dNqb
 BK1OtyGwAJaL0HPHk1D7
 =0LhM
 -----END PGP SIGNATURE-----
 

[NVPTX] Add intrinsics and clang builtins for conversions of f4x2 type (#139244)

This change adds intrinsics and clang builtins for the cvt instruction
variants of type (FP4) `.e2m1x2`. introduced in PTX 8.6 for `sm_100a`,
`sm_101a`, and `sm_120a`.

Tests are added in `NVPTX/convert-sm100a.ll` and
`clang/test/CodeGen/builtins-nvptx.c` and verified through ptxas 12.8.0.

PTX Spec Reference:
https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-cvt