Reland "[X86][MC] Always emit `rep` prefix for `bsf`"

`BMI` new instruction `tzcnt` has better performance than `bsf` on new
processors. Its encoding has a mandatory prefix '0xf3' compared to
`bsf`. If we force emit `rep` prefix for `bsf`, we will gain better
performance when the same code run on new processors.

GCC has already done this way: https://c.godbolt.org/z/6xere6fs1

Fixes #34191

Reviewed By: craig.topper, skan

Differential Revision: https://reviews.llvm.org/D130956

GitOrigin-RevId: 7f648d27a85a98fa077f0968dea081821627d477
4 files changed