[X86] Add rewrite pattern for SSE41/AVX1 roundss/sd + blendps/pd (#172056)

Due to a previous PR (https://github.com/llvm/llvm-project/pull/171227),
operations likeĀ `_mm_ceil_sd` compile to suboptimal assembly:
```asm
roundsd xmm1, xmm1, 10
blendpd xmm0, xmm1, 1
```
This PR introduces a rewrite pattern to mitigate this, and fuse the corresponding operations.

GitOrigin-RevId: a484de1e06efc4c80accfa4d8be97575133cb26b
4 files changed