[LegalizeVectorOps] Improve the placement of ANDs in the ExpandLoad path for non-byte-sized loads.

When we need to merge two adjacent loads the AND mask for the low piece was still sized for the full src element size. But we didn't have that many bits. The upper bits are already zero due to the SRL. So we can skip the AND if we're going to combine with the high bits.

We do need an AND to clear out any bits from the high part. We were anding the high part before combining with the low part, but it looks like ANDing after the OR gets better results.

So we can just emit the final AND after the optional concatentation is done. That will handling skipping before the OR and get rid of extra high bits after the OR.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354655 91177308-0d34-0410-b5e6-96231b3b80d8
3 files changed