[X86][SSE] Use PSLLDQ/PSRLDQ to mask out zeroable ends of a shuffle

As suggested on PR40318, this patch uses PSLLDQ/PSRLDQ to lower shuffles to zero out the ends of a vector, leaving a sequential inner section.

For pre-SSSE3 we do this for shuffles with zeros at either end (requiring up to 3 shifts), but once PSHUFB is available I've limited this to shuffles with a single zeroable end (2 shifts).

Differential Revision: https://reviews.llvm.org/D56784

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352883 91177308-0d34-0410-b5e6-96231b3b80d8
5 files changed