AMDGPU: Fix offsets for < 4-byte aggregate kernel arguments

We were still using the rounded down offset and alignment even though
they aren't handled because you can't trivially bitcast the loaded
value.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348658 91177308-0d34-0410-b5e6-96231b3b80d8
2 files changed