[AMDGPU] Fix CS scratch setup on pre-GCN3 ASICs

Prior to GCN3 s_load_dword offsets are in dwords rather than bytes.
Thus the scratch buffer descriptor offset must be adjusted for pre-GCN3 ASICs.

Reviewers: nhaehnle, tpr

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D56496

