[StreamExecutor] Add DeviceMemory and kernel arg packing

Add types for device memory and add the code that knows how to pack these
device memory types if they are passed as arguments to kernel launches.

Differential Revision: https://reviews.llvm.org/D23211

