[AMDGPU] SIFoldOperands: eagerly delete dead copies

This is cheap to implement, means less work for future passes like
MachineDCE, and slightly improves the folding in some cases.

Differential Revision: https://reviews.llvm.org/D100117
diff --git a/llvm/test/CodeGen/AMDGPU/fold-operands-order.mir b/llvm/test/CodeGen/AMDGPU/fold-operands-order.mir
index f8c6547..b23faff 100644
--- a/llvm/test/CodeGen/AMDGPU/fold-operands-order.mir
+++ b/llvm/test/CodeGen/AMDGPU/fold-operands-order.mir
@@ -6,10 +6,7 @@
 # aren't made in users before the def is seen.
 
 # GCN-LABEL: name: mov_in_use_list_2x{{$}}
-# GCN: %2:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
-# GCN-NEXT: %3:vgpr_32 = COPY undef %0
-
-# GCN: %1:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
+# GCN: %3:vgpr_32 = COPY undef %0
 
 
 name: mov_in_use_list_2x