[AMDGPU] Correct rmw atomics s_waitcnt generation The AMD GPU SIMemoryLegalizer was using the ordering address space rather than the instruction address space when determining the s_waitcnt to generate to ensure that a read-modify-write atomic has completed. This resulted in additional unnecessary counters being waited on. Differential Revision: https://reviews.llvm.org/D96743

commit: c62b737ad655f189cf76f4324ba04317133d6648 [log] [tgz]
author: Tony Tye <Tony.Tye@amd.com> Tue Feb 16 03:22:34 2021 +0000
committer: Tony Tye <Tony.Tye@amd.com> Wed Feb 17 01:32:29 2021 +0000
tree: abed073242f69a7988b14563c4d915f875f070e5
parent: f456959a9331e628e8214930e6d4dceb34d75ea0 [diff] [blame]
diff --git a/llvm/test/CodeGen/AMDGPU/atomicrmw-nand.ll b/llvm/test/CodeGen/AMDGPU/atomicrmw-nand.ll
index 76cd337..774270e 100644
--- a/llvm/test/CodeGen/AMDGPU/atomicrmw-nand.ll
+++ b/llvm/test/CodeGen/AMDGPU/atomicrmw-nand.ll

@@ -42,7 +42,7 @@
 ; GCN-NEXT:    v_or_b32_e32 v2, -5, v2
 ; GCN-NEXT:    s_waitcnt vmcnt(0) lgkmcnt(0)
 ; GCN-NEXT:    global_atomic_cmpswap v2, v[0:1], v[2:3], off glc
-; GCN-NEXT:    s_waitcnt vmcnt(0) lgkmcnt(0)
+; GCN-NEXT:    s_waitcnt vmcnt(0)
 ; GCN-NEXT:    buffer_wbinvl1_vol
 ; GCN-NEXT:    v_cmp_eq_u32_e32 vcc, v2, v3
 ; GCN-NEXT:    s_or_b64 s[4:5], vcc, s[4:5]
commit	c62b737ad655f189cf76f4324ba04317133d6648	[log] [tgz]
author	Tony Tye <Tony.Tye@amd.com>	Tue Feb 16 03:22:34 2021 +0000
committer	Tony Tye <Tony.Tye@amd.com>	Wed Feb 17 01:32:29 2021 +0000
tree	abed073242f69a7988b14563c4d915f875f070e5
parent	f456959a9331e628e8214930e6d4dceb34d75ea0 [diff] [blame]