[AMDGPU] AMDGPULateCodeGenPrepare Legacy PM: replace `setPreservesAll()` with `setPreservesCFG()` (#148167)
This PR depends on #148165; the first commit
(90f1d0a881a21a8b4f192622d798c290770fda63) belongs to that PR. The
changes are distinct, so separate PRs seemed like the best option. I
don't have commit access, so I couldn't use user-branches to mark the
dependency.
As AMDGPULateCodeGenPrepare actually performs changes that invalidate
Uniformity Analysis; use `setPreservesCFG()` to mark this, instead of
`setPreservesAll()` which wrongly includes preserving Uniformity
Analysis.
Note that before #148165, this would still have preserved Uniformity
Analysis, hence the dependency. In addition, `amdgpu/llc-pipeline.cc`
needs to be changed when both changes are in effect, but those changes
would make the test fail if the PRs weren't based on one another.
Note on why this hasn't caused issues so far:
It just so happens that AMDGPULateCodeGenPrepare is always immediately
followed by AMDGPUUnifyDivergentExitNodes, which *does* invalidate most
analyses, including Uniformity. And because UnifyDivergentExitNodes only
looks at terminators, and LateCGP seemingly does not replace uniform
values with divergent values, or divergent values with uniform values,
and it only *inserts new values that are not looked at by
UnifyDivergentExitNodes*, this bug remained hidden.
---
I ran `git-clang-format` on my changes. I tested them using the
`check-llvm` target; no unexpected failures occurred after I made the
change to `amdgpu/llc-pipeline.ll`.
diff --git a/llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp b/llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp
index 523c66c..56113e6 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp
@@ -545,7 +545,8 @@
AU.addRequired<TargetPassConfig>();
AU.addRequired<AssumptionCacheTracker>();
AU.addRequired<UniformityInfoWrapperPass>();
- AU.setPreservesAll();
+ // Invalidates UniformityInfo
+ AU.setPreservesCFG();
}
bool runOnFunction(Function &F) override;
diff --git a/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll b/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
index 2a5c652..3e17be6 100644
--- a/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
+++ b/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
@@ -255,6 +255,7 @@
; GCN-O1-NEXT: Uniformity Analysis
; GCN-O1-NEXT: AMDGPU IR late optimizations
; GCN-O1-NEXT: Post-Dominator Tree Construction
+; GCN-O1-NEXT: Uniformity Analysis
; GCN-O1-NEXT: Unify divergent function exit nodes
; GCN-O1-NEXT: Dominator Tree Construction
; GCN-O1-NEXT: Cycle Info Analysis
@@ -559,6 +560,7 @@
; GCN-O1-OPTS-NEXT: Uniformity Analysis
; GCN-O1-OPTS-NEXT: AMDGPU IR late optimizations
; GCN-O1-OPTS-NEXT: Post-Dominator Tree Construction
+; GCN-O1-OPTS-NEXT: Uniformity Analysis
; GCN-O1-OPTS-NEXT: Unify divergent function exit nodes
; GCN-O1-OPTS-NEXT: Dominator Tree Construction
; GCN-O1-OPTS-NEXT: Cycle Info Analysis
@@ -875,6 +877,7 @@
; GCN-O2-NEXT: Uniformity Analysis
; GCN-O2-NEXT: AMDGPU IR late optimizations
; GCN-O2-NEXT: Post-Dominator Tree Construction
+; GCN-O2-NEXT: Uniformity Analysis
; GCN-O2-NEXT: Unify divergent function exit nodes
; GCN-O2-NEXT: Dominator Tree Construction
; GCN-O2-NEXT: Cycle Info Analysis
@@ -1206,6 +1209,7 @@
; GCN-O3-NEXT: Uniformity Analysis
; GCN-O3-NEXT: AMDGPU IR late optimizations
; GCN-O3-NEXT: Post-Dominator Tree Construction
+; GCN-O3-NEXT: Uniformity Analysis
; GCN-O3-NEXT: Unify divergent function exit nodes
; GCN-O3-NEXT: Dominator Tree Construction
; GCN-O3-NEXT: Cycle Info Analysis