[libc] Simplifies multi implementations and benchmarks
This is a follow up on D101524 which:
- simplifies cpu features detection and usage,
- flattens target dependent optimizations so it's obvious which implementations are generated,
- provides an implementation targeting the host (march/mtune=native) for the mem* functions,
- makes sure all implementations are unittested (provided the host can run them),
- makes sure all implementations are benchmarkable (provided the host can run them).
Differential Revision: https://reviews.llvm.org/D101895
GitOrigin-RevId: 541f107871bc9c020925a6e5342542a47c902d12
5 files changed