[LIBC] Add optimized memcpy routine for AArch64

This patch adds an optimized memcpy routine for AArch64 tuned and benchmarked
on Neoverse-N1.

Differential Revision: https://reviews.llvm.org/D92235

GitOrigin-RevId: 369f7de3135a517a69c45084d4b175f7b0d5e6f5
3 files changed