[libc++] Implement std::gcd using the binary version (#77747)

The binary version is four times faster than current implementation in
my setup, and generally considered a better implementation.

Code inspired by https://en.algorithmica.org/hpc/algorithms/gcd/ which
itself is inspired by
https://lemire.me/blog/2013/12/26/fastest-way-to-compute-the-greatest-common-divisor/

Fix #77648

GitOrigin-RevId: 27a062e9ca7c92e89ed4084c3c3affb9fa39aabb
10 files changed