[APInt] Optimize umul_ov

Change two costly udiv() calls to lshr(1)*RHS + left-shift + plus

On one 64-bit umul_ov benchmark, I measured an obvious improvement: 12.8129s -> 3.6257s

Note, there may be some value to special case 64-bit (the most common
case) with __builtin_umulll_overflow().

Differential Revision: https://reviews.llvm.org/D60669

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358730 91177308-0d34-0410-b5e6-96231b3b80d8
2 files changed