`my_memcmp` didn't work properly comparing memory sizes not aligned to 4 bytes,
this worked while we used guarded-alloc (which always wrote a guard at the end of each allocation).
Since moving to lockfree allocator it could read uninitialized memory.
It also consistently performed ~10-30% worse then glibc's.
This is typically well optimized, no need to do ourselves.