Part of #118145.
Structure the sampling code more like the brush code, processing the
distances for all vertices in a node at the same time. However, because
this is such a common code path, I compromised a bit on the code's
simplicity to improve performance, mostly by avoiding the use of more
local arrays like we often do for brushes, and also by skipping loop
iterations when the factor is zero. Avoiding the square root for filtered
vertices had a large performance impact of about 5-10% for example.
This is the last use of the sculpt brush test functions and structs which
allows removing them, completing a large part of the overall refactor.
The newly added `_sq` versions of the distance functions are deduplicated
from the non-square versions by separating the square roots to a separate
loop at the end. In my testing that had a ~1% performance cost, though
with variable timing results. I hope that smaller nodes will remove
that cost in the future.
Some rough numbers with the brush benchmark file:
Before: 0.471->0.476s
Without the filtered sqrt: ~0.5-0.53s
After: ~0.473-0.475s
Pull Request: https://projects.blender.org/blender/blender/pulls/126058