Avoid copying positions and normals from their source arrays. This is simplified by using separate loops for the original data and accumulate cases. I observed a performance improvement in the typical benchmark file of about 13%, from 0.54s to 0.48s for a brush stroke affecting most of a 6 million vertex grid.