Instead of allocating a vector of the basis weights cache for
each evaluated point, allocate a single vector for all of the
weights. This should reduce memory usage by avoiding the
overhead of storing many vectors. I noticed a small performance
improvement to evaluated position calculation with an order of 5,
which is larger than `Vector`'s default inline buffer capacity.
This change is possible because of previous commits that
made the basis cache for each evaluated point always have
the same "order" size.