7d77caab9b375503a349941d67748a97ef06309f
Mesh shape key data elements are always of the `ShapeKeyPoint` type and only have a `.co` property which is stored contiguously. However, the `.data` collection property's elements are dynamically typed because Legacy Curve shape key data elements use a mix of `ShapeKeyCurvePoint` and `ShapeKeyBezierPoint` elements. The necessary support for handling the dynamic/mixed typing of the shape key data elements disables raw access to the collection and its elements which makes `foreach_get`/ `foreach_set` slower. `ShapeKey.points` is a new collection property whose element type is fixed as `ShapeKeyPoint` and uses the `#rna_iterator_array_next` and `#rna_iterator_array_get` collection functions which enable raw access to the collection. To complete the raw access to Mesh shape key data, the `.co` property of `ShapeKeyPoint` also needs raw access. To accomplish this, the RNA definition for `ShapeKeyPoint` now uses the `#vec3f` DNA struct and its `.co` property is set to start from the `x` field of the `#vec3f` DNA struct. Lattice shape keys also use `ShapeKeyPoint` and also benefit from using `.points` instead of `.data`, though Lattice objects typically have far fewer data, such that performance is of minimal concern. On shape keys belonging to Legacy Curves (`bpy.types.Curve`/ `bpy.types.SurfaceCurve`), `.points` will simply always be empty because they do not have `ShapeKeyPoint` elements. --- **Performance** The increase in performance is specifically for foreach_get/foreach_set, there is no noticeable performance difference to iterating through or accessing individual elements of `.points` directly through Python, e.g. `.points[0].co` vs `.data[0].co`. `foreach_get` with a Python list is about 2.8 times faster. `foreach_get` with an incompatible buffer (NumPy ndarray with np.double dtype) is about 1.4 times faster. `foreach_get` with a compatible buffer now scales better with larger collections, so it is about 11.7 times faster at 100 elements and about 200.0 times faster at 100000 elements, dropping off to about 65 times faster for much larger collections. The increase in `foreach_set` performance is slightly better than in each `foreach_get` case, but scales the same overall. `foreach_set` with a Python list is about 3.8 times faster. `foreach_set` with an incompatible buffer (NumPy ndarray with np.double dtype) is about 1.45 times faster. `foreach_set` with a compatible buffer now scales better with larger collections, so it is about 13.4 times faster at 100 elements and about 220.0 times faster at 100000 elements, dropping off to about 70 times faster for much larger collections. The performance drop-off might be to do with hardware/OS specifics of `memcpy`. The drop-off occurs for me on Windows 10 with my AMD Ryzen 7 3800X at just above 1.5MiB of data copied (1572888B copied -> 200x faster, 1572900B copied -> 75x faster). Pull Request: https://projects.blender.org/blender/blender/pulls/116637
…
Blender
Blender is the free and open source 3D creation suite. It supports the entirety of the 3D pipeline-modeling, rigging, animation, simulation, rendering, compositing, motion tracking and video editing.
Project Pages
Development
License
Blender as a whole is licensed under the GNU General Public License, Version 3. Individual files may have a different, but compatible license.
See blender.org/about/license for details.
Description
Languages
C++
78%
Python
14.9%
C
2.9%
GLSL
1.9%
CMake
1.2%
Other
0.9%
