The term `looptri` was used ambiguously for both single & arrays.
The term `tri` was also used, causing `tri->tri`.
Use terms:
- `looptris` for an array or when dealing with multiple items.
- `looptri` is used when dealing with a single item.
- `lt` for a single MLoopTri variables & arguments.
This was already a convention but not followed closely.
The Realtime compositor currently relies on the GPU cache in image IDs.
That cache only supports single layer images, so multi-layer images will
be acquired without a cache, introducing significant IO bottlenecks for
the GPU compositor.
This patch ignores the image GPU cache and stores the images in the
static cache manager of the compositor. Draw data was introduced to the
image ID for proper cache invalidation, like other IDs such as masks.
The downside is that the cache will no longer be shared between EEVEE
and the compositor. But realistically, images are not typically shared
between materials and compositors.
This is just a temporary solution until we have proper GPU storage
support for image buffers.
Pull Request: https://projects.blender.org/blender/blender/pulls/115511
This gives better asserts in debug builds through use of Span, more
safety when name convention attributes happen to have different types
or domains, and simpler code in some cases. But the main reasoning is to
avoid relying on the specifics of CustomData more to allow us to replace
it in the future.
See c4446d7924
When the "fully flat" state comes from "sharp_edge" and "sharp_face"
doesn't exist, we need to check for that for every face when extracting
normals. Eventually these loops should be unrolled so we don't have a
function call per face. That would remove the cost of this check.
The build scripts are still referring to gpu tests as being opengl.
Although they can also use Metal or Vulkan. This PR only replaces
the work `opengl` with `gpu` for build options.
Special note is that the windows argument `with_opengl_tests` is
also replaced with `with_gpu_tests` for consistency.
Pull Request: https://projects.blender.org/blender/blender/pulls/116030
All the relevant code is C++ now, so we don't need to complicate things
with the trip through C anymore. We will still need some wrappers, since
opensubdiv is an optional dependency though. The goal is to make it
simpler to remove the unnecessary/costly abstraction levels between
Blender mesh data and the opensubdiv code.
"mesh" reads much better than "me" since "me" is a different word.
There's no reason to avoid using two more characters here. Replacing
all of these at once is better than encountering it repeatedly and
doing the same change bit by bit.
This PR makes it so that locked materials as well as hidden materials will not have their edit points and edit lines visible.
Note: Previously in grease pencil, strokes with hidden materials would still display the edit lines. This behavior is now fixed in GPv3.
Pull Request: https://projects.blender.org/blender/blender/pulls/115740
This allows for parallel processing of refraction.
Also fix a limitation of using AO node in
refraction materials.
This is needed for the grouping of raytracing
passes.
This increases VRAM consumption a bit (8MB for fullHD
frame) but has no impact on performance.
This include a needed fix to the `draw::Texture::swap`.
### Later work
- Limit the memory overhead to the cases where it is needed.
Pull Request: https://projects.blender.org/blender/blender/pulls/115912
NDEBUG is part of the C standard and disables asserts. Only this will
now be used to decide if asserts are enabled.
DEBUG was a Blender specific define, that has now been removed.
_DEBUG is a Visual Studio define for builds in Debug configuration.
Blender defines this for all platforms. This is still used in a few
places in the draw code, and in external libraries Bullet and Mantaflow.
Pull Request: https://projects.blender.org/blender/blender/pulls/115774
Also use const arguments, move a null check from the callback to the
PBVH function, and reorganice the PBVH code to be in a consistent
place in the file and to simplify the logic.
Convert shrinkwrap data arrays to use C++ arrays and BitVector,
use references in "EditMeshData" code, and store both structs
with `std::unique_ptr` instead of a raw allocation.
If I remember correctly, this was needed during development of
89e3ba4e25, but not anymore. Now the face normals are used
if faces are sharp. And if there is a mixture of sharp and smooth faces,
then normals_domain will be Corner anyway.
In a basic test with a 16 million face grid, this saved 30ms every draw
extraction. It also saves about 12 bytes per corner which was cached on
the mesh before.
There's a chance I'm missing something and this will require changes
elsewhere, but it's working well in my testing.
The copied the material index and its smoothness to every grid,
resulting in 4 bytes of memory per base mesh face corner. That's
wasteful, since it's trivial to loop up the original data from the base
mesh attributes as necessary. This way we can also avoid the
dereference.
Automatic memory management and clearer ownership! Requires
removing `MEM_CXX_CLASS_ALLOC_FUNCS` from `MeshRuntime`,
but that's used very inconsistently anyway, and `MeshRuntime` isn't
that large.
Instead of allocating a separate bitmap per grid for the hide status, store
all the bits in a recently added C++ data structure that stores all bits in one
contiguous memory chunk. When nothing is hidden, nothing is allocated
(that saves 32 MB for a 16 million vertex multires sculpt). Intuitively it
could have better performance because of the cache benefits of
contiguous memory, but this is hard to measure. It also has a nicer
API than `BLI_bitmap`.
I discussed this with Sergey in person recently. Most of the changes are
just straightforward refactors. The part that isn't is a change to the "show/hide"
operator to structure it similarly to the mesh handling in 4e66769ec0.
Pull Request: https://projects.blender.org/blender/blender/pulls/115687
This layout is more flexible and polymorphic.
While the worst case is worse (4 + 3 layers),
the common case is more optimized (2 + 2 layers).
The average written closure data is also lower
since we can compact the data for special cases
which are quite frequent.
Some adjustment had to be made in the denoise an
tile classify shaders.
Pull Request: https://projects.blender.org/blender/blender/pulls/115541
Avoid reusing the custom data type enum with additional values. Instead
use std::variant and type names to properly distinguish between custom
and generic attribute requests. Use a Vector to hold the requests.
Also attempt to simplify the string key building process for requests
and groups of requests in batches. Previously for every PBVH node it
would rebuild the key 3 times, now it only does it once. It's hard to
measure, but that process did show up in profiles, so performance is
probably slightly improved when many nodes are handled at once.
This makes so that locked layers don't show the points and the lines aren't visibly selected.
This separate the batch ensure function into two function, one for handling the visible grease pencil batch and one for handling the edit mode overlay batch.
This also makes it so that only selected curves show the points that make up it, (this matches legacy grease pencil).
Pull Request: https://projects.blender.org/blender/blender/pulls/114751
Implement the next phases of bounds improvement design #96968.
Mainly the following changes:
Don't use `Object.runtime.bb` for performance caching volume bounds.
This is redundant with the cache in most geometry data-block types.
Instead, this becomes `Object.runtime.bounds_eval`, and is only used
where it's actually needed: syncing the bounds from the evaluated
geometry in the active depsgraph to the original object.
Remove all redundant functions to access geometry bounds with an
Object argument. These make the whole design confusing, since they
access geometry bounds at an object level.
Use `std::optional<Bounds<float3>>` to pass and store bounds instead
of an allocated `BoundBox` struct. This uses less space, avoids
small heap allocations, and generally simplifies code, since we
usually only want the min and max anyway.
After this, to avoid performance regressions, we should also cache
bounds in volumes, and maybe the legacy curve and GP data types
(though it might not be worth the effort for those legacy types).
Pull Request: https://projects.blender.org/blender/blender/pulls/114933
Texture usage flag `GPU_TEXTURE_USAGE_MIP_SWIZZLE_VIEW`
was originally implemented and used too conservatively for many
cases in which the underlying API flags were not required.
Renaming to `GPU_TEXTURE_USAGE_FORMAT_VIEW` to reflect
the only essential use case for when a texture view is initialized with
a different texture format to the source texture. Texture views can
still be created without this flag when mip range or base level is
adjusted,
This flag is still required by stencil views and internally by the Metal
backend for certain feature support such as SRGB render toggling.
Patch also includes some small changes to the Metal backend to
adapt to this new compatibility and correctly capture all texture view
use-cases.
Related to #115269
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/115300