Avoid rebuilding BVH trees when meshes are copied.
Similar to the other uses of the shared cache system,
this can arbitrarily improve performance when meshes
are copied but not deformed and BVH building is the
main bottleneck. In a simple test file I got a 6x speedup.
The amount of code is also reduced and the system is
much simpler overall-- built out of common threading
patterns like `SharedCache` with its double-checked lock.
RAII is used in a few places to simplify memory management
too.
The downside is storing more `SharedCache` items in the
mesh runtime struct. That has a slight cost when copying
a small mesh many times, but we have ideas to improve that
in the future anyway (#104327).
Pull Request: https://projects.blender.org/blender/blender/pulls/130865
The Lighten blend mode is wrong for factors less than 1, as it is
computed as a weighted maximum using the factor as the weight, while it
should be a simple component-wise maximum.
This is correctly implemented for Compositor, EEVEE, and Cycles. But the
render/material implementation is wrong. So we adjust the implementation
to match the correct one.
The Darken counterpart was already fixed in 1dcf956849. While the
lighten was fixed in 8b7b165ad9 among other patches. This just completes
the fix.
Pull Request: https://projects.blender.org/blender/blender/pulls/131242
NOTE: This also required some changes to Cycles code itself, who is now
directly including `BKE_image.hh` instead of declaring a few prototypes
of these functions in its `blender/utils.h` header (due to C++ functions
names mangling, this was not working anymore).
Pull Request: https://projects.blender.org/blender/blender/pulls/130174
When changing render engine, we discard the persistent data
that could be saved for all the current render instance that
exists. This is to save memory for the new renderer.
When doing so while rendering for F12, `engine_depsgraph_free`
is called after waiting for the render to finish. But this
can be called before the renderer destruction and on the main
thread.
Doing so on the main thread means that the `gpu_context` used
by the renderer cannot be bound for the sake of just receiving
the orphan buffers that the depsgraph holds. This is because
only the worker thread can make the gpu context active.
Binding the draw gpu context in this situation avoid all
possible conflict.
This is basically doing exactly what the
`DRW_render_context_enable/disable` function is doing internally
if the render engine gpu context is null.
Pull Request: https://projects.blender.org/blender/blender/pulls/129982
Blender crashes when opening files that invokes the interactive
compositor on file load with a BadAccess X_GLXMakeCurrent error.
This is caused by the same system GPU context being active in two
threads at the same time, which happens when the GPU context for the
compositor is created in the main thread, it is made current during
creation, but it is not reset to the main GPU context of the drawable
because it is null. So when the GPU compositor actually executes, it
makes the GPU context current again but in its own thread, causing a
BadAccess error in X11 and potentially other window systems.
So the root cause is that the drawable is nullptr, and an attempt to fix
this was committed in 0a70360eb6 but was reverted in 98722773da because
it caused serious issue that were not obvious.
This patch attempts another fix by simply releasing the system GPU
context created when calling the RE_system_gpu_context_ensure. This is
more robust anyways because callers do not expect the context to be
bound form an API point of view.
Pull Request: https://projects.blender.org/blender/blender/pulls/129793
Meta-data are missing on Cryptomatte layers in the GPU compositor, so
they do not get saved using the File Output node. This is due to a use
after free error where a temporary string is used in the meta-data
population logic. This is fixed by assigning the string to a temporary
variable instead.
Thanks to Jorn Visser for finding the cause of the issue.
Blender crashes when changing the compositor execution device. That's
because cached resources that were originally computed for CPU are now
being used for GPU and vice versa, which can be unexpected in code that
uses them.
To fix this, we free and recreate the entire compositor context when the
execution device or precision change, because it is much easier and
safer to recreate everything as opposed to trying to update the
necessary resources.
This patch adds support for passes in the new CPU compositor. This
involves rewriting the get_input_texture method into a get_pass methods
that returns a result as opposed to a texture. The result wraps the
cached GPU texture or image buffer depending on the execution device.
The Render Layers node was implemented for CPU execution and a new
utility constructor for the result class was added to determine type and
precision based on GPU texture format. The fallback depth pass that was
retrieved from the viewport frame buffer was removed, as it was a hack
that can no longer be supported due to the use of stencil format.
Pull Request: https://projects.blender.org/blender/blender/pulls/129154
For Texture baking, there is a CPU-bound preparation step needed to
establish a mapping between the Hi-Res and Low-Res Meshes before the
rendering engine can take over and start the bake.
Part of this process is now taking advantage of parallel_for,
speeding up this step of the Bake process for many/large textures almost
linearly to the number of CPU-cores.
Pull Request: https://projects.blender.org/blender/blender/pulls/128964
Blender crashes when using the GPU compositor sometimes. This is because
compositor render data was accessed before it was updated in the
realtime compositor when detecting compositing device. So fix that by
first updating compositor data before calling any context methods.
The Legacy Cryptomatte node doesn't work in GPU execution mode if
Precision is set to Auto. That's because the colors picked from the Pick
layer might be in half precision and thus will not match the colors in
the Cryptomatte layers. This is due to the compositor using the
context's precision for Viewer outputs as opposed to the precision of
the image that actually needs to be viewed in the Viewer node.
To fix this, we set the Viewer node precision to be the precision of its
input, that way, the Cryptomatte pick layer will be output in full
precision as intended.
Pull Request: https://projects.blender.org/blender/blender/pulls/128495
To avoid unnecessary looping over listbase items the function
`BLI_listbase_count_at_most` was used however it resulting in an awkward
expression: `BLI_listbase_count_at_most(list, count + 1) == count`
replace this with `BLI_listbase_count_is_equal_to(list, count)`.
This patch supports the viewer node in the new CPU compositor. To do
that, the context viewer output mechanism was refactored to allow CPU
storage by utilizing the result class as opposed to a GPU texture.
This patch introduces a new experimental option for the new CPU
compositor under development. This is to make development easier such
that it happens directly in main, but the compositor is not expected to
work and will probably crash.
Pull Request: https://projects.blender.org/blender/blender/pulls/125960
Previously, values for `ID.flag` and `ID.tag` used the prefixes `LIB_` and
`LIB_TAG` respectively. This was somewhat confusing because it's not really
related to libraries in general. This patch changes the prefix to `ID_FLAG_` and
`ID_TAG_`. This makes it more obvious what they correspond to, simplifying code.
Pull Request: https://projects.blender.org/blender/blender/pulls/125811
The File Output node doesn't provide an option to save byte formats like
PNG in a space that is not sRGB. This is problematic for data images
like normal maps, which need to be saved as non-color.
This patch adds a Color Space option to the File Output node to allows
users to override the assumed color space. This also adds a new global
Save As Render option that is used if Use Node Format is enabled.
Pull Request: https://projects.blender.org/blender/blender/pulls/124238
The viewport compositor slows down complex scenes even if it has very
simple setups. That's because it internally computes previews which
involves a fair bit of CPU computation, however, those previews are
actually never written to the original tree, so previewers weren't
really visible so it is effectively redundantly computations.
To fix this, we double down on disabling previews for the viewport
compositor and avoid any redundant computations in that case.
This continues the cmake modernization effort and introduces support for
allowing our optional dependencies to integrate properly. TBB is added
here as it's proven troublesome to maintain correctly.
Currently the only Blender project which uses the TBB headers directly
is `blenlib`. However, all downstream projects which require blenlib as
their dependency, and wish to properly make use of its threading
facilities, needed to define various TBB items in their CMake files. Not
only is this unnecessary and arcane, but several projects didn't do this
and ended up not using threading as well as producing ODR violations
along the way[1].
This PR makes TBB a modern dependency and exposes it PUBLIC'ly from
`blenlib`. All downstream projects which depend on blenlib will now
receive everything they require from TBB automatically. This includes
the `WITH_TBB` define, the headers, and the library itself.
[1] blender/blender@05241f47f5
Pull Request: https://projects.blender.org/blender/blender/pulls/124916
This commit moves generated `RNA_blender.h`, `RNA_prototype.h` and
`RNA_blender_cpp.h` headers to become C++ header files.
It also removes the now useless `RNA_EXTERN_C` defines, and just
directly use the `extern` keyword. We do not need anymore `extern "C"`
declarations here.
Pull Request: https://projects.blender.org/blender/blender/pulls/124469
The vector pass and potentially other vectors that store 4 values are
stored wrongly, in particular, the last channel is ignored. To fix this
we identify if a vector pass is 4D and store the information in the
result meta data, then use this information to either save a 3D or a 4D
pass in the File Output node.
This is a partial fix for the GPU compositor only. The complete fix for
the CPU compositor will be submitted separately as it is not
straightforward and will likely require a refactor.
Pull Request: https://projects.blender.org/blender/blender/pulls/124522
This patch adds support for meta data in the GPU compositor much like
the mechanism that already exist in the CPU compositor. Only Cryptomatte
meta data is handled at the moment because that is the only meta data
that the compositor supports.
The is_data member of the result was moved to the meta data structure for
consistency with the CPU compositor.
Fixes#124222.
Pull Request: https://projects.blender.org/blender/blender/pulls/124460
The BLI_spin APIs use a `SpinLock` typedef whose underlying type is
contingent on the precense of `WITH_TBB`. Since our projects did not
consistently define the `WITH_TBB` definition, multiple `SpinLock` types
would end up in our final binary creating ODR violations.
Pull Request: https://projects.blender.org/blender/blender/pulls/124285
This patch cleanup and refactors the render pipeline compositor render
code to deduplicate code and clarify usage.
The unused this_scene arguemenet was removed, per-node functions were
introduced to simplify loops, C++ Set was used instead of GSet, and
scene change is now detected by any rendered scene in the set.
Pull Request: https://projects.blender.org/blender/blender/pulls/124028
Blender doesn't render the scene even though a Cryptomatte node exists.
That's because Blender only considers Render Layer nodes, but
Cryptomatte node can reference scenes as well. This patch fixes that by
putting Cryptomatte nodes into consideration.
Pull Request: https://projects.blender.org/blender/blender/pulls/123814
Blender crashes when rendering a scene strip that references a scene
with a GPU compositor active. This is because when rendering a scene
strip, a new render with a nullptr system GPU context is created for the
scene it references, which is then used for compositing.
Ideally, the strip scene would have its own context, but we can't ensure
its context because we are not in the main thread. The alternative is to
then identify scenes that will be rendered before hand and set their
renders before starting the job, which doesn't seem like a great
solution. So for now, we just use the DST context in those cases.
Pull Request: https://projects.blender.org/blender/blender/pulls/123057
Currently, during baking each pixel stores a seed input that comes from the
Blender side. This is only needed for vertex color baking, however -
for regular image baking, we can just as well hash the pixel coordinates.
Therefore, we can save some memory (4 byte per pixel) by splitting the seed
info out into a separate pass and only storing it when needed.
Pull Request: https://projects.blender.org/blender/blender/pulls/122806