This patch adds compile-time optimizations where the operation inputs
are guaranteed to be non-single values. Pixel load methods now take an
optional template parameter CouldBeSingle, which is false by default. If
the input is not guaranteed to be single, it needs to be set to true.
Gives up to 3x improvement in some nodes.
OptiX OSL tests were previously disabled due to a GPU driver bug
resulting in many tests failing unexpectedly.
The new driver version is now out with the fix so we can now enable
OptiX OSL testing.
This commit also updates the OptiX OSL block list with better comments,
and more tests that are known to fail and need investigating.
Ref: #123012
Pull Request: https://projects.blender.org/blender/blender/pulls/129280
Until the newer Hair Curves system can fully replace particle hair, add
a small test to ensure this continues to work.
Since the hair is exported as cubic bspline curves, we can also use this
same file to test bspline import now too.
Pull Request: https://projects.blender.org/blender/blender/pulls/131997
On Linux, Cycles HIP has a JIT compilation feature.
This feature is used when Cycles can not find a precompiled kernel
for your GPU. Which is most common when using hardware that wasn't
out at the time that a version of Blender was released.
There were various issues with this JIT compilation system, this commit
aims to solve them. The changes include:
- Enable `WITH_NANOVDB` when Blender is built with NanoVDB.
- This fixes a issue where VDB objects would not render.
- Enable some extra debug options for developers when desired
(This is so we match the CUDA implementation of the same feature).
- Reduce the optimizaiton level from -O3 to the default.
- This is to avoid any extra issues that may occur as a result
of an increase optimization level that isn't tested with
precompiled kernels.
- Reduce the optimization level even further to -O1 for Vega.
- This was done on precompiled kernels to work around some issues,
so I decided to apply it to JIT kernels as well.
- Note: Although Vega is not officially supported, this may help
people that unofficially use Vega.
- Added some previously missing compiler arguments and fixed errors that
were introduced when enabling these compiler arguments.
- Fixed a issue where JIT compilation would fail if Blener was
installed in a path that had a space in it.
Pull Request: https://projects.blender.org/blender/blender/pulls/131853
Area "close" operation is actually an area merge of some area into the
one being closed. This means that screen->active_region will be
pointing at deallocated RAM. Normally not noticed because active_region
is set very quickly to the new area, but error nonetheless and noticed
by ASAN. This PR sets the screen->active_region to null when merges
change the active area.
Pull Request: https://projects.blender.org/blender/blender/pulls/131994
This applies upstream PR 13328 to our copy of dpcpp, which enables
building dpcpp on a many core box. I Upgraded my build env and
ran into this issue.
No rebuilds required, build time fix only.
While adding tests I found that metaball export has been broken since
Blender 3.4. It would export each metaball geometry twice.
This looks to have been a side effect of a change to `object_dupli.cc`
which no longer sets the `no_draw` flag for metaballs[1]. With the flag
unset we would end up visiting this particular object twice.
Use a direct check for Metaballs now and add test coverage for the
scenario in general.
[1] eaa87101cd
Pull Request: https://projects.blender.org/blender/blender/pulls/131984
The new `--disable-depsgraph-on-file-load` commandline option, when used
together with the `--background` or `--command` ones, will prevent
building a depsgraph immediately after loading a blendfile.
The goal is to improve performances of batch-processing of blendfiles by
python scripts. It is intended to become the default behavior in Blender
5.0.
Scripts requiring evaluated data then need to explicitly ensure that
an evaluated depsgraph is available (e.g. by calling
`depsgraph = context.evaluated_depsgraph_get()`).
------
This disables the call to `wm_event_do_depsgraph` in `wm_file_read_post`.
Some quick performances tests:
* The whole `blendfile_versioning` tests gain about 2% speedup. These are
almost always small and simple blendfiles.
* Loading a Gold production file however goes from 26.5s to 3.5s (almost
90% faster) when this new option is specified.
Pull Request: https://projects.blender.org/blender/blender/pulls/131978
Previously, the number of material slots on the geometry (e.g. mesh) was the
ground truth. However, this had limitations in the case when the object had more
material slots than the evaluated geometry. All extra slots on the object were
ignored.
This patch changes the definition so that the number of materials used for
rendering is the maximum of the number of material slots on the geometry and on
the object. This also implies that one always needs a reference to an object
when determining that number, but that was fairly straight forward to achieve in
current code.
This patch also cleans up the material count handling a fair amount by using the
`BKE_object_material_*_eval` API more consistently instead of manually accessing
`totcol`. Cycles uses the the same API indirectly through RNA.
Pull Request: https://projects.blender.org/blender/blender/pulls/131869
As mentioned in fb6ac24514 and 9a6beb915d, `file_draw_preview()` is a
rather overloaded and confusing function. I'm trying to make it more
readable.
Split out file indicator icon drawing from the preview drawing function,
there's not much reason for it to be there as well. I rather keep
functions a bit simpler and more manageable.
Also added some comments and tried to make logic a bit more clear.
As mentioned in 9a6beb915d, `file_draw_preview()` is a rather
overloaded and confusing function. I'm trying to make it more readable.
Move image scale calculations to a separate function, reducing perceived
complexity of the function and the number of local variables. Also
rename or remove some variables to be more clear, add comments and move
variables closer to where they are used.
The GCC version on the buildbot does not support attribute on
a class member, resulting in the following warning:
NOD_node_declaration.hh:577:42: warning: ‘maybe_unused’ attribute ignored [-Wattributes]
Use the `UNUSED_VARS` macro instead to solve the original warning
about member being unused in release builds without introducing
a warning when using older compiler.
Pull Request: https://projects.blender.org/blender/blender/pulls/131974
Render tests can still fail. This change will disable them until they
are in a better shape. Reduces confusion when running cycles GPU render
tests.
Known issues:
- Render in batch can take forever due to a locking issue
- Headless rendering is still in development
- Particle hair rendering is broken.
Pull Request: https://projects.blender.org/blender/blender/pulls/131964
The type info table for VSE modifiers was initialized to point
to global variables on first use. But really there's no reason to do
that, we can just declare the actual table instead. This is both
shorter, and avoids dances with preprocessor (INIT_TYPE macro).
Pull Request: https://projects.blender.org/blender/blender/pulls/131958
`file_draw_preview()` does multiple things and is quite hard to follow
already, it needs some improvents. One issue is naming that I always
found made the function unnecessarily confusing. For example `is_icon`
had nothing to do with the `icon` parameter, you'd have to search around
the code a bit to understand what it was actually representing.
Attempt to make variable and function names more clear.
Also reduce variable scope and add a comment.
When baking line art strokes, the object matrix that are used for back
transformation was inverted. Should be `world_to_object` instead of
`object_to_world`. Probably a typo during GPv3 rewrite.
The compositor leaks memory when the node tree contains unavailable
links. That's because the compositor doesn't ignore those links when
computing the reference counts for outputs. To fix this, check if the
output is logically linked and return 0 in case it isn't.
This adds initial support for ReBAR capable platforms.
It ensures that when allocating buffers that should not be host visible, still
tries to allocate in host visible memory. When there is space in this memory
heap the buffer will be automatically mapped to host memory.
When mapped staging buffers can be skipped when the buffer was newly
created. In order to make better usage of ReBAR the `VKBuffer::create`
function will need to be revisit. It currently hides to much options to allocate
in the correct memory heap. This change isn't part of this PR.
Using shader_balls.blend rendering the first 50 frames in main takes 1516ms.
When using ReBAR it takes 1416ms.
```
Operating system: Linux-6.8.0-49-generic-x86_64-with-glibc2.39 64 Bits, X11 UI
Graphics card: AMD Radeon Pro W7700 (RADV NAVI32) Advanced Micro Devices radv Mesa 24.3.1 - kisak-mesa PPA Vulkan Backend
```
Pull Request: https://projects.blender.org/blender/blender/pulls/131856
Change from ababc2e01b did not actually behave in a way that the
caller can force-disable overlays [which seems like the intention from
the commit message and also desired behavior for e.g. grease pencil
drawing/reprojecion].
Pull Request: https://projects.blender.org/blender/blender/pulls/131861
When running tests `WITH_GTESTS` and `WITH_GPU_DRAW_TESTS` the
GPUShaderCreateInfo's specfically created for the tests could not be
found. This failed running tests on any backend.
This PR fixes this. The root cause what that the name of the compile
directive was incorrect. It should have been `WITH_GTESTS` but was
`WITH_GTEST`.
Pull Request: https://projects.blender.org/blender/blender/pulls/131956
Originally intended to be a code cleanup that makes the code shorter
(part of VSE quality project #130975), but as a side effect many
modifiers are now faster since they no longer do many branches in
the innermost pixel loop.
Main part is having apply_modifier_op that given the "modifier op"
functor object, instantiates the correct processing function based
on type of image (byte vs float) and mask (none, byte, float), for
a total of 6 possible cases. And then a helper like
apply_and_advance_mask that applies mask based on input and result
in a consistent and not "literal copy paste of code" way across the
modifiers.
Brightness/Contrast, Color Balance, Tonemap modifiers were already
optimized to move branches out of inner loops previously; their
performance remains unchanged. Mask modifier performance remains
unchanged; it is very simple and memory bandwidth limited on my
machine.
Other modifiers, tested on 4K resolution, Win10 / Ryzen 5950X, time
in milliseconds taken to apply the modifier calculation, on a byte
image with no mask:
- Curves: 12.1 -> 7.7ms
- Hue Correct: 24.5 -> 15.8ms
- White Balance: 20.5 -> 13.8ms
Same as above, but on a float image with a byte mask:
- Curves: 13.5 -> 12.3ms
- Hue Correct: 19.7 -> 16.4ms
- White Balance: 19.3 -> 15.9ms
Pull Request: https://projects.blender.org/blender/blender/pulls/131736
Change BrightRings exr file to not contain nan/inf pixels. Testing for
nan/inf in input just gives too many headaches across different
platforms, and is arguably a very corner case.
Pull Request: https://projects.blender.org/blender/blender/pulls/131926
UNICODE code points were named "ascii" or "cha",
use the term "charcode" as used elsewhere in BLF and FreeType.
Also use char32_t internally and add a utility function to apply the
small-caps flag.
Resolve by welding overaps between adjacent underline glyphs.
Besides avoiding overlap, this makes underlines work better when the
text follows a curve.
Also remove increased underline width which was originally added in [0].
This may have been done to avoid gaps - which is no longer needed.
Further, underlines typically don't extend beyond the glyphs bounds
so the increased width isn't expected behavior.
[0]: a07394ef2c
This adds a new `bl_use_group_interface` property that can be set on custom node
group types. By default it is `true` to avoid this being a breaking change. If
it's set to `false` some UI elements related to the built-in node group
interface are hidden.
Pull Request: https://projects.blender.org/blender/blender/pulls/131877
Refactoring functions out of the 3D text layout funciton was
becoming difficult because vfont.cc mixed API's that use VFont
for different purposes (clipboard, text layout, ID management.. etc).
Split VFont to character & text layout into its own file.