This avoid recreating the GPU context for each individual
tests. This reduces the overhead drastically.
Excluding static_shaders and texture_pool tests I get for GPUVulkanTest:
`Before: 129 tests from 1 test suite ran. (26304 ms total) `
`After: 129 tests from 1 test suite ran. (6965 ms total) `
Including static_shaders and texture_pool tests I get for GPUMetalTest:
`Before: 124 tests from 1 test suite ran. (54654 ms total)`
`After: 124 tests from 1 test suite ran. (1870 ms total)`
Given the tests are run twice for the workarounds versions, the
speedup can be multiplied by 2.
Overall tests time is still largely dominated by shader compilation time.
However, there is still 3x improvement using this patch:
Including static_shaders and texture_pool tests I get for GPUVulkanTest,
GPUVulkanWorkaroundTest, GPUOpenGLTest, GPUOpenGLWorkaroundTest:
`Before: 516 tests from 4 test suites ran. (318878 ms total)`
`After: 516 tests from 4 test suites ran. (106593 ms total)`
Pull Request: https://projects.blender.org/blender/blender/pulls/138097
This improves the user experience, because the data-flow is not interrupted
temporarily. The automatically inserted links are exactly the same that the
Evaluate Closure node uses when no closure is connected.
Pull Request: https://projects.blender.org/blender/blender/pulls/138125
The corner-cost bias used when calculating the shortest path meant
the path could cross over itself.
Resolve by tagging vertices as well as edges to prevent stepping across
vertices used by edges that have been added to the heap.
This removes the minimum thickness clamping in the shader.
The reason why this was clamped in the first place was to reduce
aliasing artifacts. With the new super sampling method for
rendering, this should no longer be an issue.
Note: This can break visual compatibility, but the previous radii
were arguably "wrong". This essentially fixes this and renders
the strokes with the actual radii from the attribute.
Previous files that might have relied on the clamping will have
to be updated.
Pull Request: https://projects.blender.org/blender/blender/pulls/138119
Metal and Vulkan don't support line smoothing. There is a workaround
implemented. This workaround is only enabled when linesmooth value is
larger than 1. However When using smooth lines it should also be used.
This is fixed by adding a `GPU_line_smooth_get` function for getting the
current line smooth state.
Pull Request: https://projects.blender.org/blender/blender/pulls/138123
This patch removes the Alpha option in the Map UV node. That's because
it currently only has a marginal effect on the result of the node and is
unnoticeable to the user.
The alpha option was supposed to darken the edges of the object that has
the UVs by attenuating the result according to the Laplacian of UV
coordinates. But the attenuation factor was too small to be noticeable.
This option was probably added when UV passes had no anti-aliased alpha
and was therefore needed to smooth the transition at the edges probably.
We asked for user feedback and got feedback from the studio, and it seem
the option is mostly unused.
Pull Request: https://projects.blender.org/blender/blender/pulls/138116
This is implements option 1 of #129309. It contains a few changes:
* Split `BHead8` into `SmallBHead8` and `LargeBHead8`. The latter is the new one
and uses `int64_t` for array sizes instead of just `int`. That applies to to
buffer size in bytes (`len`) and the array size (`nr`).
* The first .blend file header (the first few bytes of the file) are updated
according to #129309.
* Support reading files with that use `BHead4`, `SmallBHead8` and `LargeBHead8`.
* New option in the preferences that controls whether new files are written with
the older `SmallBHead8` or the new `LargeBHead8`. The new file format is
disabled by default. Potential unofficial 32 bit builds (#67184) always write
`BHead4`, but can read all types (in theory anyway, can't test it).
Note that there are other places in Blender that don't fully support arrays this
large. E.g. I noticed that the spreadsheet currently can't scroll all the way
down.
The experimental option can be removed once we are in the 5.0 branch, at which
point only 4.5 will be able to open the files saved with 5.0.
Co-authored-by: Bastien Montagne <bastien@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/129751
- ImBuf reference counting: turn that into just an atomic integer
- Cachefile safety: turn into a mutex, since work under the spinlock
was quite heavy (hashtable creation, other memory allocations)
- Movie clip editor: turn into a mutex, since work under the spinlock
was very heavy (reading files from disk, etc.)
- Mesh intersect: remove the previously commented out spinlock path;
replace BLI mutex with C++ mutex for shorter code
Pull Request: https://projects.blender.org/blender/blender/pulls/137989
Each time when rendering an image a new context is created. When the
context is destroyed it can still contain a render graph, which ownership
isn't transferred back to the device. Resulting in an increase of several
MB per render. The render graph is cleanly destroyed during quiting as
there is a master list of created render graphs.
Fixed by moving the ownership of the render graph back to the device
when a context is unregistered.
Pull Request: https://projects.blender.org/blender/blender/pulls/138095
When using a tablet an increased gizmo hit-radius was used,
afterwards toggling modifier keys could then re-evaluate the
highlighted gizmo with a smaller radius, meaning that pressing modifiers
caused gizmos to become inactive.
Revert [0] while keeping larger radius for both mouse & tablet motion
from 10 to 12 px (scaled by the UI scale).
[0]: a043a0e74d
There is an internal flag which gets set only during USD import,
which allows for custom attributes to be read from Mesh Sequence Cache.
This flag is not shown anywhere in the UI, and is not accessible through
python, which can lead to confusing behavior.
For example, loading a .usd file through `bpy.ops.wm.usd_import` will read
custom attributes, but when `Mesh Sequence Cache` is added manually then
loading the same usd file through `bpy.ops.cachefile.open` will not load
custom attributes.
Further, if the user edits `object.modifiers['MeshSequenceCache'].read_data`
then `MOD_MESHSEQ_READ_ATTRIBUTES` flag will be removed with no way
to bring it back.
Pull Request: https://projects.blender.org/blender/blender/pulls/132475
Calling UV stitch from menu executes `exec` function instead of modal() due
to wrong opcontext. Set `operator_context=INVOKE_REGION_WIN` so that
clicking operator from menu invokes `stitch_invoke()` which later calls
modal operation.
Pull Request: https://projects.blender.org/blender/blender/pulls/138075
There are two way to access input data in geometry nodes: `get_input` and `extract_input`.
They semantially the same as copy and move constructor. Since sometimes we can access
input multiple times in arbitrary context, we can not always use move constructor. But if
execution flow is trivial then `extract_input` should be preferred.
Pull Request: https://projects.blender.org/blender/blender/pulls/138057
Instead of creating a 1D 'data' texture during color attribute baking,
where `width = #face corners` and `height = 1`, this PR uses a square
aspect texture instead. This fixes the reported issue and makes baking
faster since large 1D textures are inefficient with tiled rendering.
See PR for measurements and further details.
Pull Request: https://projects.blender.org/blender/blender/pulls/138047
(when done from python)
When changing those setting via the **UI**, we somehow get the
dependency graph tagged for changes.
If this happens, the update ripples through the following (and we are
all good):
- `DEG_editors_update` (here the depsgraph update is detected via
`DEG_id_type_any_updated`)
- `deg_editors_scene_update`
- `ED_render_scene_update`
- `ED_render_view3d_update`
- `engine_view_update` which finally calls the renderengine `sync_func`
When doing this from python though, the DEG tag is missing, now added
(similar to 4f15c24705)
We could as well just do a broader update though and use
`update_render_engine`
Since it seems `CyclesVisibilitySettings` are only used for the World
nowadays, this PR also corrects the property descriptions accordingly
(might make sense to split this into a separate commit).
Pull Request: https://projects.blender.org/blender/blender/pulls/138093
Since commit 175686f2, the behavior of OSL now matches SVM, so instead of this
separate test this is handled by the regular test and the "does OSL match SVM"
checks.
This changes removes Vulkan from experimental. It should be feature
parity with OpenGL with the exception of USD/Hydra.
Some backports to USD/Hydra developments are being made #133717
and should land soon.
This change only updates UI text that mentions the state of the
backend.
Thanks for the community so far for testing and reporting issues!
Pull Request: https://projects.blender.org/blender/blender/pulls/138086
Blender crashes if the compositor runs from the render pipeline while
the node tree is being drawn. That's because the render pipeline
adjusting the original node tree by calling ntreeCompositTagRender from
a different thread during compositor evaluation, which is unsafe while
the node tree is being drawn on the main thread.
ntreeCompositTagRender seems to update Composite and Texture nodes, but
they don't have update function, so it seems to do nothing in those
cases. It also updates nodes that reference the scene, like the Render
Layers and Cryptomatte nodes, but this seems to be already done in other
places like do_render_compositor_scenes and ntreeCompositUpdateRLayers.
Furthermore, in one of the calls in the render pipeline, it does raw
pointer comparison with the evaluated scene, so the comparison fails and
it does nothing.
Considering the above, it seems this can be omitted from the render
pipeline code.
Pull Request: https://projects.blender.org/blender/blender/pulls/138087
The GHOST_DisplayManager and its implementations are for the most part
unmaintained and almost completely unused in all backends. To clean
things up, and avoid any confusion about how displays are handled in
each respective GHOST backend, this PR completely removes the GHOST
Display Manager, and move the few remaining logic it still held directly
to the corresponding backends.
The backends that were modified (apart from removing the display manager
initialization call from their init) are:
- Win32: `GHOST_SystemWin32::getNumDisplays()` was calling
`m_displayManager->getNumDisplays`, the underlying system metric call
(`GetSystemMetrics(SM_CMONITORS)`) was substituted in place.
- SDL: `GHOST_SystemSDL::createWindow` was calling
`GHOST_DisplayManagerSDL::getCurrentDisplayModeSDL` which returned its
`m_mode` data member by reference. Since none of the
`GHOST_DisplayManagerSDL` member function that modified this data member
were ever called, the variable `memset` initialization call was
substituted in place from the `DisplayManagerSDL` constructor
Pull Request: https://projects.blender.org/blender/blender/pulls/138066
The issue here is that the automatic bump shader in OSL adjusts globals.N,
which is used to define each closure's shading normal, but sd->N remains
as it was (unlike SVM, which sets it from svm_node_set_normal).
This is a problem since some code like the shadow terminator stuff uses sd->N,
so to make behavior consistent the fix is to set it from OSL as well.
Pull Request: https://projects.blender.org/blender/blender/pulls/138092
When allocating a large vertex buffer on NVIDIA it tried to allocate it
on the GPU and host visible. This section is limited in size (256MB).
However when allocating a large vertex buffer it should not have been
chosen.
Detected during investigation of #137909.
It removes the validation error, but it is unclear yet if this solves the'
crash as I wasn't able to reproduce the crash.
Pull Request: https://projects.blender.org/blender/blender/pulls/138079
This PR adds a separate "Grease Pencil" render pass. Once
the "Grease Pencil" option is checked in the passes list, the
Grease Pencil engine will render to a new render pass for various
composition uses.
Notes:
- Occluded Grease Pencil geometry is not rendered.
- In most cases, using an "Alpha Over" with the rest will result
in the same render as the "Combined" output. The exception is
when there are Grease Pencil layers that use a blending mode
that changes the chromaticity of the alpha channel.
Pull Request: https://projects.blender.org/blender/blender/pulls/137638
On the one hand, this improves initialization time since we don't need to
load/compile the full OSL module with all the shading logic if we're only
using a custom camera with SVM shading.
On the other hand, it also fixes a bug I noticed while preparing test scenes:
The AO and Bevel nodes don't work when using custom cameras with SVM on OptiX.
The issue there is that those two are handled by the SHADE_SURFACE_RAYTRACE
kernel, but since that one has intersection logic, we use the OptiX-specific
kernel even if OSL shading is disabled.
However, with the previous unified OSL module, this would mean loading
SHADE_SURFACE_RAYTRACE from kernel_osl.cu, which has `#undef __SVM__` and
therefore doesn't handle them correctly.
With this change, we'll use the kernels from kernel_shader_raytrace.cu in that
case, which do support SVM nodes just fine.
Disk usage of the new kernel_optix_osl_camera.ptx.zst file is 30KB, so this
also doesn't blow up the kernel disk size (and kernel_optix_osl.ptx.zst is
probably smaller by that amount now).
Since it seems that we can mix modules just fine, I'm suspecting that we could
split the modules properly (intersection, SVM shading with raytracing,
OSL shading, OSL camera), instead of the current approach where modules
essentially correspond to feature set tiers and each includes the previous
one's kernels as well - but that's a separate refactor.
Pull Request: https://projects.blender.org/blender/blender/pulls/138021
osl_transform_triple(), osl_transform_dvmdv() and so on are supposed to apply
the given transform in the context of OSL's auto-differentiation system.
Therefore, the given input is a dual vector, containing both the value as v[0]
and its derivatives w.r.t. X and Y in v[1] and v[2].
However, the existing code treats these as a simple list of vectors, applying
the same operation to all three instead of propagating the derivatives.
On top of that, it also treated the given matrix input as if there were three
of them, which isn't the case.
Therefore, this commit replaces the implementation to do the right thing.
The Vector and Normal case are straightforward since the operation is linear,
so applying the same operation to all three vectors works.
The Point case is a bit more complicated, but not too bad when written out.
This bug mostly became apparent when using Object or Camera texture coordinates
with a Bump node, since that node uses OSL differentials and Object/Camera
coordinates are implemented using transform().
I'm pretty sure that all the other builtin functions (e.g. sin) at the bottom
of services_gpu.h have the same problem, but one thing at a time...
Pull Request: https://projects.blender.org/blender/blender/pulls/138045
Blender crashes when rendering a scene that has a file output node with
no inputs. That's because the preview code assumes that nodes would
always have inputs. So we simply guard against this case.
Numpad * is being used for grease pencil object to isolate active layer
while inside edit/paint mode and mouse is over viewport. This PR
includes same keyitem into dopesheet keymap, so that shortcut
can be used in dopesheet editor context.
Pull Request: https://projects.blender.org/blender/blender/pulls/132054
* Add GraphicsInteropDevice to check if interop is possible with device
* Rename GraphcisInterop to GraphicsInteropBuffer
* Include display device type and memory size in GraphicsInteropBuffer
* Unnest graphics interop class to make forward declarations possible
Pull Request: https://projects.blender.org/blender/blender/pulls/137363