This patch adds two new kernels: SORT_BUCKET_PASS and SORT_WRITE_PASS. These replace PREFIX_SUM and SORTED_PATHS_ARRAY on supported devices (currently implemented on Metal, but will be trivial to enable on the other backends). The new kernels exploit sort partitioning (see D15331) by sorting each partition separately using local atomics. This can give an overall render speedup of 2-3% depending on architecture. As before, we fall back to the original non-partitioned sorting when the shader count is "too high".
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D16909
This patch removes the option to select both AMD and Intel GPUs on system that have both. Currently both devices will be selected by default which results in crashes and other poorly understood behaviour. This patch adds precedence for using any discrete AMD GPU over an integrated Intel one. This can be overridden with CYCLES_METAL_FORCE_INTEL.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D17166
This patch fixes T103393 by undefining `__LIGHT_TREE__` on Metal/AMD as it has an unexpected & major impact on performance even when light trees are not in use.
Patch authored by Prakash Kamliya.
Reviewed By: brecht
Maniphest Tasks: T103393
Differential Revision: https://developer.blender.org/D17167
The merge down operator was sometimes copying the wrong frame, which altered the animation.
While merging the layers, it is sometimes needed to duplicate a keyframe,
when the lowest layer does not have a keyframe but the highest layer does.
Instead of duplicating the previous keyframe of the lowest layer, the code
was actually duplicating the active frame of the layer which was the current frame in the timeline.
This patch fixes the issue by setting the previous keyframe of the layer as its active frame before duplication.
Related issue: T104371.
Differential Revision: https://developer.blender.org/D17214
Mutex locks for manipulating GHOST_System::m_timerManager from
GHOST_SystemWayland relied on WAYLAND being the only user of the
timer-manager.
This isn't the case as timers are fired from
`GHOST_System::dispatchEvents`.
Resolve by using a separate timer-manager for wayland key-repeat timers.
Resolve a thread safety issue reported by valgrind's helgrind checker,
although I wasn't able to redo the error in practice.
NULL check on the key-repeat timer also needs to lock, otherwise it's
possible the timer is set in another thread before the lock is acquired.
Now all key-repeat timer access which may run from a thread
locks the timer mutex before any checks or timer manipulation.
There were two errors with the function used to convert face sets
to the legacy mesh format for keeping forward compatibility:
- It was moved before `CustomData_blend_write_prepare` so it
operated on an empty span.
- It modified the mesh when it's only supposed to change the copy
of the layers written to the file.
Differential Revision: https://developer.blender.org/D17210
This implements two optimizations:
* If the duplication count is constant, the offsets array can be
filled directly in parallel.
* Otherwise, extracting the counts from the virtual array is parallelized.
But there is still a serial loop over all elements in the end to compute
the offsets.
In the unlikely case an area could not be created OPERATOR_CANCELLED
was returned, this has the same value of WM_HANDLER_HANDLED however
break is logical in this situation and both flags work.
If GetFileAttributesW returns an error, only debug assert if the reason
is file not found.
See D17204 for more details.
Differential Revision: https://developer.blender.org/D17204
Reviewed by Ray Molenkamp
Allow preview region to change cursor as per the selected tool
Reviewed by: campbellbarton, ISS
Differential Revision: https://developer.blender.org/D16878
This makes the code more readable.
In this commit also the `int curr_side_unclamp` member was moved to
`EdgeSlideParams` as it is a common value for all "Containers".
There was an inconsistency between geometry sample nodes and mesh/curve
sample nodes, where the latter didn't have a special "Sample" category,
and we categorized as "Operations", which they were not. Also put the
sample category between "Read" and "Write" since the verb name is
more consistent and sampling is an advanced form of reading.
In some cases panels without headers were drawn too wide in sidebars
with region overlap.
Instead of always using the maximum width the view allows, headerless
panels now also use the width calculated in
`panel_draw_width_from_max_width_get`. The function already returns the
correct width in all cases and also takes care of insetting the panels,
when their backdrop needs to be drawn, which is necessary for headerless
panels, too, when there is region overlap.
Reviewed By: Hans Goudey
Differential Revision: http://developer.blender.org/D17194
This is both a cleanup and a preparation for the Principled v2 changes.
Notable changes:
- Clearcoat weight is now folded into the closure weight, there's no reason
to track this separately.
- There's a general-purpose helper for computing a Closure's albedo, which is
currently used by the denoising albedo and diffuse/gloss/transmission color
passes.
- The d/g/t color passes didn't account for closure albedo before, this means
that e.g. metallic shaders with Principled v2 now have their color texture
included in the glossy color pass. Also fixes T104041 (sheen albedo).
- Instead of precomputing and storing the albedo during shader setup, compute
it when needed. This is technically redundant since we still need to compute
it on shader setup to adjust the sample weight, but the operation is cheap
enough that freeing up the storage seems worth it.
- Future changes (Principled v2) are easier to integrate since the Fresnel
handling isn't all over the place anymore.
- Fresnel handling in the Multiscattering GGX code is still ugly, but since
removing that entirely is the next step, putting effort into cleaning it up
doesn't seem worth it.
- Apart from the d/g/t color passes, no changes to render results are expected.
Differential Revision: https://developer.blender.org/D17101
Useful for validating changes when sampling/noise changes:
- First run with BLENDER_TEST_UPDATE=1 and e.g. CYCLESTEST_SPP_MULTIPLIER=32
- Apply your change
- Run with only CYCLESTEST_SPP_MULTIPLIER=32
- Compare
- Reset the SVN repo
Differential Revision: https://developer.blender.org/D17107
Cycles ignores the size of spot lights, therefore the illuminated area doesn't match the gizmo. This patch resolves this discrepancy.
| Before (Cycles) | After (Cycles) | Eevee
|{F14200605}|{F14200595}|{F14200600}|
This is done by scaling the ray direction by the size of the cone. The implementation of `spot_light_attenuation()` in `spot.h` matches `spot_attenuation()` in `lights_lib.glsl`.
**Test file**:
{F14200728}
Differential Revision: https://developer.blender.org/D17129
Own mistake in d204830107.
For some buttons the type is changed after construction, which means the button
has to be reconstructed. For example number buttons can be turned into number
slider buttons this way. New code was unintentionally overriding the button
type after reconstruction with the old type again.
Use the `SharedCache` concept introduced in D16204 to share lazily
calculated evaluated data between original and multiple evaluated
curves data-blocks. Combined with D14139, this should basically remove
most costs associated with copying large curves data-blocks (though
they add slightly higher constant overhead). The caches should
interact well with undo steps, limiting recalculations on undo/redo.
Options for avoiding the new overhead associated with the shared
caches are described in T104327 and can be addressed separately.
Simple situations affected by this change are using any of the following data
on an evaluated curves data-block without first invalidating it:
- Evaluated offsets (size of evaluated curves)
- Evaluated positions
- Evaluated tangents
- Evaluated normals
- Evaluated lengths (spline parameter node)
- Internal Bezier and NURBS caches
In a test with 4m points and 170k curves, using curve normals in a
procedural setup that didn't change positions procedurally gave 5x
faster playback speeds. Avoiding recalculating the offsets on every
update saved about 3 ms for every sculpt update for brushes that don't
change topology.
Differential Revision: https://developer.blender.org/D17134
This will add a proper modal keymap for the node link drag operator.
It allows the user to customize the keys used to start drag and so on.
Also it gets rid of the custom status bar message.
Differential Revision: https://developer.blender.org/D17190
No user-visible changes expected.
Essentially, this makes it possible to use C++ types like `std::function`
inside `uiBut`. This has plenty of benefits, for example this should help
significantly reducing unsafe `void *` use (since a `std::function` can hold
arbitrary data while preserving types).
----
I wanted to use a non-trivially-constructible C++ type (`std::function`) inside
`uiBut`. But this would mean we can't use `MEM_cnew()` like allocation anymore.
Rather than writing worse code, allow non-trivial construction for `uiBut`.
Member-initializing all members is annoying since there are so many, but rather
safe than sorry. As we use more C++ types (e.g. convert callbacks to use
`std::function`), this should become less since they initialize properly on
default construction.
Also use proper C++ inheritance for `uiBut` subtypes, the old way to allocate
based on size isn't working anymore.
Differential Revision: https://developer.blender.org/D17164
Reviewed by: Hans Goudey
Currently this conversion (which happens when using modifiers in edit
mode, for example) is completely single threaded. It's harder than some
other areas to multithread because BMesh elements don't always know
their indices (and vise versa), and because the dynamic AoS format
used by BMesh makes some typical solutions not helpful.
This patch proposes to split the operation into two steps. The first
updates the indices of BMesh elements and builds tables for easy
iteration later. It also checks if some optional mesh attributes
should be added. The second uses parallel loops over all elements,
copying attribute values and building the Mesh topology.
Both steps process different domains in separate threads (though the
first has to combine faces and loops). Though this isn't proper data
parallelism, it's quite helpful because each domain doesn't affect the
others.
**Timings**
I tested this on a Ryzen 7950x with a 1 million face grid, with no
extra attributes and then with several color attributes and vertex
groups.
| File | Before | After |
| Simple | 101.6 ms | 59.6 ms |
| More Attributes | 149.2 ms | 65.6 ms |
The optimization scales better with more attributes on the BMesh. The
speedup isn't as linear as multithreading other operations, indicating
added overhead. I think this is worth it though, because the user is
usually actively interacting with a mesh in edit mode.
See the differential revision for more timing information.
Differential Revision: https://developer.blender.org/D16249
Whenever a transform operation is activated by gizmo, the gizmo
modal is maintained, but its drawing remains the same even if the
transform mode or constrain is changed.
So update the gizmo according to the mode or constrain set.
NOTE: Currently only 3D view gizmo is affected
The operator has the option to add to selected FCurves instead
of only the active, but it was only exposed in the redo panel.
This patch adds the operator to the right-click menu on FCurve channels,
and to the channel menu in the Graph editor.
Both times with configured to add to selected
instead of only the active FCurve
Revied by Reviewed by: Sybren A. Stüvel
Differential Revision: https://developer.blender.org/D17066
Ref: D17066