Fill face offsets in one multithreaded loop with an offset indices
utility function instead of keeping track of the index and setting
the offset for each face.
Move Auto-Offset toggle from Node Editor View menu
into the Editing > Node Editor section of User Preferences,
to reflect its use as a workflow option not configured
per editor or per file.
Pull Request: https://projects.blender.org/blender/blender/pulls/111589
Change the algorithm to make better use of multiple CPU cores. First
offsets are created by counting the number of elements using each
vertex. Those offsets are used during the next phase that adds indices
to each group in parallel. Atomic increments are used to add elements
to each group. Since the order in each group is non-deterministic,
they are sorted in parallel afterwards.
The performance improvement depends on the number of cores, CPU caches,
memory bandwidth, single threaded performance, and mesh topology. In
our tests, performance improved by 3-4.5x for large grid-like meshes.
See [1] for investigation of this algorithm and potential alternatives.
1. https://hackmd.io/@s0TMIS4lTAGwHVO20ECwpw/build_edge_to_loop_map_tests.
Pull Request: https://projects.blender.org/blender/blender/pulls/110707
The count of faces destroyed could be wrong when more than one groups
of vertices on the same face resulted in the face collapsing.
The solution is to break the loop on the first collapsed face
detection.
Caused by 6ec842c43c.
These checks (shape keys and ID names) cost less than 1ms in Pets
production files, so think they are OK to run systematically.
The library consistency one is way more expensive (~200ms), so keeping
it behind the G_DEBUG_IO debug option for now.
Adds a userpref toggle for the edit mode overlays fresnel.
The edit mode fresnel is only a bit useful in edge cases, like
very dense photogrametry, and the problem is that it causes
more eye strain when modeling for many hours. And it's
benefit on shape readability is small compared to it's negative
impact on selection visibility. It makes the selection color to a
darker less saturated color instead of the theme color, which
leads to worse contrast between the selection and the mesh
or with the background, and also makes the unselected (black)
brighter, also reducing contrast. So it's off by default.
This was split up from https://projects.blender.org/blender/blender/pulls/110097
Pull Request: https://projects.blender.org/blender/blender/pulls/111494
NOTE: This code remains only executed in 'unlikely' case `G_DEBUG_IO` is
enabled. Think this should be systematically done, even though it can
have a non-neglectable cost... Will submit design task first though.
While in a way one could argue calling code could check for such case,
in practice it is way handier for the search code itself to just return
'not found' value in such case, rather than crash!
Add a call to `BKE_main_namemap_validate_and_fix` in the 'general
sanity validation' `after_liblink_merged_bmain_process` function.
This will detect, fix and report invalid ID naming issues (case found in
an early Gold production file).
Enables three options of wireframe color for all shading modes: theme color, object color
and random color. Previously this was exclusive to the wireframe shading mode.
Pull Request: https://projects.blender.org/blender/blender/pulls/111502
Previously `SEQ_modifier_list_copy` in append mode does not ensure
unique strip name, which will result in duplicated names in target
modifier list, then `strip_modifier_remove(name="something")` can remove
the wrong one later on. Now fixed using `BLI_uniquename`.
Pull Request: https://projects.blender.org/blender/blender/pulls/111602
* Using standard NDF and Smith shadowing-masking terms. The previous
`xxxx_opti()` functions were faster to evaluate, but confusing and
error-prone.
* After correcting the BRDF pdf, the prefiltered environment LOD bias
needs to be adjusted to avoid overblurred reflections.
* Corrected the half-vector computation in BTDF evaluation, added check
for invalid configuration due to total internal reflection or `eta == 1`.
* Use `saturate()` instead of `max()` when no division is needed because
the former is faster.
* Indirectly fixes EEVEE-Next refraction denoising.
Pull Request: https://projects.blender.org/blender/blender/pulls/111591
Setting the FPS to 120 caused the FPS to flicker erratically between
130 & 140 FPS.
This also impacted lower frame-rates with 23.98 playing back at 24.03
FPS on my system, 30 FPS played back at 30.13 FPS.
This problem was hidden by the FPS display rounding to an integer.
Regression in 2.5x series (worked in 2.49).
Resolve by clamping the sleep time in the main event loop so the 5ms
sleep doesn't result in sleeping when timers are scheduled to run.
There is still some visible FPS jitter that can be solved by using a
higher resolution sleep interval but that's out of scope for this fix.
Using a higher number of samples (enough samples to account for the last
second or two of playback for e.g.) can be useful when comparing minor
changes in overall playback speed, where the behavior of multi-threaded
operations can make the value jitter with 8 samples (default).
Using fixed-point arithmetic means the average FPS can be updated
by subtracting the oldest FPS sample before adding the new value,
instead of having to average an array of floats every draw.
Increasing the number of samples now only uses a little more memory
(20kb at most).
The error margin from using fixed-point arithmetic is under 0.5
microseconds per frame - more than enough precision for FPS display.
A commented define is included that shows the error margin when enabled.
Since vertex and face normals can be calculated separately, it simplifies
things to further separate the two caches. This makes it easier to use
`SharedCache` to avoid recalculating normals when copying meshes.
Sharing vertex normal caches with meshes with the same positions and
topology allows completely skipping recomputation as meshes are
copied. The effects are similar to e8f4010611, but normals are much
more expensive, so the benefit is larger.
In a simple test changing a large grid's generic attribute with geometry
nodes, I observed a performance improvement from 12 to 17 FPS.
Most real world situations will have smaller changes though.
Completely splitting face and vertex calculation is slightly slower
when face normals aren't already calculated, so I kept the option
to recalculate them together as well.
This simplifies investigating the changes in #105920 which resolve
non-determinism in the vertex normal calculation. If we can make the
topology map creation fast enough, that might allow simplifying this
code more in the future.
Pull Request: https://projects.blender.org/blender/blender/pulls/110479
When the mouse cursor is outside the window, it's expected to not have
an active area set in context on such screen level operators. Use the
existing context query that handles this case.
`uiBut` cannot simply be zero-allocated any more and needs proper construction
(so the `std::function` members can be constructed properly for example). I'm
not entirely sure where and how exactly the crash happens (happens in release
builds only, so probably optimization related), but this change must be done
either way and fixes the issue.
Remove the global `SculptThreadedTaskData` struct which contained
the arguments to ALL multi-threaded sculpt functions. Use the C++
threading API instead of the old task API, moving the arguments
previously stored in the shared struct to actual function arguments.
Pull Request: https://projects.blender.org/blender/blender/pulls/111525
The commit f10965dcb8 introduced a regression in the way how the VBOs
are filled by assuming the requested VBO attribute type has the same
alignment as the CPU and is the same on different platforms.
Unfortunately, this turned out to not be the case.
Switch the mask attribute to be float on the GPU, which has a downside
of increased bandwidth to be transferred, but a benefit of less compute
power needed to update the VBO.
The fix is suggested by Clement.
Pull Request: https://projects.blender.org/blender/blender/pulls/111521
This was hard coded to 8, which can still result in a number that
jitters making the overall FPS difficult to measure.
The default is still 8, but this is now a preference that can be
increased for values that don't jitter as much.