Commit Graph

11737 Commits

Author SHA1 Message Date
Campbell Barton
7ebd1f4b79 Cleanup: quiet compiler warnings 2022-07-19 13:32:53 +10:00
Brecht Van Lommel
3407ed5f9b Cleanup: change internal Cycles compact BVH default to match UI 2022-07-18 15:34:13 +02:00
Brecht Van Lommel
c7f788b877 Subdiv: remove unused GPU device choice, fix crash with libepoxy on init
openSubdiv_init() would detect available evaluators before any OpenGL context
exists, causing a crash with libepoxy. This test however is redundant as we
already check the requirements on the Blender side through the GPU API.

To simplify things, completely remove the device detection in the opensubdiv
module and reduce the evaluators to just CPU and GPU. The plan here is to move
to the GPU module abstraction over OpenGL/Metal/Vulkan and so all these
different backends no longer make sense.

This also removes the user preference for OpenSubdiv compute device, which was
not used for the new GPU subdivision implementation.

Ref D15291

Differential Revision: https://developer.blender.org/D15470
2022-07-18 13:59:08 +02:00
Campbell Barton
49babc7caa Cleanup: early exit MEM_lockfree_freeN when called with NULL pointer
Simplify logic for freeing a NULL pointer. While no null-pointer
de-reference was performed, this wasn't as so obvious as the pointer
was passed to MEM_lockfree_allocN_len before checking for NULL.

NOTE: T99744 claimed the a NULL pointer free was a vulnerability,
while I can't see evidence for this - exiting early makes it clearer
the memory isn't accessed.

*Details*

- Add MEMHEAD_LEN macro, avoids redundant NULL check.
- Use "UNLIKELY(..)" hint's for error cases
  (freeing NULL pointer and checking if `leak_detector_has_run`).
2022-07-16 17:23:42 +10:00
Brecht Van Lommel
5152c7c152 Cycles: refactor rays to have start and end distance, fix precision issues
For transparency, volume and light intersection rays, adjust these distances
rather than the ray start position. This way we increment the start distance
by the smallest possible float increment to avoid self intersections, and be
sure it works as the distance compared to be will be exactly the same as
before, due to the ray start position and direction remaining the same.

Fix T98764, T96537, hair ray tracing precision issues.

Differential Revision: https://developer.blender.org/D15455
2022-07-15 18:46:24 +02:00
Brecht Van Lommel
bb376da6df Fix Cycles MetalRT error after recent specialization changes 2022-07-15 18:28:13 +02:00
Brecht Van Lommel
011d3c75a7 Cleanup: compiler warning 2022-07-15 15:20:53 +02:00
Brecht Van Lommel
523bbf7065 Cycles: generalize shader sorting / locality heuristic to all GPU devices
This was added for Metal, but also gives good results with CUDA and OptiX.
Also enable it for future Apple GPUs instead of only M1 and M2, since this has
been shown to help across multiple GPUs so the better bet seems to enable
rather than disable it.

Also moves some of the logic outside of the Metal device code, and always
enables the code in the kernel since other devices don't do dynamic compile.

Time per sample with OptiX + RTX A6000:
                                         new                  old
barbershop_interior                      0.0730s              0.0727s
bmw27                                    0.0047s              0.0053s
classroom                                0.0428s              0.0464s
fishy_cat                                0.0102s              0.0108s
junkshop                                 0.0366s              0.0395s
koro                                     0.0567s              0.0578s
monster                                  0.0206s              0.0223s
pabellon                                 0.0158s              0.0174s
sponza                                   0.0088s              0.0100s
spring                                   0.1267s              0.1280s
victor                                   0.0524s              0.0531s
wdas_cloud                               0.0817s              0.0816s

Ref D15331, T87836
2022-07-15 13:42:47 +02:00
Michael Jones
da4ef05e4d Cycles: Apple Silicon optimization to specialize intersection kernels
The Metal backend now compiles and caches a second set of kernels which are
optimized for scene contents, enabled for Apple Silicon.

The implementation supports doing this both for intersection and shading
kernels. However this is currently only enabled for intersection kernels that
are quick to compile, and already give a good speedup. Enabling this for
shading kernels would be faster still, however this also causes a long wait
times and would need a good user interface to control this.

M1 Max samples per minute (macOS 13.0):

                    PSO_GENERIC  PSO_SPECIALIZED_INTERSECT  PSO_SPECIALIZED_SHADE

barbershop_interior       83.4	            89.5                   93.7
bmw27                   1486.1	          1671.0                 1825.8
classroom                175.2	           196.8                  206.3
fishy_cat                674.2	           704.3                  719.3
junkshop                 205.4	           212.0                  257.7
koro                     310.1	           336.1                  342.8
monster                  376.7	           418.6                  424.1
pabellon                 273.5	           325.4                  339.8
sponza                   830.6	           929.6                 1142.4
victor                    86.7              96.4                   96.3
wdas_cloud               111.8	           112.7                  183.1

Code contributed by Jason Fielder, Morteza Mostajabodaveh and Michael Jones

Differential Revision: https://developer.blender.org/D14645
2022-07-15 13:40:04 +02:00
Michael Jones
5653c5fcdd Cycles: keep track of SVM nodes used in kernels
To be used for specialization in Metal, to automatically leave out unused nodes
from the kernel.

Ref D14645
2022-07-15 13:40:04 +02:00
Brecht Van Lommel
79da7f2a8f Cycles: refactor to move part of KernelData definition to template header
To be used for specialization on Metal in a following commit, turning these
members into compile time constants.

Ref D14645
2022-07-15 13:40:04 +02:00
Damien Picard
2e70d5cb98 Render: camera depth of field support for armature bone targets
This is useful when using an armature as a camera rig, to avoid creating and
targetting an empty object.

Differential Revision: https://developer.blender.org/D7012
2022-07-15 13:40:04 +02:00
Brecht Van Lommel
b8ffd43bd2 Cleanup: make format 2022-07-15 13:40:04 +02:00
Campbell Barton
d8094f9212 GHOST/Wayland: partial support for updating the UI scale
Partial support for changing the UI scale while Blender is open.

The scale is set but issues with the window size not updating remain.
2022-07-15 15:42:24 +10:00
Campbell Barton
60f260eb6a GHOST/Wayland: fix error setting the cursor scale
Calculate a scale that's compatible with the cursor size.
Needed so the cursor is always a multiple of scale.
2022-07-15 15:36:21 +10:00
Olivier Maury
1b5db02a02 Fix Cycles MNEE wrong results with area light spread
When the solve is successful, the light sample needs to be updated since the
effective shading point is now on the last refractive interface. Spread was
not taken into account, creating false caustics.

Differential Revision: https://developer.blender.org/D15449
2022-07-14 16:36:38 +02:00
Brecht Van Lommel
28c3739a9b Cleanup: replace state flow macros in the kernel with functions 2022-07-14 16:36:38 +02:00
Brecht Van Lommel
5539fb3121 Cycles: add presets to the Performance panel
With choices Default, Lower Memory and Faster Render. For convenience, and
to help communicate what the various settings do.

Differential Revision: https://developer.blender.org/D15446
2022-07-14 16:36:38 +02:00
Michael Jones
4b1d315017 Cycles: Improve cache usage on Apple GPUs by chunking active indices
This patch partitions the active indices into chunks prior to sorting by material in order to tradeoff some material coherence for better locality. On Apple Silicon GPUs (particularly higher end M1-family GPUs), we observe overall render time speedups of up to 15%. The partitioning is implemented by repeating the range of `shader_sort_key` for each partition, and encoding a "locator" key which distributes the indices into sorted chunks.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D15331
2022-07-14 14:26:18 +01:00
Campbell Barton
2d04012e57 Cleanup: spelling in comments
Also remove duplicate comments in bmesh_log.h, caused by automated
comment relocation in [0].

[0]: c4e041da23
2022-07-14 22:02:52 +10:00
Campbell Barton
93f74299f0 Cleanup: clang-tidy changes to GHOST_SystemX11
Also remove redundant check.
2022-07-14 21:55:46 +10:00
Campbell Barton
cdd8b96e3b GHOST: remove redundant ascii argument to GHOST_EventKey
Now only the utf8 buffer is used there is no reason to pass both.
2022-07-14 21:54:28 +10:00
Campbell Barton
db80cf6ad7 GHOST/X11: avoid redundant utf8 text lookups for release events
The text representation of release events is never used,
so only calculate this for press events.
2022-07-14 21:10:22 +10:00
Campbell Barton
64e196422e GHOST/X11: Quiet warning about key-release events having their utf8 set
Quiet warning from [0], no functional change as the this information
was always ignored.

Key release events shouldn't have associated text, this was cleared
for wmEvent's, so there is no reason to pass it from GHOST.

[0]: d6fef73ef1
2022-07-14 21:04:16 +10:00
Campbell Barton
b35e33317d Cleanup: update & correct comments for event handling
- Remove references to `ISTEXTINPUT` as any keyboard event with it's
  utf8_buf set can be handled as text input.

- Update references to the key repeat flag.
2022-07-14 16:10:13 +10:00
Campbell Barton
d6fef73ef1 WM: Remove ASCII members from wmEvent & GHOST_TEventKeyData
The `ascii` member was only kept for historic reason as some platforms
didn't support utf8 when it was first introduced.

Remove the `ascii` struct members since many checks used this as a
fall-back for utf8_buf not being set which isn't needed.
There are a few cases where it's convenient to access the ASCII value
of an event (or nil) so a function has been added to do that.

*Details*

- WM_event_utf8_to_ascii() has been added for the few cases an events
  ASCII value needs to be accessed, this just avoids having to do
  multi-byte character checks in-line.

- RNA Event.ascii remains, using utf8_buf[0] for single byte characters.

- GHOST_TEventKeyData.ascii has been removed.

- To avoid regressions non-ASCII Latin1 characters from GHOST are
  converted into multi-byte UTF8, when building X11 without
  XInput & X_HAVE_UTF8_STRING it seems like could still occur.
2022-07-14 15:59:19 +10:00
Campbell Barton
816a73891b GHOST/SDL: pass in utf8 buffer for keyboard events
While GHOST/SDL doesn't support non-ASCII text input,
use the UTF8 buffer to be consistent with all other back-ends.

Move the conversion from SDL_KeyboardEvent to ASCII into a function.

Also only lookup this value on key press (not release).
2022-07-14 15:58:46 +10:00
Campbell Barton
09a74ff8b6 GHOST/SDL: add support for the key repeat flag
Now all ghost back-ends support the key repeat flag
(accessed as WM_EVENT_IS_REPEAT from wmEvent.flag).
2022-07-14 09:46:58 +10:00
Xavier Hallade
5f09440d5a Cycles: Make not-compact BVH the default for embree
Measurements shown on average a 1.08x speedup for a 1.04x increase in
memory usage which is an acceptable trade off for a default setting,
although discoverability of such settings influencing memory usage could
be improved.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D15429
2022-07-12 18:40:14 +02:00
Xavier Hallade
47dd42485e Cycles: fix and enable JIT oneAPI CentOS7 builds for drivers 23570+
The current specific CentOS7 workaround we have for AoT, which is to
disable __FAST_MATH__ by using -fhonor-nans, now also fixes the
compilation issue for JIT as well since at least driver 23570.
2022-07-12 15:55:32 +02:00
Brecht Van Lommel
6e426259b4 Fix T99218: light group add button should be disabled when name is empty
Previously it was inactive but still clickable.

Ref D15316
2022-07-11 14:02:38 +02:00
Campbell Barton
a83502f05f Cleanup: remove unused GHOST function getAnyModifiedState.
Remove unused GHOST_WindowManager::getAnyModifiedState()
2022-07-11 10:38:02 +10:00
Campbell Barton
443690604f Fix cursor display size with tablet input in GHOST/Wayland
The scale for tablet cursor surfaces was never set, making them display
larger. Now the outputs scale is set for mouse & tablet cursors.
2022-07-09 22:46:53 +10:00
Campbell Barton
1de14061cb Cleanup: split out wl_buffer creation into a utility function
Simplify logic for initializing the wl_buffer, ensure the cursors
custom data is never heft in a half initialized state.
Also remove the need for multiple calls to close when handling errors.
2022-07-09 22:27:30 +10:00
Campbell Barton
9a1d772339 Cleanup: remove buffer_t in GHOST/Wayland
This was allocated and only used to store the custom cursor data.
Use a pointer & size member instead for simplicity.
2022-07-09 22:27:28 +10:00
Campbell Barton
ef970b7756 Cleanup: split memfd_create into it's own function for Wayland
Avoid ifdef's in cursor loading by creating a memfd_create_sealed
utility function that works irrespective of memfd_create availability.
2022-07-09 22:27:27 +10:00
Campbell Barton
b5d22a8134 Fix resource leaks setting custom cursors in Wayland
- Memory from the prior cursor was never un-mapped.
- posix_fallocate failure left a file handle open..
2022-07-09 22:27:26 +10:00
Brecht Van Lommel
8159e0a666 Curves: use consistent default radius for Cycles, Eevee, Set Curve Radius node
To avoid Cycles not showing any hair by default, and to avoid very slow render
due to many overlaps with the previous 1 meter default in the node.

Fixes T97584, T99319

Differential Revision: https://developer.blender.org/D15405
2022-07-08 16:21:32 +02:00
Xavier Hallade
0f50ae131f Cycles: enable oneAPI in Linux release builds
with a very high min-driver version requirement, placeholder until JIT
CentOS runtime compilation issue gets fixed in a defined version.
min-driver version check can be worked around by setting
CYCLES_ONEAPI_ALL_DEVICES environment variable.
2022-07-08 15:39:13 +02:00
Campbell Barton
754dae6c76 GHOST/Wayland: add logging for listener handlers
Add logging to all Wayland listener callbacks as it can be difficult
to detect the cause of problems.

Using break-points often isn't practical for debugging interactive
windowing / compositor issues

Logging needs to be enabled on the command line, e.g:

blender --log "ghost.wl.*" --log-level 2 --log-show-basename
2022-07-08 19:40:10 +10:00
Campbell Barton
47616992f8 GHOST: use ELEM/ARRAY_SIZE/UNPACK macros to avoid repetition
Also use UNLIKELY macro for checks for very unlikely scenarios.
2022-07-08 19:39:11 +10:00
Campbell Barton
418d82af28 GHOST: add GHOST_utildefines
Add macros from BLI_utildefines, mainly to avoid that avoid repetition
(ELEM, UNPACK*, CLAMP* & ARRAY_SIZE).

Also add macros LIKELY/UNLIKELY as there are quiet a lot of checks
for unlikely situations for GHOST/Wayland (not having a keyboard,
or mouse for e.g.).
2022-07-08 19:36:31 +10:00
Campbell Barton
5d6e7df4a8 GHOST: initialize grab axis for windows
While this didn't cause any user visible bugs, ASAN would report
an error when passing it as an argument.
2022-07-07 21:44:48 +10:00
Campbell Barton
843ad51d18 Cleanup: use arguments for internal wayland cursor grabbing
Pass in arguments for internal grab logic instead of accessing
some values from the window and other values as arguments.
While more verbose it's simpler to reason about.
2022-07-07 16:36:46 +10:00
Xavier Hallade
190ad73590 Cycles oneAPI: Remove direct dependency on Level-Zero
We used it only to access device id for explicitly allowing Arc GPUs.
It made the backend require ze_loader.dll which could be problematic if
we end up using direct linking.
I've replaced filtering based on PCI device id by using other HW properties
instead (EUs, threads per EU), that are now available through Level-Zero.
2022-07-06 18:55:38 +02:00
Xavier Hallade
debb233787 Cleanup: fix comments in oneAPI kernel.cpp 2022-07-06 18:55:38 +02:00
Nikita Sirgienko
0df574b55e Cycles: Improve an occupancy for Intel GPUs
Initially oneAPI implementation have waited after each memory
operation, even if there was no need for this. Now, the implementation
will wait only if it is really necessary - it have improved
performance noticeble for some scenes and a bit for the rest of them.
2022-07-06 17:26:23 +02:00
Campbell Barton
e58e023e1a GHOST/Wayland: support dynamic loading libraries for Wayland
Add intern/wayland_dynload which is used when WITH_GHOST_WAYLAND_DYNLOAD
is enabled (off by default). When enabled, systems without Wayland
installed will fall back to X11.

This allows Blender to dynamically load:
- libwayland-client
- libwayland-cursor
- libwayland-egl
- libdecor-0 (when WITH_GHOST_WAYLAND_LIBDECOR is enabled).
2022-07-06 15:30:47 +10:00
Campbell Barton
d9505831a4 Cleanup: declare local variables static 2022-07-06 15:28:54 +10:00
Campbell Barton
db9e08a0d1 Cleanup: spelling in comments 2022-07-06 15:28:54 +10:00