Commit Graph

14726 Commits

Author SHA1 Message Date
Jeroen Bakker
46cfba075d Subdiv: Split eval shaders
Both eval shaders were implemented in osd_kernel_comp.glsl. This PR
separates them for easier understanding of the shaders.

Pull Request: https://projects.blender.org/blender/blender/pulls/135719
2025-03-10 16:06:00 +01:00
Lukas Stockner
dbe275895e Cleanup: Cycles: Deduplicate OptiX OSL code
Not a big difference for now, but will be nicer for #129495.

Pull Request: https://projects.blender.org/blender/blender/pulls/135049
2025-03-10 13:30:58 +01:00
Jeroen Bakker
40696ca452 SubDiv: Migrate GPU subdivision to use GPU module
Blender already had its own copy of OpenSubDiv containing some local fixes
and code-style. This code still used gl-calls. This PR updates the calls
to use GPU module. This allows us to use OpenSubDiv to be usable on other
backends as well.

This PR was tested on OpenGL, Vulkan and Metal. Metal can be enabled,
but Vulkan requires some API changes to work with loose geometry.

![metal.png](/attachments/bb042c3a-1a87-4140-9958-a80da10d417b)

# Considerations

**ShaderCreateInfo**

intern/opensubdiv now requires access to GPU module. This to create buffers
in the correct context and trigger correct dispatches. ShaderCreateInfo is used
to construct the shader for cross compilation to Metal/Vulkan. However opensubdiv
shader caching structures are still used.

**Vertex buffers vs storage buffers**

Implementation tries to keep as close to the original OSD implementation. If
they used storage buffers for data, we will use GPUStorageBuf. If it uses vertex
buffers, we will use gpu::VertBuf.

**Evaluator const**

The evaluator cannot be const anymore as the GPU module API only allows
updating SSBOs when constructing. API could be improved to support updating
SSBOs.

Current implementation has a change to use reads out of bounds when constructing
SSBOs. An API change is in the planning to remove this issue. This will be fixed in
an upcoming PR. We wanted to land this PR as the visibility of the issue is not
common and multiple other changes rely on this PR to land.

Pull Request: https://projects.blender.org/blender/blender/pulls/135296
2025-03-10 07:31:59 +01:00
Brecht Van Lommel
8076fa10e0 Fix: Cycles BVH2 issue causing Metal crashes, after recent cleanup
Ref 48398b223b
2025-03-07 19:20:25 +01:00
Sergey Sharybin
32d49541c0 Fix #135572: Cycles shadow linking through transparency is broken on GPU
Make the ray self primitives store and restore reliable for cases when
the intersect_shadow kernel is called multiple times:

- Light object and primitive are stored in dedicated fields in the
  state. This adds 2 integers per state.

- The self object and primitive are used from the previous intersection
  when the intersect_shadow is called multiple times.

There is more detailed explanation added in the code.

The issue was introduced by the light refactor to be objects in #134846.

Pull Request: https://projects.blender.org/blender/blender/pulls/135573
2025-03-07 17:16:04 +01:00
Brecht Van Lommel
48398b223b Cleanup: Fix various divisions by zero reported by ASAN
Pull Request: https://projects.blender.org/blender/blender/pulls/135326
2025-03-06 22:34:23 +01:00
Brecht Van Lommel
b75b2e883d Cleanup: Always use fullsize ShaderData on the CPU
This avoids compiler and ASAN warnings. This optimization exist for
GPU rendering.
2025-03-06 22:34:22 +01:00
Brecht Van Lommel
5a9d4fd613 Cleanup: Use default member initializers 2025-03-06 22:34:22 +01:00
Brecht Van Lommel
ab394c8e8d Refactor: Use std::bitset to avoid overflow in device queue logging 2025-03-06 22:34:22 +01:00
Brecht Van Lommel
dd20f31302 Cleanup: Fix clang-tidy warnings from recent OSL changes 2025-03-06 22:34:22 +01:00
Xavier Hallade
91f332e7c6 Cycles: oneAPI: Force normal GRF for integrator_intersect_* kernels
With auto mode, integrator_intersect_subsurface still ended up being
compiled in large GRF mode on Intel Arc B580, while normal GRF provides
the best performance for this kernel.
2025-03-06 22:24:45 +01:00
Falk David
e39c83c881 Merge branch 'blender-v4.4-release' 2025-03-06 21:19:31 +01:00
Jesse Yurkovich
4c38380cc6 Fix #134043: Account for frame_offset in Alembic procedurals
There's two places which perform the frame start and end time
calculation and only one of them was respecting the frame_offset.

Pull Request: https://projects.blender.org/blender/blender/pulls/134149
2025-03-06 20:54:22 +01:00
Jesse Yurkovich
3a0e4fe316 Fix #134039: Alembic procedural proxy mesh should be owned by GeometrySet
When using Alembic procedurals, the Mesh Sequence Cache attempts to
replace the original geometry with a plain old cube. However, it never
frees this new cube geometry. Transfer ownership to the underlying
GeometrySet instead.

Investigating the scenario also showed that the `~AlembicProcedural`
dtor was removing an item from the `nodes` vector while iterating over
it, which triggers debug asserts on at least MSVC. I believe the removal
is unnecessary since this is the dtor and ASAN appears clean now.

Pull Request: https://projects.blender.org/blender/blender/pulls/134085
2025-03-06 20:53:52 +01:00
Xavier Hallade
d1120c51db Cycles: oneAPI: Enable automated GRF mode selection
The default was large GRF mode for all kernels and normal GRF for
intersection kernels.
path_array kernels also benefit from normal GRF, being almost 2x faster
in this mode, as measured on my Arc B580. This translates to a much
smaller 1-3% speedup in overall rendering.
Instead of manually adding them to the list of kernels to compile in
normal GRF mode, I've switched to auto that provides the same result.
2025-03-06 17:47:19 +01:00
Sybren A. Stüvel
15758ab854 Merge remote-tracking branch 'origin/blender-v4.4-release' 2025-03-06 14:13:03 +01:00
Sergey Sharybin
7397e6da29 Fix: Cycles HIP-RT crash
The crash has been introduced by the refactor of lights to be
objects in #134846.

We can make such cases easier to catch at compile time in the
future, but for now applying the minimal patch which solves the
problem without going deeper into refactor.

Pull Request: https://projects.blender.org/blender/blender/pulls/135570
2025-03-06 11:58:13 +01:00
Sergey Sharybin
f89728a5e4 Fix: HIP-RT creates copy of vector<Object *> during build
Is harmless from functional perspective, but uses more resources and
potentially slower than it should be. Although, probably something
hard to measure in practice, but still better not follow this anti-
pattern.

Pull Request: https://projects.blender.org/blender/blender/pulls/135529
2025-03-06 11:57:51 +01:00
Sergey Sharybin
3f6fca4297 Cycles: Enable HIP-RT logging when debug log is on
These logs do not appear to be that noisy and do help nailing down
issues in HIP-RT.

Pull Request: https://projects.blender.org/blender/blender/pulls/135530
2025-03-06 11:57:34 +01:00
Campbell Barton
5b856ba447 Merge branch 'blender-v4.4-release' 2025-03-06 10:35:59 +11:00
Campbell Barton
b85fc32cae Cleanup: spelling & repeated words in comments
Address warnings from check_spelling.py
2025-03-06 10:33:21 +11:00
Bastien Montagne
2900cfa50a MEM_guardedalloc: Add template 'type-safe' versions of MEM_mallocN.
Same thing as for `MEM_callocN<T>` and `MEM_freeN<T>` in dd168a35c5,
allows to reduce type verbosity, and increase type safety.
2025-03-05 19:35:50 +01:00
Bastien Montagne
dd168a35c5 Refactor: Replace MEM_cnew with a type-aware template version of MEM_callocN.
The general idea is to keep the 'old', C-style MEM_callocN signature, and slowly
replace most of its usages with the new, C++-style type-safer template version.

* `MEM_cnew<T>` allocation version is renamed to `MEM_callocN<T>`.
* `MEM_cnew_array<T>` allocation version is renamed to `MEM_calloc_arrayN<T>`.
* `MEM_cnew<T>` duplicate version is renamed to `MEM_dupallocN<T>`.

Similar templates type-safe version of `MEM_mallocN` will be added soon
as well.

Following discussions in !134452.

NOTE: For now static type checking in `MEM_callocN` and related are slightly
different for Windows MSVC. This compiler seems to consider structs using the
`DNA_DEFINE_CXX_METHODS` macro as non-trivial (likely because their default
copy constructors are deleted). So using checks on trivially
constructible/destructible instead on this compiler/system.

Pull Request: https://projects.blender.org/blender/blender/pulls/134771
2025-03-05 16:35:09 +01:00
Brecht Van Lommel
65889672e2 Fix #135432: Cycles world light and shadow catcher bug after recent changes
Caused by #134846, e813e46327.

Pull Request: https://projects.blender.org/blender/blender/pulls/135440
2025-03-05 16:27:49 +01:00
Brecht Van Lommel
28f7e2ae91 Merge branch 'blender-v4.4-release' 2025-03-04 17:53:08 +01:00
Brecht Van Lommel
a3baf60df4 Fix: Cycles device info uninitialized variable
It's unclear if this caused an actual bug, detected by ASAN.
2025-03-04 17:46:04 +01:00
Lukas Stockner
39a6c222b2 Cycles: Split OSLManager out of OSLShaderManager
This is preparation for #129495.

Currently, all of OSL is managed by the OSLShaderManager. This makes it so that the general OSL setup is handled by a general OSLManager, and both the OSLShaderManager and (in the future) the Camera can use it to manage their scripts.

Pull Request: https://projects.blender.org/blender/blender/pulls/135050
2025-03-04 17:27:10 +01:00
Jeroen Bakker
4a95c0405c Cleanup: Subdiv: Remove wrapper OpenSubdiv_Buffer
`OpenSubdiv_Buffer` is a wrapper that was introduced at the time
that Blender couldn't use CPP directly. It contains a pointer to
a VertBuf and callbacks to use GPU module on that buffer.

This PR replaces OpenSubdiv_Buffer with `blender::gpu::VertBuf` and
removes the wrapper.

NOTE: OpenSubdiv tests are added to blender_test executable to make the
library dependencies not to complicated.

Pull Request: https://projects.blender.org/blender/blender/pulls/135389
2025-03-04 07:51:15 +01:00
Xavier Hallade
90a10dcd50 Cycles: Adjust inlining attributes for oneAPI device
Now ccl_device sets inlining and ccl_device_inline forces inlining.
This matches more closely with what is currently done for cuda and metal
backends.
I've measured from 1% to 6% overall performance improvement in rendering
benchmark scenes on Arc B580, as well as a small decrease in compile
time.
2025-03-03 18:20:02 +01:00
Aras Pranckevicius
cc2c6692c0 Cleanup: Name more IMB things as "byte" or "float" instead of "rect" and "rectFloat"
- IB_rect -> IB_byte_data
- IB_rectfloat -> IB_float_data
- Rename some functions:
	- IMB_get_rect_len -> IMB_get_pixel_count
	- IMB_rect_from_float -> IMB_byte_from_float
	- IMB_float_from_rect_ex -> IMB_float_from_byte_ex
	- IMB_float_from_rect -> IMB_float_from_byte
	- imb_addrectImBuf -> IMB_alloc_byte_pixels
	- imb_freerectImBuf -> IMB_free_byte_pixels
	- imb_addrectfloatImBuf -> IMB_alloc_float_pixels
	- imb_freerectfloatImBuf -> IMB_free_float_pixels
	- imb_freemipmapImBuf -> IMB_free_mipmaps
	- imb_freerectImbuf_all -> IMB_free_all_data
- Remove IB_multiview (not used at all)
- Remove obsolete "module" comments in public IMB headers

Pull Request: https://projects.blender.org/blender/blender/pulls/135348
2025-03-03 17:11:45 +01:00
Sergey Sharybin
ad30bdd470 Merge branch 'blender-v4.4-release' 2025-02-28 19:02:39 +01:00
Sahar A. Kashi
99a487a07c Fix: Cycles HIP-RT curve motion blur and motion pass
Various fixes in the HIP-RT BVH building related on making sure
curves motion blur is supported and is working correctly, as well
as properly handle motion pass configuration when path tracing is
to ignore motion blur (and instead write vector pass).

This PR contains #134797 with fixes needed to fully finish it:
moving commits from that PR here made it easier to ensure all
moving parts are tested without mental overhead.

Fixes #134510

Co-authored-by: Sahar A. Kashi  <sahar.alipourkashi@amd.com>
Co-authored-by: Sergey Sharybin <sergey@blender.org>
Co-authored-by: Brecht Van Lommel <brecht@blender.org>

Pull Request: https://projects.blender.org/blender/blender/pulls/135125
2025-02-28 19:02:03 +01:00
Sean Stirling
5372346978 Cycles: oneAPI: Use linear USM memory for 1D images
Rewrite the ONEAPI Blender texture allocation code to make use of
1D images backed by linear USM memory. This increases parity
with the CUDA implementation and sets the ground work for enabling
host USM allocations in Blender. By enabling this functionality,
previously failing benchmarks are now passing.

Together with the previous commit, no functional changes are expected.
2025-02-28 17:52:41 +01:00
Nikita Sirgienko
dcbc7c1623 Cycles: oneAPI: Remove some texture code from the squished bindless texture commit
This code will be reintroduced back shortly, but under proper credentials.

No functional changes are expected along with the next commit.
2025-02-28 17:51:35 +01:00
Brecht Van Lommel
5caeac4474 Fix: Cycles performance regression after refactor for light objects
Thanks to Xavier Hallade for finding the cause.

Caused by #134846, e813e46327.

Pull Request: https://projects.blender.org/blender/blender/pulls/135309
2025-02-28 17:40:03 +01:00
Brecht Van Lommel
3cf21ceaf4 Merge branch 'blender-v4.4-release' 2025-02-28 13:28:20 +01:00
Brecht Van Lommel
8f00d8b0c8 Fix: Cycles hardware RT is only supported if all multi devices have it
Pull Request: https://projects.blender.org/blender/blender/pulls/135179
2025-02-28 13:21:33 +01:00
Brecht Van Lommel
8ce58f2973 Cycles: Disable HIP-RT and MNEE on RDNA1 generation GPUs
These have bugs in with the latest HIP-RT and HIP SDK, so just disable them
as we do not expect a fix in time, and rolling back would re-introduce other
bugs. As RDNA1 does not have hardware raytracing, it is also less important
to use HIP-RT.

Note that only RDNA2+ is officially supported by HIP, so these GPUs working
at all is somewhat lucky.

Fix #134979
Fix #134978
Fix #134975

Pull Request: https://projects.blender.org/blender/blender/pulls/135179
2025-02-28 13:21:14 +01:00
Campbell Barton
86c0190924 GHOST/Wayland: set the size of custom cursors based on the DPI
HI-DPI screens now select larger custom cursors on Wayland,
previously small cursors would be scaled up.

This only works well when all outputs have the same scaling
as custom-cursors don't support sending multiple sized cursors
to GHOST at once, see code-comments for details.
2025-02-28 22:36:03 +11:00
Sergey Sharybin
d853d994ae Merge branch 'blender-v4.4-release' 2025-02-28 10:42:54 +01:00
Sergey Sharybin
b9c727e255 Fix #135200: Linking linking artifacts with light trees
Ensure that the light tree flattening code does not try to
share node in cases when we need to modify its bit trail.

Pull Request: https://projects.blender.org/blender/blender/pulls/135249
2025-02-28 10:42:29 +01:00
Aras Pranckevicius
02869cc6c8 Color management: Dithering consistency/perf improvements
Float->byte rendered image dithering uses triangle noise algorithm. Keep
the algorithm the same, just make some improvements and fix some issues:

1) The hash function for noise was using "trig" hash from "On generating
random numbers" (Rey 1998), but that is not a great quality hash, plus it
can produce very different results between CPUs/GPUs. Replace it with
"iqint3" (recommended by "Hash Functions for GPU Rendering", JCGT 2020),
which is same performance on GPU, faster on CPU, and much better quality.
This is the same hash as Cycles already uses elsewhere. Also it is purely
integer based, so exactly the same results on all platforms.

2) For the above point, replace `dither_random_value` to take integer
pixel coordinates and adjust calling code accordingly. Some previous
callers were (accidentally?) passing integer coordinates already. Other
places actually get a tiny bit simpler, since they now no longer need an
extra multiplication.

3) The CPU dithering path was wrongly introducing bias, i.e. making the
image lighter. The CPU path also needs dither noise to be in [-1..+1]
range (not [-0.5..+1.5]!) just like GPU path does, since the later
float->byte conversion already does rounding.

4) The CPU dithering path was using thread-slice-local Y coordinate,
meaning the dithering pattern was repeating vertically. The more CPU cores
you use, the worse the repetition.

5) Change the way that uniform noise is converted to triangle noise.
Previous implementation was based on one shadertoy from 2015, change it
to another shadertoy from 2020. The new one fixes issues with the old way,
and it just works on the CPU too, so now both CPU and GPU code paths are
exactly the same.

6) Cleanup: remove DitherContext, just a single float is enough

Performance and image comparisons in the PR.

Pull Request: https://projects.blender.org/blender/blender/pulls/135224
2025-02-27 15:52:45 +01:00
Alaska
88f848bb7a Merge branch 'blender-v4.4-release' 2025-02-27 15:11:03 +13:00
Alaska
b42b5d85ff Cycles: Increase minimum supported HIP GPU driver
After the recent HIP SDK 6.3 update on Windows, the minimum GPU driver
required to use HIP in Cycles has increased.

This commit increases the required driver version listed in the UI and
adds a check to avoid showing HIP devices if they're below a certain
driver version number as they don't work properly.

Pull Request: https://projects.blender.org/blender/blender/pulls/134965
2025-02-27 03:09:37 +01:00
Alaska
fb7b53143e Merge branch 'blender-v4.4-release' 2025-02-27 12:03:30 +13:00
Alaska
d840d249b3 Cycles: Re-enable HIPRT point cloud rendering
Previously point cloud rendering was disabled on the HIPRT backend due
to unexpected performance regressions introduce by it.

With the recent update to HIP SDK 6.3 and HIPRT 2.5, these performance
regressions have been resolved and so this commit re-enables
point cloud rendering on HIPRT.

Pull Request: https://projects.blender.org/blender/blender/pulls/134902
2025-02-27 00:01:35 +01:00
Xavier Hallade
a5d8bd2e29 Cycles: Drop inline hint on light_tree_pdf
Dropping the inlining hint for `light_tree_pdf` and reverting to the
default inlining thresholds for DPC++ compiler gives a ~4% speedup on
classroom and other scenes on Arc B580.

Pull Request: https://projects.blender.org/blender/blender/pulls/135042
2025-02-26 20:14:05 +01:00
Weizhen Huang
7b3f7ccae0 Merge branch 'blender-v4.4-release' 2025-02-26 15:51:55 +01:00
Lukas Stockner
9254532b8b Fix #129306: Cycles: Principled coat doesn't pass furnace test
This implements three improvements to the energy preservation and albedo
scaling logic, which help the Principled BSDF pass the white-furnace test
when using the coat layers at high roughness.

Specifically, at roughness 0.3, the albedo scaling brings it from 60% at
the edge to 95%, and with the energy preservation it's 99.8%.

Pull Request: https://projects.blender.org/blender/blender/pulls/134620
2025-02-26 15:47:21 +01:00
Weizhen Huang
9ee9a2b789 Fix #135145: object visible to volume scatter when ray visibility is off
The comment was added in e0857ad152, and volume scatter visibility is
supported since cdd1d5a93c.

Pull Request: https://projects.blender.org/blender/blender/pulls/135168
2025-02-26 15:45:58 +01:00