Deforming motion blurred point clouds do not render in Cycles
HIP-RT when BVH timesteps != 0 if Blender is launched with
debug memory.
The root cause is that the size of allocated memory for the
bounding boxes is reported to HIP-RT not the number of valid
bounding boxes.
Pull Request: https://projects.blender.org/blender/blender/pulls/127432
VDB files would fail to render in HIP-RT because NanoVDB wasn't
enabled when compiling HIP-RT kernels, resulting in NanoVDB textures
not being sampled and a blank result being returned instead.
The fix is to enable NanoVDB when compiling HIP-RT kernels.
Ref: #125086
Pull Request: https://projects.blender.org/blender/blender/pulls/127384
Fix the unnecessary recreation of the denoiser that occurs if
Cycles had fallen back to an alternative denoiser in a previous
interation. (E.g. Fallback from OptiX to OIDN)
This issue occured because Cycles didn't understand that when it
previously setup the denoising device, that it had fallen back to
something else. So it thinks the denoising settings have been changes
and tries to recreate the denoiser.
The solution is to first compute the settings change due to
the fallback, then check to see if it's different from the current
denoiser, then recreate the denoiser device if neccesary.
Pull Request: https://projects.blender.org/blender/blender/pulls/125453
The same random number was used for sampling color channel at each step,
which leads to bias. Fixed by rescaling the random number.
Another possibility would be to scramble `rng_offset` and use a new
random number each time, similar as in subsurface scattering, but
rescaling random number should be faster than computing a new one, and
is favorable here since the precision here is not very important
Pull Request: https://projects.blender.org/blender/blender/pulls/127454
IGC 1.0.17384, ocloc 24.31.30508, which:
- add support for Battlemage and Lunar Lake GPUs
- recover from recent performance regression on Linux
- allow to drop older work-around
(9d5164d472) and need for a patched
version on Windows
- ocloc now needs "dg2,mtl" naming for fat binaries.
opencl-clang patches don't get applied anymore by igc build scripts
when llvm is not a git repository, hence I could also drop we can drop
current patch disabling patching.
I've only slightly pushed min-driver-version updates after carefull
testing, instead of jumping to the same version as ocloc as we use to.
Pull Request: https://projects.blender.org/blender/blender/pulls/127251
This new version of the graphics compiler solves a performance
regression on Arc, adds support for Battlemage and Lunar Lake GPUs, and
allows to drop older patch to build fat binaries with broad
compatibility.
This latter change requires using -device dg2,mtl naming instead of
passing architecture ids.
Pull Request: https://projects.blender.org/blender/blender/pulls/127371
Changes:
- Renamed `CocoaWindowDelegate` to `BlenderWindowDelegate` and
`CocoaWindow` to `BlenderWindow` for clarity, and remove the confusion
between `CocoaWindow` and `WindowCocoa`
- Use idiomatic Objective-C properties instead of raw instance
variables, synthesized with the `m_` prefix instead of the default
Objective-C `_` to be closer to the current GHOST style.
- Use idiomatic Objective-C initWith constructors instead of setter
functions
- Function and initializer call and name adjustments
Ref #126772
Pull Request: https://projects.blender.org/blender/blender/pulls/126767
cpuProcessorApply_predivide was doing, for each pixel:
- Un-premultiply pixel to straight alpha
- Call OCIO processor on that one pixel
- Premultiply pixel back
This is not great due to just function call overhead, and probably
prevents whatever "batch processing SIMD optimizations" that OCIO
migth have.
Instead, do this:
- Un-premultiply whole input image,
- Call OCIO on the whole image to do whatever it does,
- Premultiply whole image back.
Doing cpuProcessorApply_predivide on a 4K resolution, float4 image
on Ryzen 5950X (Win10/VS2022) on one thread: 128ms -> 69ms
Pull Request: https://projects.blender.org/blender/blender/pulls/127307
Add multiple "hand" mouse cursors. These are mostly needed for Mac,
which needs open, close, and pointing hand cursors. This also adds
similar for Windows, but just for completeness and testing.
Pull Request: https://projects.blender.org/blender/blender/pulls/127164
The device code was disabled for primitives with deformation blur
and the intersection function always returned false, hence no
rendered primitive.
Other than that, there were a few bugs on both device and host codes
(e.g., the order of current and previous times and the primitive name.)
Pull Request: https://projects.blender.org/blender/blender/pulls/127163
Changes:
- Use macros to reduce Cocoa event forwarding boilerplate
- Use @autoreleasepool where necessary
- Use idiomatic Objective-C properties for `COCOA_VIEW_CLASS`
Style Changes:
- Use C-Style Comments
- Use braces for conditional statements
- Use Objective-C dot-notation
Ref #126772
Pull Request: https://projects.blender.org/blender/blender/pulls/126769
Generate all pointer (mouse) events from the "frame" callback as this
is the intended behavior according to the wl_pointer_listener docs.
In practice it's unlikely users would notice any difference however
there are potentially subtle differences because events were previously
created before all the pointer data had been collected.
This also has some minor advantages as each frame event no longer
needs to detect if scrolling is needed and in the case of motion events:
calculate the time-stamp twice.
Previously, the `AttributeIDRef` wrapper was needed because it also had to
contain a pointer to an `AnonymousAttributeID`. However, since
b279a6d703 this is not necessary anymore.
Therefore we can use "raw" `StringRef` now which reduces the mental overhead
when working with attributes and also simplifies code.
Pull Request: https://projects.blender.org/blender/blender/pulls/127140
Fixes a few issues with point clouds with HIPRT.
1. Crashing when building the BLAS due to an incorrect sized array.
2. A typo leading to all point cloud intersections being skipped.
3. A typo leading to some motion blurred point clouds rendering
as if they were stationary, or not rendering at all.
Pointclouds, with deformable motion blur, with BVH time steps set to >0
still do not render. Curves seem to have the same issue.
Ref #125086
Pull Request: https://projects.blender.org/blender/blender/pulls/125834
`atan2(0, 0)` is undefined on many platforms. To ensure consistent
result across platforms, we return `0` in this case.
Note only the behavior of the shader node `Artan2` is changed here.
During shading, we might still produce `atan2(0, 0)` internally and
cause different results across platforms, but that usually happens with
single samples and is not obvious, plus checking this condition all the
time is costly. If later we find out it's indeed necessary to change all
the invocation of `atan2(0, 0)`, we could change the wrapper functions
in `metal/compat.h` and `mtl_shader_defines.msl`.
Pull Request: https://projects.blender.org/blender/blender/pulls/126951
Autoreleasepool:
- Replace outdated `NSAutoreleasePool` `init`/`drain` mechanism with
the modern `@autoreleasepool {}` block. Leading to simpler and
cleaner code, and more flexible functions return placement.
- Add missing `autoreleasepool` in code.
The rule being that in an MRR (Manual Retain-Release / non-automatic
reference counting) environments, "Cocoa expects there to be an
autorelease pool always available. If a pool is not available,
autoreleased objects do not get released and you leak memory"
(quote from Apple Dev Docs).
As we cannot make safe assumptions about function call sites, and
cannot rely on a main autoreleasepool like a standard Obj-C
application, every piece of Objective-C code that calls any sort of
Cocoa function should be wrapped in an `autoreleasepool {}` block for
eventual internal `autorelease` call to be honored.
- Add missing `release` / `autorelease`, make correct MRR pairs
A next step would be to start transitioning the Blender Obj-C codebase
from MRR to automatic reference counting (ARC).
Dot-Notation:
- Use Objective-C dot notation to follow modern Objective-C practices,
and provide a more familiar syntax to programmers coming from C/C++,
(`foo.prop` instead of `[foo prop]` for access, `foo.prop = bar`
instead of `[foo setProp:bar]` for setting).
- Exception for singleton class properties / methods
(`[NSPasteboard generalPasteboard]` instead of
`NSPasteboard.generalPasteboard`) and nested method calls that mix
property and methods.
(Example: [NSApp windowWithWindowNumber:[window_number integerValue]]`
or `[view convertRectToBacking:[view bounds]]`)
When possible, or necessary, refactored functions were simplified or
refactored, in which case the Blender code style was applied. As such
there is some overlap with PR #126770, especially when it comes to const
correctness.
Due to the fact that these two refactors are quite interlinked, and for
easier reviewing / avoiding complicated merge conflicts, they're shipped
in a single PR.
Ref #126772
Pull Request: https://projects.blender.org/blender/blender/pulls/126771
The kernel zeroing memory since we've added host memory fallback didn't
expect large inputs, so with these scenes, it was running into
"Provided range is out of integer limits. Pass
`-fno-sycl-id-queries-fit-in-int' to disable range check" error.
This kernel was used instead of memset to avoid some issues with the
free_memory queries not always being updated.
As we can't reproduce these with recent drivers, we now use memset,
which fixes rendering with BVH2.
- Resource pools are shared between multiple swap chains to reduce
code complexity
- Fix issue where activating a new graphical context could still leave
the previous context rendering.
- Known issue: opening files with more windows require a redraw.
Reference: #126499
Pull Request: https://projects.blender.org/blender/blender/pulls/126961
The use of `const` for Objective-C object pointer is not standard and
generally unsound. Unlike a C++ class, which has support for const and
non-const methods. An Objective-C object will still respond to mutable
selectors even if its object pointer is const, making it semantically
useless.
Another problem with const Objective-C object is that they cannot be
properly passed into other Objective-C object selectors due to type
differences. Even if that selector didn't modify the underlying object.
For consistency with general Objective-C code style guidelines, usage of
const pointer syntax (`Class *const`) were also removed.
Ref #126772
Pull Request: https://projects.blender.org/blender/blender/pulls/126768
Use a bounding sphere instead of the corners of a bounding box to
compute the subtended angle of a light tree node.
Using the corners of the bounding box was an underestimate in some
scenes, causing some light tree nodes being incorrectly skipped.
Using the subtended angle of a bounding sphere is an overestimate, but
it covers the entire node and would not skip any valid contribution,
and no other reliable algorithm to compute the minimal enclosing angle
is known to us.
We expect some increase in noise due to overestimation, but this has
not been observed yet, in our benchmark scenes only a difference in
noise is visible.
Thanks to Weizhen for the suggestion to use the bounding sphere.
Pull Request: https://projects.blender.org/blender/blender/pulls/126625
Add Metallic BSDF Node to the shader editor.
This node can primarily be used to create more realistic looking
metallic materials than the existing Glossy BSDF node.
This commit does not add any new closures to Cycles, it simply exposes
existing closures that were previous hard to access on their own.
- Exposes the F82 fresnel type that is currently used by the
metallic component of the Principled BSDF. Results should match
between the Metallic BSDF and Principled BSDF when using the same
settings.
- Exposes the Physical Conductor fresnel type that was previously
limited to custom OSL scripts. The Conductor fresnel type accepts
IOR and Extinction coefficients to define the appearance of the
material based off real life measurements.
EEVEE only supports the F82 fresnel type with internal code to convert
the the physical conductor inputs in to a colour format for F82,
which can lead to noticeable rendering differences with
some configurations.
Pull Request: https://projects.blender.org/blender/blender/pulls/114958
Some of the device memory objects had their host_pointer overwritten
with another CPU-side buffer after allocation. This leads to a leak of
host memory allocated by the device_memory.
There are few remaining places where the host_pointer is assigned and
those seems to be fine because the memory was not yet allocated with
a alloc() call.
While the approach in this change is not very ideal, it is small and
potentially could be ported to the LTS tracks. More ideal solution
would be to utilize device_vector::give_data().
Pull Request: https://projects.blender.org/blender/blender/pulls/126788
Since ee1b2f53cc the ffmpeg libraries for Windows x64 are built effectively
without CPU specific SIMD optimizations. `--arch=x64` is not an architecture
that ffmpeg configure understands, so it falls back to "nothing is known,
turn any architecture specific bits off" code path.
Pull Request: https://projects.blender.org/blender/blender/pulls/126396
when tracing shadow ray through a volume and no hit is registered, we
consider the whole ray segment inside the volume.
However, no hit registered could also happen when the volume is
invisible to shadow ray. We should explicitly check this case and skip
rendering the volume segment instead.
Pull Request: https://projects.blender.org/blender/blender/pulls/126139
This PR implements #126353; In short: keep discard list as part of swap chain images. This allows
better determination when resources are actually not in use anymore.
## Resource pool
Resource pools keep track of the resources for a swap chain image.
In Blender this is a bit more complicated due to the way GPUContext work. A single thread can have
multiple contexts. Some of them have a swap chain (GHOST Window) other don't (draw manager). The
resource pool should be shared between the contexts running on the same thread.
When opening multiple windows there are also multiple swap chains to consider.
### Discard pile
Resource handles that are deleted and stored in the discard pile. When we are sure that these
resources are not used on the GPU anymore these are destroyed.
### Reusable resources
There are other resources as well like:
- Descriptor sets
- Descriptor pools
## Open issues
There are some limitations that require future PRs to fix including:
- Background rendering
- Handling multiple windows
- Improve CPU/GPU synchronization
- Reuse staging buffers
Pull Request: https://projects.blender.org/blender/blender/pulls/126353