Previously, we used precomputed Gaussian fits to the XYZ CMFs, performed
the spectral integration in that space, and then converted the result
to the RGB working space.
That worked because we're only supporting dielectric base layers for
the thin film code, so the inputs to the spectral integration
(reflectivity and phase) are both constant w.r.t. wavelength.
However, this will no longer work for conductive base layers.
We could handle reflectivity by converting to XYZ, but that won't work
for phase since its effect on the output is nonlinear.
Therefore, it's time to do this properly by performing the spectral
integration directly in the RGB primaries. To do this, we need to:
- Compute the RGB CMFs from the XYZ CMFs and XYZ-to-RGB matrix
- Resample the RGB CMFs to be parametrized by frequency instead of wavelength
- Compute the FFT of the CMFs
- Store it as a LUT to be used by the kernel code
However, there's two optimizations we can make:
- Both the resampling and the FFT are linear operations, as is the
XYZ-to-RGB conversion. Therefore, we can resample and Fourier-transform
the XYZ CMFs once, store the result in a precomputed table, and then just
multiply the entries by the XYZ-to-RGB matrix at runtime.
- I've included the Python script used to compute the table under
`intern/cycles/doc/precompute`.
- The reference implementation by the paper authors [1] simply stores the
real and imaginary parts in the LUT, and then computes
`cos(shift)*real + sin(shift)*imag`. However, the real and imaginary parts
are oscillating, so the LUT with linear interpolation is not particularly
good at representing them. Instead, we can convert the table to
Magnitude/Phase representation, which is much smoother, and do
`mag * cos(phase - shift)` in the kernel.
- Phase needs to be unwrapped to handle the interpolation decently,
but that's easy.
- This requires an extra trig operation in the kernel in the dielectric case,
but for the conductive case we'll actually save three.
Rendered output is mostly the same, just slightly different because we're
no longer using the Gaussian approximation.
[1] "A Practical Extension to Microfacet Theory for the Modeling of
Varying Iridescence" by Laurent Belcour and Pascal Barla,
https://belcour.github.io/blog/research/publication/2017/05/01/brdf-thin-film.html
Pull Request: https://projects.blender.org/blender/blender/pulls/140944
Supporting this on the Metallic BSDF will require some extra work,
and on the Glossy BSDF it doesn't make much sense conceptually
(for that kind of shader setup, we'll want to support layering in SVM),
but Glass BSDF just needs to be hooked up so might as well do that.
Pull Request: https://projects.blender.org/blender/blender/pulls/140832
Detect which volume attributes nodes have a linear mapping to their usage
as density / color / temperature in volume shader nodes, and use stochastic
sampling for them.
Pull Request: https://projects.blender.org/blender/blender/pulls/132908
Stochastically turn a tricubic filter into a trilinear one. This
reduces the number of taps from 64 to 8. It combines ideas from
the "Stochastic Texture Filtering" paper and our previous GPU
sampling of 3D textures.
This is currently only used in a few places where we know stochastic
interpolation is valid or close enough in practice.
* Principled volume density, color and temperature
* Motion blur velocity
On an Macbook Pro M3 with the openvdb_smoke.blend regression test
and cubic sampling, this gives a ~2x speedup for CPU and ~4x speedup
for GPU. However it also increases noise, usually only a little. Equal
time renders for this scene show a clear reduction in noise for both
CPU and GPU.
Note we can probably get a bigger speedup with acceptable noise trade-off
using full stochastic sampling, but will investigate that separately.
Pull Request: https://projects.blender.org/blender/blender/pulls/132908
All GPU backends now support NanoVDB, using our own kernel side code
that is easily portable. This simplifies kernel and device code.
Volume bounds are now built from the NanoVDB grid instead of OpenVDB,
to avoid having to keep around the OpenVDB grid after loading.
While this reduces memory usage, it does have a performance impact,
particularly for the Cubic filter. That will be addressed by
another commit.
Pull Request: https://projects.blender.org/blender/blender/pulls/132908
The numeric levels have no obvious meaning. This removes the distinction
between severity and levels, instead there is a single list of named levels
with defined meaning.
Debug means information that's mainly useful for developers, and trace is for
very verbose code execution tracing.
Pull Request: https://projects.blender.org/blender/blender/pulls/140244
* Add render category, which is automatically enabled when using -f or -a
command line flags for background rendering.
* Add extra logs to mention scene, view layer and frame ahead of time rather
than including it in every line.
* Remaining time was removed from Cycles, this will be added back for animations
at the render pipeline level.
Pull Request: https://projects.blender.org/blender/blender/pulls/140244
* Change order and formatting of messages
* Change WARN to WARNING, don't print INFO
* Change filter matching "foo" can be used instead of "foo.*"
* Write timestamp as hh::mm::ss.rrr
* Add memory usage writing
* Add macro to print certain INFO logs without checking level
* Indent multi-line log messages with first line
* Add mutex to avoid garbling multi-line logs
* Enable logging by either setting level or filter
Pull Request: https://projects.blender.org/blender/blender/pulls/140244
* Add own simple logging system to replace glog, which is no longer
maintained by Google.
* When building in Blender, integrate with CLOG and print all messages
through that system instead.
* --log cycles now replaces --debug-cycles. The latter still works but
is no longer documented.
Pull Request: https://projects.blender.org/blender/blender/pulls/140244
This PR solves running Wayland on compositors that don't support HDR/
color management. It also allows to let Blender window be drawn across
monitor boundaries and being transferred and clamped to the monitor
it is being displayed on.
From our point of view monitor configurations is a compositor/OS
responsibility. This PR provides the compositor that the provided
swapchain image will be using sRGB whitepoints and transfer
function. The compositor should then take care of performing the
final transfer to the monitor color volume.
The color management protocol doesn't provide guarantees that
every compositor does this. It is mentioned as a recommendation
and 'should do this'.
Pull Request: https://projects.blender.org/blender/blender/pulls/141598
This PR is a more extensive follow on from #123551 (removal of AMD and Intel GPU support).
All supported Apple GPUs have Metal 3 and tier 2 argument buffer support. The invariant resource properties `gpuAddress` and `gpuResourceID` can be written directly into GPU structs once at setup time rather than once per dispatch. More background info can be found in [this article](https://developer.apple.com/documentation/metal/improving-cpu-performance-by-using-argument-buffers?language=objc).
Code changes:
- All code relating to `MTLArgumentEncoder` is removed
- `KernelParamsMetal` updates are directly written into `id<MTLBuffer> launch_params_buffer` which is used for the "static" dispatch arguments
- Dynamic dispatch arguments are small enough to be encoded using the `MTLComputeCommandEncoder.setBytes` function, eliminating the need for cycling temporary arg buffers
Pull Request: https://projects.blender.org/blender/blender/pulls/140671
After spending way too much time looking into the image handle code
because I assumed the issue is some interaction between OSL code using
a texture and the image system that's in SVM mode (which never happened
before the custom camera) with printf debuggine because OptiX doesn't
work in debug builds, it turns out the issue was something else entirely:
C++ iterator bullshit.
Specifically, when we remove an entry from services->textures, this
invalidates the iterator, so the code restarts from the beginning.
However, the for-loop still increments the iterator *before* checking
the termination criterion.
If we remove the only element of the map, we:
- Set it = map.begin(), which equals map.end() since it's empty now
- Increment it at the end of the loop iteration
- Compare it == map.end(), which is wrong now since we're past the end
No idea how this didn't blow up sooner, none of this seems camera-specific??
Anyways, the fix is simple - only increment if we didn't restart.
Pull Request: https://projects.blender.org/blender/blender/pulls/141580
When launching Blender via blender-launcher.exe, the window briefly
displays incorrectly on startup when using the Vulkan backend. This is
caused by not properly handling the GHOST_kWindowStateFullScreen case.
Previously, even if the window state was set to fullscreen, nCmdShow
would default to SW_SHOWNORMAL or SW_SHOWNOACTIVATE. With this fix,
nCmdShow is explicitly set to SW_SHOWMAXIMIZED when the window is in
fullscreen state, preventing the flicker.
Pull Request: https://projects.blender.org/blender/blender/pulls/141518
When running Blender with `--debug-cycles` and the right
verbosity level, Cycles can output "GPU Queue Stats" to the terminal
at the end of rendering detailing how much time was spent in each kernel.
The Metal GPU backend did not support this specific way of gathering the
information. This commit fixes that by implementing support to the Metal
GPU backend.
Note: This kind of information could already be gathered for the Metal
GPU backend using the `CYCLES_METAL_PROFILING` environment variable,
and this is still the recommended way of gathering that information for
Metal. This change is just to add some consistency between platforms.
Pull Request: https://projects.blender.org/blender/blender/pulls/137592
Detected an incorrect structure type. A property struct was used to
store feature data. This could lead to incorrect values for enabling
descriptorBufferPushDescriptor, what isn't used.
We are now able to make antialiased mouse cursors at any size directly
from SVG sources. Therefore there is no need for the platform-specific
"cur" versions of these cursors. This removes the work required in
duplicating the cursors in this format. Otherwise the results should be
identical.
Pull Request: https://projects.blender.org/blender/blender/pulls/141309
Adding a Flow to a Mantaflow domain could sometimes cause a crash. This
was because the grids are allocated lazily, and as such, getting them
through `PyObject_GetAttrString` will fail when they are not yet
available. As the resulting Python errors were not cleared, Python could
be left in a bad state, leading to crashes. This is avoided by clearing
these errors before returning from `callPythonFunction` when such an
error is raised.
Ref !141364