This uses the same sample classification approach as used for PMJ,
because it turns out to also work equally well with Sobol-Burley.
This also implements a fallback (random classification) that should
work "okay" for other samplers, though there are no other samplers
at the moment.
Differential Revision: https://developer.blender.org/D15845
The multi-dimensional Sobol pattern required us to carefully use as low
dimensions as possible, as quality goes down in higher dimensions. Now that we
have two sampling patterns that are at least as good, there is no need to keep
it around and the implementation can be simplified.
Differential Revision: https://developer.blender.org/D15788
Fix two issues in the previous implementation:
* Only power-of-two prefixes were progressively stratified, not suffixes.
This resulted in unnecessarily increased noise when using non-power-of-two
sample counts.
* In order to try to get away with just a single sample pattern, the code
used a combination of sample index shuffling and Cranley-Patterson rotation.
Index shuffling is normally fine, but due to the sample patterns themselves
not being quite right (as described above) this actually resulted in
additional increased noise. Cranley-Patterson, on the other hand, always
increases noise with randomized (t,s) nets like PMJ02, and should be avoided
with these kinds of sequences.
Addressed with the following changes:
* Replace the sample pattern generation code with a much simpler algorithm
recently published in the paper "Stochastic Generation of (t, s) Sample
Sequences". This new implementation is easier to verify, produces fully
progressively stratified PMJ02, and is *far* faster than the previous code,
being O(N) in the number of samples generated.
* It keeps the sample index shuffling, which works correctly now due to the
improved sample patterns. But it now uses a newer high-quality hash instead
of the original Laine-Karras hash.
* The scrambling distance feature cannot (to my knowledge) be implemented with
any decorrelation strategy other than Cranley-Patterson, so Cranley-Patterson
is still used when that feature is enabled. But it is now disabled otherwise,
since it increases noise.
* In place of Cranley-Patterson, multiple independent patterns are generated
and randomly chosen for different pixels and dimensions as described in the
original PMJ paper. In this patch, the pattern selection is done via
hash-based shuffling to ensure there are no repeats within a single pixel
until all patterns have been used.
The combination of these fixes brings the quality of Cycles' PMJ sampler in
line with the previously submitted Sobol-Burley sampler in D15679. They are
essentially indistinguishable in terms of quality/noise, which is expected
since they are both randomized (0,2) sequences.
Differential Revision: https://developer.blender.org/D15746
This patch is a response to T92588 and is implemented
as a Function/Shader node.
This node has support for Float, Vector and Color data types.
For Vector it supports uniform and non-uniform mixing.
For Color it now has the option to remove factor clamping.
It replaces the Mix RGB for Shader and Geometry node trees.
As discussed in T96219, this patch converts existing nodes
in .blend files. The old node is still available in the
Python API but hidden from the menus.
Reviewed By: HooglyBoogly, JacquesLucke, simonthommes, brecht
Maniphest Tasks: T92588
Differential Revision: https://developer.blender.org/D13749
Based on the paper "Practical Hash-based Owen Scrambling" by Brent Burley,
2020, Journal of Computer Graphics Techniques.
It is distinct from the existing Sobol sampler in two important ways:
* It is Owen scrambled, which gives it a much better convergence rate in many
situations.
* It uses padding for higher dimensions, rather than using higher Sobol
dimensions directly. In practice this is advantagous because high-dimensional
Sobol sequences have holes in their sampling patterns that don't resolve
until an unreasonable number of samples are taken. (See Burley's paper for
details.)
The pattern reduces noise in some benchmark scenes, however it is also slower,
particularly on the CPU. So for now Progressive Multi-Jittered sampling remains
the default.
Differential Revision: https://developer.blender.org/D15679
This cleans up the OpenGL build flags and linking.
It additionally also removes some dead code.
One of these dead code paths is WITH_X11_ALPHA which actually never was
active even with the build flag on. The call to use this was never
called because the default initializer for GHOST was set to have it off
per default. Nothing called this function with a boolean value to enable it.
These cleanups are needed to support true headless OpenGL rendering.
Without these cleanups libepoxy will fail to load the correct OpenGL
Libraries as we have already linked them to the blender binary.
Reviewed By: Brecht, Campbell, Jeroen
Differential Revision: http://developer.blender.org/D15554
* Store compact ray differentials in ShaderData and compute full differentials
on demand. This reduces register pressure on the GPU.
* Remove BSDF differential code that was effectively doing nothing as the
differential orientation was discarded when making it compact.
This gives a 1-5% speedup with RTX A6000 + OptiX in our benchmarks, with the
bigger speedups in simpler scenes.
Renders appear to be identical except for the Both displacement option that
does both displacement and bump.
Differential Revision: https://developer.blender.org/D15677
Simplifies intersection code a little and slightly improves precision regarding
self intersection.
The parametric texture coordinate in shader nodes is still the same as before
for compatibility.
The Metal backend now compiles and caches a second set of kernels which are
optimized for scene contents, enabled for Apple Silicon.
The implementation supports doing this both for intersection and shading
kernels. However this is currently only enabled for intersection kernels that
are quick to compile, and already give a good speedup. Enabling this for
shading kernels would be faster still, however this also causes a long wait
times and would need a good user interface to control this.
M1 Max samples per minute (macOS 13.0):
PSO_GENERIC PSO_SPECIALIZED_INTERSECT PSO_SPECIALIZED_SHADE
barbershop_interior 83.4 89.5 93.7
bmw27 1486.1 1671.0 1825.8
classroom 175.2 196.8 206.3
fishy_cat 674.2 704.3 719.3
junkshop 205.4 212.0 257.7
koro 310.1 336.1 342.8
monster 376.7 418.6 424.1
pabellon 273.5 325.4 339.8
sponza 830.6 929.6 1142.4
victor 86.7 96.4 96.3
wdas_cloud 111.8 112.7 183.1
Code contributed by Jason Fielder, Morteza Mostajabodaveh and Michael Jones
Differential Revision: https://developer.blender.org/D14645
This patch unifies the names of math functions for different data types and uses
overloading instead. The goal is to make it possible to swap out all the float3
variables containing RGB data with something else, with as few as possible
changes to the code. It's a requirement for future spectral rendering patches.
Differential Revision: https://developer.blender.org/D15276
* Rename "texture" to "data array". This has not used textures for a long time,
there are just global memory arrays now. (On old CUDA GPUs there was a cache
for textures but not global memory, so we used to put all data in textures.)
* For CUDA and HIP, put globals in KernelParams struct like other devices.
* Drop __ prefix for data array names, no possibility for naming conflict now that
these are in a struct.
For non-raw, non-sRGB color spaces, always use half float even if that uses
more memory. Otherwise the precision loss from conversion to scene linear or
sRGB (as natively understood by the texture sampling) can be too much.
This also required a change to do alpha association ourselves instead of OIIO,
because in OIIO alpha multiplication happens before conversion to half float
and that gives too much precision loss.
Ref T68926
Move MNEE to own kernel, separate from shader ray-tracing. This does introduce
the limitation that a shader can't use both MNEE and AO/bevel, but that seems
like the better trade-off for now.
We can experiment with bigger kernel organization changes later.
Differential Revision: https://developer.blender.org/D15070
Regenerate blackbody RGB curve fit to not clamp values, and extend down to
800K since it does now change below 965K.
Note that as before, blackbody is only defined in the range 800K to 12000K
and has a fixed value outside of that. But within that range there should
be no more unnecessary gamut clamping.
This patch makes it possible to change the precision with which to
store volume data in the NanoVDB data structure (as float, half, or
using variable bit quantization) via the previously unused precision
field in the volume data block.
It makes it possible to further reduce memory usage during
rendering, at a slight cost to the visual detail of a volume.
Differential Revision: https://developer.blender.org/D10023
This completes support for tiled texture packing on the Blender / Cycles
side of things.
Most of these changes fall into one of three categories:
- Updating Image handling code to pack/unpack tiled and multi-view images
- Updating Cycles to handle tiled textures through BlenderImageLoader
- Updating OSL to properly handle textures with multiple slots
Differential Revision: https://developer.blender.org/D14395
Partially reverts rB46ae0831134 now that we have a new version of
OSL/OIIO that supports <UVTILE> directly.
Differential Revision: https://developer.blender.org/D14851
evice_update_preprocess is supposed to detect modified attributes and flag the
device_vector for a copy through device_update_flags. However, since object
attributes are only created in device_update_attributes afterwards, they can't
be included in that check.
Change the function that actually updates the device_vector to tag it as
modified as soon as its content gets updated.
Differential Revision: https://developer.blender.org/D14815
Inspired by D12936 and D12929, this patch adds general purpose
"Combine Color" and "Separate Color" nodes to Geometry, Compositor,
Shader and Texture nodes.
- Within Geometry Nodes, it replaces the existing "Combine RGB" and
"Separate RGB" nodes.
- Within Compositor Nodes, it replaces the existing
"Combine RGBA/HSVA/YCbCrA/YUVA" and "Separate RGBA/HSVA/YCbCrA/YUVA"
nodes.
- Within Texture Nodes, it replaces the existing "Combine RGBA" and
"Separate RGBA" nodes.
- Within Shader Nodes, it replaces the existing "Combine RGB/HSV" and
"Separate RGB/HSV" nodes.
Python addons have not been updated to the new nodes yet.
**New shader code**
In node_color.h, color.h and gpu_shader_material_color_util.glsl,
missing methods hsl_to_rgb and rgb_to_hsl are added by directly
converting existing C code. They always produce the same result.
**Old code**
As requested by T96219, old nodes still exist but are not displayed in
the add menu. This means Python scripts can still create them as usual.
Otherwise, versioning replaces the old nodes with the new nodes when
opening .blend files.
Differential Revision: https://developer.blender.org/D14034
* Leave code for building the render delegate against other applications and
their USD libraries to the Cycles repository, since this is not a great fit.
In the Blender repository, always use Blender's USD libraries now that they
include Hydra support.
* Hide non-USD symbols from the hdCycles shared library, to avoid library
version conflicts.
* Share Apple framework linking between the standalone app and plugin.
* Add cycles_hydra module, to be shared between the standalone app and plugin.
* Bring external libs code in sync with standalone repo, adding various missing
libraries.
* Move some cmake include directories to the top level cycles source folder
because we need to control their global order, to ensure we link against the
correct headers with mixed Blender libraries and external USD libraries.
This adds support for rendering motion blur for volumes, using their
velocity field. This works for fluid simulations and imported VDB
volumes. For the latter, the name of the velocity field can be set per
volume object, with automatic detection of velocity fields that are
split into 3 scalar grids.
A new parameter is also added to scale velocity for more artistic control.
Like for Alembic and USD caches, a parameter to set the unit of time in
which the velocity vectors are expressed is also added. For Blender gas
simulations, the velocity unit should always be in seconds, so this is
only exposed for volume objects which may come from external OpenVDB
files.
These parameters are available under the `Render` panels for the fluid
domain and the volume object data properties respectively.
Credits: kernel advection code from Tangent Animation's Blackbird based
on earlier work by Geraldine Chua
Differential Revision: https://developer.blender.org/D14629
Keep the existing Rec.709 fit and convert to other colorspace if needed, it
seems accurate enough in practice, and keeps the same performance for the
default case.
As far as I can see, it makes a lot of sense to have the alpha channel here, it matches the 2.x behavior and also matches what Eevee is doing.
Differential Revision: https://developer.blender.org/D14595
* Add missing GLEW and hgiGL libraries for Hydra
* Fix wrong case sensitive include
* Fix link errors by adding external libs to static Hydra lib
* Work around weird Hydra link error with MAX_SAMPLES
* Use Embree by default for Hydra
* Sync external libs code with standalone
* Update version number to match Blender
* Remove unneeded CLEW/GLEW from test executable
None of this should affect Cycles in Blender.
Ref T96731
Stumbled over the `integrate_surface_volume_only_bounce` kernel
function not returning the right type. The others too showed up as
warnings when building Cycles as a standalone which didn't have
those warnings disabled.
Differential Revision: https://developer.blender.org/D14558