The issue was caused by un-initialized local storage for volume
intersection hits which are supposed to be stored in per-thread
KernelGlobals.
Fix is to make thread_shader() be the same as thread_render() in
respect of KernelGlobals.
Reviewers: brecht
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D5230
We want users to go to the current version for their current version
when possible if not point to latest.
/dev should really only be for development related work. End users
should not be browsing /dev unless they are reading about upcoming
features ahead of time.
When compute preemption is available we schedule more work which is more
efficient. However the CUDA driver appears to be incorrectly reporting this as
unavailable, even though it should be supported starting with Windows 10 1803
and Pascal and Turing (10x0 and 20x0) graphics cards.
This reduces render time by about a 25% difference on our benchmark scenes. On
Linux compute preemption appears to be reported correctly.
Previously, bright edges (e.g. caused by rim lighting) would sometimes get
halos around them after denoising.
This change introduces a log(1+x) highlight compression step that is performed
before denoising and reversed afterwards. That way, the denoising algorithm
itself operates in the compressed space and therefore bright edges cause less
numerical issues.
The kernel does not use AVX2 vectorization, and trying to use BVH8 was
leading to an empty scenes.
Fixes T64624: Ctest : Win32 + AVX2 fails virtually all cycles tests
This adds our own OSL texture handle, that has info for OIIO textures or our
own custom texture types. A filename to handle hash map is used for lookups.
This is efficient because it happens at OSL compile time, because the optimizer
can figure out constant strings and replace them with texture handles.
It's effectively always enabled, only not on some unsupported OpenCL devices.
For testing those it's not useful to disable these features. This is replaced
by the more fine grained feature toggles that we have now.
This version fixes various bugs, and there is no need anymore to use both
9.1 and 10.0 for different cards.
There is a bug related to WITH_CYCLES_CUBIN_COMPILER and bump mapping in the
regression tests, so that remains disabled same as it was for CUDA 10.0.
Fix T59286: CUDA bake failing on some cards.
Fix T56858: CUDA 9.2 and 10 issues.
The main goals of this change is faster starting when using foreground
rendering.
This patch will build kernels in parallel to the update process of
the scene. When these optimized kernels are not available (yet) an AO
kernel will be used.
These AO kernels are fast to compile (3-7 seconds) and can be
reused by all scenes. When the final kernels become available we
will switch to these kernels.
In background mode the AO kernels will not be used.
Some kernels are being used during Scene update (displace, background
light). When these kernels are being used the process can halt until
these become available.
Reviewed By: brecht, #cycles
Maniphest Tasks: T61752
Differential Revision: https://developer.blender.org/D4428
The functions that determine the program name + filename of kernels
were missing some base kernels like denoising and base. For completeness
I added those kernels so the function returns the correct results.
This patch will reduce the number of times that we need to
recompile kernels. It does this by (en/dis)abling features
by default. So when the user needs them that the kernels are
already available.
Other features are enabled by default for background and foreground
rendering. When in background rendering the user wants the best
render performance. When in foreground rendering the user wants
the least amount of recompilations.
Enabling volumetrics or subdivision evaluation will still trigger
a recompilation during foreground rendering.
Reviewed By: #cycles, brecht
Differential Revision: https://developer.blender.org/D4485
Part of the cleanup of the OpenCL codebase.
Single program is not effective when using OpenCL, it is slower
to compile and slower during rendering (when used in for example
`barbershop` or `victor`).
Reviewers: brecht, #cycles
Maniphest Tasks: T62267
Differential Revision: https://developer.blender.org/D4481
Displacement and Background kernels are selectively used, but always compiled. This patch will not compile these kernels when they are not needed.
Displacement kernel is only used for true displacement.
Background kernel is only used when there is a (Cycles)Light of type `LIGHT_BACKGROUND`.
Reviewed By: brecht, #cycles
Tags: #cycles
Maniphest Tasks: T61971
Differential Revision: https://developer.blender.org/D4412
The goal of this patch is to have limit the number of times
kernels needs to be compiled and are reused as kernels with
different compile directives can lead to identical same
binaries.
The implementation does this by stripping the compile directives.
and reshuffling kernels so the output is more likely to be the
same.
We focussed on the kernels where it was easy to detect and maintain
(bundle, bake, displace, do_volume and background). More optimizations
could be done but they are probably less obvious.
Merged the data_init and state_buffer_size kernels to split_bundle.
This patch will also remove empty kernels for do_volume and bake
when their features are not enabled.
When using the benchmark files there are less background, bake and
do_volume kernels compiled.
Fix: T61576, T61501, T61466
Reviewed By: brecht, #cycles
Differential Revision: https://developer.blender.org/D4390
Using OpenCL MegaKernel has been slow and therefore not usefull.
This patch will remove the mega kernel from the OpenCL codebase
and the OpenCLDeviceBase class.
T61736: removal of mega kernel
T61703: baking does not work with mega kernel
Tags: #cycles
Differential Revision: https://developer.blender.org/D4383