Commit Graph

789 Commits

Author SHA1 Message Date
Brecht Van Lommel
394a1373a0 Cycles: use OpenCL C 2.0 if available, to improve performance for AMD
Tested with AMD Radeon Pro WX 9100, where it brings performance back to 2.80
level, and combined with recent changes is about 2-15% faster than 2.80 in
our benchmark scenes.

This somehow appears to specifically address the issue where adding more shader
nodes leads to slower runtime. I found no additional speedup by applying this
to change to 2.80 or removing the new shader node code.

Ref T71479

Patch by Jeroen Bakker.

Differential Revision: https://developer.blender.org/D6252
2020-03-24 20:09:36 +01:00
Ray Molenkamp
44c6b6615b OpenCL: Bring back CYCLES_OPENCL_TEST override
Back in 2.79 you could either use the debug panel or an
environment variable to override using OpenCL for unsupported
hardware. Which was rather useful for developers when testing
on NVidia just to be sure the CL kernels at-least build properly.

This broke in rB949ab753bb2

This diff restores testing though the CYCLES_OPENCL_TEST
environment variable.

Differential Revision: https://developer.blender.org/D7202

Reviewers: brecht
2020-03-21 11:55:45 -06:00
Dalai Felinto
2d1cce8331 Cleanup: make format after SortedIncludes change 2020-03-19 09:33:58 +01:00
Brecht Van Lommel
472534d16e Fix memory leak in recent Cycles image texture refactor 2020-03-12 20:30:49 +01:00
Brecht Van Lommel
26bea849cf Cleanup: add device_texture for images, distinct from other global memory
There was too much image texture specific stuff in device_memory, and too
much code duplication between devices.
2020-03-12 17:28:55 +01:00
Brecht Van Lommel
21821601f2 Fix Optix build error on Linux with some compilers 2020-03-11 20:35:38 +01:00
Brecht Van Lommel
f01bc597a8 Cleanup: stop encoding image data type in slot index
This is legacy code from when we had a fixed number of textures.
2020-03-11 17:07:17 +01:00
Brecht Van Lommel
dcdcc23488 Fix T74504: Cycles wrong progress bar with CPU adaptive sampling 2020-03-06 23:46:58 +01:00
Brecht Van Lommel
b31b44c223 Fix error in Cycles Optix adaptive sampling after recent cleanup 2020-03-06 23:46:58 +01:00
Dalai Felinto
c60366c01f Cycles: cleanup warning 2020-03-06 15:27:50 +01:00
Brecht Van Lommel
c8ac760c59 Cleanup: tweak Cycles #includes in preparation for clang-format sorting 2020-03-06 14:44:42 +01:00
Campbell Barton
8574d68aa0 Cleanup: spelling 2020-03-06 11:52:32 +11:00
Patrick Mours
88db9a17ce Fix T74393: Cycles crashes when both OSL and Optix Denoising are enabled
Enabling viewport denoising causes Cycles to use a multi-device, which always returned NULL when
asked for OSL memory and would subsequently crash. This fixes that by returning the correct OSL
memory pointer from the CPU device in the special viewport denoising multi-device.
2020-03-05 16:28:31 +01:00
Stefan Werner
51e898324d Adaptive Sampling for Cycles.
This feature takes some inspiration from
"RenderMan: An Advanced Path Tracing Architecture for Movie Rendering" and
"A Hierarchical Automatic Stopping Condition for Monte Carlo Global Illumination"

The basic principle is as follows:
While samples are being added to a pixel, the adaptive sampler writes half
of the samples to a separate buffer. This gives it two separate estimates
of the same pixel, and by comparing their difference it estimates convergence.
Once convergence drops below a given threshold, the pixel is considered done.

When a pixel has not converged yet and needs more samples than the minimum,
its immediate neighbors are also set to take more samples. This is done in order
to more reliably detect sharp features such as caustics. A 3x3 box filter that
is run periodically over the tile buffer is used for that purpose.

After a tile has finished rendering, the values of all passes are scaled as if
they were rendered with the full number of samples. This way, any code operating
on these buffers, for example the denoiser, does not need to be changed for
per-pixel sample counts.

Reviewed By: brecht, #cycles

Differential Revision: https://developer.blender.org/D4686
2020-03-05 12:21:38 +01:00
Patrick Mours
af54bbd61c Cycles: Rework tile scheduling for denoising
This fixes denoising being delayed until after all rendering has finished. Instead, tile-based
denoising is now part of the "RENDER" task again, so that it is all in one task and does not
cause issues with dedicated task pools where tasks are serialized.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D6940
2020-02-28 16:12:29 +01:00
Patrick Mours
0cea9353fd Fix CUDA out of memory error with OptiX viewport denoising on small GPUs
This makes the memory allocation for the denoiser state use the memory allocator in Cycles, which
will evict textures to host memory when there is not enough space on the device. This means the
allocation for the denoiser state won't just fail if there is no more space and instead more space is
made for it to work. Also simplifies code somewhat.
2020-02-28 15:58:17 +01:00
Patrick Mours
a93043153a Cleanup: Remove superfluous "cuda_device_ptr" function 2020-02-25 17:13:59 +01:00
Dalai Felinto
213b4f76ee Cleanup: make format 2020-02-19 18:44:22 +01:00
Patrick Mours
2278aa0da9 Cycles: Add support for adaptive kernel compilation to OptiX device
This modifies the common CUDA implementation for adaptive kernel compilation slightly to support both CUBIN and PTX output (the latter which is then used in the OptiX device). It also fixes adaptive kernel compilation on Windows.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D6851
2020-02-17 14:27:44 +01:00
Ray molenkamp
9339dc6dd1 Fix T70685: Cycles crash using WITH_CYCLES_NATIVE_ONLY on Windows
MSVC does not have -march=native, so the kernel gets built without AVX2 and
BVH8 support. The code assumed it to be available and crashed

Differential Revision: https://developer.blender.org/D6082
2020-02-14 13:55:11 +01:00
Patrick Mours
0d750d7c06 Fix OptiX denoising when multiple CUDA streams are active 2020-02-13 15:22:26 +01:00
Patrick Mours
63bde1063f Cleanup: Remove some unnecessary OptiX device code 2020-02-13 15:22:26 +01:00
Campbell Barton
d1bd33407d Cleanup: pass const variables 2020-02-13 14:14:33 +11:00
Patrick Mours
6389471c40 Fix NLM denoiser no longer working with OptiX after recent commit 2020-02-12 15:46:30 +01:00
Brecht Van Lommel
d104c77af8 Fix Cycles compiler warnings after recent commit 2020-02-12 14:29:14 +01:00
Brecht Van Lommel
709012187a Fix Cycles build errors and clang-format after recent commit 2020-02-12 14:14:02 +01:00
Patrick Mours
153e001c74 Cleanup: Move common CUDA/OptiX Cycles device code into separate file
This reduces code duplication between the CUDA and OptiX device implementations: The CUDA device
class is now split into declaration and definition (similar to the OpenCL device) and the OptiX device
class implements that and only overrides the functions it actually has to change, while using the CUDA
implementation for everything else.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D6814
2020-02-12 13:11:32 +01:00
Patrick Mours
38589de10c Cycles: Add support for denoising in the viewport
The OptiX denoiser can be a great help when rendering in the viewport, since it is really fast
and needs few samples to produce convincing results. This patch therefore adds support for
using any Cycles denoiser in the viewport also (but only the OptiX one is selectable because
the NLM one is too slow to be usable currently). It also adds support for denoising on a
different device than rendering (so one can e.g. render with the CPU but denoise with OptiX).

Reviewed By: #cycles, brecht

Differential Revision: https://developer.blender.org/D6554
2020-02-11 18:03:43 +01:00
Brecht Van Lommel
d9c5f0d25f Cleanup: split Cycles Hair and Mesh classes, with Geometry base class 2020-02-07 12:18:15 +01:00
Ray Molenkamp
02226ef653 Code_Cleanup_Day/Windows: Clean-up windows API Level.
Not sure when this happened but apparently the lower bar is now windows 7 [1]

This patch bumps to API version to 0x0601 (Win7) and cleans up any uses that
worked around the globally set API version.

[1] https://www.blender.org/download/requirements/

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D6758
2020-02-05 20:03:06 -07:00
Sergey Sharybin
6fff73e3f0 Merge branch 'blender-v2.82-release' 2020-01-23 16:59:50 +01:00
Sergey Sharybin
517870a4a1 CMake: Refactor external dependencies handling
This is a more correct fix to the issue Brecht was fixing in D6600.

While the fix in that patch worked fine for linking it broke ASAN
runtime under some circumstances.
For example, `make full debug developer` would compile, but trying
to start blender will cause assert failure in ASAN (related on check
that ASAN is not running already).

Top-level idea: leave it to CMake to keep track of dependency graph.

The root of the issue comes to the fact that target like "blender" is
configured to use a lot of static libraries coming from Blender sources
and to use external static libraries. There is nothing which ensures
order between blender's and external libraries. Only order of blender
libraries is guaranteed.

It was possible that due to a cycle or other circumstances some of
blender libraries would have been passed to linker after libraries
it uses, causing linker errors.

For example, this order will likely fail:

  libbf_blenfont.a libfreetype6.a libbf_blenfont.a

This change makes it so blender libraries are explicitly provided
their dependencies to an external libraries, which allows CMake to
ensure they are always linked against them.

General rule here: if bf_foo depends on an external library it is
to be provided to LIBS for bf_foo.
For example, if bf_blenkernel depends on opensubdiv then LIBS in
blenkernel's CMakeLists.txt is to include OPENSUBDIB_LIBRARIES.

The change is made based on searching for used include folders
such as OPENSUBDIV_INCLUDE_DIRS and adding corresponding libraries
to LIBS ion that CMakeLists.txt. Transitive dependencies are not
simplified by this approach, but I am not aware of any downside of
this: CMake should be smart enough to simplify them on its side.
And even if not, this shouldn't affect linking time.

Benefit of not relying on transitive dependencies is that build
system is more robust towards future changes. For example, if
bf_intern_opensubiv is no longer depends on OPENSUBDIV_LIBRARIES
and all such code is moved to bf_blenkernel this will not break
linking.

The not-so-trivial part is change to blender_add_lib (and its
version in Cycles). The complexity is caused by libraries being
provided as a single list argument which doesn't allow to use
different release and debug libraries on Windows. The idea is:

- Have every library prefixed as "optimized" or "debug" if
  separation is needed (non-prefixed libraries will be considered
  "generic").

- Loop through libraries passed to function and do simple parsing
  which will look for "optimized" and "debug" words and specify
  following library to corresponding category.

This isn't something particularly great. Alternative would be to
use target_link_libraries() directly, which sounds like more code
but which is more explicit and allows to have more flexibility
and control comparing to wrapper approach.

Tested the following configurations on Linux, macOS and Windows:

- make full debug developer
- make full release developer
- make lite debug developer
- make lite release developer

NOTE: Linux libraries needs to be compiled with D6641 applied,
otherwise, depending on configuration, it's possible to run into
duplicated zlib symbols error.

Differential Revision: https://developer.blender.org/D6642
2020-01-23 16:59:18 +01:00
Antonio Vazquez
fb671035be Merge branch 'blender-v2.82-release' 2020-01-23 16:56:26 +01:00
Patrick Mours
26687dda5a Fix T71344: Optix render errors with motion blur and unknown bone constraint relationship
The OptiX SRT motion expects a motion defined by translation,
rotation, shear and scale, but the matrix decomposition code in
Cycles was not able to extract shear information and instead
produced a stretch matrix with the information baked in. This
caused conflicting transforms between traversal and shading
and lead to render artifacts.
This patch changes the matrix decomposition to produce factors
inline with what OptiX expects to fix that.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D6605
2020-01-22 15:49:14 +01:00
Patrick Mours
2a638e8c90 Merge branch 'blender-v2.82-release' 2020-01-14 17:49:04 +01:00
Patrick Mours
ff430dea66 Fix rendering motion blur scenes with OptiX failing with CUDA_ERROR_INVALID_CONTEXT
Commit baeb11826b switched memory
allocation for the motion transform to use CUDA directly, instead of going
through abstractions. But no CUDA context was set active before those
were called, so the calls failed. This fixes that by binding a context beforehand.
2020-01-14 17:48:16 +01:00
Patrick Mours
89578a8f6e Fix OptiX acceleration structure failing to build in viewport
The `optixAccelBuild` API throws an error when the property to get compacted size is passed in
without the `OPTIX_BUILD_FLAG_ALLOW_COMPACTION` flag set. This is not currently hit
because `background` is always true (set in `mem_alloc`), but would become an issue once that
is sorted out, so fixing it now to be safe.
2020-01-10 16:03:11 +01:00
Patrick Mours
d5ca72191c Cycles: Add OptiX AI denoiser support
This patch adds support for the OptiX denoiser as an alternative to the existing NLM denoiser in Cycles. It's re-using the same denoising architecture based on tiles and therefore implicitly also works with multiple GPUs.

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D6395
2020-01-08 16:53:11 +01:00
Jeroen Bakker
f5e37af5a8 Cycles/OpenCL: Remove NULL PTR Workaround
In the current OpenCL implementation we have a work-around for platforms
that didn't support NULL pointers. We used to replace all NULLs and
empty arrays with a pointer to a single byte on the OpenCL Device.

During investigation of {T65924} it was asked to remove this work-around
for testing. This change improves the render times.

    SCENE              | BEFORE | AFTER
   --------------------+--------+-------
   bmw27               | 108    | 89
   barbershop_interior | 867    | 673
   classroom           | 270    | 173
   fishy_cat           | 244    | 196
   koro                | 249    | 207
   pavillon_barcelona  | 582    | 414

Note that this change does not fix T65924 it just improves the
rendering performance for OpenCL. We haven't tested this patch on all
platforms so we should keep an eye out on the tracker.

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D6391
2019-12-11 11:59:21 +01:00
Patrick Mours
baeb11826b Cycles: Add OptiX acceleration structure compaction
This adds compaction support for OptiX acceleration structures, which reduces the device memory footprint in a post step after building. Depending on the scene this can reduce the amount of used device memory quite a bit and even improve performance (smaller acceleration structure improves cache usage). It's only enabled for background renders to make acceleration structure builds fast in viewport.

Also fixes a bug in the memory management for OptiX acceleration structures: These were held in a dynamic vector of 'device_memory' instances and used the mem_alloc/mem_free functions. However, those keep track of memory instances in the 'cuda_mem_map' via pointers to 'device_memory' (which works fine everywhere else since those are never copied/moved). But in the case of the vector, it may decide to reallocate at some point, which invalidates those pointers and would result in some nasty accesses to invalid memory. So it is not actually safe to move a 'device_memory' object and therefore this removes the move operator overloads again.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D6369
2019-12-09 14:32:12 +01:00
Patrick Mours
8caeae9f40 Fix crash if OptiX context creation fails in Cycles
When encountering an error during context creation, the "OptiXDevice" constructor aborts early.
This means the "cuda_stream" vector is never resized and the destructor iterated over non-existent data.
2019-11-28 17:07:14 +01:00
Patrick Mours
70a32adfeb Fix assert in Cycles memory statistics when using OptiX on multiple GPUs
The acceleration structure built by OptiX may be different between GPUs, so cannot assume the memory size is the same for all.
This fixes that by moving the memory management for all OptiX acceleration structures into the responsibility of each device (was already the case for BLAS previously, now for TLAS too).
2019-11-28 13:57:02 +01:00
Patrick Mours
03cdfc2ff6 Fix potential access to deleted memory in OptiX kernel loading code
Calling "OptiXDevice::load_kernels" multiple times would call "optixPipelineDestroy" on a pipeline
pointer that may have already been deleted previously (since the PIP_SHADER_EVAL pipeline is only
created conditionally).
This change also avoids a CUDA kernel reload every time this is called. The CUDA kernels are
precompiled and don't change, so there is no need to reload them every time.
2019-11-25 18:36:55 +01:00
Sergey Sharybin
4ab0b2b5aa Merge branch 'blender-v2.81-release' 2019-11-07 11:12:44 +01:00
Sergey Sharybin
aa2904ea13 Cycles: Fix strict compiler warning
Pointer used for math arithmetics in assert().
CUDA device pointer is actually an integer type, not a pointer.
2019-11-07 11:06:41 +01:00
Brecht Van Lommel
af9a50bfe6 Merge branch 'blender-v2.81-release' 2019-11-05 17:35:27 +01:00
Ha Hyung-jin
9a9e93e804 Fix T71071: errors when using multiple CUDA/Optix GPUs and host mapped memory
The multi device code did not correctly handle cases where some GPUs store a
resource in device memory and others store it in host mapped memory.

Differential Revision: https://developer.blender.org/D6126
2019-11-05 16:40:55 +01:00
Patrick Mours
200267eb96 Merge branch 'blender-v2.81-release' 2019-10-21 15:53:29 +02:00
Patrick Mours
d0cba5caf4 Fix T70937: Cycles fails in viewport when rendering with OptiX
Was caused by D6068, which did not handle "MEM_PIXELS" memory
when not in background mode. Before that it always fell back to using
generic device memory, so restoring that behavior. In future this
should be changes to use OpenGL interop for optimal performance.
2019-10-21 14:23:45 +02:00
Philipp Oeser
8148bf8cf0 Merge branch 'blender-v2.81-release' 2019-10-18 13:33:20 +02:00