Commit Graph

2101 Commits

Author SHA1 Message Date
Jeroen Bakker
949ab753bb Cycles OpenCL: Remove OpenCL MegaKernel
Using OpenCL MegaKernel has been slow and therefore not usefull.
This patch will remove the mega kernel from the OpenCL codebase
and the OpenCLDeviceBase class.

T61736: removal of mega kernel
T61703: baking does not work with mega kernel

Tags: #cycles

Differential Revision: https://developer.blender.org/D4383
2019-02-20 15:17:22 +01:00
Jeroen Bakker
667033e89e T61463: Separate Baking kernels
Cycles OpenCL: Split baking kernels in own program

Fix T61463. Before this patch baking was part of the base kernels. There
are 3 baking kernels that and all 3 uses shader evaluation. Only for one
of these kernels the functionality was wrapped in the __NO_BAKING__
compile directive.

When you start baking this leads to long compile times. By separating
in individual programs will reduce the compile times.

Also wrapped all baking kernels with __NO_BAKING__ to reduce the
compilation times.

Impact on compilation time

    job   |   scene_name    | previous |  new  | percentage
  --------+-----------------+----------+-------+------------
   T61463 | empty           |    10.63 |  7.27 |         32%
   T61463 | bmw             |    17.91 | 14.24 |         20%
   T61463 | fishycat        |    19.57 | 15.08 |         23%
   T61463 | barbershop      |    54.10 | 48.18 |         11%
   T61463 | classroom       |    17.55 | 14.42 |         18%
   T61463 | koro            |    18.92 | 17.15 |          9%
   T61463 | pavillion       |    17.43 | 14.23 |         18%
   T61463 | splash279       |    16.48 | 15.33 |          7%
   T61463 | volume_emission |    36.22 | 34.19 |          6%

Impact on render time

    job   |   scene_name    | previous |   new   | percentage
  --------+-----------------+----------+---------+------------
   T61463 | empty           |    21.06 |   20.54 |          2%
   T61463 | bmw             |   198.44 |  189.59 |          4%
   T61463 | fishycat        |   394.20 |  388.50 |          1%
   T61463 | barbershop      |  1188.16 | 1185.49 |          0%
   T61463 | classroom       |   341.08 |  339.27 |          1%
   T61463 | koro            |   472.43 |  360.70 |         24%
   T61463 | pavillion       |   905.77 |  902.14 |          0%
   T61463 | splash279       |    55.26 |   54.92 |          1%
   T61463 | volume_emission |    62.59 |   39.09 |         38%

I don't have a grounded explanation why koro and volume_emission is this much
faster; I have done several tests though...

Maniphest Tasks: T61463

Differential Revision: https://developer.blender.org/D4376
2019-02-19 16:34:55 +01:00
Jeroen Bakker
e6f5632eb1 T61513: Refactored Cycles Attribute Retrieval
There is a generic function to retrieve float and float3 attributes
`primitive_attribute_float` and primitive_attribute_float3`. Inside
these functions an prioritised if-else construction checked where
the attribute is stored and then retrieved from that location.

Actually the calling function most of the time already knows where
the data is stored. So we could simplify this by splitting these
functions and remove the check logic.

This patch splits the `primitive_attribute_float?` functions into
`primitive_surface_attribute_float?` and `primitive_volume_attribute_float?`.
What leads to less branching and more optimum kernels.

The original function is still being used by OSL and `svm_node_attr`.

This will reduce the compilation time and render time for kernels.
Especially in production scenes there is a lot of benefit.

Impact in compilation times

    job  |   scene_name    | previous |  new  | percentage
  -------+-----------------+----------+-------+------------
  t61513 | empty           |    10.63 | 10.66 |          0%
  t61513 | bmw             |    17.91 | 17.65 |          1%
  t61513 | fishycat        |    19.57 | 17.68 |         10%
  t61513 | barbershop      |    54.10 | 24.41 |         55%
  t61513 | classroom       |    17.55 | 16.29 |          7%
  t61513 | koro            |    18.92 | 18.05 |          5%
  t61513 | pavillion       |    17.43 | 16.52 |          5%
  t61513 | splash279       |    16.48 | 14.91 |         10%
  t61513 | volume_emission |    36.22 | 21.60 |         40%

Impact in render times

    job  |   scene_name    | previous |  new   | percentage
  -------+-----------------+----------+--------+------------
  61513 | empty           |    21.06 |  20.35 |          3%
  61513 | bmw             |   198.44 | 190.05 |          4%
  61513 | fishycat        |   394.20 | 401.25 |         -2%
  61513 | barbershop      |  1188.16 | 912.39 |         23%
  61513 | classroom       |   341.08 | 340.38 |          0%
  61513 | koro            |   472.43 | 471.80 |          0%
  61513 | pavillion       |   905.77 | 899.80 |          1%
  61513 | splash279       |    55.26 |  54.86 |          1%
  61513 | volume_emission |    62.59 |  61.70 |          1%

There is also a possitive impact when using CPU and CUDA, but they are small.

I didn't split the hair logic from the surface logic due to:

* Hair and surface use same attribute types. It was not clear if it could be
  splitted when looking at the code only.
* Hair and surface are quick to compile and to read. So the benefit is quite
  small.

Differential Revision: https://developer.blender.org/D4375
2019-02-19 16:28:25 +01:00
Brecht Van Lommel
9800837b98 Cycles: Support multithreaded compilation of kernels
This patch implements a workaround to get the multithreaded compilation from D2231 working.
So far, it only works for Blender, not for Cycles Standalone. Also, I have only tested the Linux codepath in the helper function.
Depends on D2231.

Patch by lukasstockner97, jbakker, brecht

    job    |   scene_name    | compilation_time
----------+-----------------+------------------
    Baseline | empty           |            22.73
    D2264    | empty           |            13.94
    Baseline | bmw             |            56.44
    D2264    | bmw             |            41.32
    Baseline | fishycat        |            59.50
    D2264    | fishycat        |            45.19
    Baseline | barbershop      |           212.28
    D2264    | barbershop      |           169.81
    Baseline | victor          |            67.51
    D2264    | victor          |            53.60
    Baseline | classroom       |            51.46
    D2264    | classroom       |            39.02
    Baseline | koro            |            62.48
    D2264    | koro            |            49.03
    Baseline | pavillion       |            54.37
    D2264    | pavillion       |            38.82
    Baseline | splash279       |            47.43
    D2264    | splash279       |            37.94
    Baseline | volume_emission |           145.22
    D2264    | volume_emission |           121.10

This patch reduced compilation time as the split kernels and base
kernels are compiled in parallel. In cycles debug mode (256) you can set
unmark the opencl single program file, what reduces the compilation time
even further (bmw 17 seconds, barbershop 53 seconds).

Reviewers: brecht, dingto, sergey, juicyfruit, lukasstockner97

Reviewed By: brecht

Subscribers: Loner, jbakker, candreacchio, 3dLuver, LazyDodo, bliblubli

Differential Revision: https://developer.blender.org/D2264
2019-02-15 08:56:20 +01:00
Brecht Van Lommel
de0e456a6c Cleanup: fix compiler warnings. 2019-02-14 19:39:39 +01:00
Brecht Van Lommel
9886ae6331 Fix T61470: incorrect saturation clamping in recent bugfix.
We should clamp the result after multiplication.
2019-02-14 19:28:44 +01:00
Brecht Van Lommel
ec559912fb Fix T61470: inconsistent HSV node results with saturation > 1.0.
Values outside the 0..1 range produce negative colors, so now clamp to that
range everywhere. Also fixes improper handling of hue > 2.0 in some places.
2019-02-13 17:06:30 +01:00
Lukas Stockner
fccf506ed7 Cycles: animation denoising support in the kernel.
This is the internal implementation, not available from the API or
interface yet. The algorithm takes into account past and future frames,
both to get more coherent animation and reduce noise.

Ref D3889.
2019-02-06 15:18:42 +01:00
Lukas Stockner
c183ac73dc Cycles: tweak outlier detection, preparing for animation denoising.
Ref D3889.
2019-02-06 15:18:38 +01:00
Lukas Stockner
405cacd4cd Cycles: prefilter feature passes separate from denoising.
Prefiltering of feature passes will happen during rendering, which can
then be used for denoising immediately or written as a render pass for
later (animation) denoising.

The number of denoising data passes written is reduced because of this,
leaving out the feature variance passes. The passes are now Normal,
Albedo, Depth, Shadowing, Variance and Intensity.

Ref D3889.
2019-02-06 15:18:29 +01:00
Campbell Barton
8c68ed6df1 Cleanup: remove redundant, invalid info from headers
BF-admins agree to remove header information that isn't useful,
to reduce noise.

- BEGIN/END license blocks

  Developers should add non license comments as separate comment blocks.
  No need for separator text.

- Contributors

  This is often invalid, outdated or misleading
  especially when splitting files.

  It's more useful to git-blame to find out who has developed the code.

See P901 for script to perform these edits.
2019-02-02 02:40:00 +11:00
Brecht Van Lommel
d918217d35 OSL: remove fresnel template that was not public domain.
Convention is to only have public domain code templates. Also fixes wrong
license header in Cycles.
2019-01-28 12:04:54 +01:00
Brecht Van Lommel
10fa3b790f Fix T60450: Cycles broken GPU denoising after recent changes. 2019-01-14 11:42:38 +01:00
Brecht Van Lommel
e5a1a9288c Fix T60320: Cycles OpenCL denoising filter errors on some drivers. 2019-01-11 11:25:37 +01:00
Brecht Van Lommel
b486088218 Fix T60320: Cycles OpenCL volume rendering error on some drivers. 2019-01-08 15:59:10 +01:00
Brecht Van Lommel
8491dba0c6 Fix T60300: Cycles SSS render hanging with AMD OpenCL. 2019-01-08 15:37:16 +01:00
Brecht Van Lommel
fffdedbcc1 Fix T54962: Cycles crash using subsurface scattering texture blur. 2019-01-03 17:10:37 +01:00
Brecht Van Lommel
f7e9642da9 Fix T60061: Cycles OSL point density not working.
Add override keywords so we can detect when the function definitions change.
2019-01-02 19:56:49 +01:00
Brecht Van Lommel
8e331c3431 Fix T59565: NaN/crash with zero radius tip of hair curves. 2018-12-21 18:54:45 +01:00
Brecht Van Lommel
765795aed7 Fix macOS buildbot build, wrong CUDA version check. 2018-12-11 14:16:48 +01:00
Brecht Van Lommel
cccc40db51 Fix T57963: Cycles crash using AO for displacement.
Note this is not supported, there exists no geometry at this point, but
it should not crash at least.
2018-12-06 19:50:05 +01:00
Brecht Van Lommel
f5b46daf52 Fix build with old CMake versions. 2018-12-05 12:53:19 +01:00
Brecht Van Lommel
f63da3dcf5 Buildbot: enable support for NVIDIA Turing cards in Cycles (like GTX 20xx).
We currently only build the sm_7x kernels with CUDA 10.0, older cards still
use 9.1 until rendering errors are solved for them.
2018-12-04 16:03:18 +01:00
Brecht Van Lommel
b14ec18601 Cycles: add initial CUDA 10.0 support, but only recommend use for Turing cards.
There may still be rendering errors when used for older graphics cards.
2018-12-04 16:03:18 +01:00
Shane Ambler
5a6f1fa563 Fix T58600: update OSL scripts to work with OSL 1.10.x. 2018-12-03 15:14:21 +01:00
Lukas Stockner
7fa6f72084 Cycles: Add sample-based runtime profiler that measures time spent in various parts of the CPU kernel
This commit adds a sample-based profiler that runs during CPU rendering and collects statistics on time spent in different parts of the kernel (ray intersection, shader evaluation etc.) as well as time spent per material and object.

The results are currently not exposed in the user interface or per Python yet, to see the stats on the console pass the "--cycles-print-stats" argument to Cycles (e.g. "./blender -- --cycles-print-stats").

Unfortunately, there is no clear way to extend this functionality to CUDA or OpenCL, so it is CPU-only for now.

Reviewers: brecht, sergey, swerner

Reviewed By: brecht, swerner

Differential Revision: https://developer.blender.org/D3892
2018-11-29 02:45:24 +01:00
Campbell Barton
e742e0934d Cleanup: trailing space 2018-11-25 08:01:14 +11:00
Sergey Sharybin
968bf0df14 Fix T57811: Render crashes in certain scenes when AO Bounces are used 2018-11-21 14:17:26 +01:00
Sergey Sharybin
6f48bfc7a8 Cycles: Cleanup, use utility function
Replaces inlined platform-specific code.
2018-11-21 13:51:18 +01:00
Sergey Sharybin
65143542af Cycles: Cleanup, reduce indentation level 2018-11-21 12:41:24 +01:00
Sergey Sharybin
700330afe8 Cycles: Cleanup, comments and dead code 2018-11-21 11:33:11 +01:00
Sergey Sharybin
65d01def80 Cycles: Cleanup, CUDA code path is not possible inside AVX2 2018-11-21 11:28:49 +01:00
Sergey Sharybin
cd9ab9d99e Cycles: Cleanup, code style 2018-11-15 17:16:40 +01:00
Sergey Sharybin
65e9388440 Revert "Cycles: Cleanup, move Embree BVH logic to own file"
While we shouldn't have logic in an entry point, and since one should
not be making typos when moving lines around, there is bigger entanglement
issue with BVH host code using kernel function. This is bad violation,
but is tricky to get solved moments before the weekly.

In order to keep things in a (less) broken state than before own cleanup
reverting the changes.

This reverts commit 2bad10be96.
This reverts commit ddabb21d05
2018-11-09 17:54:09 +01:00
Sergey Sharybin
ddabb21d05 Cycles; Cleanup, line length
There are some more sanitization which would be cool to be done
in the neighbourhood of those functions, but that could also happen
later.
2018-11-09 12:31:46 +01:00
Sergey Sharybin
2bad10be96 Cycles: Cleanup, move Embree BVH logic to own file
There is no way we can keep generic entry point functions easy to
follow if we start adding actual logic in them.
2018-11-09 12:28:55 +01:00
Sergey Sharybin
2d98b198e9 Cycles: Cleanup, indentation in preprocessor 2018-11-09 12:12:11 +01:00
Sergey Sharybin
3e76cc494a Cycles: Cleanup, indentation 2018-11-09 12:10:48 +01:00
Sergey Sharybin
203de0bbf0 Cycles: Cleanup, space after (void)
It was used in like 95% of places.
2018-11-09 12:08:51 +01:00
Sergey Sharybin
cb4b5e12ab Cycles: Cleanup, spacing after preprocessor
It is supposed to be two spaces before comment stating which if
else/endif statements corresponds to. Was mainly violated in the
header guards.
2018-11-09 11:34:54 +01:00
Brecht Van Lommel
116be3deff Fix build on 32bit after Embree changes. 2018-11-08 14:58:01 +01:00
Stefan Werner
d3320c5488 Cycles: Rearranged macros in kernel_types.h to fix Embree build. 2018-11-07 15:20:24 +01:00
Brecht Van Lommel
33201a48b0 Fix build with OSL, remove unneeded file after Embree changes. 2018-11-07 14:38:07 +01:00
Stefan Werner
2c5531c0a5 Cycles: Added Embree as BVH option for CPU renders.
Note that this is turned off by default and must be enabled at build time with the CMake WITH_CYCLES_EMBREE flag.
Embree must be built as a static library with ray masking turned on, the `make deps` scripts have been updated accordingly.
There, Embree is off by default too and must be enabled with the WITH_EMBREE flag.

Using Embree allows for much faster rendering of deformation motion blur while reducing the memory footprint.

TODO: GPU implementation, deduplication of data, leveraging more of Embrees features (e.g. tessellation cache).

Differential Revision: https://developer.blender.org/D3682
2018-11-07 12:58:12 +01:00
Brecht Van Lommel
ea8e45de29 Fix assert rendering hair tests on some systems. 2018-11-04 20:25:57 +01:00
Brecht Van Lommel
7c0d37deca Fix build error on Windows 32bit, alignment was wrong. 2018-10-30 11:39:44 +01:00
Stefan Werner
e58c6cf0c6 Cycles: Added Cryptomatte output.
This allows for extra output passes that encode automatic object and material masks
for the entire scene. It is an implementation of the Cryptomatte standard as
introduced by Psyop. A good future extension would be to add a manifest to the
export and to do plenty of testing to ensure that it is fully compatible with other
renderers and compositing programs that use Cryptomatte.

Internally, it adds the ability for Cycles to have several passes of the same type
that are distinguished by their name.

Differential Revision: https://developer.blender.org/D3538
2018-10-28 05:37:41 -04:00
Brecht Van Lommel
c0b3e3daeb Fix T57393: Cycles OSL bevel and AO not working after OSL upgrade. 2018-10-27 15:00:37 +02:00
Lukas Stockner
65b25df801 Cycles: Overhaul ensure_valid_reflection to fix issues with normal- and bumpmapping
This function is supposed to prevent the black artifacts caused by strong normal- or bumpmapping, but failed in some cases.

Now the code correctly handles all test files and previous issues I am aware of and also has extensive comments describing
the algorithm and the math behind it.

Basically, the main problem was that there can be multiple valid solutions that fulfil the reflection angle criterium,
but I had assumed that only one would exist and therefore simply picked the first solution with a positive term in srqt().
Now, the code uses additional validity checks and a simple heuristic to pick the best valid solution.

Additionally, the code messed up very shallow reflections even if the normal map strength was zero due to the constant
limit for the outgoing ray angle, which caused shallow incoming rays to fail the initial test even when reflected directly
on Ng. Now, the code accounts for this by reducing the threshold in the case of a shallow incoming ray, ensuring that at
least N=Ng is always a valid solution.

Reviewers: brecht

Differential Revision: https://developer.blender.org/D3816
2018-10-25 14:50:48 +02:00
Lukas Stockner
15e9d80375 Cycles: Use existing shared temporary memory in reconstruction step of the denoiser
Previously the code allocated its own temporary memory, but it's possible to just use the existing shared one instead.
2018-10-08 22:13:40 +02:00