Commit Graph

2191 Commits

Author SHA1 Message Date
Jeroen Bakker
15edae617f Merge branch 'blender2.7' 2019-02-26 14:07:57 +01:00
Jeroen Bakker
e6099c7e46 T61576: Do Not (Re-)Compile OpenCL kernels
The goal of this patch is to have limit the number of times
kernels needs to be compiled and are reused as kernels with
different compile directives can lead to identical same
binaries.

The implementation does this by stripping the compile directives.
and reshuffling kernels so the output is more likely to be the
same.

We focussed on the kernels where it was easy to detect and maintain
(bundle, bake, displace, do_volume and background). More optimizations
could be done but they are probably less obvious.

Merged the data_init and state_buffer_size kernels to split_bundle.

This patch will also remove empty kernels for do_volume and bake
when their features are not enabled.

When using the benchmark files there are less background, bake and
do_volume kernels compiled.

Fix: T61576, T61501, T61466

Reviewed By: brecht, #cycles

Differential Revision: https://developer.blender.org/D4390
2019-02-26 12:45:26 +01:00
Sergey Sharybin
8986c92b65 Merge branch 'blender2.7' 2019-02-21 15:33:07 +01:00
Jeroen Bakker
a51d08f473 Fix: Missing closing brackets in include 2019-02-21 14:36:51 +01:00
Jeroen Bakker
fab6c5040d Fix: OpenCL Displacement and light sampling
The bake kernels are also used during mesh displacement and light
importance sampling. We disabled the implementation of these kernels
when baking was not enabled.
2019-02-21 08:11:02 +01:00
Sergey Sharybin
9e4d561a8b Merge branch 'blender2.7' 2019-02-20 23:20:43 +01:00
Sergey Sharybin
ccd291aafb Cycles: Fix uninitialized number of hits
Was happening when looking for all intersections for transparent shadow rays
in the case the ray is degenerate.

Still quesitonable whether we should consider this a transparent or opaque
configuraiton. Ideally, we should prevent such rays from happening, but that
is another vector of debugging.
2019-02-20 23:20:07 +01:00
Ray Molenkamp
4ec6b16b4e cycles/opencl: Fix compile error.
added missing quote, introduced in rB15edda3a8e07003bef695cca939744bbea80ad18
2019-02-20 11:44:06 -07:00
Jeroen Bakker
8a4cdda373 Merge branch 'blender2.7' 2019-02-20 15:22:23 +01:00
Jeroen Bakker
949ab753bb Cycles OpenCL: Remove OpenCL MegaKernel
Using OpenCL MegaKernel has been slow and therefore not usefull.
This patch will remove the mega kernel from the OpenCL codebase
and the OpenCLDeviceBase class.

T61736: removal of mega kernel
T61703: baking does not work with mega kernel

Tags: #cycles

Differential Revision: https://developer.blender.org/D4383
2019-02-20 15:17:22 +01:00
Jeroen Bakker
667033e89e T61463: Separate Baking kernels
Cycles OpenCL: Split baking kernels in own program

Fix T61463. Before this patch baking was part of the base kernels. There
are 3 baking kernels that and all 3 uses shader evaluation. Only for one
of these kernels the functionality was wrapped in the __NO_BAKING__
compile directive.

When you start baking this leads to long compile times. By separating
in individual programs will reduce the compile times.

Also wrapped all baking kernels with __NO_BAKING__ to reduce the
compilation times.

Impact on compilation time

    job   |   scene_name    | previous |  new  | percentage
  --------+-----------------+----------+-------+------------
   T61463 | empty           |    10.63 |  7.27 |         32%
   T61463 | bmw             |    17.91 | 14.24 |         20%
   T61463 | fishycat        |    19.57 | 15.08 |         23%
   T61463 | barbershop      |    54.10 | 48.18 |         11%
   T61463 | classroom       |    17.55 | 14.42 |         18%
   T61463 | koro            |    18.92 | 17.15 |          9%
   T61463 | pavillion       |    17.43 | 14.23 |         18%
   T61463 | splash279       |    16.48 | 15.33 |          7%
   T61463 | volume_emission |    36.22 | 34.19 |          6%

Impact on render time

    job   |   scene_name    | previous |   new   | percentage
  --------+-----------------+----------+---------+------------
   T61463 | empty           |    21.06 |   20.54 |          2%
   T61463 | bmw             |   198.44 |  189.59 |          4%
   T61463 | fishycat        |   394.20 |  388.50 |          1%
   T61463 | barbershop      |  1188.16 | 1185.49 |          0%
   T61463 | classroom       |   341.08 |  339.27 |          1%
   T61463 | koro            |   472.43 |  360.70 |         24%
   T61463 | pavillion       |   905.77 |  902.14 |          0%
   T61463 | splash279       |    55.26 |   54.92 |          1%
   T61463 | volume_emission |    62.59 |   39.09 |         38%

I don't have a grounded explanation why koro and volume_emission is this much
faster; I have done several tests though...

Maniphest Tasks: T61463

Differential Revision: https://developer.blender.org/D4376
2019-02-19 16:34:55 +01:00
Jeroen Bakker
15edda3a8e T61463: Separate Baking kernels
Cycles OpenCL: Split baking kernels in own program

Fix T61463. Before this patch baking was part of the base kernels. There
are 3 baking kernels that and all 3 uses shader evaluation. Only for one
of these kernels the functionality was wrapped in the __NO_BAKING__
compile directive.

When you start baking this leads to long compile times. By separating
in individual programs will reduce the compile times.

Also wrapped all baking kernels with __NO_BAKING__ to reduce the
compilation times.

Impact on compilation time

    job   |   scene_name    | previous |  new  | percentage
  --------+-----------------+----------+-------+------------
   T61463 | empty           |    10.63 |  7.27 |         32%
   T61463 | bmw             |    17.91 | 14.24 |         20%
   T61463 | fishycat        |    19.57 | 15.08 |         23%
   T61463 | barbershop      |    54.10 | 48.18 |         11%
   T61463 | classroom       |    17.55 | 14.42 |         18%
   T61463 | koro            |    18.92 | 17.15 |          9%
   T61463 | pavillion       |    17.43 | 14.23 |         18%
   T61463 | splash279       |    16.48 | 15.33 |          7%
   T61463 | volume_emission |    36.22 | 34.19 |          6%

Impact on render time

    job   |   scene_name    | previous |   new   | percentage
  --------+-----------------+----------+---------+------------
   T61463 | empty           |    21.06 |   20.54 |          2%
   T61463 | bmw             |   198.44 |  189.59 |          4%
   T61463 | fishycat        |   394.20 |  388.50 |          1%
   T61463 | barbershop      |  1188.16 | 1185.49 |          0%
   T61463 | classroom       |   341.08 |  339.27 |          1%
   T61463 | koro            |   472.43 |  360.70 |         24%
   T61463 | pavillion       |   905.77 |  902.14 |          0%
   T61463 | splash279       |    55.26 |   54.92 |          1%
   T61463 | volume_emission |    62.59 |   39.09 |         38%

I don't have a grounded explanation why koro and volume_emission is this much
faster; I have done several tests though...

Maniphest Tasks: T61463

Differential Revision: https://developer.blender.org/D4376
2019-02-19 16:33:50 +01:00
Jeroen Bakker
e6f5632eb1 T61513: Refactored Cycles Attribute Retrieval
There is a generic function to retrieve float and float3 attributes
`primitive_attribute_float` and primitive_attribute_float3`. Inside
these functions an prioritised if-else construction checked where
the attribute is stored and then retrieved from that location.

Actually the calling function most of the time already knows where
the data is stored. So we could simplify this by splitting these
functions and remove the check logic.

This patch splits the `primitive_attribute_float?` functions into
`primitive_surface_attribute_float?` and `primitive_volume_attribute_float?`.
What leads to less branching and more optimum kernels.

The original function is still being used by OSL and `svm_node_attr`.

This will reduce the compilation time and render time for kernels.
Especially in production scenes there is a lot of benefit.

Impact in compilation times

    job  |   scene_name    | previous |  new  | percentage
  -------+-----------------+----------+-------+------------
  t61513 | empty           |    10.63 | 10.66 |          0%
  t61513 | bmw             |    17.91 | 17.65 |          1%
  t61513 | fishycat        |    19.57 | 17.68 |         10%
  t61513 | barbershop      |    54.10 | 24.41 |         55%
  t61513 | classroom       |    17.55 | 16.29 |          7%
  t61513 | koro            |    18.92 | 18.05 |          5%
  t61513 | pavillion       |    17.43 | 16.52 |          5%
  t61513 | splash279       |    16.48 | 14.91 |         10%
  t61513 | volume_emission |    36.22 | 21.60 |         40%

Impact in render times

    job  |   scene_name    | previous |  new   | percentage
  -------+-----------------+----------+--------+------------
  61513 | empty           |    21.06 |  20.35 |          3%
  61513 | bmw             |   198.44 | 190.05 |          4%
  61513 | fishycat        |   394.20 | 401.25 |         -2%
  61513 | barbershop      |  1188.16 | 912.39 |         23%
  61513 | classroom       |   341.08 | 340.38 |          0%
  61513 | koro            |   472.43 | 471.80 |          0%
  61513 | pavillion       |   905.77 | 899.80 |          1%
  61513 | splash279       |    55.26 |  54.86 |          1%
  61513 | volume_emission |    62.59 |  61.70 |          1%

There is also a possitive impact when using CPU and CUDA, but they are small.

I didn't split the hair logic from the surface logic due to:

* Hair and surface use same attribute types. It was not clear if it could be
  splitted when looking at the code only.
* Hair and surface are quick to compile and to read. So the benefit is quite
  small.

Differential Revision: https://developer.blender.org/D4375
2019-02-19 16:28:25 +01:00
Jeroen Bakker
d6d306441f T61513: Refactored Cycles Attribute Retrieval
There is a generic function to retrieve float and float3 attributes
`primitive_attribute_float` and primitive_attribute_float3`. Inside
these functions an prioritised if-else construction checked where
the attribute is stored and then retrieved from that location.

Actually the calling function most of the time already knows where
the data is stored. So we could simplify this by splitting these
functions and remove the check logic.

This patch splits the `primitive_attribute_float?` functions into
`primitive_surface_attribute_float?` and `primitive_volume_attribute_float?`.
What leads to less branching and more optimum kernels.

The original function is still being used by OSL and `svm_node_attr`.

This will reduce the compilation time and render time for kernels.
Especially in production scenes there is a lot of benefit.

Impact in compilation times

    job  |   scene_name    | previous |  new  | percentage
  -------+-----------------+----------+-------+------------
  t61513 | empty           |    10.63 | 10.66 |          0%
  t61513 | bmw             |    17.91 | 17.65 |          1%
  t61513 | fishycat        |    19.57 | 17.68 |         10%
  t61513 | barbershop      |    54.10 | 24.41 |         55%
  t61513 | classroom       |    17.55 | 16.29 |          7%
  t61513 | koro            |    18.92 | 18.05 |          5%
  t61513 | pavillion       |    17.43 | 16.52 |          5%
  t61513 | splash279       |    16.48 | 14.91 |         10%
  t61513 | volume_emission |    36.22 | 21.60 |         40%

Impact in render times

    job  |   scene_name    | previous |  new   | percentage
  -------+-----------------+----------+--------+------------
  61513 | empty           |    21.06 |  20.35 |          3%
  61513 | bmw             |   198.44 | 190.05 |          4%
  61513 | fishycat        |   394.20 | 401.25 |         -2%
  61513 | barbershop      |  1188.16 | 912.39 |         23%
  61513 | classroom       |   341.08 | 340.38 |          0%
  61513 | koro            |   472.43 | 471.80 |          0%
  61513 | pavillion       |   905.77 | 899.80 |          1%
  61513 | splash279       |    55.26 |  54.86 |          1%
  61513 | volume_emission |    62.59 |  61.70 |          1%

There is also a possitive impact when using CPU and CUDA, but they are small.

I didn't split the hair logic from the surface logic due to:

* Hair and surface use same attribute types. It was not clear if it could be
  splitted when looking at the code only.
* Hair and surface are quick to compile and to read. So the benefit is quite
  small.

Differential Revision: https://developer.blender.org/D4375
2019-02-19 16:25:48 +01:00
Brecht Van Lommel
9800837b98 Cycles: Support multithreaded compilation of kernels
This patch implements a workaround to get the multithreaded compilation from D2231 working.
So far, it only works for Blender, not for Cycles Standalone. Also, I have only tested the Linux codepath in the helper function.
Depends on D2231.

Patch by lukasstockner97, jbakker, brecht

    job    |   scene_name    | compilation_time
----------+-----------------+------------------
    Baseline | empty           |            22.73
    D2264    | empty           |            13.94
    Baseline | bmw             |            56.44
    D2264    | bmw             |            41.32
    Baseline | fishycat        |            59.50
    D2264    | fishycat        |            45.19
    Baseline | barbershop      |           212.28
    D2264    | barbershop      |           169.81
    Baseline | victor          |            67.51
    D2264    | victor          |            53.60
    Baseline | classroom       |            51.46
    D2264    | classroom       |            39.02
    Baseline | koro            |            62.48
    D2264    | koro            |            49.03
    Baseline | pavillion       |            54.37
    D2264    | pavillion       |            38.82
    Baseline | splash279       |            47.43
    D2264    | splash279       |            37.94
    Baseline | volume_emission |           145.22
    D2264    | volume_emission |           121.10

This patch reduced compilation time as the split kernels and base
kernels are compiled in parallel. In cycles debug mode (256) you can set
unmark the opencl single program file, what reduces the compilation time
even further (bmw 17 seconds, barbershop 53 seconds).

Reviewers: brecht, dingto, sergey, juicyfruit, lukasstockner97

Reviewed By: brecht

Subscribers: Loner, jbakker, candreacchio, 3dLuver, LazyDodo, bliblubli

Differential Revision: https://developer.blender.org/D2264
2019-02-15 08:56:20 +01:00
Brecht Van Lommel
4ce9785e01 Cycles: Support multithreaded compilation of kernels
This patch implements a workaround to get the multithreaded compilation from D2231 working.
So far, it only works for Blender, not for Cycles Standalone. Also, I have only tested the Linux codepath in the helper function.
Depends on D2231.

Reviewers: brecht, dingto, sergey, juicyfruit, lukasstockner97

Reviewed By: brecht

Subscribers: Loner, jbakker, candreacchio, 3dLuver, LazyDodo, bliblubli

Differential Revision: https://developer.blender.org/D2264
2019-02-15 08:49:25 +01:00
Brecht Van Lommel
7a41c1634b Merge branch 'blender2.7' 2019-02-14 20:00:37 +01:00
Brecht Van Lommel
de0e456a6c Cleanup: fix compiler warnings. 2019-02-14 19:39:39 +01:00
Brecht Van Lommel
9886ae6331 Fix T61470: incorrect saturation clamping in recent bugfix.
We should clamp the result after multiplication.
2019-02-14 19:28:44 +01:00
Brecht Van Lommel
dbd9b7590a Merge branch 'blender2.7' 2019-02-13 19:02:43 +01:00
Brecht Van Lommel
ec559912fb Fix T61470: inconsistent HSV node results with saturation > 1.0.
Values outside the 0..1 range produce negative colors, so now clamp to that
range everywhere. Also fixes improper handling of hue > 2.0 in some places.
2019-02-13 17:06:30 +01:00
Brecht Van Lommel
e21ae0bb26 Merge branch 'blender2.7' 2019-02-06 15:22:53 +01:00
Lukas Stockner
fccf506ed7 Cycles: animation denoising support in the kernel.
This is the internal implementation, not available from the API or
interface yet. The algorithm takes into account past and future frames,
both to get more coherent animation and reduce noise.

Ref D3889.
2019-02-06 15:18:42 +01:00
Lukas Stockner
c183ac73dc Cycles: tweak outlier detection, preparing for animation denoising.
Ref D3889.
2019-02-06 15:18:38 +01:00
Lukas Stockner
405cacd4cd Cycles: prefilter feature passes separate from denoising.
Prefiltering of feature passes will happen during rendering, which can
then be used for denoising immediately or written as a render pass for
later (animation) denoising.

The number of denoising data passes written is reduced because of this,
leaving out the feature variance passes. The passes are now Normal,
Albedo, Depth, Shadowing, Variance and Intensity.

Ref D3889.
2019-02-06 15:18:29 +01:00
Campbell Barton
8c68ed6df1 Cleanup: remove redundant, invalid info from headers
BF-admins agree to remove header information that isn't useful,
to reduce noise.

- BEGIN/END license blocks

  Developers should add non license comments as separate comment blocks.
  No need for separator text.

- Contributors

  This is often invalid, outdated or misleading
  especially when splitting files.

  It's more useful to git-blame to find out who has developed the code.

See P901 for script to perform these edits.
2019-02-02 02:40:00 +11:00
Campbell Barton
65ec7ec524 Cleanup: remove redundant, invalid info from headers
BF-admins agree to remove header information that isn't useful,
to reduce noise.

- BEGIN/END license blocks

  Developers should add non license comments as separate comment blocks.
  No need for separator text.

- Contributors

  This is often invalid, outdated or misleading
  especially when splitting files.

  It's more useful to git-blame to find out who has developed the code.

See P901 for script to perform these edits.
2019-02-02 01:36:28 +11:00
Brecht Van Lommel
409a21b32e Merge branch 'blender2.7' 2019-01-28 12:05:51 +01:00
Brecht Van Lommel
d918217d35 OSL: remove fresnel template that was not public domain.
Convention is to only have public domain code templates. Also fixes wrong
license header in Cycles.
2019-01-28 12:04:54 +01:00
Brecht Van Lommel
728f43d585 Merge branch 'blender2.7' 2019-01-14 12:13:10 +01:00
Brecht Van Lommel
10fa3b790f Fix T60450: Cycles broken GPU denoising after recent changes. 2019-01-14 11:42:38 +01:00
Sergey Sharybin
cca35c1013 Merge branch 'blender2.7' 2019-01-11 15:09:46 +01:00
Brecht Van Lommel
e5a1a9288c Fix T60320: Cycles OpenCL denoising filter errors on some drivers. 2019-01-11 11:25:37 +01:00
Sergey Sharybin
ddabad2410 Merge branch 'blender2.7' 2019-01-09 12:56:50 +01:00
Brecht Van Lommel
b486088218 Fix T60320: Cycles OpenCL volume rendering error on some drivers. 2019-01-08 15:59:10 +01:00
Brecht Van Lommel
8491dba0c6 Fix T60300: Cycles SSS render hanging with AMD OpenCL. 2019-01-08 15:37:16 +01:00
Brecht Van Lommel
88e45b8c99 Merge branch 'blender2.7' 2019-01-03 18:32:51 +01:00
Brecht Van Lommel
fffdedbcc1 Fix T54962: Cycles crash using subsurface scattering texture blur. 2019-01-03 17:10:37 +01:00
Brecht Van Lommel
017573495e Merge branch 'blender2.7' 2019-01-02 19:58:26 +01:00
Brecht Van Lommel
f7e9642da9 Fix T60061: Cycles OSL point density not working.
Add override keywords so we can detect when the function definitions change.
2019-01-02 19:56:49 +01:00
Brecht Van Lommel
5216dd5fce Merge branch 'blender2.7' 2018-12-27 10:53:02 +01:00
Brecht Van Lommel
8e331c3431 Fix T59565: NaN/crash with zero radius tip of hair curves. 2018-12-21 18:54:45 +01:00
Brecht Van Lommel
f60018e425 Merge branch 'master' into blender2.8 2018-12-11 15:18:43 +01:00
Brecht Van Lommel
765795aed7 Fix macOS buildbot build, wrong CUDA version check. 2018-12-11 14:16:48 +01:00
Campbell Barton
606223f6a6 Merge branch 'master' into blender2.8 2018-12-07 15:54:17 +11:00
Brecht Van Lommel
cccc40db51 Fix T57963: Cycles crash using AO for displacement.
Note this is not supported, there exists no geometry at this point, but
it should not crash at least.
2018-12-06 19:50:05 +01:00
Brecht Van Lommel
4a21345665 Merge branch 'master' into blender2.8 2018-12-05 12:54:45 +01:00
Brecht Van Lommel
f5b46daf52 Fix build with old CMake versions. 2018-12-05 12:53:19 +01:00
Brecht Van Lommel
b9b88d59dd Merge branch 'master' into blender2.8 2018-12-04 16:35:16 +01:00
Brecht Van Lommel
f63da3dcf5 Buildbot: enable support for NVIDIA Turing cards in Cycles (like GTX 20xx).
We currently only build the sm_7x kernels with CUDA 10.0, older cards still
use 9.1 until rendering errors are solved for them.
2018-12-04 16:03:18 +01:00