Commit Graph

1445 Commits

Author SHA1 Message Date
Brecht Van Lommel
39ae324918 Cycles: remove extended precision hacks, no longer needed with SSE2 requirement.
Differential Revision: https://developer.blender.org/D2079
2016-07-04 18:22:11 +02:00
Brecht Van Lommel
8cc123a387 Fix T48783: OSL render errors after recent refactoring. 2016-07-03 13:08:21 +02:00
Campbell Barton
9f5621bb4a Cleanup: comment blocks 2016-07-02 10:08:33 +10:00
Thomas Dinges
5c249fac9a Fix Cycles OpenCL not taking Extend and Clip extension types into account.
(See T48720).
2016-07-01 23:48:31 +02:00
Sergey Sharybin
23cc453975 Fix T48732: New GGX breaks OpenCL kernel
Make sure we don't perform any implicit address space conversion.

A bit annoying, but less intrusive approaches (like using temp private
variable in .cl kernel) do not work correct here.

Using generic address space will help from code side here, but will
be somewhat slower due to extra things happening as far as i know.
2016-06-28 17:15:35 +05:00
Lukas Stockner
2a69b09b62 Fix T48732 v2: New GGX breaks OpenCL kernel
As far as I can see, the second issue there was that the functions receive a pointer to a member variable of the
ShaderData, which is stored in global memory. However, this means that the pointer points to global memory as well,
therefore OpenCL requires the ccl_addr_space "keyword" in front of the pointer.
With this commit, the OpenCL kernels build on Linux with the Intel CPU OpenCL runtime - however, they already did
without the change and I don't have an AMD card, so I can't really test whether the AMD runtime is happy as well now.
2016-06-26 00:51:16 +02:00
Thomas Dinges
99088f8b55 Fix T48732, OpenCL compile failure after Multiscatter GGX commit.
Use OpenCL "all" builtin type for conversion, according to OpenCL 1.1 spec 6.3e.
2016-06-25 11:14:06 +02:00
Lukas Stockner
23c276832b Cycles: Add multi-scattering, energy-conserving GGX as an option to the Glossy, Anisotropic and Glass BSDFs
This commit adds a new distribution to the Glossy, Anisotropic and Glass BSDFs that implements the
multiple-scattering microfacet model described in the paper "Multiple-Scattering Microfacet BSDFs with the Smith Model".

Essentially, the improvement is that unlike classical GGX, which only models single scattering and assumes
the contribution of multiple bounces to be zero, this new model performs a random walk on the microsurface until
the ray leaves it again, which ensures perfect energy conservation.

In practise, this means that the "darkening problem" - GGX materials becoming darker with increasing
roughness - is solved in a physically correct and efficient way.

The downside of this model is that it has no (known) analytic expression for evalation. However, it can be
evaluated stochastically, and although the correct PDF isn't known either, the properties of MIS and the
balance heuristic guarantee an unbiased result at the cost of slightly higher noise.

Reviewers: dingto, #cycles, brecht

Reviewed By: dingto, #cycles, brecht

Subscribers: bliblubli, ace_dragon, gregzaal, brecht, harvester, dingto, marcog, swerner, jtheninja, Blendify, nutel

Differential Revision: https://developer.blender.org/D2002
2016-06-23 22:57:26 +02:00
Lukas Stockner
028ba31903 Fix T48698: Rays from SSS act as diffuse for normal objects but have an undefined type for lamp objects
The problem here was that there are five path types internally (diffuse, glossy, transmission, subsurface and volume scatter), but subsurface isn't exposed to the user.
This caused some weird behaviour - if all four types are disabled on the lamp, Cycles doesn't even try sampling it, but if any type was active, the lamp would illuminate
the cube since none of the options set subsurface to zero.
In the future, it might be reasonable to add subsurface visibility as an option - but for now the weird and inconsistent behaviour can be fixed simply by setting both
diffuse and subsurface to zero if the user disables diffuse visibility.
2016-06-21 20:09:48 +02:00
Lukas Stockner
29ce3dfeb6 Fix T48691: Cycles - OpenCL - HDR Image mapping does not match CUDA rendering
The OpenCL texture code didn't offset the coordinates by half a pixel like the CPU code does.
2016-06-21 00:49:25 +02:00
Lukas Stockner
dfa7ddd4a8 Cycles: Add svm_util_color.h file to CMake
The file wasn't included in CMake and therefore not installed into the addon folder.
2016-06-20 19:07:22 +02:00
Alexander Gavrilov
f7bada00a7 Cycles: add constant folding for more color operation nodes.
Invert, brightness & constrast, separate/combine and Mix RGB blend modes
and clamping.
2016-06-19 20:17:28 +02:00
Alexander Gavrilov
98547e8817 Fix Cycles RGB and Vector Curves node Fac handling. 2016-06-19 20:17:27 +02:00
Brecht Van Lommel
e26eb9c93b Cycles: reduce CUDA stack memory access for Maxwell and up, increasing max registers.
For non-branched path tracing with a GTX 960 and CUDA 7.5, this gives a small reduction
in stack usage but mainly: 8% faster render on BMW, 5% on pabellon, 13% on classroom.
2016-06-19 20:17:26 +02:00
Thomas Dinges
6311a9ff23 Cycles: Support half and half4 textures.
This is an initial commit for half texture support in Cycles.
It adds the basic infrastructure inside of the ImageManager and support for these textures on CPU.

Supported:
* Half Float OpenEXR images (can be used for e.g HDRs or Normalmaps) now use 1/2 the memory, when loaded via disk (OIIO).

ToDo:
Various things like support for inbuilt half textures, GPU... will come later, step by step.

Part of my GSoC 2016.
2016-06-19 17:31:16 +02:00
Sergey Sharybin
7ac126e728 Fix T46492: GGX distribution produces black pixels
The issue was caused by some numerical instability.
2016-06-17 16:30:29 +02:00
Thomas Dinges
a3a7e46318 Cleanup: Remove outdated comment, visibility layers in kernel have been removed. 2016-06-14 16:46:44 +02:00
Brecht Van Lommel
ebdd2e0b6d Cycles: make shader node enums consistently lower case, update OSL shaders accordingly. 2016-06-11 23:50:11 +02:00
Lukas Stockner
654019fa01 Cycles: Fix two numerical issues in the volume code
This hopefully fixes T48383 by avoiding two numerical problems that I found in the volume code.

Reviewers: sergey, dingto, brecht

Reviewed By: sergey, dingto, brecht

Maniphest Tasks: T48383

Differential Revision: https://developer.blender.org/D2051
2016-06-08 03:17:19 +02:00
Sergey Sharybin
b595a692c8 Cycles: Limit degenerated triangle check got CUDA only
OpenCL seems to work fine here, and for some reason that comparison was
giving compilation error on OpenCL here.

Better to compile OpenCL kernel than to be fully robust to weird corner
cases.
2016-06-07 15:48:56 +02:00
Lukas Stockner
7a5a02509b Cycles: Use faster ray-quad-intersection test
The original quad intersection test works by just testing against the two triangles that define the quad.
However, in this case it's actually faster to use the same test that's also used for portals: Determining
the distance to the plane in which the quad lies, calculating the hitpoint and checking whether it's in the
quad by projecting onto the sides.

Reviewers: brecht, sergey, dingto

Reviewed By: dingto

Differential Revision: https://developer.blender.org/D2045
2016-06-06 23:38:50 +02:00
Sergey Sharybin
14f9a5aa1d Fix T48571: Cycles/GPU - A lot of fireflies on SSS+Volume
Was some accumulated precision error happening.
2016-06-06 15:56:22 +02:00
Martijn Berger
50f432b1e0 CMake, minor changes to make Visual studio 2015 use a compatible numpy and
the standard cmake CUDA/NVCC arguments flag allowing 2015 build to use
msvc 2013 for cuda
2016-06-04 11:42:48 +02:00
Sergey Sharybin
f54a98a1c5 Cycles: Simplify check for degenerated faces on GPU
Still not sure how to properly solve the issue, needs some trickery to get
actual optimized values from intersection function (using printf() avoids
some optimization and makes stuff render correct).

For the time being let's just simplify check.
2016-06-03 10:36:04 +02:00
Sergey Sharybin
d2bb0e660b Fix T46207: Slow OpenCL GPU bake and blown out baking Cycles render 2016-05-31 17:48:42 +02:00
Sergey Sharybin
a2aa44370b Fix T48556: Missing transparent shadows on AMD OpenCL
We had transparent shadows disabled for some time because they were causing
drivers to crash. Can't reproduce that issue anymore with current drivers,
so will enable them and see how it goes.
2016-05-31 11:48:07 +02:00
Brecht Van Lommel
9bd2820aaf Code refactor: add separate RGB to BW node and rename some sockets. 2016-05-29 20:30:16 +02:00
Thomas Dinges
2ee063868d Cleanup: Shorten texture variables, tex and image was kinda redundant.
Also make prefix consistent, so it starts with either TEX_NUM or TEX_START, followed by texture type and architecture.
2016-05-27 22:58:33 +02:00
Sergey Sharybin
7dae62cde0 Cycles: Simplify code around debug stats in BVH traversing 2016-05-27 10:55:48 +02:00
Brecht Van Lommel
f2ba13964d Fix T48514: Cycles toon glossy BSDF not respecting reflective caustics option. 2016-05-25 21:13:24 +02:00
Brecht Van Lommel
b49185df99 Cycles CUDA: reduce branched path stack memory by sharing indirect ShaderData.
Saves about 15% for the branched path kernel.
2016-05-25 21:13:24 +02:00
Sergey Sharybin
42b26206c6 Fix T48508: Cycles Regression / Crash 2016-05-24 14:53:34 +02:00
Brecht Van Lommel
999d5a6785 Cycles CUDA: reduce stack memory by reusing ShaderData.
57% less for path and 48% less for branched path.
2016-05-23 22:29:24 +02:00
Sergey Sharybin
2aa4b6045a Cycles: Fix wrong closure counter in feature adaptive kernel
Some closures were missing from calculation, leading to an array
under-allocation, presumable causing memory corruption issues with
emission shaders on OpenCL and was causing issues with Volume 3D
textures with CUDA.

The issue was identified by Thomas Dinges, the patch is different
from the original D2006. See the brief discussion there. Current
approach is similar (or the same) as Brecht suggested.
2016-05-23 14:09:27 +02:00
Brecht Van Lommel
ca03eddfcc Cleanup: remove Cycles layer bits checking in the kernel.
At some point the idea was that we could have an optimization where we could
render multiple render layers without re-exporting the scene, by just updating
the layer bits. We are not doing this now and in practice with the available
render layer control like exclude layers it's not always possible anyway.

This makes it easier to support an arbitrary number of layers in the future
(hopefully this summer), and frees up some useful bits in the kernel.

Reviewed By: sergey, dingto

Differential Revision: https://developer.blender.org/D2020
2016-05-22 17:36:38 +02:00
Brecht Van Lommel
ec51175f1f Code refactor: add generic Cycles node infrastructure.
Differential Revision: https://developer.blender.org/D2016
2016-05-22 17:29:24 +02:00
Thomas Dinges
a5a05fc291 Cycles: Fix long compile time with MSVC.
Compile time per kernel increased alot after recent image commits, re-shuffle some code to fix this.

Patch by "LazyDodo".

Differential Revision: https://developer.blender.org/D2012
2016-05-20 16:50:29 +02:00
Thomas Dinges
c9f1ed1e4c Cycles: Add support for bindless textures.
This adds support for CUDA Texture objects (also known as Bindless textures) for Kepler GPUs (Geforce 6xx and above).
This is used for all 2D/3D textures, data still uses arrays as before.

User benefits:
* No more limits of image textures on Kepler.
 We had 5 float4 and 145 byte4 slots there before, now we have 1024 float4 and 1024 byte4.
 This can be extended further if we need to (just change the define).

* Single channel textures slots (byte and float) are now supported on Kepler as well (1024 slots for each type).

ToDo / Issues:
* 3D textures don't work yet, at least don't show up during render. I have no idea whats wrong yet.
* Dynamically allocate bindless_mapping array?

I hope Fermi still works fine, but that should be tested on a Fermi card before pushing to master.

Part of my GSoC 2016.

Reviewers: sergey, #cycles, brecht

Subscribers: swerner, jtheninja, brecht, sergey

Differential Revision: https://developer.blender.org/D1999
2016-05-19 13:14:37 +02:00
Sergey Sharybin
f74c7fcca2 Fix T47727: Weird bake results with non integer color values 2016-05-18 15:11:05 +02:00
Sergey Sharybin
792e147e2c Cycles: Fix compilation error of CUDA kernels after recent volume commit
Apparently the code path with malloc() was enabled for CUDA.
2016-05-18 11:15:28 +02:00
Sergey Sharybin
cbe7f9dd03 Cycles: Pole merging for spherical stereo
The idea of pole merge is to fade interocular distance after a certain
altitude to zero when altitude goes closer to a pole. This should prevent
annoyances looking up in the sky or down to the bottom.

Works for both panorama and perspective cameras when Spherical Stereo
is enabled.

Reviewers: dfelinto, brecht

Reviewed By: brecht

Subscribers: sebastian_k

Differential Revision: https://developer.blender.org/D1998
2016-05-18 10:56:57 +02:00
Sergey Sharybin
7b356a8565 Cycles: Reduce amount of malloc() calls from the kernel
This commit makes it so malloc() is only happening once per volume and
once per transparent shadow query (per thread), improving scalability of
the code to multiple CPU cores.

Hard to measure this with a low-bottom i7 here currently, but from quick
tests seems volume sampling gave about 3-5% speedup.

The idea is to store allocated memory in kernel globals, which are per
thread on CPU already.

Reviewers: dingto, juicyfruit, lukasstockner97, maiself, brecht

Reviewed By: brecht

Subscribers: Blendify, nutel

Differential Revision: https://developer.blender.org/D1996
2016-05-18 10:14:24 +02:00
Brecht Van Lommel
08670d3b81 Code refactor: use dynamic shader node array lengths now that OSL supports them. 2016-05-17 21:39:16 +02:00
Thomas Dinges
3c85e1ca1a Cycles: Add support for single channel byte textures.
This way, we also save 3/4th of memory for single channel byte textures (e.g. Bump Maps).

Note: In order for this to work, the texture *must* have 1 channel only.
In Gimp you can e.g. do that via the menu: Image -> Mode -> Grayscale
2016-05-12 14:51:42 +02:00
Thomas Dinges
16ce1b78b0 Cleanup: Remove outdated comment and add new one about slot IDs. 2016-05-11 22:25:48 +02:00
Thomas Dinges
4a4f043bc4 Cycles: Add support for single channel float textures on CPU.
Until now, single channel textures were packed into a float4, wasting 3 floats per pixel. Memory usage of such textures is now reduced by 3/4.
Voxel Attributes such as density, flame and heat benefit from this, but also Bumpmaps with one channel.
This commit also includes some cleanup and code deduplication for image loading.

Example Smoke render from Cosmos Laundromat: http://www.pasteall.org/pic/show.php?id=102972
Memory here went down from ~600MB to ~300MB.

Reviewers: #cycles, brecht

Differential Revision: https://developer.blender.org/D1981
2016-05-11 21:58:34 +02:00
Thomas Dinges
76481eaeff Cycles: Add support for float4 textures on OpenCL.
Title says it all, this adds OpenCL float4 texture support.

There is a bug in the code still, I get a "Out of ressources error" on nvidia hardware here, not sure whats wrong yet.
Will investigate further, but maybe someone else has an idea. :)

Reviewers: #cycles, brecht

Subscribers: brecht, candreacchio

Differential Revision: https://developer.blender.org/D1983
2016-05-10 02:53:50 +02:00
Sergey Sharybin
f616caa315 CMake: Fix compilation error when toolkit gives empty result
Should we also check whether toolkit exist perhaps?
2016-05-09 16:05:02 +02:00
Thomas Dinges
d6555d936c Cleanup: Avoid duplicative defines for CPU textures, use the ones from util_texture.h
Also includes some further byte -> byte4 renaming, missed that in last commit.
2016-05-09 09:16:41 +02:00
Thomas Dinges
9a1e11260c Cleanup: More byte -> byte4 renaming for consistency. 2016-05-09 02:22:01 +02:00