Commit Graph

1993 Commits

Author SHA1 Message Date
Brecht Van Lommel
37d9e65ddf Code cleanup: abstract shadow catcher logic more into accumulation code. 2017-09-13 15:24:14 +02:00
Brecht Van Lommel
f77cdd1d59 Code cleanup: deduplicate some branched and split kernel code.
Benchmarks peformance on GTX 1080 and RX 480 on Linux is the same for
bmw27, classroom, pabellon, and about 2% faster on fishy_cat and koro.
2017-09-13 15:24:14 +02:00
Brecht Van Lommel
c4c450045d Code cleanup: tweak inlining for 2% better CUDA performance with hair. 2017-09-13 15:24:14 +02:00
Mathieu Menuet
659ba012b0 Cycles: change AO bounces approximation to do more glossy and transmission.
Rather than treating all ray types equally, we now always render 1 glossy
bounce and unlimited transmission bounces. This makes it possible to get
good looking results with low AO bounces settings, making it useful to
speed up interior renders for example.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D2818
2017-09-12 15:37:35 +02:00
Brecht Van Lommel
de6ecc82ed Fix rare firefly in volume equiangular sampling when sampling short distance. 2017-09-12 12:50:44 +02:00
Brecht Van Lommel
cd6c9e9e5f Cycles: improve sample stratification on area lights for path tracing.
Previously we used a 1D sequence to select a light, and another 2D sequence
to sample a point on the light. For multiple lights this meant each light
would get a random subset of a 2D stratified sequence, which is not
guaranteed to be stratified anymore.

Now we use only a 2D sequence, split into segments along the X axis, one for
each light. The samples that fall within a segment then each are a stratified
sequence, at least in the limit. So for example for two lights, we split up
the unit square into two segments [0,0.5[ x [0,1[ and [0.5,1[ x [0,1[.

This doesn't make much difference in most scenes, mainly helps if you have a
few large area lights or some types of HDR backgrounds.
2017-09-12 12:45:29 +02:00
Brecht Van Lommel
d454a44e96 Fix Cycles bug in RR termination, probability should never be > 1.0.
This causes render differences in some scenes, for example fishy_cat
and pabellon scenes render brighter in a few spots. This is an old
bug, not due to recent RR changes.
2017-09-12 12:43:26 +02:00
Sergey Sharybin
467d92b8f1 Cycles: Tweaks to avoid compilation error of megakernel
Also moved code out of deep-inside ifdef block, otherwise it was quite confusing.
2017-09-12 13:33:46 +05:00
Sergey Sharybin
ae41a08288 Cycles: Attempt to work around compilation of sm_20 and sm_21
Disabled forceinline for those architectures, which seems to be compiling
successfully more often.

There might be ~3% slowdown based on quick tests, but better be rendering
something rather than failing to compile kernels again and again.

Those architectures will be doomed for abandon once we'll switch to toolkit 9.
2017-09-08 18:37:54 +02:00
Brecht Van Lommel
ce1f2e271d Cycles: disable fast math flags, only use a subset.
Empty BVH nodes are set to NaN which must be preserved all the way to the
tnear <= tfar test which can then give false for empty nodes. This needs
strict semantices and careful argument ordering for min() and max(), so
the second argument is used if either of the arguments is NaN.

Fixes T52635: crash in BVH traversal with SSE4.1.

Differential Revision: https://developer.blender.org/D2828
2017-09-08 15:12:37 +02:00
Brecht Van Lommel
c10ea88420 Fix T52660: CUDA volume texture rendering not working on Fermi GPUs. 2017-09-06 18:12:45 +02:00
Brecht Van Lommel
2d407fc288 Fix T52661: mesh light shader using backfacing not working, after new sampling. 2017-09-06 13:51:48 +02:00
Brecht Van Lommel
dd8016f708 Fix T52652: Cycles image box mapping has flipped textures.
This breaks backwards compatibility some in that 3 sides will be mapped
differently now, but difficult to avoid and can be considered a bugfix.
2017-09-06 13:51:45 +02:00
Sergey Sharybin
750e38a526 Cycles: Fix compilation error with CUDA after recent changes 2017-09-05 16:52:45 +02:00
Sergey Sharybin
f01e43fac3 Fix T52433: Volume Absorption color tint
Need to exit the volume stack when shadow ray laves the medium.

Thanks Brecht for review and help in troubleshooting!
2017-09-05 15:48:34 +02:00
Sergey Sharybin
b0bbb5f34f Cycles: Cleanup, style 2017-09-05 12:43:02 +02:00
Sergey Sharybin
018137f762 Cycles: Cleanup, indentation and trailing whitespace 2017-08-31 14:47:49 +02:00
Stefan Werner
68dfa0f1b7 Fixing T52477 - switching from custom ray/triangle intersection code to the one from util_intersection.h. This fixes the bug and makes the code more readable and maintainable. 2017-08-30 11:48:49 +02:00
Lukas Stockner
f9a3d01452 Cycles: Mark pixels with negative values as outliers
If a pixel has negative components, something already went wrong, so the best option is to just ignore it.

Should be good for 2.79.
2017-08-25 17:46:15 +02:00
Brecht Van Lommel
76b74a93a8 Fix Cycles CUDA transparent shadow error after recent fix in c22b52c.
Fishy cat benchmark was rendering with wrong shadows. Cause is unclear,
adding printf or rearranging code seems to avoid this issue, possibly a
compiler bug. This reverts the fix and solves the OSL bug elsewhere.
2017-08-24 03:43:02 +02:00
Brecht Van Lommel
b85d36d811 Code cleanup: remove shader context.
This was needed when we accessed OSL closure memory after shader evaluation,
which could get overwritten by another shader evaluation. But all closures
are immediatley converted to ShaderClosure now, so no longer needed.
2017-08-24 03:43:02 +02:00
Brecht Van Lommel
049932c4c3 Fix panorama render crash with split kernel, due to incorrect buffer pointer.
Also some refactoring to clarify variable usage scope.
2017-08-22 00:41:07 +02:00
Brecht Van Lommel
1d1ddd48db Fix T52470: cycles OpenCL hair rendering not working after recent changes. 2017-08-20 23:32:20 +02:00
Brecht Van Lommel
b5f8063fb9 Cycles: support baking normals plugged into BSDFs, averaged with closure weight. 2017-08-20 16:51:53 +02:00
Brecht Van Lommel
c22b52cd36 Fix T52452: OSL trace broken after shadow catcher recent changes.
We should only early out with any hit in BVH traversal if the only visibility
bits used are opaque shadow. Not when opaque shadow is one of multiple bits.
2017-08-19 18:14:16 +02:00
Brecht Van Lommel
cfa8b762e2 Code cleanup: move rng into path state.
Also pass by value and don't write back now that it is just a hash for seeding
and no longer an LCG state. Together this makes CUDA a tiny bit faster in my
tests, but mainly simplifies code.
2017-08-19 18:14:16 +02:00
Stefan Werner
7a4696197d Cycles: Fix for a division by zero that could happen with solid angle triangle light sampling 2017-08-17 15:07:59 +02:00
Stefan Werner
8141eac2f8 Improved triangle sampling for mesh lights
This implements Arvo's "Stratified sampling of spherical triangles". Similar to how we sample rectangular area lights, this is sampling triangles over their solid angle. It does significantly improve sampling close to the triangle, but doesn't do much for more distant triangles. So I added a simple heuristic to switch between the two methods. Unfortunately, I expect this to add render time in any case, even when it does not make any difference whatsoever. It'll take some benchmarking with various scenes and hardware to estimate how severe the impact is and if it is worth the change.

Reviewers: #cycles, brecht

Reviewed By: #cycles, brecht

Subscribers: Vega-core, brecht, SteffenD

Tags: #cycles

Differential Revision: https://developer.blender.org/D2730
2017-08-17 12:44:32 +02:00
Brecht Van Lommel
dc7fcebb33 Code cleanup: make L_transparent part of PathRadiance. 2017-08-13 01:19:07 +02:00
Brecht Van Lommel
7542282c06 Code cleanup: make DebugData part of PathRadiance. 2017-08-13 01:19:07 +02:00
Brecht Van Lommel
fce405059f Code cleanup: make it easier to test only Sobol, CMJ or Pseudorandom. 2017-08-13 01:19:07 +02:00
Brecht Van Lommel
8f97108353 Cycles: optimize CPU split kernel data init. 2017-08-12 20:43:34 +02:00
Brecht Van Lommel
601f94a3c2 Code cleanup: remove unused Cycles random number code. 2017-08-12 20:40:38 +02:00
Brecht Van Lommel
85ad248c36 Code cleanup: fix warning and improve terminology. 2017-08-12 13:18:05 +02:00
Sergey Sharybin
2e25754ecd Cycles: Clarify new argument in PathRadiance 2017-08-11 13:49:50 +02:00
Sergey Sharybin
bd069a89aa Fix T52229: Shadow Catcher artifacts when under transparency
Added some extra tirckery to avoid background being tinted dark with transparent
surface. Maybe a bit hacky, but seems to work fine.
2017-08-11 13:49:50 +02:00
Sergey Sharybin
176ad9ecdd Cycles: Remove ulong usage
This is a bit confusing, especially when one mixes OpenCL code where ulong equals
to uint64_t with CPU side code where ulong is expected to be something else from
the naming.

This commit makes it so we use explicit name, common on all platforms.
2017-08-09 14:08:58 +02:00
Sergey Sharybin
c961737d0f Cycles: Fix compilation error of filter kernels on 32 bit Windows
We don't enable global SSE optimizations in regular kernel, and we
keep those disabled on Linux 32bit.

One possible workaround would be to pass arguments by ccl_ref, but
that is quite a few of code which better be done accurately.
2017-08-08 22:01:17 +02:00
Sergey Sharybin
19d19add1e Cycles: Cleanup, de-duplicate function parameter list
Was only needed to sue const reference on CPU. Now it is done using ccl_ref.
2017-08-08 15:27:25 +02:00
Sergey Sharybin
fd397a7d28 Cycles: Add utility macro ccl_ref
It is defined to & for CPU side compilation, and defined to an empty for any GPU
platform. The idea here is to use this macro instead of #ifdef block with bunch
of duplicated lines just to make it so CPU code is efficient.

Eventually we might switch to references on CUDA as well, but that would require
some intensive testing.
2017-08-08 15:27:25 +02:00
Mai Lavelle
ec8ae4d5e9 Cycles: Pack kernel textures into buffers for OpenCL
Image textures were being packed into a single buffer for OpenCL, which
limited the amount of memory available for images to the size of one
buffer (usually 4gb on AMD hardware). By packing textures into multiple
buffers that limit is removed, while simultaneously reducing the number
of buffers that need to be passed to each kernel.

Benchmarks were within 2%.

Fixes T51554.

Differential Revision: https://developer.blender.org/D2745
2017-08-08 07:12:04 -04:00
Sergey Sharybin
451ccf7396 Cycles: Cleanup, move curve intersection functions to own file
This way curve file becomes much shorter and it's also easier to write a
benchmark application to check performance before/after future changes.
2017-08-07 20:53:30 +02:00
Sergey Sharybin
77a7a7f455 Cycles: Cleanup, trailign whitespace 2017-08-07 20:53:30 +02:00
Sergey Sharybin
95fe9b2617 Cycles: Cleanup, remove bvh prefix from curve functions
Those are nothing to do with BVH, and can be used separately.
2017-08-07 20:53:30 +02:00
Sergey Sharybin
a4bbce8949 Cycles: Fix compilation error on NVidia OpenCL after recent refactor
Still need to verify this is proper thing to do for AMD OpenCL. At least now
i can compile OpenCL kernel on my laptop with sm21 card.
2017-08-07 20:52:24 +02:00
Brecht Van Lommel
fc38276d74 Fix Cycles shadow catcher objects influencing each other.
Since all the shadow catchers are already assumed to be in the footage,
the shadows they cast on each other are already in the footage too. So
don't just let shadow catchers skip self, but all shadow catchers.

Another justification is that it should not matter if the shadow catcher
is modeled as one object or multiple separate objects, the resulting
render should be the same.

Differential Revision: https://developer.blender.org/D2763
2017-08-07 17:54:26 +02:00
Sergey Sharybin
580741b317 Cycles: Cleanup, space after keyword 2017-08-07 14:47:51 +02:00
Brecht Van Lommel
ee77c1e917 Code refactor: use float4 instead of intrinsics for CPU denoise filtering.
Differential Revision: https://developer.blender.org/D2764
2017-08-07 14:01:24 +02:00
Brecht Van Lommel
a24fbf3323 Code refactor: add, remove, optimize various SSE functions.
* Remove some unnecessary SSE emulation defines.
* Use full precision float division so we can enable it.
* Add sqrt(), sqr(), fabs(), shuffle variations, mask().
* Optimize reduce_add(), select().

Differential Revision: https://developer.blender.org/D2764
2017-08-07 14:01:24 +02:00
Brecht Van Lommel
a8cc0d707e Code refactor: split defines into separate header, changes to SSE type headers.
I need to use some macros defined in util_simd.h for float3/float4, to emulate
SSE4 instructions on SSE2. But due to issues with order of header includes this
was not possible, this does some refactoring to make it work.

Differential Revision: https://developer.blender.org/D2764
2017-08-07 14:01:24 +02:00