Commit Graph

3437 Commits

Author SHA1 Message Date
Dalai Felinto
dfa5b32c8c Merge remote-tracking branch 'origin/master' into blender2.8 2016-10-13 16:42:54 +00:00
Brecht Van Lommel
7f5441b916 Fix T49640: Cycles constant folding incorrect for texture coordinates. 2016-10-12 18:42:38 +02:00
Brecht Van Lommel
21e65d7457 Fix build error with WITH_CYCLES_NATIVE_ONLY and recent AVX2 changes. 2016-10-12 17:35:03 +02:00
Sergey Sharybin
22cdf44101 Cycles: Use const reference for register variables in non-OpenCL code
This is something tested by @LazyDodo and suggested by Maxym to make
MSVC happier.
2016-10-12 14:48:59 +02:00
Sergey Sharybin
e588106d45 Cycles: Use more SSE intrinsics for float3 type
This gives about 5% speedup on AVX2 kernels (other kernels still
have SSE disabled for math operations) and this solves the slowdown
of koro scene mention in the previous commit.

The title says it all actually. This commit also contains
changes to pass float3 as const reference in affected functions.

This should make MSVC happier without breaking OpenCL because it's
only done in areas which are ifdef-ed for non-OpenCL.

Another patch based on inspiration from Maxym Dmytrychenko, thanks!
2016-10-12 14:43:00 +02:00
Sergey Sharybin
42aeb608e7 Cycles: Implement AVX2 version of triangle_intersect
This commit basically vectorizes existing code using AVX2 instructions
(without modifying algorithm itself). This gives quite nice speedups:

  BMW:        -8%
  Classroom:  -5%
  Cat:        -5%
  Koro:       +1%
  Barcelona:  -8%

That's on Linux machine, reported performance improvement on Windows
goes up to 20%.

Not currently sure why Koro is somewhat slower because it mainly uses
curve intersection tests, could be a time noise? Or osmething with the
cache utilization perhaps? In any case speedup in other scenes makes
me thinking that current state is acceptable for initial implementation.

This is again inspired by Maxym Dmytrychenko.
2016-10-12 14:11:55 +02:00
Sergey Sharybin
6a4ec3ca43 Cycles: Add new avxf vectorized data type
Based on existing ssef data type and to my knowledge it's also what happens in
Embree nowadays.

Inspired by Maxym Dmytrychenko and required for the upcoming triangle
intersection commit.

Hopefully the copyright message is correct.
2016-10-12 13:54:13 +02:00
Sergey Sharybin
fa62a989b4 Cycles: Enable SSE options of math module for AVX2 kernels
Currently this does not give measurable difference, but is required
ground work for some upcoming further optimization of AVX2 kernels.
2016-10-12 12:54:31 +02:00
Sergey Sharybin
87d08a5dc1 Cycles: Get rid of ifdef-ed noinline policy 2016-10-12 12:15:24 +02:00
Sergey Sharybin
cc95172667 Cycles: Fix use of uninitialized variable in SSS
When ray hits curve segment with SSS shader it was possible to have
uninitialized hit_P variable used for sampling.

Seems that was a reason of our headache of difference between AVX2
and SSE4 render results here, so now we can revert all the nasty
ifdef-ed inline policies.
2016-10-12 12:12:28 +02:00
Sergey Sharybin
edd9d89673 Cycles: Cleanup, style 2016-10-12 11:54:33 +02:00
Bastien Montagne
6371f8ff8a Merge branch 'master' into blender2.8
Conflicts:
	source/blender/blenloader/intern/readfile.c
	source/blender/editors/space_view3d/view3d_draw.c
2016-10-10 12:41:32 +02:00
Lukas Stockner
9ea71bc674 Cycles: Split device_opencl.cpp into multiple files for easier maintenance
There are no user-visible changes, just some internal restructuring.

Differential Revision: https://developer.blender.org/D2231
2016-10-09 15:49:50 +02:00
Lukas Stockner
2dccf5a6e8 Cycles: Fix OpenCL split kernel compilation after recent CUDA 8 performance fix 2016-10-07 18:50:43 +02:00
Dalai Felinto
ae44e24fed Merge remote-tracking branch 'origin/master' into blender2.8 2016-10-03 20:54:22 +00:00
Brecht Van Lommel
b4f9766ed1 Cycles CUDA: make CUDA 8.0 the officially supported version for all platforms. 2016-10-03 22:15:26 +02:00
Brecht Van Lommel
a3abb020e3 Fix Cycles CUDA performance on CUDA 8.0.
Mostly this is making inlining match CUDA 7.5 in a few performance critical
places. The end result is that performance is now better than before, possibly
due to less register spilling or other CUDA 8.0 compiler improvements.

On benchmarks scenes, there are 3% to 35% render time reductions. Stack memory
usage is reduced a little too.

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D2269
2016-10-03 22:15:25 +02:00
Bastien Montagne
55aadccbde Merge branch 'master' into blender2.8 2016-10-03 20:48:00 +02:00
lazydodo
3ee5ce155c [Windows/Cycles/Clang] Fix compilation error with clang-cl on windows 2016-10-02 14:01:23 -06:00
Bastien Montagne
c50ccc8476 Merge branch 'master' into blender2.8 2016-10-02 18:53:01 +02:00
Alexander Gavrilov
40eedd5df9 Cycles: implement partial constant folding for exponentiation.
This is also an important mathematical operation that can be folded
if it is known that one argument is a certain constant. For colors
the operation is provided as a Gamma node.

The SVM Gamma node needs a small fix to make it follow the 0 ^ 0 == 1
rule, same as the Power node, or the Gamma node itself in OSL mode.

Reviewers: #cycles

Differential Revision: https://developer.blender.org/D2263
2016-10-01 14:37:03 +03:00
Brecht Van Lommel
20c6d5e3cb Fix MSVC compiler warning due to using */* to start comment. 2016-10-01 01:55:34 +02:00
Julian Eisel
42ed1f0e3c Merge branch 'master' into blender2.8
Conflicts:
	source/blender/blenloader/intern/writefile.c
2016-09-30 01:18:41 +02:00
Sergey Sharybin
80837d06de Cycles: Support earlier tile rendering termination on cancel
It will discard the whole tile, but it's still kind of more friendly than
fully locked interface (sort of) for until tile is fully sampled.

Sorry if it causes PITA to merge for the opencl split work, but this issue
bothering a lot when collecting benchmarks.
2016-09-29 16:00:25 +02:00
Sergey Sharybin
333366dbcf Cycles: Fix typo in shader cancel routines 2016-09-29 15:48:10 +02:00
Sergey Sharybin
31ebbe40a0 Cycles: Improve OpenCL line information handling
Previously it was falling back to just a path after #include
statement was finished. Now we fall back to a proper current
file name after dealing with the preprocessor statement.
2016-09-29 10:20:24 +02:00
Sergey Sharybin
94c919349b Cycles: Cleanup file headers
Some of the files were wrongly attributing code to some other
organizations and in few places proper attribution was missing.

This is mainly either a copy-paste error (when new file was
created from an existing one and header wasn't updated) or due
to some refactor which split non-original-BF code with purely
BF code.

Should solve some confusion around.
2016-09-29 10:11:40 +02:00
Sergey Sharybin
0ec87f1227 Cycles: Cleanup, indentation 2016-09-28 17:05:33 +02:00
Sergey Sharybin
e1bfb89da2 Cycles: Fix compilation error with minimal feature set 2016-09-28 17:03:59 +02:00
Bastien Montagne
8cff9c20ff Merge branch 'master' into blender2.8
WARNING! Full build is broken, alembic has not been merged in correctly and has some references to particle stuff.
Don't have time to tackle this now (and probably would be better if someone knowing what he's doing does it anyway).

Conflicts:
	release/scripts/startup/bl_ui/properties_particle.py
	source/blender/blenkernel/intern/library_remap.c
	source/blender/blenkernel/intern/smoke.c
	source/blender/editors/physics/particle_object.c
	source/blender/editors/physics/physics_intern.h
	source/blender/editors/physics/physics_ops.c
	source/blender/editors/space_outliner/outliner_intern.h
	source/blender/editors/space_view3d/drawvolume.c
	source/blender/makesrna/intern/rna_smoke.c
2016-09-26 17:19:03 +02:00
Lukas Stockner
07de832e22 Cycles: Use correct light sampling PDF for MIS calculation with Branched Path Tracing
The light sampling functions calculate light sampling PDF for the case that the light has been randomly selected out of all lights.
However, since BPT handles lamps and meshlights separately, this isn't the case. So, to avoid a wrong result, the code just included the 0.5 factor in the throughput.

In theory, however, the correction should be made to the sampling probability, which needs to be doubled. Now, for the regular calculation, that's no real difference since the throughput is divided by the pdf.
However, it does matter for the MIS calculation - it's unbiased both ways, but including the factor in the PDF instead of the throughput should give slightly better results.

Reviewers: sergey, brecht, dingto, juicyfruit

Differential Revision: https://developer.blender.org/D2258
2016-09-25 23:16:05 +02:00
Lukas Stockner
0b89b31a18 Cycles: Fix T49411: Multiscatter GGX with zero roughness when Filter Glossy is enabled 2016-09-25 22:09:38 +02:00
Brecht Van Lommel
335ee5ce5a Fix T49310: incorrect Cycles standalone normals with negative scale. 2016-09-25 05:23:52 +02:00
Sergey Sharybin
2372e67dd6 Cycles: Don't sum up memory usage of all devices together for the stats 2016-09-23 12:43:23 +02:00
Julian Eisel
1dfb89d229 Merge branch 'master' into blender2.8
Conflicts:
	intern/ghost/intern/GHOST_ContextCGL.mm
	intern/ghost/intern/GHOST_WindowCocoa.mm
	source/blender/makesrna/intern/rna_main.c
2016-09-23 01:40:19 +02:00
Mai Lavelle
1b2b7cfa20 Cycles: Fix overflow caused by wrong size calculation in Mesh::add_undisplaced 2016-09-22 17:44:22 -04:00
Sergey Sharybin
d84c55f0fa Fix T49417: Cycles crash - can't use 5 Gigabyte Tile EXR texture file
Was an integer overflow issue when calculating offsets.
2016-09-22 17:30:31 +02:00
Sergey Sharybin
622c9ced6c Cycles: Cleanup, whitespace 2016-09-21 14:42:05 +02:00
Sergey Sharybin
166286e6de Cycles: Make code more uniform across two versions of shadow_blocked()
Just to make it easier to research ways of possible code de-duplication.
2016-09-21 11:50:11 +02:00
Sergey Sharybin
e4f7bf6ccb Cycles: Remove out of date comment 2016-09-21 11:48:36 +02:00
Sergey Sharybin
a5f14ad1a2 Cycles: Make regular bvh traversal functions close to each other 2016-09-20 16:58:39 +02:00
Sergey Sharybin
a6db95cd42 Cycles: Re-group ifdef so we check for particular feature only once 2016-09-20 16:58:39 +02:00
Sergey Sharybin
386da0cc77 Cycles: Avoid conversion from bool to uint 2016-09-20 13:00:36 +02:00
Sergey Sharybin
100b2ad775 Cycles: Cleanup code style in split kernel 2016-09-19 16:05:12 +02:00
Sergey Sharybin
5c6a14f4e5 Cycles: More tweaks to make specialized BVH traversal matching 2016-09-19 15:29:37 +02:00
Sergey Sharybin
7901f62a9d Cycles: Avoid redundant intersection pre-calculation 2016-09-19 15:18:27 +02:00
Sergey Sharybin
6ba59660fb Cycles: Cleanup, sync some comments across different traversal 2016-09-19 15:18:27 +02:00
Sergey Sharybin
85f48216ed Cycles: Cleanup, always use parenthesis
Makes it simpler to compare different traversal algorithms.
2016-09-19 15:18:27 +02:00
Sergey Sharybin
2980c6ebae Cycles: Move BVH constants to an own files, so they are easily re-usable 2016-09-19 13:00:41 +02:00
Mai Lavelle
772dab9df1 Cycles: Fix typo that would sometimes result in subsurf modifier being disabled 2016-09-18 22:14:15 -04:00