Commit Graph

6318 Commits

Author SHA1 Message Date
Dalai Felinto
dfa5b32c8c Merge remote-tracking branch 'origin/master' into blender2.8 2016-10-13 16:42:54 +00:00
Brecht Van Lommel
7f5441b916 Fix T49640: Cycles constant folding incorrect for texture coordinates. 2016-10-12 18:42:38 +02:00
Brecht Van Lommel
21e65d7457 Fix build error with WITH_CYCLES_NATIVE_ONLY and recent AVX2 changes. 2016-10-12 17:35:03 +02:00
Sergey Sharybin
22cdf44101 Cycles: Use const reference for register variables in non-OpenCL code
This is something tested by @LazyDodo and suggested by Maxym to make
MSVC happier.
2016-10-12 14:48:59 +02:00
Sergey Sharybin
e588106d45 Cycles: Use more SSE intrinsics for float3 type
This gives about 5% speedup on AVX2 kernels (other kernels still
have SSE disabled for math operations) and this solves the slowdown
of koro scene mention in the previous commit.

The title says it all actually. This commit also contains
changes to pass float3 as const reference in affected functions.

This should make MSVC happier without breaking OpenCL because it's
only done in areas which are ifdef-ed for non-OpenCL.

Another patch based on inspiration from Maxym Dmytrychenko, thanks!
2016-10-12 14:43:00 +02:00
Sergey Sharybin
42aeb608e7 Cycles: Implement AVX2 version of triangle_intersect
This commit basically vectorizes existing code using AVX2 instructions
(without modifying algorithm itself). This gives quite nice speedups:

  BMW:        -8%
  Classroom:  -5%
  Cat:        -5%
  Koro:       +1%
  Barcelona:  -8%

That's on Linux machine, reported performance improvement on Windows
goes up to 20%.

Not currently sure why Koro is somewhat slower because it mainly uses
curve intersection tests, could be a time noise? Or osmething with the
cache utilization perhaps? In any case speedup in other scenes makes
me thinking that current state is acceptable for initial implementation.

This is again inspired by Maxym Dmytrychenko.
2016-10-12 14:11:55 +02:00
Sergey Sharybin
6a4ec3ca43 Cycles: Add new avxf vectorized data type
Based on existing ssef data type and to my knowledge it's also what happens in
Embree nowadays.

Inspired by Maxym Dmytrychenko and required for the upcoming triangle
intersection commit.

Hopefully the copyright message is correct.
2016-10-12 13:54:13 +02:00
Sergey Sharybin
fa62a989b4 Cycles: Enable SSE options of math module for AVX2 kernels
Currently this does not give measurable difference, but is required
ground work for some upcoming further optimization of AVX2 kernels.
2016-10-12 12:54:31 +02:00
Sergey Sharybin
87d08a5dc1 Cycles: Get rid of ifdef-ed noinline policy 2016-10-12 12:15:24 +02:00
Sergey Sharybin
cc95172667 Cycles: Fix use of uninitialized variable in SSS
When ray hits curve segment with SSS shader it was possible to have
uninitialized hit_P variable used for sampling.

Seems that was a reason of our headache of difference between AVX2
and SSE4 render results here, so now we can revert all the nasty
ifdef-ed inline policies.
2016-10-12 12:12:28 +02:00
Sergey Sharybin
edd9d89673 Cycles: Cleanup, style 2016-10-12 11:54:33 +02:00
Bastien Montagne
6371f8ff8a Merge branch 'master' into blender2.8
Conflicts:
	source/blender/blenloader/intern/readfile.c
	source/blender/editors/space_view3d/view3d_draw.c
2016-10-10 12:41:32 +02:00
Lukas Stockner
9ea71bc674 Cycles: Split device_opencl.cpp into multiple files for easier maintenance
There are no user-visible changes, just some internal restructuring.

Differential Revision: https://developer.blender.org/D2231
2016-10-09 15:49:50 +02:00
Brecht Van Lommel
74e0f900c5 Fix a few compile errors with C++11 on macOS. 2016-10-08 15:03:53 +02:00
Lukas Stockner
2dccf5a6e8 Cycles: Fix OpenCL split kernel compilation after recent CUDA 8 performance fix 2016-10-07 18:50:43 +02:00
Julian Eisel
553b4faac8 Merge branch 'master' into blender2.8 2016-10-07 00:22:21 +02:00
Brecht Van Lommel
5a0f397eaa Fix T49523: very slow normal map tangent computation for rendering in 2.78. 2016-10-06 03:12:04 +02:00
Dalai Felinto
ae44e24fed Merge remote-tracking branch 'origin/master' into blender2.8 2016-10-03 20:54:22 +00:00
Brecht Van Lommel
b4f9766ed1 Cycles CUDA: make CUDA 8.0 the officially supported version for all platforms. 2016-10-03 22:15:26 +02:00
Brecht Van Lommel
a3abb020e3 Fix Cycles CUDA performance on CUDA 8.0.
Mostly this is making inlining match CUDA 7.5 in a few performance critical
places. The end result is that performance is now better than before, possibly
due to less register spilling or other CUDA 8.0 compiler improvements.

On benchmarks scenes, there are 3% to 35% render time reductions. Stack memory
usage is reduced a little too.

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D2269
2016-10-03 22:15:25 +02:00
Brecht Van Lommel
49ad4215ba Fix fluid sim build error with MSVC. 2016-10-03 22:15:24 +02:00
Bastien Montagne
55aadccbde Merge branch 'master' into blender2.8 2016-10-03 20:48:00 +02:00
lazydodo
3ee5ce155c [Windows/Cycles/Clang] Fix compilation error with clang-cl on windows 2016-10-02 14:01:23 -06:00
Bastien Montagne
c50ccc8476 Merge branch 'master' into blender2.8 2016-10-02 18:53:01 +02:00
Brecht Van Lommel
3f9b69287d Fluids: improve multithreaded CPU usage.
Fixes for clamp-omp, fewer shared variables, fix some cases of threads writing
to the same memory location. Issue found by Jens Verwiebe, who reports 30%
speedup with 16 core CPU, when using this with a recent clang-omp version.
2016-10-02 16:38:14 +02:00
Alexander Gavrilov
40eedd5df9 Cycles: implement partial constant folding for exponentiation.
This is also an important mathematical operation that can be folded
if it is known that one argument is a certain constant. For colors
the operation is provided as a Gamma node.

The SVM Gamma node needs a small fix to make it follow the 0 ^ 0 == 1
rule, same as the Power node, or the Gamma node itself in OSL mode.

Reviewers: #cycles

Differential Revision: https://developer.blender.org/D2263
2016-10-01 14:37:03 +03:00
Brecht Van Lommel
20c6d5e3cb Fix MSVC compiler warning due to using */* to start comment. 2016-10-01 01:55:34 +02:00
Julian Eisel
42ed1f0e3c Merge branch 'master' into blender2.8
Conflicts:
	source/blender/blenloader/intern/writefile.c
2016-09-30 01:18:41 +02:00
Sergey Sharybin
80837d06de Cycles: Support earlier tile rendering termination on cancel
It will discard the whole tile, but it's still kind of more friendly than
fully locked interface (sort of) for until tile is fully sampled.

Sorry if it causes PITA to merge for the opencl split work, but this issue
bothering a lot when collecting benchmarks.
2016-09-29 16:00:25 +02:00
Sergey Sharybin
333366dbcf Cycles: Fix typo in shader cancel routines 2016-09-29 15:48:10 +02:00
Sergey Sharybin
31ebbe40a0 Cycles: Improve OpenCL line information handling
Previously it was falling back to just a path after #include
statement was finished. Now we fall back to a proper current
file name after dealing with the preprocessor statement.
2016-09-29 10:20:24 +02:00
Sergey Sharybin
94c919349b Cycles: Cleanup file headers
Some of the files were wrongly attributing code to some other
organizations and in few places proper attribution was missing.

This is mainly either a copy-paste error (when new file was
created from an existing one and header wasn't updated) or due
to some refactor which split non-original-BF code with purely
BF code.

Should solve some confusion around.
2016-09-29 10:11:40 +02:00
lazydodo
26d7d995db Fix Windows mouse wheel scroll speed
In Windows, event dispatching code is throwing out the wheel scroll count value.
Despite of how many fast you move the wheel, it only make one-notch scroll event.

This patch convert wheel event to multiple 1-notch wheel events.

This also correct the handling of smooth scroll mouse wheel (which can report smaller than 1-notch wheel movement) by accumulating the small wheel delta values.

Reviewers: djnz, shadowrom, elubie, #platform:_windows, sergey, juicyfruit, brecht

Reviewed By: shadowrom, elubie, #platform:_windows, brecht

Subscribers: dingto, elubie, brachi, brecht

Differential Revision: https://developer.blender.org/D143
2016-09-28 17:22:02 -06:00
Sergey Sharybin
0ec87f1227 Cycles: Cleanup, indentation 2016-09-28 17:05:33 +02:00
Sergey Sharybin
e1bfb89da2 Cycles: Fix compilation error with minimal feature set 2016-09-28 17:03:59 +02:00
Mike Erwin
5d0de39238 fix Mac build for Xcode < 8
We need a long-term fix, but this will get 2.78 out the door.
2016-09-27 16:16:47 +02:00
Mike Erwin
2ebb367b0f cleanup: spacing & alignment 2016-09-27 14:56:58 +02:00
Bastien Montagne
8cff9c20ff Merge branch 'master' into blender2.8
WARNING! Full build is broken, alembic has not been merged in correctly and has some references to particle stuff.
Don't have time to tackle this now (and probably would be better if someone knowing what he's doing does it anyway).

Conflicts:
	release/scripts/startup/bl_ui/properties_particle.py
	source/blender/blenkernel/intern/library_remap.c
	source/blender/blenkernel/intern/smoke.c
	source/blender/editors/physics/particle_object.c
	source/blender/editors/physics/physics_intern.h
	source/blender/editors/physics/physics_ops.c
	source/blender/editors/space_outliner/outliner_intern.h
	source/blender/editors/space_view3d/drawvolume.c
	source/blender/makesrna/intern/rna_smoke.c
2016-09-26 17:19:03 +02:00
Lukas Stockner
07de832e22 Cycles: Use correct light sampling PDF for MIS calculation with Branched Path Tracing
The light sampling functions calculate light sampling PDF for the case that the light has been randomly selected out of all lights.
However, since BPT handles lamps and meshlights separately, this isn't the case. So, to avoid a wrong result, the code just included the 0.5 factor in the throughput.

In theory, however, the correction should be made to the sampling probability, which needs to be doubled. Now, for the regular calculation, that's no real difference since the throughput is divided by the pdf.
However, it does matter for the MIS calculation - it's unbiased both ways, but including the factor in the PDF instead of the throughput should give slightly better results.

Reviewers: sergey, brecht, dingto, juicyfruit

Differential Revision: https://developer.blender.org/D2258
2016-09-25 23:16:05 +02:00
Lukas Stockner
0b89b31a18 Cycles: Fix T49411: Multiscatter GGX with zero roughness when Filter Glossy is enabled 2016-09-25 22:09:38 +02:00
Brecht Van Lommel
335ee5ce5a Fix T49310: incorrect Cycles standalone normals with negative scale. 2016-09-25 05:23:52 +02:00
Mike Erwin
7fc2e333bb small merge fix
Follow-up to rB1dfb89d22930
2016-09-23 18:12:24 +02:00
Sergey Sharybin
2372e67dd6 Cycles: Don't sum up memory usage of all devices together for the stats 2016-09-23 12:43:23 +02:00
Julian Eisel
1dfb89d229 Merge branch 'master' into blender2.8
Conflicts:
	intern/ghost/intern/GHOST_ContextCGL.mm
	intern/ghost/intern/GHOST_WindowCocoa.mm
	source/blender/makesrna/intern/rna_main.c
2016-09-23 01:40:19 +02:00
Mai Lavelle
1b2b7cfa20 Cycles: Fix overflow caused by wrong size calculation in Mesh::add_undisplaced 2016-09-22 17:44:22 -04:00
Sergey Sharybin
d84c55f0fa Fix T49417: Cycles crash - can't use 5 Gigabyte Tile EXR texture file
Was an integer overflow issue when calculating offsets.
2016-09-22 17:30:31 +02:00
Sergey Sharybin
622c9ced6c Cycles: Cleanup, whitespace 2016-09-21 14:42:05 +02:00
Sergey Sharybin
166286e6de Cycles: Make code more uniform across two versions of shadow_blocked()
Just to make it easier to research ways of possible code de-duplication.
2016-09-21 11:50:11 +02:00
Sergey Sharybin
e4f7bf6ccb Cycles: Remove out of date comment 2016-09-21 11:48:36 +02:00
Sergey Sharybin
a5f14ad1a2 Cycles: Make regular bvh traversal functions close to each other 2016-09-20 16:58:39 +02:00