Commit Graph

677 Commits

Author SHA1 Message Date
Campbell Barton
be40389165 Merge branch 'master' into blender2.8 2018-01-03 23:44:47 +11:00
Brecht Van Lommel
c621832d3d Cycles: CUDA support for rendering scenes that don't fit on GPU.
In that case it can now fall back to CPU memory, at the cost of reduced
performance. For scenes that fit in GPU memory, this commit should not
cause any noticeable slowdowns.

We don't use all physical system RAM, since that can cause OS instability.
We leave at least half of system RAM or 4GB to other software, whichever
is smaller.

For image textures in host memory, performance was maybe 20-30% slower
in our tests (although this is highly hardware and scene dependent). Once
other type of data doesn't fit on the GPU, performance can be e.g. 10x
slower, and at that point it's probably better to just render on the CPU.

Differential Revision: https://developer.blender.org/D2056
2018-01-02 23:50:18 +01:00
Campbell Barton
03a5eccc94 Merge branch 'master' into blender2.8 2017-11-30 18:30:41 +11:00
Lukas Stockner
fa3d50af95 Cycles: Improve denoising speed on GPUs with small tile sizes
Previously, the NLM kernels would be launched once per offset with one thread per pixel.
However, with the smaller tile sizes that are now feasible, there wasn't enough work to fully occupy GPUs which results in a significant slowdown.

Therefore, the kernels are now launched in a single call that handles all offsets at once.
This has two downsides: Memory accesses to accumulating buffers are now atomic, and more importantly, the temporary memory now has to be allocated for every shift at once, increasing the required memory.
On the other hand, of course, the smaller tiles significantly reduce the size of the memory.

The main bottleneck right now is the construction of the transformation - there is nothing to be parallelized there, one thread per pixel is the maximum.
I tried to parallelize the SVD implementation by storing the matrix in shared memory and launching one block per pixel, but that wasn't really going anywhere.

To make the new code somewhat readable, the handling of rectangular regions was cleaned up a bit and commented, it should be easier to understand what's going on now.
Also, some variables have been renamed to make the difference between buffer width and stride more apparent, in addition to some general style cleanup.
2017-11-30 07:37:08 +01:00
Brecht Van Lommel
84d39ab97b Merge branch 'master' into blender2.8 2017-11-29 18:13:06 +01:00
Maxym Dmytrychenko
7e349f2745 Cycles: improve triangle intersection performance.
Reduces render time by about 1-2% in benchmark scenes.

Differential Revision: https://developer.blender.org/D2911
2017-11-29 18:11:40 +01:00
Dalai Felinto
258292abc0 Merge commit '212a8d9e5ae7^' into blender2.8 2017-11-15 07:07:27 -02:00
Lukas Stockner
d8066fb0f1 Cycles: Refactor closure roughness detection to fix a potential bug with Denoising of specular shaders 2017-11-14 04:17:54 +01:00
Dalai Felinto
1cb6cea71c Merge remote-tracking branch 'origin/master' into blender2.8 2017-11-13 11:48:48 -02:00
Sergey Sharybin
d1a761c4d4 Cycles: Fix compilation error of standalone application 2017-11-13 10:49:05 +01:00
Sergey Sharybin
42dff6cc2e Cycles: Fix compilation error with OIIO compiled against system PugiXML 2017-11-13 10:42:29 +01:00
Sergey Sharybin
1737887938 Merge branch 'master' into blender2.8 2017-11-10 10:36:46 +01:00
Sergey Sharybin
db7a78a2be Cycles: Fix compilation error with latest OIIO
There was some changes about namespaces, which causes ambiguities.

Replaces using namespace with an explicit symbols we need. Is good idea to NOT
pull in the whole namespace anyway!
2017-11-10 10:04:33 +01:00
Campbell Barton
941484ff81 Merge branch 'master' into blender2.8 2017-11-01 01:27:03 +11:00
Sergey Sharybin
46963f359d Cycles: Bump version number to 1.9.0
This matches Blender Release 2.79.
2017-10-31 13:34:34 +01:00
Brecht Van Lommel
f5456df095 Merge branch 'master' into blender2.8 2017-10-24 02:05:41 +02:00
Brecht Van Lommel
070a668d04 Code refactor: move more memory allocation logic into device API.
* Remove tex_* and pixels_* functions, replace by mem_*.
* Add MEM_TEXTURE and MEM_PIXELS as memory types recognized by devices.
* No longer create device_memory and call mem_* directly, always go
  through device_only_memory, device_vector and device_pixels.
2017-10-24 01:25:19 +02:00
Julian Eisel
147f9585db Merge branch 'master' into blender2.8 2017-10-23 00:04:20 +02:00
Brecht Van Lommel
57a0cb797d Code refactor: avoid some unnecessary device memory copying. 2017-10-21 20:58:28 +02:00
Campbell Barton
6ec43a765b Merge branch 'master' into blender2.8 2017-10-10 01:36:36 +11:00
Brecht Van Lommel
f61c340bc1 Cycles: OpenCL bicubic and tricubic texture interpolation support. 2017-10-08 02:55:44 +02:00
Brecht Van Lommel
23098cda99 Code refactor: make texture code more consistent between devices.
* Use common TextureInfo struct for all devices, except CUDA fermi.
* Move image sampling code to kernels/*/kernel_*_image.h files.
* Use arrays for data textures on Fermi too, so device_vector<Struct> works.
2017-10-07 14:53:14 +02:00
Campbell Barton
ea606a7847 Merge branch 'master' into blender28 2017-10-06 21:25:33 +11:00
Brecht Van Lommel
4537e85584 Fix T53001: more workarounds for crash in AMD compiler with recent drivers. 2017-10-05 17:57:58 +02:00
Sergey Sharybin
128c7c3ba1 Merge branch 'master' into blender2.8 2017-09-22 13:26:49 +05:00
Brecht Van Lommel
18a353dd24 Fix T52368: Cycles OSL trace() failing on Windows 32 bit. 2017-09-20 19:38:08 +02:00
Campbell Barton
572b1a644f Merge branch 'master' into blender2.8 2017-09-05 22:56:03 +10:00
Sergey Sharybin
885c0a5f90 Cycles: Fix compilation warning 2017-09-04 13:28:15 +02:00
Campbell Barton
323a7ab944 Merge branch 'master' into blender2.8 2017-08-31 21:57:38 +10:00
Brecht Van Lommel
1457e5ea73 Fix Cycles Windows render errors with BVH2 CPU rendering.
One problem is that it was always using __mm_blendv_ps emulation even if the
instruction was supported. The other that the emulation function was wrong.

Thanks a lot to Ray Molenkamp for tracking this one down.
2017-08-29 22:55:35 +02:00
Campbell Barton
79111f9246 Merge branch 'master' into blender2.8 2017-08-27 00:51:54 +10:00
Sergey Sharybin
90299e4216 Cycles: Add utility function to query current value of scoped timer 2017-08-25 14:27:34 +02:00
Campbell Barton
f8f6f8f26e Merge branch 'master' into blender2.8 2017-08-25 20:45:16 +10:00
Sergey Sharybin
436d1b4e90 Cycles: FIx issue with -0 being considered a non-finite value 2017-08-24 14:32:56 +02:00
Campbell Barton
0671814e3b Merge branch 'master' into blender2.8 2017-08-24 01:07:09 +10:00
Mai Lavelle
2540741dee Fix implementation of atomic update max and move to a central location
While unlikely to have had any serious effects because of limited use, the
previous implementation was not actually atomic due to a data race and
incorrectly coded CAS loop. We also had duplicates of this code in a few
places, it's now been moved to a single location with all other atomic
operations.
2017-08-23 06:54:25 -04:00
Campbell Barton
bd935b5aed Merge branch 'master' into blender2.8 2017-08-22 18:21:05 +10:00
Brecht Van Lommel
296d74c4b1 Cycles: reorganize Performance panel layout, move viewport BVH type to debug. 2017-08-21 19:05:17 +02:00
Campbell Barton
2332051419 Merge branch 'master' into blender2.8 2017-08-19 21:54:05 +10:00
Brecht Van Lommel
4d428d14af Fix T52443: Cycles OpenCL build error after recent mesh lights changes. 2017-08-19 01:02:55 +02:00
Campbell Barton
230be97284 Merge branch 'master' into blender2.8 2017-08-14 12:13:55 +10:00
Brecht Van Lommel
6919393a51 Fix T52372: CUDA build error after recent changes. 2017-08-12 20:37:06 +02:00
Campbell Barton
22872857d4 Merge branch 'master' into blender2.8 2017-08-13 01:14:55 +10:00
Brecht Van Lommel
d7639d57dc Fix T52368: OSL trace() crash after recent changes. 2017-08-12 14:32:52 +02:00
Campbell Barton
d1328feeb1 Merge branch 'master' into blender2.8 2017-08-11 10:33:39 +10:00
Brecht Van Lommel
267e75158a Fix T52322: denoiser broken on Windows after recent changes.
It's not clear why this only happened on Windows, but the code
was wrong and should do a bitcast here instead of conversion.
2017-08-11 01:09:35 +02:00
Bastien Montagne
e8b6bcd65c Merge branch 'master' into blender2.8
Conflicts:
	source/blender/depsgraph/intern/builder/deg_builder_relations.cc
	source/blender/editors/object/object_add.c
	source/blender/python/intern/bpy_app_handlers.c
2017-08-08 16:43:25 +02:00
Sergey Sharybin
fd397a7d28 Cycles: Add utility macro ccl_ref
It is defined to & for CPU side compilation, and defined to an empty for any GPU
platform. The idea here is to use this macro instead of #ifdef block with bunch
of duplicated lines just to make it so CPU code is efficient.

Eventually we might switch to references on CUDA as well, but that would require
some intensive testing.
2017-08-08 15:27:25 +02:00
Brecht Van Lommel
dc4d850d10 Fix Windows build errors with recent Cycles SIMD refactoring. 2017-08-07 17:54:26 +02:00
Brecht Van Lommel
25b8eb4631 Merge branch 'master' into blender2.8 2017-08-07 17:48:14 +02:00