Commit Graph

233 Commits

Author SHA1 Message Date
Dalai Felinto
1cb6cea71c Merge remote-tracking branch 'origin/master' into blender2.8 2017-11-13 11:48:48 -02:00
Brecht Van Lommel
e568c1a975 Fix T53289: CUDA missing textures not showing pink, after recent changes. 2017-11-12 20:45:47 +01:00
Bastien Montagne
7a6ad2901c Merge branch 'master' into blender2.8 2017-11-10 10:13:19 +01:00
Brecht Van Lommel
bd4bea3e98 Cycles: avoid reallocating tile denoising memory many times during render. 2017-11-09 20:28:00 +01:00
Sergey Sharybin
c99481b632 Merge branch 'master' into blender2.8 2017-11-09 10:59:15 +01:00
Mai Lavelle
087331c495 Cycles: Replace __MAX_CLOSURE__ build option with runtime integrator variable
Goal is to reduce OpenCL kernel recompilations.

Currently viewport renders are still set to use 64 closures as this seems to
be faster and we don't want to cause a performance regression there. Needs
to be investigated.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D2775
2017-11-09 01:04:06 -05:00
Brecht Van Lommel
7b1d707481 Merge branch 'master' into blender2.8 2017-11-08 00:20:59 +01:00
Brecht Van Lommel
ff34e48911 Cycles: add an extra CUDA synchronize before rendering.
It should not be needed as far as I know, but just in case it fixes any
of the recent issues like T52572.
2017-11-07 22:35:12 +01:00
Bastien Montagne
91af8f2ae2 Merge branch 'master' into blender2.8
Conflicts:
	intern/cycles/device/device.cpp
	source/blender/blenkernel/intern/library.c
	source/blender/blenkernel/intern/material.c
	source/blender/editors/object/object_add.c
	source/blender/editors/object/object_relations.c
	source/blender/editors/space_outliner/outliner_draw.c
	source/blender/editors/space_outliner/outliner_edit.c
	source/blender/editors/space_view3d/drawobject.c
	source/blender/editors/util/ed_util.c
	source/blender/windowmanager/intern/wm_files_link.c
2017-11-06 18:02:46 +01:00
Brecht Van Lommel
5801ef71e4 Code refactor: device memory cleanups, preparing for mapped host memory. 2017-11-05 15:22:04 +01:00
Brecht Van Lommel
5475314f49 Cycles: reserve CUDA local memory ahead of time.
This way we can log the amount of memory used, and it will be important
for host mapped memory support.
2017-11-05 15:22:04 +01:00
Campbell Barton
d4fe083b35 Merge branch 'master' into blender2.8 2017-11-04 21:45:52 +11:00
Brecht Van Lommel
33b5e8daff Code refactor: replace CUDA array with linear memory for 1D and 2D textures.
This is a prequisite for getting host memory allocation to work. There appears
to be no support for 3D textures using host memory. The original version of
this code was written by Stefan Werner for D2056.
2017-11-04 02:23:00 +01:00
Brecht Van Lommel
6ec599c682 Fix T53247: mixed CPU + GPU render wrong texture limits. 2017-11-03 20:32:29 +01:00
Brecht Van Lommel
f5456df095 Merge branch 'master' into blender2.8 2017-10-24 02:05:41 +02:00
Brecht Van Lommel
070a668d04 Code refactor: move more memory allocation logic into device API.
* Remove tex_* and pixels_* functions, replace by mem_*.
* Add MEM_TEXTURE and MEM_PIXELS as memory types recognized by devices.
* No longer create device_memory and call mem_* directly, always go
  through device_only_memory, device_vector and device_pixels.
2017-10-24 01:25:19 +02:00
Brecht Van Lommel
aa8b4c5d81 Code refactor: use device_only_memory and device_vector in more places. 2017-10-24 01:25:13 +02:00
Brecht Van Lommel
7ad9333fad Code refactor: store device/interp/extension/type in each device_memory. 2017-10-24 01:03:59 +02:00
Julian Eisel
147f9585db Merge branch 'master' into blender2.8 2017-10-23 00:04:20 +02:00
Brecht Van Lommel
57a0cb797d Code refactor: avoid some unnecessary device memory copying. 2017-10-21 20:58:28 +02:00
Sergey Sharybin
0f8a57de68 Merge branch 'master' into blender2.8 2017-10-19 13:58:01 +02:00
Sergey Sharybin
910dd7fb1b Cycles: Add extra logging in CUDA device detection code 2017-10-19 11:26:10 +02:00
Sergey Sharybin
dc95c79971 Merge branch 'master' into blender2.8 2017-10-11 13:14:16 +05:00
Campbell Barton
6ec43a765b Merge branch 'master' into blender2.8 2017-10-10 01:36:36 +11:00
Brecht Van Lommel
e360d003ea Cycles: schedule more work for non-display and compute preemption CUDA cards.
This change affects CUDA GPUs not connected to a display or connected to a
display but supporting compute preemption so that the display does not
freeze. I couldn't find an official list, but compute preemption seems to be
only supported with GTX 1070+ and Linux (not GTX 1060- or Windows).

This helps improve small tile rendering performance further if there are
sufficient samples x number of pixels in a single tile to keep the GPU busy.
2017-10-08 21:12:16 +02:00
Brecht Van Lommel
cdb0b3b1dc Code refactor: use DeviceInfo to enable QBVH and decoupled volume shading. 2017-10-08 13:17:33 +02:00
Brecht Van Lommel
23098cda99 Code refactor: make texture code more consistent between devices.
* Use common TextureInfo struct for all devices, except CUDA fermi.
* Move image sampling code to kernels/*/kernel_*_image.h files.
* Use arrays for data textures on Fermi too, so device_vector<Struct> works.
2017-10-07 14:53:14 +02:00
Campbell Barton
ea606a7847 Merge branch 'master' into blender28 2017-10-06 21:25:33 +11:00
Brecht Van Lommel
fb99ea79f8 Code refactor: split displace/background into separate kernels, remove luma. 2017-10-05 17:57:58 +02:00
Brecht Van Lommel
49199963bf Fix incorrect CUDA remaining time estimate after previous commit. 2017-10-04 23:25:51 +02:00
Brecht Van Lommel
6da6f8d33f Cycles: CUDA faster rendering of small tiles, using multiple samples like OpenCL.
The work size is still very conservative, and this doesn't help for progressive
refine. For that we will need to render multiple tiles at the same time. But this
should already help for denoising renders that require too much memory with big
tiles, and just generally soften the performance dropoff with small tiles.

Differential Revision: https://developer.blender.org/D2856
2017-10-04 21:58:47 +02:00
Brecht Van Lommel
12f4538205 Code refactor: use split variance calculation for mega kernels too.
There is no significant difference in denoised benchmark scenes and
denoising ctests, so might as well make it all consistent.
2017-10-04 21:11:14 +02:00
Brecht Van Lommel
e3e16cecc4 Code refactor: remove rng_state buffer and compute hash on the fly.
A little faster on some benchmark scenes, a little slower on others, seems
about performance neutral on average and saves a little memory.
2017-10-04 21:11:14 +02:00
Brecht Van Lommel
5b7d6ea54b Code refactor: add WorkTile struct for passing work to kernel.
This makes sharing some code between mega/split in following commits a bit
easier, and also paves the way for rendering multiple tiles later.
2017-10-04 21:11:14 +02:00
Campbell Barton
cc8c064f11 Merge branch 'master' into blender2.8 2017-09-28 03:05:46 +10:00
Brecht Van Lommel
88520dd5b6 Code refactor: simplify CUDA context push/pop.
Makes it possible to call a function like mem_alloc() when the context is
already active. Also fixes some missing pops in case of errors.
2017-09-27 13:43:21 +02:00
Campbell Barton
3e555d3d78 Merge branch 'master' into blender2.8 2017-08-21 15:41:03 +10:00
Brecht Van Lommel
43a6cf1504 Cycles: attempt to recover from crashing CUDA/OpenCL drivers on Windows.
I don't know if this will actually work, needs testing. Ref T52064.
2017-08-20 23:18:25 +02:00
Bastien Montagne
e8b6bcd65c Merge branch 'master' into blender2.8
Conflicts:
	source/blender/depsgraph/intern/builder/deg_builder_relations.cc
	source/blender/editors/object/object_add.c
	source/blender/python/intern/bpy_app_handlers.c
2017-08-08 16:43:25 +02:00
Mai Lavelle
ec8ae4d5e9 Cycles: Pack kernel textures into buffers for OpenCL
Image textures were being packed into a single buffer for OpenCL, which
limited the amount of memory available for images to the size of one
buffer (usually 4gb on AMD hardware). By packing textures into multiple
buffers that limit is removed, while simultaneously reducing the number
of buffers that need to be passed to each kernel.

Benchmarks were within 2%.

Fixes T51554.

Differential Revision: https://developer.blender.org/D2745
2017-08-08 07:12:04 -04:00
Bastien Montagne
b282716c3a Merge branch 'master' into blender2.8 2017-08-07 16:16:43 +02:00
Brecht Van Lommel
45dcd20ca9 Cycles: CUDA split performance tweaks, still far from megakernel.
On Pabellon, 25.8s mega, 35.4s split before, 32.7s split after.
2017-08-05 14:32:59 +02:00
Luca Rood
bdeeb29482 Merge branch 'master' into blender2.8 2017-07-05 15:50:01 +02:00
Sergey Sharybin
d37dd97e45 Cycles: Pass string by const reference rather than by value
Some of the functions might have been inlined, but others i don't see
how that was possible (don't think virtual functions can be inlined here).

In any case, better be explicitly optimal in the code.
2017-07-05 12:27:41 +02:00
Campbell Barton
bb773acd5f Merge branch 'master' into blender2.8 2017-06-09 19:40:47 +10:00
Lukas Stockner
705c43be0b Cycles Denoising: Merge outlier heuristic and confidence interval test
The previous outlier heuristic only checked whether the pixel is more than
twice as bright compared to the 75% quantile of the 5x5 neighborhood.
While this detected fireflies robustly, it also incorrectly marked a lot of
legitimate small highlights as outliers and filtered them away.

This commit adds an additional condition for marking a pixel as a firefly:
In addition to being above the reference brightness, the lower end of the
3-sigma confidence interval has to be below it.
Since the lower end approximates how low the true value of the pixel might be,
this test separates pixels that are supposed to be very bright from pixels that
are very bright due to random fireflies.

Also, since there is now a reliable outlier filter as a preprocessing step,
the additional confidence interval test in the reconstruction kernel is no
longer needed.
2017-06-09 03:46:11 +02:00
Bastien Montagne
44f91a9a18 Merge branch 'master' into blender2.8
Conflicts:
	source/blender/blenloader/intern/versioning_270.c
2017-05-22 22:49:02 +02:00
Sergey Sharybin
34b689892b Fix T51568: CUDA error in viewport render after fix for for OpenCL
Seems re-loading module invalidates memory pointers by the looks of it,
which gives an error on the next kernel call.

Not sure how to move memory pointer from one CUDA module to another one,
so for now simply disabling kernel re-load for CUDA devices. Not ideal,
but better than failing render.

Feature-selective option for CUDA is not an official feature anyway.
2017-05-22 12:28:21 +02:00
Sergey Sharybin
38a2bf665b Cycles: Cleanup, style and unused arguments
- Some arguments were inapproriatry tagged as unused
  using (void)foo semantic.

  Only use such semantic in tricky casses, when something
  needs to be ignored in release builds or something is
  dependent on tricky ifndef policy.

  For rest of the cases just use void foo(int /bar*/)
  semantic, which ensures variable is not used. Solves
  confusion and code running out of sync with later
  development.

- Used proper unused semantic to some arguments.

- Added braces to make code easier to follow, tricky
  indentation with ifdef, uh.
2017-05-20 05:21:27 -07:00
Bastien Montagne
1f46da922a Merge branch 'master' into blender2.8
Conflicts:
	source/blender/blenloader/intern/versioning_270.c
	source/blender/depsgraph/intern/depsgraph_tag.cc
	source/blender/editors/mask/mask_draw.c
2017-05-19 09:36:14 +02:00