Commit Graph

595 Commits

Author SHA1 Message Date
Campbell Barton
d09920687c Merge branch 'master' into blender2.8 2018-05-02 12:46:14 +02:00
Lukas Stockner
16c05161e7 Cycles: Cleanup: Remove double semicolons 2018-04-29 09:28:41 +02:00
Clément Foucault
9ff8195535 OpenGL: Remove remaining instances of GL_RGBA16F_ARB.
There is no need for it now that we use opengl 3.3. Use GL_RGBA16F instead.
2018-04-24 12:48:43 +02:00
Sergey Sharybin
eaf8608ba5 Merge branch 'master' into blender2.8 2018-04-04 09:54:50 +02:00
Mai Lavelle
8c4d28cdb9 Fix T54400: Some GCN 1 cards available to select for use with Cycles
Hainan was missing from the list of GCN 1 cards.
2018-04-03 23:15:07 -04:00
Campbell Barton
2bc952fdb6 Merge branch 'master' into blender2.8 2018-02-18 22:33:05 +11:00
Brecht Van Lommel
fee4b646c4 Cycles: tweak CUDA messages and avoid build errors with existing sm_2x configs. 2018-02-18 00:53:25 +01:00
Brecht Van Lommel
1dcd7db73d Code cleanup: remove some more unused code after recent CUDA changes. 2018-02-18 00:53:03 +01:00
Thomas Dinges
9e717c0495 Cycles: Remove Fermi texture code.
This should be the last Fermi removal commit, unless I missed something.
It's been a pleasure Fermi!
2018-02-17 22:56:58 +01:00
Thomas Dinges
2eaf90b305 Cycles: Remove Fermi support from CMake and update runtime checks in device_cuda.cpp.
Fermi code in Cycles kernel and texture system are coming next.
2018-02-17 16:15:07 +01:00
Brecht Van Lommel
ade2aaba09 Merge branch 'master' into blender2.8 2018-02-07 17:17:24 +01:00
Brecht Van Lommel
1dafe759ed Update CUEW to latest version
This brings separate initialization for libcuda and libnvrtc, which
fixes Cycles nvrtc compilation not working on build machines without
CUDA hardware available.

Differential Revision: https://developer.blender.org/D3045
2018-02-07 11:53:01 +01:00
Campbell Barton
5376c739f5 Merge branch 'master' into blender2.8 2018-02-06 23:06:23 +11:00
Brecht Van Lommel
ce3e0afe59 Fix T54001: AMD OpenCL fails with certain resolutions, after recent changes.
We should actually be using CL_DEVICE_MEM_BASE_ADDR_ALIGN for sub buffers,
previous change in this code was incorrect. Renamed the function now to
make the specific purpose of this alignment clear, it's not required for
data types in general.
2018-02-05 22:19:49 +01:00
Campbell Barton
eeb621566a Merge branch 'master' into blender2.8 2018-02-04 10:46:34 +11:00
Ray Molenkamp
36c1122b96 msvc: Use source folder structure for project file.
This patch changes the huge list of projects in visual studio into a nice tree matching the source folder structure. see D2823 for details.

Differential Revision: http://developer.blender.org/D2823
2018-02-03 16:38:27 -07:00
Ray Molenkamp
a5052770b8 cycles: Add an nvrtc based cubin cli compiler.
nvcc is very picky regarding compiler versions, severely limiting the compiler we can use, this commit adds a nvrtc based compiler that'll allow us to build the cubins even if the host compiler is unsupported. for details see D2913.

Differential Revision: http://developer.blender.org/D2913
2018-02-03 10:59:09 -07:00
Sergey Sharybin
0d64857c3f Merge branch 'master' into blender2.8 2018-01-30 14:32:27 +01:00
Brecht Van Lommel
fb941679bb Fix Cycles allocating too much device memory, after recent memory refactoring.
Spotted by Ha Hyung-jin, thanks!
2018-01-29 17:07:08 +01:00
Brecht Van Lommel
41cc2ae626 Merge branch 'master' into blender2.8 2018-01-23 13:19:32 +01:00
Brecht Van Lommel
4a3ddd8a7a Fix Cycles assert when resizing rendererd viewport. 2018-01-23 13:07:25 +01:00
Campbell Barton
fc1fd2704a Merge branch 'master' into blender2.8 2018-01-23 11:45:39 +11:00
Sergey Sharybin
2f79d1c058 Cycles: Replace use_qbvh boolean flag with an enum-based property
This was we can introduce other types of BVH, for example, wider ones, without
causing too much mess around boolean flags.

Thoughs:

- Ideally device info should probably return bitflag of what BVH types it
  supports.

  It is possible to implement based on simple logic in device/ and mesh.cpp,
  rest of the changes will stay the same.

- Not happy with workarounds in util_debug and duplicated enum in kernel.
  Maybe enbum should be stores in kernel, but then it's kind of weird to include
  kernel types from utils. Soudns some cyclkic dependency.

Reviewers: brecht, maxim_d33

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D3011
2018-01-22 17:19:20 +01:00
Dalai Felinto
4d0bb7de64 Merge remote-tracking branch 'origin/master' into blender2.8 2018-01-19 17:01:48 -02:00
Sergey Sharybin
fa91b43e8c Cycles: Make it more proper check on vectorization flags from DebugFlags
Mimics to checks in system_cpu_support() checks.
2018-01-19 15:48:42 +01:00
Dalai Felinto
d9858d5897 Merge remote-tracking branch 'origin/master' into blender2.8 2018-01-19 12:46:23 -02:00
Sergey Sharybin
ccec1e7667 Cycles: Cleanup, stop using debug flags in system utilities
Debug flags are to be controlling render behavior, nothing to do with low level
system utilities.

it was simple to hack, but logically is wrong. Lets do things where they are
supposed to be done!
2018-01-19 15:22:32 +01:00
Sergey Sharybin
8e1dd7ed81 Cycles: Remove unneeded include statements
Also try to move them from headers to implementation files as much as possible.
2018-01-19 15:19:45 +01:00
Brecht Van Lommel
0fe41009f0 Fix T53830: Cycles OpenCL debug assert on macOS,
This was probably harmless besides some unnecessary memory usage due to
aligning allocations too much.
2018-01-19 11:35:07 +01:00
Campbell Barton
9c91c75ea6 Merge branch 'master' into blender2.8 2018-01-11 13:24:41 +11:00
Brecht Van Lommel
d0892a6648 Fix issue with moving CUDA memory to host and multiple devices.
This is not expected to fix all issues. Also adds some more details
to error reporting to investigate failures.
2018-01-11 00:00:48 +01:00
Brecht Van Lommel
0f4b46cee6 Fix T53692: OpenCL multi GPU rendering not using all GPUs.
Ensure each OpenCL device has a unique ID even if the hardware ID is not
unique for some reason.
2018-01-11 00:00:48 +01:00
Campbell Barton
be40389165 Merge branch 'master' into blender2.8 2018-01-03 23:44:47 +11:00
Brecht Van Lommel
c621832d3d Cycles: CUDA support for rendering scenes that don't fit on GPU.
In that case it can now fall back to CPU memory, at the cost of reduced
performance. For scenes that fit in GPU memory, this commit should not
cause any noticeable slowdowns.

We don't use all physical system RAM, since that can cause OS instability.
We leave at least half of system RAM or 4GB to other software, whichever
is smaller.

For image textures in host memory, performance was maybe 20-30% slower
in our tests (although this is highly hardware and scene dependent). Once
other type of data doesn't fit on the GPU, performance can be e.g. 10x
slower, and at that point it's probably better to just render on the CPU.

Differential Revision: https://developer.blender.org/D2056
2018-01-02 23:50:18 +01:00
Brecht Van Lommel
6699454fb6 Cycles: make CUDA code a bit more robust to host/device alloc failures.
Fixes a few corner cases found while stress testing host mapped memory.
2018-01-02 23:46:19 +01:00
Sergey Sharybin
9f0d067c2e Merge branch 'master' into blender2.8 2017-12-21 11:17:34 +01:00
Sergey Sharybin
5650fe77e4 Cycles: Cleanup, indentation 2017-12-20 17:42:50 +01:00
Campbell Barton
7ca8af4cc8 Merge branch 'master' into blender2.8 2017-12-06 16:51:37 +11:00
Lukas Stockner
2069102c56 Cycles: Fix constness for load_kernels in device_cpu.cpp 2017-12-06 00:00:18 +01:00
Campbell Barton
03a5eccc94 Merge branch 'master' into blender2.8 2017-11-30 18:30:41 +11:00
Lukas Stockner
fa3d50af95 Cycles: Improve denoising speed on GPUs with small tile sizes
Previously, the NLM kernels would be launched once per offset with one thread per pixel.
However, with the smaller tile sizes that are now feasible, there wasn't enough work to fully occupy GPUs which results in a significant slowdown.

Therefore, the kernels are now launched in a single call that handles all offsets at once.
This has two downsides: Memory accesses to accumulating buffers are now atomic, and more importantly, the temporary memory now has to be allocated for every shift at once, increasing the required memory.
On the other hand, of course, the smaller tiles significantly reduce the size of the memory.

The main bottleneck right now is the construction of the transformation - there is nothing to be parallelized there, one thread per pixel is the maximum.
I tried to parallelize the SVD implementation by storing the matrix in shared memory and launching one block per pixel, but that wasn't really going anywhere.

To make the new code somewhat readable, the handling of rectangular regions was cleaned up a bit and commented, it should be easier to understand what's going on now.
Also, some variables have been renamed to make the difference between buffer width and stride more apparent, in addition to some general style cleanup.
2017-11-30 07:37:08 +01:00
Brecht Van Lommel
d992240bfa Fix unneeded legacy OpenGL call in Cycles viewport drawing. 2017-11-24 00:12:48 +01:00
Julian Eisel
7f96323cd0 Merge branch 'master' into blender2.8 2017-11-19 13:16:14 +01:00
Lukas Stockner
40f528a7da Cycles: Add per-tile render time debug pass
Reviewers: sergey, brecht

Differential Revision: https://developer.blender.org/D2920
2017-11-17 16:40:24 +01:00
Dalai Felinto
1cb6cea71c Merge remote-tracking branch 'origin/master' into blender2.8 2017-11-13 11:48:48 -02:00
Brecht Van Lommel
e568c1a975 Fix T53289: CUDA missing textures not showing pink, after recent changes. 2017-11-12 20:45:47 +01:00
Mai Lavelle
e389ae9dca Cycles: Set error if a split kernel fails to load
To help catch cases where adding a new kernel is missed for one of the
device implementations.
2017-11-11 01:01:14 -05:00
Bastien Montagne
7a6ad2901c Merge branch 'master' into blender2.8 2017-11-10 10:13:19 +01:00
Brecht Van Lommel
bd4bea3e98 Cycles: avoid reallocating tile denoising memory many times during render. 2017-11-09 20:28:00 +01:00
Sergey Sharybin
c99481b632 Merge branch 'master' into blender2.8 2017-11-09 10:59:15 +01:00