Commit Graph

155 Commits

Author SHA1 Message Date
Sergey Sharybin
333366dbcf Cycles: Fix typo in shader cancel routines 2016-09-29 15:48:10 +02:00
Sergey Sharybin
91e0a16f2f Cycles: Use XDG's .cache folder for cached kernels
Basically just moves cached kernels from ~/.config/blender/BLENDER_VERSION to
~/.cache/cycles/kernels. This has following benefits:

- Follows XDG specification more closely,
  not as if it's totally crucial or measurable by users, but still nice.

- Prevents unexpected sizes of config folder, makes disk space used in more
  predictable for users way.

- Allows to share kernels across multiple Blender versions,
  which makes it easier debugging at the times close to release.

- "Copy Previous Settings" operator will no longer be copying possibly
  gigabytes of cached kernels, which used to lead to really nast disk usage
  and annoying delays of copying settings.

- In the future we can have some smart logic to clear old unused cached
  kernels.

Currently only done for Linux and OSX. Windows still follows old "cache"
folder logic, but it's not really important for now because we don't
support kernel compilation on this platform yet.

Reviewers: dingto, juicyfruit, brecht

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D2197
2016-09-12 09:39:05 +02:00
Thomas Dinges
9d236ac06c Cycles: Enable half float support (4 channels and 1 channel) on CUDA.
Atm OpenEXR half files benefit from this and will use only 1/2 of the memory now. More space for HDRs!

Part of my GSoC 2016.
2016-08-11 22:47:53 +02:00
Thomas Dinges
c2a7317d1f CUDA: We don't support Toolkits < 7.5, update error message. 2016-08-09 11:41:25 +02:00
Sergey Sharybin
b416168d85 Cycles: Cleanup, trailing whitespace 2016-08-02 14:09:34 +02:00
Sergey Sharybin
7b8b16a18c Cycles: Some cleanup in CUDA device file 2016-08-02 14:09:34 +02:00
Sergey Sharybin
ad48f13099 Cycles: Include NVCC compiler flags into md5 hash
This way we can easily switch between toolkits without worrying
whether some kernel was compiled with old or new CUDA toolkit.

It's also now possible to switch machine architecture and have
proper cached kernel detected. Not as if it happens every day,
but i did such a bitness switch back in the days :)
2016-08-02 14:09:34 +02:00
Sergey Sharybin
6353ecb996 Cycles: Tweaks to support CUDA 8 toolkit
All the changes are mainly giving explicit tips on inlining functions,
so they match how inlining worked with previous toolkit.

This make kernel compiled by CUDA 8 render in average with same speed
as previous kernels. Some scenes are somewhat faster, some of them are
somewhat slower. But slowdown is within 1% so far.

On a positive side it allows us to enable newer generation cards on
buildbots (so GTX 10x0 will be officially supported soon).
2016-08-01 15:54:29 +02:00
Sergey Sharybin
b406b7be00 Cycles: Mark which CUDA device is used for display
It is really handy to know which one is display when having two cards of
same type in the machine.
2016-06-03 11:52:08 +02:00
Mai Lavelle
4388b29e98 Cycles: Add human readable sizes to debug output
Some of these values can get quite large and are hard to read, adding this
makes it easy to read them at a glance.

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D2039
2016-05-31 06:13:54 -04:00
Brecht Van Lommel
f7c28a66e2 Fix Cycles compile errors with GCC due to double promotion as errors. 2016-05-22 19:17:22 +02:00
Thomas Dinges
dedc995018 Cycles / CUDA: Don't use bundled kernel if Adaptive is enforced by the user. 2016-05-19 16:32:57 +02:00
Thomas Dinges
c9f1ed1e4c Cycles: Add support for bindless textures.
This adds support for CUDA Texture objects (also known as Bindless textures) for Kepler GPUs (Geforce 6xx and above).
This is used for all 2D/3D textures, data still uses arrays as before.

User benefits:
* No more limits of image textures on Kepler.
 We had 5 float4 and 145 byte4 slots there before, now we have 1024 float4 and 1024 byte4.
 This can be extended further if we need to (just change the define).

* Single channel textures slots (byte and float) are now supported on Kepler as well (1024 slots for each type).

ToDo / Issues:
* 3D textures don't work yet, at least don't show up during render. I have no idea whats wrong yet.
* Dynamically allocate bindless_mapping array?

I hope Fermi still works fine, but that should be tested on a Fermi card before pushing to master.

Part of my GSoC 2016.

Reviewers: sergey, #cycles, brecht

Subscribers: swerner, jtheninja, brecht, sergey

Differential Revision: https://developer.blender.org/D1999
2016-05-19 13:14:37 +02:00
Thomas Dinges
29a17d54da Fix CUDA MEMCPY condition, it should only copy 3D, 2D or 1D.
Found by Brecht, thanks!
2016-05-17 00:37:34 +02:00
Thomas Dinges
1c46ecd86b Cleanup: Remove unneded (void) line, we don't have ifdefs here anymore. 2016-05-07 15:55:28 +02:00
Thomas Dinges
734d1aec3f Cycles: Make CUDA adaptive feature compile a Debug flag.
If the CUDA Toolkit is installed and the user is on Linux,
adaptive, feature based CUDA runtime compile is now possible to enable via:

* Environment flag CYCLES_CUDA_ADAPTIVE_COMPILE or
* Debug menu (Debug value 256) in the Cycles UI.
2016-05-06 23:13:33 +02:00
Thomas Dinges
3807bcb3a8 Cleanup: Rename texture slots to float4 and byte, to distinguish from future float (single channel) and half_float slots.
Should be no functional changes, tested CPU and CUDA.
2016-05-06 14:37:35 +02:00
Sergey Sharybin
e50d229273 Fix T47794: Point density sometime seems stretched when rendered on GPU 2016-04-20 14:42:19 +02:00
Sergey Sharybin
b20f12d835 Cycles: Some typo fixes 2016-03-12 15:01:20 +05:00
Sergey Sharybin
8cab327316 Cycles: Make CUDA 7.5 officially recommended
This was a hard decision, because going newer CUDA toolkit makes
rendering up to 5% slower. But on another hand, it solves major
speed regressions (up to 30%) with branched path tracing on a
top level cards.

Neither of those regressions have a meaningful and sane workaround
from the code itself.

Toolkit 6.5 could still be used, but it's no longer recommended one.
2016-02-17 15:18:56 +01:00
Sergey Sharybin
1c4f21f85e Cycles: Initial support of 3D textures for CUDA rendering
Supports both smoke/fire and point density textures now.

Reduces number of textures available for sm_20 and sm_21, but you have
to compromise somewhere on such a limited hardware.

Currently limited to linear interpolation only, and decoupled ray
marching is not supported yet. Think those could be considered just a
further improvement.

Some quick example:

  https://developer.blender.org/F282934

Code is minimal and we can fully consider it a fix for missing
support of 3D textures with CUDA.

Reviewers: lukasstockner97, brecht, juicyfruit, dingto

Reviewed By: brecht, juicyfruit, dingto

Subscribers: mib2berlin

Differential Revision: https://developer.blender.org/D1806
2016-02-15 21:26:29 +01:00
Sergey Sharybin
28604c46a1 Cycles: Make Blender importer more forward compatible
Basically the idea is to make code robust against extending
enum options in the future by falling back to a known safe
default setting when RNA is set to something unknown.

While this approach solves the issues similar to T47377,
but it wouldn't really help when/if any of the RNA values
gets ever deprecated and removed. There'll be no simple
solution to that apart from defining explicit mapping from
RNA value to Cycles one.

Another part which isn't so great actually is that we now
have to have some enum guards and give some explicit values
to the enum items, but we can live with that perhaps.

Reviewers: dingto, juicyfruit, lukasstockner97, brecht

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D1785
2016-02-12 15:27:33 +01:00
Sergey Sharybin
3aa74828ab Cycles: Cleanup, indentation and braces 2016-02-03 15:00:55 +01:00
Sergey Sharybin
52e34ffe33 Cycles: Pass missing shader filter argument to CUDA and OpenCL kernels 2016-01-19 22:53:19 +01:00
Sergey Sharybin
55926ad298 Cycles: Fix string compiler warnings after recent changes 2016-01-14 17:04:56 +05:00
Thomas Dinges
3ba9742be2 Cycles: Remove the experimental CUDA kernel.
This commit removes the experimental CUDA kernel, making SSS and CMJ
regular features.

Several improvements have been made in the past few
weeks (thanks Sergey!) which make SSS render several times faster (2-3x
compared to 2.76b) on the GPU, and the increased VRAM usage has also been
fixed. Therefore the experimental kernel is no longer needed.

Differential Revision: https://developer.blender.org/D1726

Manual has been updated: too:
https://www.blender.org/manual/render/cycles/features.html
2016-01-14 12:56:08 +01:00
Sergey Sharybin
2af7637f20 Cycles: Add option to directly link against CUDA libraries
The main purpose of such linking is to make Blender compatible with
NVidia's debuggers and profilers which are doing some LD_PRELOAD
magic to intercept some function calls. Such magic conflicts with
our CUDA wrangler magic and causes segmentation faults.

The option is disabled by default, so there's no affect on any of
artists.

In order to make Blender linked directly against CUDA library use
the WITH_CUDA_DYNLOAD CMake option (it's marked as advanced).
2016-01-14 12:27:22 +05:00
Sergey Sharybin
3918c8b9a5 Cycles: Optionally output luminance from the shader evaluation kernel
This makes it possible to move some parts of evaluation from host to the device
and hopefully reduce memory usage by avoid having full RGBA buffer on the host.

Reviewers: juicyfruit, lukasstockner97, brecht

Reviewed By: lukasstockner97, brecht

Differential Revision: https://developer.blender.org/D1702
2015-12-30 19:04:04 +05:00
Mike Erwin
6006173f4a OpenGL: use sized texture internal formats
Maybe this is pedantic but I read it’s best to explicitly set the
desired component size.

Also append “_ARB” to float texture formats since those need an
extension in GL 2.1.
2015-12-08 01:19:55 -05:00
Sergey Sharybin
2ae7593700 Cycles: Avoid having two consequence getenv() calls 2015-11-28 21:05:12 +05:00
Mike Erwin
291afea8cc OpenGL: clean up use of old extensions 2015-11-24 02:21:07 -05:00
Sergey Sharybin
c08727ebab Cycles: Quick experiment with using feature-adaptive kernels for CUDA
Gives few percent of memory improvement for regular feature set kernel
and could give significant memory improvement for Experimental kernel.
It could also give some degree of performance improvement, but this I
didn't really measure reliably yet.

Code is ifdef-ed for now, since it's only working on Linux and requires
CUDA toolkit to be installed (other platform only use precompiled
kernels).

This is just an experiment for now and a base for the proper feature
support in the future (with runtime compilation using CUDA 7?).
2015-11-21 22:16:01 +05:00
Sergey Sharybin
52d7bc624a Cycles: Make CUDA device internally operate with requested features
This just replaces internal argument `experimental` with `requested_features`
making it possible to access particular requested settings when building
kernels.
2015-11-21 21:49:00 +05:00
Sergey Sharybin
4690281b17 Cycles: Add implementation of clip extension mode
For now there's no OpenCL support, it'll come later.
2015-07-28 14:36:08 +02:00
Sergey Sharybin
3fba620858 Cycles: Prepare for more image extension types support
Basically just replace boolean periodic flag with extension type enum in the
device API.
2015-07-28 14:14:24 +02:00
Sergey Sharybin
4d74180b9f Cycles: Fix for wrong device enumeration in CUDA
it is the same issue as described in the previous commit, original changes
in this area were wrong and only worked on a bugger optimus driver which
simply appeared to work by co-incident and in fact used wrong device..
2015-06-27 15:13:08 +02:00
Sergey Sharybin
63dd554ff1 Cycles: Don't show pre-sm_20 CUDA cards in the device list 2015-06-20 17:34:12 +02:00
Sergey Sharybin
28f798f86e Cycles: Initial support for OpenCL capabilities reports
For now it's just generic information, still need to expose memory, workgorup
sizes and so on.
2015-06-05 14:17:30 +02:00
Sergey Sharybin
399a27b261 Cycles: Code cleanup, spaces around keyword and brace 2015-06-01 19:49:52 +05:00
Sergey Sharybin
2c503d8303 Cycles: Restructure kernel files organization
Since the kernel split work we're now having quite a few of new files, majority
of which are related on the kernel entry points. Keeping those files in the
root kernel folder will eventually make it really hard to follow which files are
actual implementation of Cycles kernel.

Those files are now moved to kernel/kernels/<device_type>. This way adding extra
entry points will be less noisy. It is also nice to have all device-specific
files grouped together.

Another change is in the way how split kernel invokes logic. Previously all the
logic was implemented directly in the .cl files, which makes it a bit tricky to
re-use the logic across other devices. Since we'll likely be looking into doing
same split work for CUDA devices eventually it makes sense to move logic from
.cl files to header files. Those files are stored in kernel/split. This does not
mean the header files will not give error messages when tried to be included
from other devices and their arguments will likely be changed, but having such
separation is a good start anyway.

There should be no functional changes.

Reviewers: juicyfruit, dingto

Differential Revision: https://developer.blender.org/D1314
2015-05-22 16:31:34 +05:00
Sergey Sharybin
2ab909a88c Cycles: Make experimental kernel build option more generic
Previously it was explicitly mentioning it's NVidia kernel related option,
but in fact it's also handy for the OpenCL kernel.
2015-05-15 13:22:47 +05:00
Antony Riakiotakis
4fc3188112 Cycles: Get rid of one more OpenGL matrix manipulation/push/pop. 2015-05-11 16:41:18 +02:00
Antony Riakiotakis
e38f914421 Cycles: use vertex buffers when possible to draw tiles on the screen.
Not terribly necessary in this case, since we are just drawing a quad,
but makes blender overall more GL 3.x core ready.
2015-05-11 16:28:41 +02:00
Antony Riakiotakis
5588a51c9c Cycles OpenGL: Don't use full matrix transform when we can just use
simple addition.
2015-05-11 13:10:19 +02:00
Sergey Sharybin
0e4ddaadd4 Cycles: Change the way how we pass requested capabilities to the device
Previously we only had experimental flag passed to device's load_kernel() which
was all fine. But since we're gonna to have some extra parameters passed there
it makes sense to wrap them into a single struct, which will make it easier to
pass stuff around.
2015-05-09 19:05:49 +05:00
Sergey Sharybin
2f5dd83759 Cycles: Add some statistics logging
Covers number of entities in the scene (objects, meshes etc), also reports
sizes of textures being allocated.
2015-04-10 15:37:49 +05:00
Sergey Sharybin
5ff132182d Cycles: Code cleanup, spaces around keywords
This inconsistency drove me totally crazy, it's really confusing
when it's inconsistent especially when you work on both Cycles and
Blender sides.

Shouldn;t cause merge PITA, it's whitespace changes only, Git should
be able to merge it nicely.
2015-03-28 00:15:15 +05:00
Sergey Sharybin
585dd26120 Cycles: Code cleanup, prepare for strict C++ flags 2015-03-27 18:23:31 +05:00
Sergey Sharybin
7ea7c2aab2 Cycles: Fix inconsistent command line used for runtime kernel compilation
Basically build-time compiled kernels were using --fast-math (which is correct)
but run-time compiled did not.
2015-02-02 15:00:21 +05:00
Sergey Sharybin
77e6f2212f Cycles: Allow paths customization via environment variables
This is for development and test environment setup only, not for
regular users usage hence no mentioning in the man page needed.
2015-02-02 02:02:10 +05:00