Commit Graph

5429 Commits

Author SHA1 Message Date
Sergey Sharybin
2a5c1fc9cc Cycles: Delay shooting SSS indirect rays
The idea is to delay shooting indirect rays for the SSS sampling and
trace them after the main integration loop was finished.

This reduces GPU stack usage even further and brings it down to around
652MB (comparing to 722MB before the change and 946MB with previous
stable release).

This also solves the speed regression happened in the previous commit
and now simple SSS scene (SSS suzanne on the floor) renders in 0:50
(comparing to 1:16 with previous commit and 1:03 with official release).
2015-11-25 13:01:22 +05:00
Sergey Sharybin
8bca34fe32 Cysles: Avoid having ShaderData on the stack
This commit introduces a SSS-oriented intersection structure which is replacing
old logic of having separate arrays for just intersections and shader data and
encapsulates all the data needed for SSS evaluation.

This giver a huge stack memory saving on GPU. In own experiments it gave 25%
memory usage reduction on GTX560Ti (722MB vs. 946MB).

Unfortunately, this gave some performance loss of 20% which only happens on GPU.
This is perhaps due to different memory access pattern. Will be solved in the
future, hopefully.

Famous saying: won in memory - lost in time (which is also valid in other way
around).
2015-11-25 13:01:22 +05:00
Mike Erwin
e6fff424db OpenGL: set geometry shader input length implicitly
Input array length is implicitly set at link time, based on the geometry
shader's layout. Specifying the wrong value here is an error; specifying
no value is the same as getting it right. (inspired by a recent codegen
change)
2015-11-25 01:49:07 -05:00
Bastien Montagne
14221521fb Fix previous own fix - second message was actually OK, first one had bad comma placement...
Thanks to psy-fi for the head-up.
2015-11-24 15:36:49 +01:00
Bastien Montagne
0b422900c8 Fix broken windows 'MessageBox' calls (UI messages).
Reported by Bzzt_Ploink on IRC.
2015-11-24 15:14:22 +01:00
Sergey Sharybin
fa6bdfd622 Cycles: Support per-render layer world AO settings
This is sort of extension of existing Use Environment option which now allows to
disable AO on the render layer basis.

Useful in cases like disabling AO for the background because it might make it
too flat and so.

Reviewers: juicyfruit, dingto, brecht

Reviewed By: brecht

Subscribers: eyecandy, venomgfx

Differential Revision: https://developer.blender.org/D1633
2015-11-24 13:21:40 +05:00
Mike Erwin
ef5fff4adc OpenGL: when checking GL version, assume >= 2.1
Mostly glBlendFunc related.
2015-11-24 02:34:54 -05:00
Mike Erwin
291afea8cc OpenGL: clean up use of old extensions 2015-11-24 02:21:07 -05:00
Brecht Van Lommel
880258a0db Fix T46848: OpenNL crash on Windows due to uninintialized variables. 2015-11-23 18:20:32 +01:00
Sergey Sharybin
f021d97e8f Fix T46842: Removing World is missing AO update in viewport render 2015-11-23 17:44:52 +05:00
Mike Erwin
f997449f84 OpenSubdiv: support OpenGL 3.x
GLSL 130, 140, 150 with extensions as needed.

Similar logic to my recent gpu_extensions changes.

Partially fixes T46706. Matcaps now work with OpenSubdiv, as do basic
materials. Anything with UV coordinates is still broken.
2015-11-23 03:35:16 -05:00
Campbell Barton
4fe413a419 CMake: use -Wshadow warning for C source
C source now builds without shadowing, enable with GCC by default.
2015-11-23 17:43:55 +11:00
Brecht Van Lommel
d28431a648 OpenNL: make the API thread safe by always passing context.
Previously two laplacian smooth or deform modifiers executing
simultaneously could crash.
2015-11-22 22:49:43 +01:00
Brecht Van Lommel
47ce2d7bef OpenNL: significantly simplify code using Eigen / STL. 2015-11-22 22:49:03 +01:00
Brecht Van Lommel
e6c58df74e OpenNL: replace SuperLU by Eigen SparseLU solver.
Performance is roughly the same because it's using the same COLAMD ordering
and supernodal LU factorization algorithms. Solve results also appear to be
identical.
2015-11-22 22:49:03 +01:00
Brecht Van Lommel
5e52433371 OpenNL: convert source file to C++, remove some unused functions. 2015-11-22 22:49:03 +01:00
Antony Riakiotakis
db1f0e3616 Error out on Windows if driver does not support OpenGL 2.1 with an error
messagebox.
2015-11-22 20:53:57 +01:00
Antony Riakiotakis
8623d75b46 Add check for OpenGL version 2.1 on linux.
Unfortunately there's no easy way to show a messagebox here, so just
print a warning on fstderr and exit. If we don't call exit() here we get
crashes on other blender systems (python, opensubdiv) and it can get
tricky to track the initialization state here, so just using exit()
should do the trick for now.
2015-11-22 18:14:46 +01:00
Sergey Sharybin
56ead9d34b Cycles: Make branched path tracer covered with requested features
This gives few percent extra memory saving for the CUDA kernel when
using regular path tracing.

Still more like an experiment, but will be handy in the future.
2015-11-22 13:54:51 +05:00
Lukas Stockner
5c5df9dc18 Cycles: Save one transform inversion in the camera sync
Summary: By calculating the Camera-to-Screen-Matrix first, one inversion can be saved in the Camera sync.
It won't really improve speed and/or precision, it's mainly a small cleanup.

Reviewers: sergey, dingto

Subscribers:
2015-11-22 00:58:40 +01:00
Sergey Sharybin
f547bf2f10 Cycles: Make requested features struct aware of subsurface BSDF
This way we'll be able to disable SSS for the scene-adaptive kernel.
2015-11-21 23:00:29 +05:00
Sergey Sharybin
c08727ebab Cycles: Quick experiment with using feature-adaptive kernels for CUDA
Gives few percent of memory improvement for regular feature set kernel
and could give significant memory improvement for Experimental kernel.
It could also give some degree of performance improvement, but this I
didn't really measure reliably yet.

Code is ifdef-ed for now, since it's only working on Linux and requires
CUDA toolkit to be installed (other platform only use precompiled
kernels).

This is just an experiment for now and a base for the proper feature
support in the future (with runtime compilation using CUDA 7?).
2015-11-21 22:16:01 +05:00
Sergey Sharybin
f8ab3fd30f Cycles: add utility function to calculate MD5 hash of a given string 2015-11-21 22:07:59 +05:00
Sergey Sharybin
52d7bc624a Cycles: Make CUDA device internally operate with requested features
This just replaces internal argument `experimental` with `requested_features`
making it possible to access particular requested settings when building
kernels.
2015-11-21 21:49:00 +05:00
Sergey Sharybin
a106da7f1d Cycles: Move build options constructions to DeviceRequestedFeatures
This way it's easier to re-use requested features logic across multiple
device implementations.
2015-11-21 21:42:31 +05:00
Sergey Sharybin
9aafec1ce1 Cycles: Avoid multiple spaces in OpenCL build options
This should solve some compilation errors with compilation on OSX.
2015-11-21 21:33:08 +05:00
Sergey Sharybin
7e71be261b Cycles: Fix filter glossy being broken after recent changes
Basically we can not use sharp closure as a substitude when filter glossy is
used. This is because we can not blur sharp reflection/refraction.

This is quite quick and not really clean implementation. Not really happy
with manual handling of original settings, but this is as good as we can do
in the quick patch. It's a good acknowledgment and we now can re-consider
some aspects of graph simplification to make such cases more natively
supported.

P.S. This failure would have been shown by our regression tests, so please,
bother a bit to run Cycles's test sweep before doing such optimizations.
2015-11-20 18:18:27 +05:00
Campbell Barton
ae8e4d3718 Cleanup: redundant 'break', minor edits 2015-11-19 22:52:13 +11:00
Thomas Dinges
eeed28fefc Fix T46818, crash with Glossy node on Windows. 2015-11-19 08:43:43 +01:00
Lukas Stockner
8dea06565f Cycles: Add Blackman-Harris filter, fix Gaussian filter
This commit adds the Blackman-Harris windows function as a pixel filter to Cycles. On some cases, such as wireframes or high-frequency textures,
Blackman-Harris can give subtle but noticable improvements over the Gaussian window.
Also, the gaussian window was truncated too early, which degraded quality a bit, therefore the evaluation region is now three times as wide.
To avoid artifacts caused by the wider curve, the filter table size is increased to 1024.

Reviewers: #cycles

Differential Revision: https://developer.blender.org/D1453
2015-11-18 20:50:06 +01:00
Thomas Dinges
0639ba8ea5 Cycles / Shader graph: Fallback to Sharp closures for very small roughness.
We fallback to Sharp closures for Glossy, Glass and Refraction nodes now, in case the Roughness input is disconnected and 0 (< 1e-4f to be exact).
This way we gain a few percentages of performance, in case the user did not manually set the closure type to "Sharp" in the UI.

Sharp will probably be removed from the UI as a followup, not needed anymore with this internal optimization.

Original idea by Lukas Stockner(Differential Revision: https://developer.blender.org/D1439), code implementation by myself.
2015-11-18 18:47:56 +01:00
Thomas Dinges
836c69c92f Cleanup: Add some notes in code for upcoming graph simplification process. 2015-11-18 17:20:39 +01:00
Thomas Dinges
38bbc920a6 Cycles: Add utility functions to get a ShaderInput / ShaderOutput by name. 2015-11-18 17:12:26 +01:00
Campbell Barton
dc14629b26 GHOST: rename suffix X11 to Unix for non X11 files
We may use these for Wayland or SDL back-ends.
2015-11-16 21:57:05 +11:00
Mike Erwin
21195a9ea4 check compute shader support for OpenSubdiv
Built into OpenGL 4.3, or 4.2 plus ARB_compute_shader extension.
2015-11-15 23:15:00 -05:00
Mike Erwin
f9e8de0b26 minor cleanup: spelling/wording 2015-11-14 13:48:15 -05:00
Mike Erwin
f34cb5ab5a tweak GL extension check for OpenSubdiv drawing
Once we adopt GL 3.2 across Blender, the check will be:

return GLEW_VERSION_4_0 || GLEW_ARB_gpu_shader5;
2015-11-14 13:43:39 -05:00
Mike Erwin
46478ad2bc enable OpenSubdiv Transform Feedback on Intel
Fixed extension check.

This feature requires ARB_texture_buffer_object which was subsumed into
OpenGL 3.0. Intel driver on Windows doesn't claim to support this
extension, but GL version is > 3 so it actually does work.
2015-11-14 13:34:41 -05:00
Mike Erwin
175f8e49e7 enable OpenSubdiv on Intel graphics
Tested working on Haswell i5-4670 running Windows 10.
2015-11-14 13:27:09 -05:00
Sergey Sharybin
b3184640fe OpenSubdiv: Enable GLSL Compute for AMD devices
There was a bug in OpenSubdiv library which is now fixed by updating library in
the build environments.

Still requires testing, but now being at the beginning of release cycle is
should be safe to simply enable option and let everyone to test :)
2015-11-10 12:16:18 +05:00
Campbell Barton
cfbbf72d89 Revert "Increase CMake minimum version to 3.0"
This reverts commit ff3cf93405.

Turns out distros only a year old still use CMake 2.8x
2015-11-10 02:53:10 +11:00
Campbell Barton
ff3cf93405 Increase CMake minimum version to 3.0
This allows us to use newer features of CMake, and less hassles having to test & support older versions.
2015-11-09 23:37:53 +11:00
Jörg Müller
8c25c1c484 Fix T46683: High pitch sound artifact on import of 48k audio
The bug header is wrong, the file contains the high pitched sound, but the bug that existed was that animation rendering did not use the high quality resampler, while audio mixdown does.
Blender uses the low quality resampler to be as little CPU consuming as possible.
2015-11-03 19:34:33 +01:00
Sergey Sharybin
537bd0eb51 Fix T46671: Cycles assert with CMJ sample function
With current formulation of cmj_fast_div_pow2() it should always return 0
in the case of first argument is zero and no assert really needed anymore.
2015-11-03 18:49:27 +05:00
Sergey Sharybin
9bce104c8c Cycles: Partially revert previous commit
Apparently removing kernel arguments broke NVidia OpenCL.

Needs more investigation, for the time being revering changes which caused problem.
2015-11-01 21:01:12 +05:00
Sergey Sharybin
dc9e0b819b Cycles: Remove unused argument from the split kernel functions
Should be no functional changes, just simplifies operation with kernels.
2015-11-01 17:22:42 +05:00
Sergey Sharybin
84e8b05e97 Cycles: Minor code style cleanup 2015-11-01 15:40:17 +05:00
Sergey Sharybin
a43e087fb8 Cycles: Add split kernel headers to project generation 2015-10-31 04:08:19 +05:00
Sergey Sharybin
cb1cb63d40 Cycles: Fixes for few typos in OpenCL kernel 2015-10-30 23:31:24 +05:00
Sergey Sharybin
537f41250f Cycles: Fix typo in split kernel
Shadow blocked kernel was using wrong array for storing intersection.
2015-10-29 21:52:56 +05:00