Commit Graph

237 Commits

Author SHA1 Message Date
Sergey Sharybin
2c503d8303 Cycles: Restructure kernel files organization
Since the kernel split work we're now having quite a few of new files, majority
of which are related on the kernel entry points. Keeping those files in the
root kernel folder will eventually make it really hard to follow which files are
actual implementation of Cycles kernel.

Those files are now moved to kernel/kernels/<device_type>. This way adding extra
entry points will be less noisy. It is also nice to have all device-specific
files grouped together.

Another change is in the way how split kernel invokes logic. Previously all the
logic was implemented directly in the .cl files, which makes it a bit tricky to
re-use the logic across other devices. Since we'll likely be looking into doing
same split work for CUDA devices eventually it makes sense to move logic from
.cl files to header files. Those files are stored in kernel/split. This does not
mean the header files will not give error messages when tried to be included
from other devices and their arguments will likely be changed, but having such
separation is a good start anyway.

There should be no functional changes.

Reviewers: juicyfruit, dingto

Differential Revision: https://developer.blender.org/D1314
2015-05-22 16:31:34 +05:00
Thomas Dinges
a934730368 Cycles: Remove TM / R and whitespace from OpenCL device names.
Was already done for CPU devices, now we also do this for OpenCL.
2015-05-21 23:43:18 +02:00
Sergey Sharybin
d4c676e81b Cycles: CYCLES_OPRNCL_DEBUG now affects on split kernel as well 2015-05-21 14:30:33 +05:00
Sergey Sharybin
f18d77b874 Cycles: Restore some lost custom cflags passed to the kernel compilation
They were lost during simplification of kernel loading but might be rather
crucial for the performance.

Also made it so cflags are shared across kernels. Surely it might lead to
some unwanted kernel re-compilation but at the same time they might easily
run out of sync with the changes in kernel and so.
2015-05-21 14:05:53 +05:00
Sergey Sharybin
148ed4e05e Cycles: Cleanup, synchronize name across file name, program and kernel names 2015-05-20 23:10:07 +05:00
Sergey Sharybin
6f48df45ee Cycles: Simplify code around kernel loading 2015-05-20 23:10:07 +05:00
Sv. Lockal
88acb3c599 Fix T44707: cycles border render regression 2015-05-18 11:37:19 +10:00
Thomas Dinges
105b87a3f7 Cycles: Enable advanced shading on AMD / OpenCL.
That is needed for Motion Blur and Render Passes to work properly.
I hope there are no nasty side effects, but we need to test this.
2015-05-17 19:29:33 +02:00
Thomas Dinges
14c2bc53c0 Cleanup: Typos, typos everywhere. :D 2015-05-17 18:32:31 +02:00
Campbell Barton
daeb3069cf Cleanup: typos 2015-05-17 16:09:32 +10:00
Campbell Barton
31e96cbf96 Cleanup: style, spelling 2015-05-15 23:38:53 +10:00
Sergey Sharybin
c2b9f78415 Cycles: Pass __KERNEL_EXPERIMENTAL__ to OpenCL split kernels
Experimental feature set id currently unavailable for megakernel, it'll
require some changes to the cache system to distinguish cached regular
kernels from cached experimental kernels.

Currently unused, but some features will be enabled soon.
2015-05-15 13:22:47 +05:00
Sergey Sharybin
2ab909a88c Cycles: Make experimental kernel build option more generic
Previously it was explicitly mentioning it's NVidia kernel related option,
but in fact it's also handy for the OpenCL kernel.
2015-05-15 13:22:47 +05:00
Sergey Sharybin
960d7df56f Cycles: Pass device compute capabilities to kernel via build options
This way it's possible to do device-selective feature disabling/enabling.
Currently only supported for NVidia devices via OpenCL extension.
2015-05-15 13:22:47 +05:00
Sergey Sharybin
03f9d5a4cf Cycles: Cleanup, move build options string calculation into the device class
This way it's easier to access platform name, device ID and other stuff which
might be needed to define build options.
2015-05-15 13:22:47 +05:00
Sergey Sharybin
3c10ec96b5 Cycles: Enable object motion blur on Intel OpenCL platform
This required allocating some memory related on object transform needed
by ShaderData and currently it is done for all the platforms. Since we're
targeting full feature-complete platforms this is rather acceptable at
this point and in the future we'll do selective NO_HAIR/NO_SSS/NO_BLUR
kernels.

This is experimental still and in fact there're some major issues on
NVidia platform and it's not really clear if it's a bug in compiler,
some uninitizlied variable or other kind of issue.
2015-05-15 00:48:12 +05:00
Sergey Sharybin
03565218d5 Cycles: Various fixes
Some stupid fixes like spaces around operator and missing semicolon,
plus fix for wrong detecting of ShaderData SOA size. Thar was harmless
since there's only one closure array, but still better to fix this.
2015-05-15 00:42:05 +05:00
Sergey Sharybin
f6c6dd44de Cycles: Remove meaningless ifdef checks for features in device_opencl
This file was actually checking for features enabled on CPU and surely all
of them were enabled, so removing them does not cause any difference.

ideally we'll need to do runtime feature detection and just pass some stuff
as NULL to the kernel, or maybe also have variadic kernel entry points which
is also possible quite easily.
2015-05-14 23:44:19 +05:00
Sergey Sharybin
93867ae549 Cycles: Cleanup: use generic utility function to set kernel arguments 2015-05-13 19:56:24 +05:00
Sergey Sharybin
51a6bc8faa Cycles: Inline sizeof of elements needed for the split kernel
No need to store them in the class, they're unlikely to be changed
and if they do change we're in big trouble anyway.

More appropriate approach would be then to typedef this things in
kernel_types.h, but still use inlined sizeof(),
2015-05-13 19:56:24 +05:00
Antony Riakiotakis
4fc3188112 Cycles: Get rid of one more OpenGL matrix manipulation/push/pop. 2015-05-11 16:41:18 +02:00
Antony Riakiotakis
e38f914421 Cycles: use vertex buffers when possible to draw tiles on the screen.
Not terribly necessary in this case, since we are just drawing a quad,
but makes blender overall more GL 3.x core ready.
2015-05-11 16:28:41 +02:00
Antony Riakiotakis
5588a51c9c Cycles OpenGL: Don't use full matrix transform when we can just use
simple addition.
2015-05-11 13:10:19 +02:00
Sergey Sharybin
3a2c0ccdd0 Cycles: Correction to opencl whitelist check
Was using platform as a device id accidentally.
2015-05-10 20:02:06 +05:00
Sergey Sharybin
136d7a4f62 Cycles: Only whitelist AMD GPU devices in the OpenCL section
Only those ones are priority for now, all the rest are still testable
if CYCLES_OPENCL_TEST or CYCLES_OPENCL_SPLIT_KERNEL_TEST environment
variables are set.
2015-05-09 23:40:26 +05:00
George Kyriazis
7f4479da42 Cycles: OpenCL kernel split
This commit contains all the work related on the AMD megakernel split work
which was mainly done by Varun Sundar, George Kyriazis and Lenny Wang, plus
some help from Sergey Sharybin, Martijn Berger, Thomas Dinges and likely
someone else which we're forgetting to mention.

Currently only AMD cards are enabled for the new split kernel, but it is
possible to force split opencl kernel to be used by setting the following
environment variable: CYCLES_OPENCL_SPLIT_KERNEL_TEST=1.

Not all the features are supported yet, and that being said no motion blur,
camera blur, SSS and volumetrics for now. Also transparent shadows are
disabled on AMD device because of some compiler bug.

This kernel is also only implements regular path tracing and supporting
branched one will take a bit. Branched path tracing is exposed to the
interface still, which is a bit misleading and will be hidden there soon.

More feature will be enabled once they're ported to the split kernel and
tested.

Neither regular CPU nor CUDA has any difference, they're generating the
same exact code, which means no regressions/improvements there.

Based on the research paper:

  https://research.nvidia.com/sites/default/files/publications/laine2013hpg_paper.pdf

Here's the documentation:

  https://docs.google.com/document/d/1LuXW-CV-sVJkQaEGZlMJ86jZ8FmoPfecaMdR-oiWbUY/edit

Design discussion of the patch:

  https://developer.blender.org/T44197

Differential Revision: https://developer.blender.org/D1200
2015-05-09 19:52:40 +05:00
Sergey Sharybin
f680c1b54a Cycles: Communicate number of closures and nodes feature set to the device
This way device can actually make a decision of how it can optimize the kernel
in order to make it most efficient.
2015-05-09 19:28:00 +05:00
Sergey Sharybin
b3299bace0 Cycles: Pass requested tile size to the device via device task
This is currently unused but crucial for things like calculating amount of
device memory required to deal with the tasks.

Maybe not really best place to store it, but consider it good enough for now.
2015-05-09 19:09:07 +05:00
Sergey Sharybin
0e4ddaadd4 Cycles: Change the way how we pass requested capabilities to the device
Previously we only had experimental flag passed to device's load_kernel() which
was all fine. But since we're gonna to have some extra parameters passed there
it makes sense to wrap them into a single struct, which will make it easier to
pass stuff around.
2015-05-09 19:05:49 +05:00
Sergey Sharybin
35812e65f4 Cycles: Fix compilation error on windows after recent logging changes 2015-04-10 22:35:10 +05:00
Sergey Sharybin
2f5dd83759 Cycles: Add some statistics logging
Covers number of entities in the scene (objects, meshes etc), also reports
sizes of textures being allocated.
2015-04-10 15:37:49 +05:00
Martijn Berger
f01456aaa4 Optionally use c++11 stuff instead of boost in cycles where possible. We do and continue to depend on boost though
Reviewers: dingto, sergey

Reviewed By: sergey

Subscribers: #cycles

Differential Revision: https://developer.blender.org/D1185
2015-03-29 22:12:40 +02:00
Sergey Sharybin
5ff132182d Cycles: Code cleanup, spaces around keywords
This inconsistency drove me totally crazy, it's really confusing
when it's inconsistent especially when you work on both Cycles and
Blender sides.

Shouldn;t cause merge PITA, it's whitespace changes only, Git should
be able to merge it nicely.
2015-03-28 00:15:15 +05:00
Sergey Sharybin
585dd26120 Cycles: Code cleanup, prepare for strict C++ flags 2015-03-27 18:23:31 +05:00
Sergey Sharybin
7f406a53c7 Cycles: Cleanup for indentation in device_cpu.cpp
Perhaps became broken after rather recent change about which entry point
to kernel to use.
2015-02-19 19:05:04 +05:00
Sergey Sharybin
7ea7c2aab2 Cycles: Fix inconsistent command line used for runtime kernel compilation
Basically build-time compiled kernels were using --fast-math (which is correct)
but run-time compiled did not.
2015-02-02 15:00:21 +05:00
Sergey Sharybin
77e6f2212f Cycles: Allow paths customization via environment variables
This is for development and test environment setup only, not for
regular users usage hence no mentioning in the man page needed.
2015-02-02 02:02:10 +05:00
Sergey Sharybin
a922be9270 Cycles: Repot CPU and CUDA capabilities to system info operator
For CPU it gives available instructions set (SSE, AVX and so).

For GPU CUDA it reports most of the attribute values returned by
cuDeviceGetAttribute(). Ideally we need to only use set of those
which are driver-specific (so we don't clutter system info with
values which we can get from GPU specifications and be sure they
stay the same because driver can't affect on them).
2015-01-06 14:13:21 +05:00
Sergey Sharybin
1369bd562c Cycles: Fix compilation error on AVX platforms with -arch-native
Was a conflict in headers between clew and util_optimization.h.
2015-01-03 00:11:28 +05:00
Sergey Sharybin
9e2e408323 Cycles: Add logging to OSL and CUDA initialization/compilation
This is what was handy troubleshooting issues in the studio,
plus this is exactly the same thing which would be helpful
when solving issues with paths to compiled shaders and cubins
for standalone repository.
2015-01-01 01:31:08 +05:00
Sergey Sharybin
4497b6ac84 Cycles: Synchronize changes with standalone repository
This changes were done in original commit of the standalone Cycles repository
and needed here for easier patch synchronization.
2015-01-01 01:31:07 +05:00
Thomas Dinges
ee36e75b85 Cleanup: Fix Cycles Apache header.
This was already mixed a bit, but the dot belongs there.
2014-12-25 02:50:24 +01:00
Campbell Barton
c07f6c02b3 Docs: reference the new manual 2014-12-08 11:18:58 +01:00
Thomas Dinges
e3a6f1c152 Cycles: Remove workaround for missing sm_52 kernel, now we require it for Maxwell cards. 2014-12-02 13:45:39 +01:00
Bastien Montagne
c14d34322b Fix typo breaking compilation with SSE2.
Spotted by sybrenstuvel (Sybren Stüvel), thanks!
2014-11-02 23:01:09 +01:00
Thomas Dinges
4ff8744669 Cycles / CUDA: Better fix for missing sm_52 kernel, in case user compiles himself. 2014-10-30 11:42:59 +01:00
Martijn Berger
4b33667b93 Deduplicate some code by using a function pointer to the real kernel
This has no performance impact what so ever and is already used in the adaptive sampling patch
2014-10-30 10:23:44 +01:00
Sergey Sharybin
e556670b36 Cycles: Do cuda pointer arithmetic in integers, don't use pointer arithmetic
This should hopefully fix https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=765187
2014-10-14 17:54:41 +02:00
Jason Wilkins
8d084e8c8f Ghost Context Refactor
https://developer.blender.org/D643
Separates graphics context creation from window code in Ghost so that they can vary separately.
2014-10-07 15:47:32 -05:00
Sergey Sharybin
cd6129d1ff Cycles: Workaround dead-slow expf() on 64bit linux
Single precision exponent on 64bit linux tends to be order of magnitude slower
than double precision version even with single<->double precision conversion.

Some feedback in the mailing lists also suggests that logf() is also slow, but
this i didn't confirm here in the studio yet.

Depending on the shader setup it gives ~3% with the secret agent shot and up to
around 15% with the bmw scene here.
2014-10-06 12:36:46 +02:00