Commit Graph

682 Commits

Author SHA1 Message Date
Sergey Sharybin
da34136de1 Cycles: Check for validity of the tiles arrays in progressive refine
In certain configurations (for example when start resolution is set to small
value for background render and progressive refine enabled) number of tiles
might change in the tile manager. This situation will confuse progressive
refine feature and likely cause crash.

We might also add some settings verification in the session constructor, but
having an assert with brief explanation about what's wrong should already be
much better than nothing.
2015-05-19 12:42:07 +05:00
Sergey Sharybin
f868be6295 Cycles: Check for whether update/write callbacks are set prior to calling them
This changes the progressive refine part, regular update was already checking
for whether callbacks are set.
2015-05-19 12:42:07 +05:00
Sergey Sharybin
0a60c7d8ee Cycles: Fix missing camera-in-volume update when using certain render layers configurations 2015-05-14 19:08:13 +05:00
Thomas Dinges
0e80eb82e0 Cycles: Resize light_data after possible light removal. 2015-05-14 01:13:40 +02:00
Thomas Dinges
67eb2c7897 Cycles: Remove Emission shaders from the graph if color or strength is 0. 2015-05-14 01:13:40 +02:00
Sergey Sharybin
f0f481031c Fix T44616: Cycles crashes loading 42k by 21k textures
Simple integer overflow issue.

TODO(sergey): Check on CPU cubic sampling, it might also need size_t.
2015-05-12 18:48:55 +05:00
Antony Riakiotakis
4fc3188112 Cycles: Get rid of one more OpenGL matrix manipulation/push/pop. 2015-05-11 16:41:18 +02:00
George Kyriazis
7f4479da42 Cycles: OpenCL kernel split
This commit contains all the work related on the AMD megakernel split work
which was mainly done by Varun Sundar, George Kyriazis and Lenny Wang, plus
some help from Sergey Sharybin, Martijn Berger, Thomas Dinges and likely
someone else which we're forgetting to mention.

Currently only AMD cards are enabled for the new split kernel, but it is
possible to force split opencl kernel to be used by setting the following
environment variable: CYCLES_OPENCL_SPLIT_KERNEL_TEST=1.

Not all the features are supported yet, and that being said no motion blur,
camera blur, SSS and volumetrics for now. Also transparent shadows are
disabled on AMD device because of some compiler bug.

This kernel is also only implements regular path tracing and supporting
branched one will take a bit. Branched path tracing is exposed to the
interface still, which is a bit misleading and will be hidden there soon.

More feature will be enabled once they're ported to the split kernel and
tested.

Neither regular CPU nor CUDA has any difference, they're generating the
same exact code, which means no regressions/improvements there.

Based on the research paper:

  https://research.nvidia.com/sites/default/files/publications/laine2013hpg_paper.pdf

Here's the documentation:

  https://docs.google.com/document/d/1LuXW-CV-sVJkQaEGZlMJ86jZ8FmoPfecaMdR-oiWbUY/edit

Design discussion of the patch:

  https://developer.blender.org/T44197

Differential Revision: https://developer.blender.org/D1200
2015-05-09 19:52:40 +05:00
Sergey Sharybin
f680c1b54a Cycles: Communicate number of closures and nodes feature set to the device
This way device can actually make a decision of how it can optimize the kernel
in order to make it most efficient.
2015-05-09 19:28:00 +05:00
Sergey Sharybin
6fc1669679 Cycles: Initial work towards selective nodes support compilation
The goal is to be able to compile kernel with nodes which are actually needed
to render current scene, hence improving performance of the kernel,

The idea is:

- Have few node groups, starting with a group which contains nodes are used
  really often, and then couple of groups which will be extension of this one.

- Have feature-based nodes disabling, so it's possible to disable nodes related
  to features which are not used with the currently used nodes group.

This commit only lays down needed routines for this approach, actual split will
happen later after gathering statistics from bunch of production scenes.
2015-05-09 19:22:16 +05:00
Sergey Sharybin
17c95d0a96 Cycles: Add utility function to count maximum number of closures used by session
This will be used by split kernel in order to compile most optimal kernel.

Maximum number of closures is actually being cached in the session, so viewport
rendering will not trigger kernel re-loading when number of closures goes down.
2015-05-09 19:17:49 +05:00
Sergey Sharybin
5068f7dc01 Cycles: Add utility function to graph to query number of closures used in it
Currently unused but will be needed soon for the split kernel work.
2015-05-09 19:13:32 +05:00
Sergey Sharybin
b3299bace0 Cycles: Pass requested tile size to the device via device task
This is currently unused but crucial for things like calculating amount of
device memory required to deal with the tasks.

Maybe not really best place to store it, but consider it good enough for now.
2015-05-09 19:09:07 +05:00
Sergey Sharybin
0e4ddaadd4 Cycles: Change the way how we pass requested capabilities to the device
Previously we only had experimental flag passed to device's load_kernel() which
was all fine. But since we're gonna to have some extra parameters passed there
it makes sense to wrap them into a single struct, which will make it easier to
pass stuff around.
2015-05-09 19:05:49 +05:00
Sergey Sharybin
7eac672e4f Cycles: Set default closure values to some of the nodes
Previously it was only set at compilation time which is all fine but does
not let us to check which closure the node corresponds to prior to the
compilation.
2015-05-09 19:04:09 +05:00
Thomas Dinges
900fc43bb4 Cleanup: Remove unused ray type flags.
They were added for completeness, but it seems we don't need them.
2015-05-08 12:10:26 +02:00
Sv. Lockal
7201f6d14c Cycles: Use curve approximation for blackbody instead of lookup table
Now we calculate color in range 800..12000 using an approximation a/x+bx+c for R and G and ((at + b)t + c)t + d) for B.
Max absolute error for RGB for non-lut function is less than 0.0001, which is enough to get the same 8 bit/channel color as for OSL with a noticeable performance difference.
However there is a slight visible difference between previous non-OSL implementation because of lookup table interpolation and offset-by-one mistake.
The previous implementation gave black color outside of soft range (t > 12000), now it gives the same color as for 12000.

Also blackbody node without input connected is being converted to value input at shader compile time.

Reviewers: dingto, sergey

Reviewed By: dingto

Subscribers: nutel, brecht, juicyfruit

Differential Revision: https://developer.blender.org/D1280
2015-05-05 06:11:54 +00:00
Sergey Sharybin
e5f3193df3 Cycles: Fix wrong order in object flags calculations
Object flags are depending on bounding box which is only available after
mesh synchronization.

This was broken since 7fd4c44 which happened quite close to the release
and oddly enough was not sopped by anyone. Render test is coming for this.

Was spotted by Thomas Dinges while working on another patch.
2015-04-30 01:09:48 +05:00
Lukas Stockner
f478c2cfbd Cycles: Added support for light portals
This patch adds support for light portals: objects that help sampling the
environment light, therefore improving convergence. Using them tor other
lights in a unidirectional pathtracer is virtually useless.

The sampling is done with the area-preserving code already used for area lamps.
MIS is used both for combination of different portals and for combining portal-
and envmap-sampling.

The direction of portals is considered, they aren't used if the sampling point
is behind them.

Reviewers: sergey, dingto, #cycles

Reviewed By: dingto, #cycles

Subscribers: Lapineige, nutel, jtheninja, dsisco11, januz, vitorbalbio, candreacchio, TARDISMaker, lichtwerk, ace_dragon, marcog, mib2berlin, Tunge, lopataasdf, lordodin, sergey, dingto

Differential Revision: https://developer.blender.org/D1133
2015-04-28 01:30:16 +05:00
Sergey Sharybin
ae7d84dbc1 Cycles: Use native saturate function for CUDA
This more a workaround for CUDA optimizer which can't optimize clamp(x, 0, 1)
into a single instruction and uses 4 instructions instead.

Original patch by @lockal with own modification:

  Don't make changes outside of the kernel. They don't make any difference
  anyway and term saturate() has a bit different meaning outside of kernel.

This gives around 2% of speedup in Barcelona file, but in more complex shader
setups with lots of math nodes with clamping speedup could be much nicer.

Subscribers: dingto

Projects: #cycles

Differential Revision: https://developer.blender.org/D1224
2015-04-28 00:38:32 +05:00
Thomas Dinges
bc160d8a85 Cleanup: Code style. 2015-04-26 00:42:26 +02:00
Thomas Dinges
8dd055cd47 Cleanup: Update Lookup table comments. 2015-04-26 00:06:38 +02:00
Sergey Sharybin
828abaf11c Cycles: Split BVH nodes storage into inner and leaf nodes
This way we can get rid of inefficient memory usage caused by BVH boundbox
part being unused by leaf nodes but still being allocated for them. Doing
such split allows to save 6 of float4 values for QBVH per leaf node and 3
of float4 values for regular BVH per leaf node.

This translates into following memory save using 01.01.01.G rendered
without hair:

                   Device memory size   Device memory peak   Global memory peak
Before the patch:  4957                 5051                 7668
With the patch:    4467                 4562                 7332

The measurements are done against current master. Still need to run speed tests
and it's hard to predict if it's faster or not: on the one hand leaf nodes are
now much more coherent in cache, on the other hand they're not so much coherent
with regular nodes anymore.

Reviewers: brecht, juicyfruit

Subscribers: venomgfx, eyecandy

Differential Revision: https://developer.blender.org/D1236
2015-04-20 17:29:51 +05:00
Sergey Sharybin
cd44449578 Cycles: Synchronize images after building mesh BVH
This way memory overhead caused by the BVH building is not so visible and peak
memory usage will be reduced.

Implementing this idea is not so straightforward actually, because we need to
synchronize images used for true displacement before meshes. Detecting whether
image is used for true displacement is not so striaghtforward, so for now all
all displacement types will synchronize images used for them.

Such change brings memory usage from 4.1G to 4.0G with the 01_01_01_D scene
from gooseberry. With 01_01_01_G scene it's 7.6G vs. 6.8G (before and after
the patch).

Reviewers: campbellbarton, juicyfruit, brecht

Subscribers: eyecandy

Differential Revision: https://developer.blender.org/D1217
2015-04-20 17:29:51 +05:00
Dalai Felinto
394c5318c6 Bake-API: reduce memory footprint when baking more than one object (Fix T41092)
Combine all the highpoly pixel arrays into a single array with a lookup
object_id for each of the highpoly objects.

Note: This changes the Bake API, external engines should refer to the
bake_api.c for the latest API.

Many thanks for Sergey Sharybin for the complete review, changes
suggestion and feedback. (you rock!)

Reviewers: sergey

Subscribers: pildanovak, marcclintdion, monio, metalliandy, brecht

Maniphest Tasks: T41092

Differential Revision: https://developer.blender.org/D772
2015-04-17 12:25:37 -03:00
Campbell Barton
d1f9fcaabc Cleanup: style 2015-04-13 22:08:51 +10:00
Sergey Sharybin
aac0df956f Cycles: Cleanup, make more clear what camera utility functions are private/public 2015-04-10 16:25:35 +05:00
Sergey Sharybin
e073562f80 Cycles: Make transform from viewplane a generic utility function 2015-04-10 15:53:14 +05:00
Sergey Sharybin
2f5dd83759 Cycles: Add some statistics logging
Covers number of entities in the scene (objects, meshes etc), also reports
sizes of textures being allocated.
2015-04-10 15:37:49 +05:00
Sergey Sharybin
7ea4163e1e Cycles: Fix BVH counter on mesh updates 2015-04-09 22:23:59 +05:00
Sergey Sharybin
cca4405437 Cycles: Fix wrong render result in certain configuration of render layer's surface/hair
There were some synchronization missing in cases when only one of those settings
was disabled.

Also added a render test for such configurations now.
2015-04-09 21:22:48 +05:00
Sergey Sharybin
09a746b857 Cycles: Cleanup, typos 2015-04-08 01:15:38 +05:00
Sergey Sharybin
d0aae79505 Cycles: More instant feedback on progressive rendering for first sample
Main purpose of this change is to make material preview appearing more
instant after the shader tweaks.
2015-04-06 19:28:25 +05:00
Sergey Sharybin
b5f58c1ad9 Cycles: Experiment with making previews more interactive
There were two major problems with the interactivity of material previews:

- Beckmann tables were re-generated on every material tweak.
  This is because preview scene is not set to be persistent, so re-triggering
  the render leads to the full scene re-sync.

- Images could take rather noticeable time to load with OIIO from the disk
  on every tweak.

This patch addressed this two issues in the following way:

- Beckmann tables are now static on CPU memory.

  They're couple of hundred kilobytes only, so wouldn't expect this to be
  an issue. And they're needed for almost every render anyway.

  This actually also makes blackbody table to be static, but it's even smaller
  than beckmann table.

  Not totally happy with this approach, but others seems to complicate things
  quite a bit with all this render engine life time and so..

- For preview rendering all images are considered to be built-in. This means
  instead of OIIO which re-loads images on every re-render they're coming
  from ImBuf cache which is fully manageable from blender side and unused
  images gets freed later.

  This would make it impossible to have mipmapping with OSL for now, but we'll
  be working on that later anyway and don't think mipmaps are really so crucial
  for the material preview.

  This seems to be a better alternative to making preview scene persistent,
  because of much optimal memory control from blender side.

Reviewers: brecht, juicyfruit, campbellbarton, dingto

Subscribers: eyecandy, venomgfx

Differential Revision: https://developer.blender.org/D1132
2015-04-06 19:22:17 +05:00
Sergey Sharybin
3639a70eae Fix T44222: Crash using pointiness attribute for volume shaders
This attribute is not really supported for volumes, so it get's converted to
constant 0 at shader compile time.

TODO: We should consider doing the same for tangent attribute in order to save
some annoying checks at tracing time.
2015-04-06 14:11:28 +05:00
Martijn Berger
f01456aaa4 Optionally use c++11 stuff instead of boost in cycles where possible. We do and continue to depend on boost though
Reviewers: dingto, sergey

Reviewed By: sergey

Subscribers: #cycles

Differential Revision: https://developer.blender.org/D1185
2015-03-29 22:12:40 +02:00
Sergey Sharybin
e1bcc2d779 Cycles: Code cleanyp, sky model
For as long as code stays in official folders it should follow
our code style.
2015-03-28 00:28:37 +05:00
Sergey Sharybin
5ff132182d Cycles: Code cleanup, spaces around keywords
This inconsistency drove me totally crazy, it's really confusing
when it's inconsistent especially when you work on both Cycles and
Blender sides.

Shouldn;t cause merge PITA, it's whitespace changes only, Git should
be able to merge it nicely.
2015-03-28 00:15:15 +05:00
Sergey Sharybin
3d305b5a37 Cycles: Code cleanup, make strict flags happy about disabled OSL 2015-03-27 19:10:36 +05:00
Sergey Sharybin
585dd26120 Cycles: Code cleanup, prepare for strict C++ flags 2015-03-27 18:23:31 +05:00
Sergey Sharybin
948bc66a00 Cycles: Improve readability of dumped graphs 2015-03-17 21:15:17 +05:00
Sergey Sharybin
a43d00d51e Cycles: Fix displacement code creating cyclic dependencies in graph
Bump result was passed to set_normal node and then set_node was connected
to all unconnected Normal inputs, including the one from original Bump
node, causing cycles.
2015-03-17 19:39:09 +05:00
Sergey Sharybin
60df4d10ff Fix T43999: MIS for environment broken after multi-threading commit
Typo in task start row calculation.
2015-03-16 13:31:27 +05:00
Thomas Dinges
cdb47b9dfc Cycles: Make Background MIS building threaded
Use multiple threads for building the MIS table, if the
resolution is higher than 512.
Also replace division by cdf_total, with a inverse multiplication by
cdf_total_inv. This gives further speedup.

On my Macbook (8 CPU threads) this improves the time to build the table:
Resolution 4096: From 0.16s to 0.03s
Resolution 8096: From 0.61s to 0.11s

This especially helps to reduce the scene update time, when tweaking world
shader while viewport rendering is running.

Patch by Sergey and myself.

Differential Revision: https://developer.blender.org/D1159
2015-03-12 13:50:11 +01:00
Sergey Sharybin
888d810185 Cycles: Use lower progressive update timeout for preview rendering
This ways previews are refreshing with the same ratio as job was expecting
this to happen, giving more instant feedback on the changes.
2015-02-21 17:30:29 +05:00
Sergey Sharybin
a97bc1bedf Fix T43755: Wireframe attribute doesn't work with displace
This attribute missed derivatives calculation.

Not totally sure what's the proper approach for algebraic derivative
calculation, so calculating them by definition. This isn't fastest
way to do it in this case and could be replaced with some smarter magic
in the wireframe calculation loop.

At least currently implemented approach is better than nothing.
2015-02-21 17:30:29 +05:00
Thomas Dinges
7bd4c78a1a Cleanup: Put all Bump dx/dy code in the beginning here, same as with other nodes. 2015-02-21 12:55:19 +01:00
Sergey Sharybin
de4dcda545 Fix T43651: New pointiness attribute doesn't work with displacement
Simple fix: just make pointiness aware of bump offset.
2015-02-20 17:20:24 +05:00
Sergey Sharybin
0f652501c7 Cycles: Reduce memory used by background light update
Simple change: just get rid of intermediate data a bit earlier, before
final pixels array is being allocated. This gives around 30% of memory
save during light update (this is about 60meg in the frank sheep file
i'm using here).

This isn't really visible by artists a lot, because actual spike happens
on BVH construction. But it doesn't mean we shouldn't be accurate with
memory usage in other areas.
2015-02-19 18:18:04 +05:00
Thomas Dinges
fa9311c9a4 Cleanup: Update comments and make it more clear what volume interpolation is for. 2015-02-16 22:11:41 +01:00