test2

Author	SHA1	Message	Date
Sergey Sharybin	de3ee3c6e8	Cycles: Fix compilation error of CUDA kernel Was caused by previous commit.	2018-09-28 15:02:44 +02:00
Sergey Sharybin	b030277e79	Cycles: Fix crash with BVH8 on certain scenes The crash was caused by BVH traversal stack being overflowed. That overflow was caused by lots of false-positive intersections for rays originating on a non-finite location. Not sure why those rays will be existing in the first place, this is to be investigated separately. This commit moves pre-SSE4.1 check to a higher level function and enables it for all miroarchitectures.	2018-09-28 13:57:50 +02:00
Sergey Sharybin	8f9a6b1bab	Cycles: Cleanup	2018-09-27 14:49:37 +02:00
Sergey Sharybin	e51f51d55d	Cycles: Cleanup, use explicit comparison with NULL	2018-08-31 12:28:12 +02:00
Sergey Sharybin	8ee76535da	Fix T56626: Cycles ambient occlusion only local : crash Was caused by missing NULL pointer check in BVH8.	2018-08-31 12:14:36 +02:00
Sergey Sharybin	73f2056052	Cycles: Add BVH8 and packeted triangle intersection This is an initial implementation of BVH8 optimization structure and packated triangle intersection. The aim is to get faster ray to scene intersection checks. Scene BVH4 BVH8 barbershop_interior 10:24.94 10:10.74 bmw27 02:41.25 02:38.83 classroom 08:16.49 07:56.15 fishy_cat 04:24.56 04:17.29 koro 06:03.06 06:01.45 pavillon_barcelona 09:21.26 09:02.98 victor 23:39.65 22:53.71 As memory goes, peak usage raises by about 4.7% in a complex scenes. Note that BVH8 is disabled when using OSL, this is because OSL kernel does not get per-microarchitecture optimizations and hence always considers BVH3 is used. Original BVH8 patch from Anton Gavrikov. Batched triangles intersection from Victoria Zhislina. Extra work and tests and fixes from Maxym Dmytrychenko.	2018-08-29 15:03:09 +02:00
Brecht Van Lommel	5261cd233c	Fix Cycles crash rendering mix of instanced and non-instanced volumes.	2018-08-05 12:05:10 +02:00
Lukas Stockner	799779d432	Cycles: change Ambient Occlusion shader to output colors. This means the shader can now be used for procedural texturing. New settings on the node are Samples, Inside, Local Only and Distance. Original patch by Lukas with further changes by Brecht. Differential Revision: https://developer.blender.org/D3479	2018-06-15 22:16:06 +02:00
Sergey Sharybin	16017178b2	Revert "Cycles: Cleanup: Don't use return on function returning void" Not sure why exactly it is called a cleanup, the code was much more clear and robust against possible missing return statements which are MANDATORY. Missing return statement will: - Cause two different BVH traversals to be run. Not is happening currently, but if more BVH layouts are added, it will become a problem. - It is already causing assert() statements to fail, since functions are no longer returning when they are supposed to. If there is any measurable reason to keep this change, let me know. Otherwise just stick to reliable/tested/robust code. This reverts commit `ba65f7093b`.	2018-06-07 11:57:57 +02:00
Lukas Stockner	ba65f7093b	Cycles: Cleanup: Don't use return on function returning void	2018-06-04 00:07:17 +02:00
Brecht Van Lommel	b66efbecf4	Code refactor: make Transform always affine, dropping last row. This save a little memory and copying in the kernel by storing only a 4x3 matrix instead of a 4x4 matrix. We already did this in a few places, and those don't need to be special exceptions anymore now.	2018-03-10 04:54:05 +01:00
Stefan Werner	f3010e98c3	Code refactor: use KernelShader and KernelParticle instead of float arrays. Original patch by Stefan with modifications by Brecht.	2018-03-10 04:54:04 +01:00
Sergey Sharybin	2f79d1c058	Cycles: Replace use_qbvh boolean flag with an enum-based property This was we can introduce other types of BVH, for example, wider ones, without causing too much mess around boolean flags. Thoughs: - Ideally device info should probably return bitflag of what BVH types it supports. It is possible to implement based on simple logic in device/ and mesh.cpp, rest of the changes will stay the same. - Not happy with workarounds in util_debug and duplicated enum in kernel. Maybe enbum should be stores in kernel, but then it's kind of weird to include kernel types from utils. Soudns some cyclkic dependency. Reviewers: brecht, maxim_d33 Reviewed By: brecht Differential Revision: https://developer.blender.org/D3011	2018-01-22 17:19:20 +01:00
Brecht Van Lommel	f79f386731	Code refactor: rename subsurface to local traversal, for reuse.	2017-11-07 22:35:12 +01:00
Brecht Van Lommel	ce1f2e271d	Cycles: disable fast math flags, only use a subset. Empty BVH nodes are set to NaN which must be preserved all the way to the tnear <= tfar test which can then give false for empty nodes. This needs strict semantices and careful argument ordering for min() and max(), so the second argument is used if either of the arguments is NaN. Fixes T52635: crash in BVH traversal with SSE4.1. Differential Revision: https://developer.blender.org/D2828	2017-09-08 15:12:37 +02:00
Sergey Sharybin	b0bbb5f34f	Cycles: Cleanup, style	2017-09-05 12:43:02 +02:00
Brecht Van Lommel	76b74a93a8	Fix Cycles CUDA transparent shadow error after recent fix in `c22b52c`. Fishy cat benchmark was rendering with wrong shadows. Cause is unclear, adding printf or rearranging code seems to avoid this issue, possibly a compiler bug. This reverts the fix and solves the OSL bug elsewhere.	2017-08-24 03:43:02 +02:00
Brecht Van Lommel	c22b52cd36	Fix T52452: OSL trace broken after shadow catcher recent changes. We should only early out with any hit in BVH traversal if the only visibility bits used are opaque shadow. Not when opaque shadow is one of multiple bits.	2017-08-19 18:14:16 +02:00
Sergey Sharybin	95fe9b2617	Cycles: Cleanup, remove bvh prefix from curve functions Those are nothing to do with BVH, and can be used separately.	2017-08-07 20:53:30 +02:00
Brecht Van Lommel	fc38276d74	Fix Cycles shadow catcher objects influencing each other. Since all the shadow catchers are already assumed to be in the footage, the shadows they cast on each other are already in the footage too. So don't just let shadow catchers skip self, but all shadow catchers. Another justification is that it should not matter if the shadow catcher is modeled as one object or multiple separate objects, the resulting render should be the same. Differential Revision: https://developer.blender.org/D2763	2017-08-07 17:54:26 +02:00
Sergey Sharybin	be17445714	Cycles: Cleanup, indentation	2017-03-29 15:41:56 +02:00
Sergey Sharybin	30bed91b78	Cycles: Fix compilation error with visibility flag disabled	2017-03-29 14:28:45 +02:00
Sergey Sharybin	0579eaae1f	Cycles: Make all #include statements relative to cycles source directory The idea is to make include statements more explicit and obvious where the file is coming from, additionally reducing chance of wrong header being picked up. For example, it was not obvious whether bvh.h was refferring to builder or traversal, whenter node.h is a generic graph node or a shader node and cases like that. Surely this might look obvious for the active developers, but after some time of not touching the code it becomes less obvious where file is coming from. This was briefly mentioned in T50824 and seems @brecht is fine with such explicitness, but need to agree with all active developers before committing this. Please note that this patch is lacking changes related on GPU/OpenCL support. This will be solved if/when we all agree this is a good idea to move forward. Reviewers: brecht, lukasstockner97, maiself, nirved, dingto, juicyfruit, swerner Reviewed By: lukasstockner97, maiself, nirved, dingto Subscribers: brecht Differential Revision: https://developer.blender.org/D2586	2017-03-29 13:41:11 +02:00
Sergey Sharybin	6ea54fe9ff	Cycles: Switch to reformulated Pluecker ray/triangle intersection The intention of this commit it to address issues mentioned in the reports T43865,T50164 and T50452. The code is based on Embree code with some extra vectorization to speed up single ray to single triangle intersection. Unfortunately, such a fix is not coming for free. There is some slowdown for AVX2 processors, mainly due to different vectorization code, which caused different number of instructions to be executed and different instructions-per-cycle counters. But on another hand this commit makes pre-AVX2 platforms such as AVX and SSE4.1 a bit faster. The prerformance goes as following: 2.78c AVX2 2.78c AVX Patch AVX2 Patch AVX BMW 05:21.09 06:05.34 05:32.97 (+3.5%) 05:34.97 (-8.5%) Classroom 16:55.36 18:24.51 17:10.41 (+1.4%) 17:15.87 (-6.3%) Fishy Cat 08:08.49 08:36.26 08:09.19 (+0.2%) 08:12.25 (-4.7% Koro 11:22.54 11:45.24 11:13.25 (-1.5%) 11:43.81 (-0.3%) Barcelone 14:18.32 16:09.46 14:15.20 (-0.4%) 14:25.15 (-10.8%) On GPU the performance is about 1.5-2% slower in my tests on GTX1080 but afraid we can't do much as a part of this chaneg here and consider it a price to pay for more proper intersection check. Made in collaboration with Maxym Dmytrychenko, big thanks to him! Reviewers: brecht, juicyfruit, lukasstockner97, dingto Differential Revision: https://developer.blender.org/D1574	2017-03-28 17:26:47 +02:00
Sergey Sharybin	d14e39622a	Cycles: First implementation of shadow catcher It uses an idea of accumulating all possible light reachable across the light path (without taking shadow blocked into account) and accumulating total shaded light across the path. Dividing second figure by first one seems to be giving good estimate of the shadow. In fact, to my knowledge, it's something really similar to what is happening in the denoising branch, so we are aligned here which is good. The workflow is following: - Create an object which matches real-life object on which shadow is to be catched. - Create approximate similar material on that object. This is needed to make indirect light properly affecting CG objects in the scene. - Mark object as Shadow Catcher in the Object properties. Ideally, after doing that it will be possible to render the image and simply alpha-over it on top of real footage.	2017-03-27 10:46:03 +02:00
Sergey Sharybin	85a5fbf2ce	Cycles: Workaround incorrect SSS with CUDA toolkit 8.0.61	2017-03-24 10:08:18 +01:00
Sergey Sharybin	ba8c7d2ba1	Cycles: Use SSE-optimized version of triangle intersection for motion triangles The title says it all actually. Gives up to 10% speedup on test scenes here on i7-6800K. Render times on GPU are unreliable here, but there might be some slowdown caused by watertight nature of intersections.	2017-03-23 17:58:03 +01:00
Sergey Sharybin	f8a999c965	Cycles: Move triangle intersection precalc to an util file This is a preparation work for the followup commit which wil l move remaining parts of Woop intersection logic to an utility file. Doing it as a separate commit to keep changes more atomic and easier to bisect when/if needed.	2017-03-23 17:45:19 +01:00
Sergey Sharybin	59fd21296a	Cycles: Cleanup, extra semicolon and space	2017-03-10 15:38:30 +01:00
Hristo Gueorguiev	57e26627c4	Cycles: SSS and Volume rendering in split kernel Decoupled ray marching is not supported yet. Transparent shadows are always enabled for volume rendering. Changes in kernel/bvh and kernel/geom are from Sergey. This simiplifies code significantly, and prepares it for record-all transparent shadow function in split kernel.	2017-03-09 17:09:37 +01:00
Sergey Sharybin	930186d3df	Cycles: Optimize sorting of transparent intersections on CUDA	2017-02-13 18:24:45 +01:00
Sergey Sharybin	21dbfb7828	Cycles: Fix wrong transparent shadows with CUDA Was a bug in recent optimization commit.	2017-02-13 18:22:10 +01:00
Sergey Sharybin	04cf1538b5	Cycles: Fix compilation error on OpenCL	2017-02-08 14:00:48 +01:00
Sergey Sharybin	9830eeb44b	Cycles: Implement record-all transparent shadow function for GPU The idea is to record all possible transparent intersections when shooting transparent ray on GPU (similar to what we were doing on CPU already). This avoids need of doing whole ray-to-scene intersections queries for each intersection and speeds up a lot cases like transparent hair in the cost of extra memory. This commit is a base ground for now and this feature is kept disabled for until some further tweaks.	2017-02-08 14:00:48 +01:00
Sergey Sharybin	bc096e1eb8	Cycles: Split ShaderData object and shader flags We started to run out of bits there, so now we separate flags which came from __object_flags and which are either runtime or coming from __shader_flags. Rule now is: SD_OBJECT_* flags are to be tested against new object_flags field of ShaderData, all the rest flags are to be tested against flags field of ShaderData. There should be no user-visible changes, and time difference should be minimal. In fact, from tests here can only see hardly measurable difference and sometimes the new code is somewhat faster (all within a noise floor, so hard to tell for sure). Reviewers: brecht, dingto, juicyfruit, lukasstockner97, maiself Differential Revision: https://developer.blender.org/D2428	2017-01-23 12:56:55 +01:00
Sergey Sharybin	b9311b5e5a	Cycles: Make object flag names more obvious that hey are object and not shader	2017-01-23 12:14:17 +01:00
Sergey Sharybin	1ad04c7d65	Cycles: Store time in BVH nodes This way we can stop traversing BVH node early on. Gives about 2-2.5x times render time improvement with 3 BVH steps. Hopefully this gives no measurable performance loss for scenes with single BVH step. Traversal is currently only implemented for QBVH, meaning old CPUs and GPU do not benefit from this change.	2017-01-20 12:46:18 +01:00
Sergey Sharybin	e5a665fe24	Cycles: Fix wrong transparent shadows for motion blur hair This was a missing bit from `b53ce9a`.	2017-01-13 16:14:57 +01:00
Sergey Sharybin	b53ce9a1d0	Cycles: Prepare BVH traversal code to work with multiple curve primitives per node	2017-01-12 18:20:19 +01:00
Sergey Sharybin	f12f906dd9	Cycles: Correct assert() for cases when there are multiple curves per BVH node	2017-01-12 17:38:27 +01:00
Sergey Sharybin	53fa389802	Cycles: Use dedicated debug passes for traversed nodes and intersection tests This way it's more clear whether some issue is caused by lots of geometry in the node or by lots of "transparent" BVH nodes.	2017-01-12 13:44:35 +01:00
Sergey Sharybin	8c761ff838	Cycles: Use new SSE version of offset calculation for all QBVH flavors Gives up to ~1% speedup again. While it seems to be small, still nice since the code now is actually more clean that it used to be before.	2016-10-25 15:27:50 +02:00
Sergey Sharybin	f7cf2f659a	Cycles: Move QBVH near/far offset calculation to an utility function Just preparing for new optimization to be used in all traversal implementation. Should be no measurable difference.	2016-10-25 15:08:33 +02:00
Sergey Sharybin	064caae7b2	Cycles: BVH-related SSE optimization Several ideas here: - Optimize calculation of near_{x,y,z} in a way that does not require 3 if() statements per update, which avoids negative effect of wrong branch prediction. - Optimization of direction clamping for BVH. - Optimization of point/direction transform. Brings ~1.5% speedup again depending on a scene (unfortunately, this speedup can't be sum across all previous commits because speedup of each of the changes varies from scene to scene, but it still seems to be nice solid speedup of few percent on Linux and bigger speedup was reported on Windows). Once again ,thanks Maxym for inspiration! Still TODO: We have multiple places where we need to calculate near x,y,z indices in BVH, for now it's only done for main BVH traversal. Will try to move this calculation to an utility function and see if that can be easily re-used across all the BVH flavors.	2016-10-25 14:47:34 +02:00
Brecht Van Lommel	a3abb020e3	Fix Cycles CUDA performance on CUDA 8.0. Mostly this is making inlining match CUDA 7.5 in a few performance critical places. The end result is that performance is now better than before, possibly due to less register spilling or other CUDA 8.0 compiler improvements. On benchmarks scenes, there are 3% to 35% render time reductions. Stack memory usage is reduced a little too. Reviewed By: sergey Differential Revision: https://developer.blender.org/D2269	2016-10-03 22:15:25 +02:00
Sergey Sharybin	94c919349b	Cycles: Cleanup file headers Some of the files were wrongly attributing code to some other organizations and in few places proper attribution was missing. This is mainly either a copy-paste error (when new file was created from an existing one and header wasn't updated) or due to some refactor which split non-original-BF code with purely BF code. Should solve some confusion around.	2016-09-29 10:11:40 +02:00
Sergey Sharybin	a5f14ad1a2	Cycles: Make regular bvh traversal functions close to each other	2016-09-20 16:58:39 +02:00
Sergey Sharybin	a6db95cd42	Cycles: Re-group ifdef so we check for particular feature only once	2016-09-20 16:58:39 +02:00
Sergey Sharybin	386da0cc77	Cycles: Avoid conversion from bool to uint	2016-09-20 13:00:36 +02:00
Sergey Sharybin	5c6a14f4e5	Cycles: More tweaks to make specialized BVH traversal matching	2016-09-19 15:29:37 +02:00

1 2

67 Commits