test2

Author	SHA1	Message	Date
Lukas Stockner	fa3d50af95	Cycles: Improve denoising speed on GPUs with small tile sizes Previously, the NLM kernels would be launched once per offset with one thread per pixel. However, with the smaller tile sizes that are now feasible, there wasn't enough work to fully occupy GPUs which results in a significant slowdown. Therefore, the kernels are now launched in a single call that handles all offsets at once. This has two downsides: Memory accesses to accumulating buffers are now atomic, and more importantly, the temporary memory now has to be allocated for every shift at once, increasing the required memory. On the other hand, of course, the smaller tiles significantly reduce the size of the memory. The main bottleneck right now is the construction of the transformation - there is nothing to be parallelized there, one thread per pixel is the maximum. I tried to parallelize the SVD implementation by storing the matrix in shared memory and launching one block per pixel, but that wasn't really going anywhere. To make the new code somewhat readable, the handling of rectangular regions was cleaned up a bit and commented, it should be easier to understand what's going on now. Also, some variables have been renamed to make the difference between buffer width and stride more apparent, in addition to some general style cleanup.	2017-11-30 07:37:08 +01:00
Mathieu Menuet	83e80db56e	Fix T53349: AO bounces not working correct with OpenCL.	2017-11-26 15:53:00 +01:00
Brecht Van Lommel	e50ed90e4d	Fix T53348: Cycles difference between gradient texture on CPU and GPU.	2017-11-23 17:14:04 +01:00
Brecht Van Lommel	d77f1d6538	Fix T53313: bevel shader with transmission render artifacts.	2017-11-22 01:59:21 +01:00
Stefan Werner	58a15b2bfe	Cycles: Fixed compilation of CUDA kernels. Follow-up fix for my last commit.	2017-11-21 10:43:40 +01:00
Mai Lavelle	d8f80fbe72	Cycles: Fix OSL brick node after recent fix	2017-11-21 04:30:12 -05:00
Stefan Werner	1febc85855	Cycles: Workaround for performance loss with the CUDA 9.0 SDK. CUDA 9.0.176 apparently caused some slow down on high-end Pascal cards that can be mitigated by increasing the number of registers. See https://developer.blender.org/F1142667 for a detailed comparison.	2017-11-21 10:29:11 +01:00
Mai Lavelle	9325b9bf15	Fix T53365: OpenCL has wrong shading of brick texture Looks like some weird compiler difference with signed vs unsigned ints.	2017-11-21 00:42:55 -05:00
Brecht Van Lommel	d089875c4c	Fix build with OSL 1.9.x, automatically aligns to 16 bytes now.	2017-11-20 23:24:24 +01:00
Sergey Sharybin	51e2844387	Cycles: Fix wrong behavior of sharpness in Cubic SSS Was giving difference when using sharpness of 1.0 and 0.999 even though the result was expected to be really close to each other. This SSS profile will probably be removed in the future in favor of more physically bases Burley, but for the time being don't see anything wrong fixing an existing code.	2017-11-20 11:40:55 +01:00
Lukas Stockner	40f528a7da	Cycles: Add per-tile render time debug pass Reviewers: sergey, brecht Differential Revision: https://developer.blender.org/D2920	2017-11-17 16:40:24 +01:00
Lukas Stockner	a0c02e4d1b	Cycles: Add Volume Direct and Volume Indirect passes for volume-scattered light No color pass because it's hard to define what to use as color in a volume. Reviewers: sergey, brecht Differential Revision: https://developer.blender.org/D2903	2017-11-17 16:39:45 +01:00
Lukas Stockner	f78e963858	Cycles: Refactor PassType from bitflag to index in order to allow for more passes	2017-11-17 16:34:19 +01:00
Mai Lavelle	470b4cb62f	Cycles: Fix crash with split branched path tracing ShaderData memory was getting clobbered in the branched path code paths. Was caused by `087331c495`	2017-11-16 04:59:31 -05:00
Lukas Stockner	212a8d9e5a	Cycles: Make per-object random value output also work for Lamps	2017-11-14 04:17:54 +01:00
Lukas Stockner	d8066fb0f1	Cycles: Refactor closure roughness detection to fix a potential bug with Denoising of specular shaders	2017-11-14 04:17:54 +01:00
Brecht Van Lommel	a466d7ae24	Cycles: better distance sampling for chromatic volume extinction. Previously we picked one of the RGB channels with equal probability, but this works poorly in a dense volume after many bounces. Now we take into account the throughput and single scattering albedo. This makes it a little more practical to do brute force SSS with volumes, but is still very inefficient because we do direct light sampling at every volume bounce even when inside an opaque mesh. In theory there could be a light inside the mesh so we can't automatically disable direct lighting.	2017-11-10 01:37:10 +01:00
Brecht Van Lommel	21a535840d	Fix T53270: crash with multiscatter GGX after recent refactoring. In fact this was an existing issue when exceeding the number of available closure, but it's more common now that we set the number to 0 for shadows and emission	2017-11-09 20:28:00 +01:00
Mai Lavelle	087331c495	Cycles: Replace __MAX_CLOSURE__ build option with runtime integrator variable Goal is to reduce OpenCL kernel recompilations. Currently viewport renders are still set to use 64 closures as this seems to be faster and we don't want to cause a performance regression there. Needs to be investigated. Reviewed By: brecht Differential Revision: https://developer.blender.org/D2775	2017-11-09 01:04:06 -05:00
Brecht Van Lommel	26f39e6359	Cycles: add bevel shader, for raytrace based rounded edges. The algorithm averages normals from nearby surfaces. It uses the same sampling strategy as BSSRDFs, casting rays along the normal and two orthogonal axes, and combining the samples with MIS. The main concern here is that we are introducing raytracing inside shader evaluation, which could be quite bad for GPU performance and stack memory usage. In practice it doesn't seem so bad though. Note that using this feature can easily slow down renders 20%, and that if you care about performance then it's better to use a bevel modifier. Mainly this is useful for baking, and for cases where the mesh topology makes it difficult for the bevel modifier to work well. Differential Revision: https://developer.blender.org/D2803	2017-11-07 22:35:12 +01:00
Brecht Van Lommel	f79f386731	Code refactor: rename subsurface to local traversal, for reuse.	2017-11-07 22:35:12 +01:00
Brecht Van Lommel	d0af56fe3b	Cycles: antialias normal baking if the mesh has a bump map.	2017-11-07 22:35:12 +01:00
Brecht Van Lommel	e74b229342	Fix incorrect MIS weights in Cycles with multiple lights. This causes some difference in the classroom scene, where ray visibility tricks are used and break the MIS balance. Otherwise there doesn't seem to be much effect, but better to use the right formulas. Problem originally identified by Lukas.	2017-11-07 22:35:12 +01:00
Brecht Van Lommel	8a72be7697	Cycles: reduce closure memory usage for emission/shadow shader data. With a Titan Xp, reduces path trace local memory from 1092MB to 840MB. Benchmark performance was within 1% with both RX 480 and Titan Xp. Original patch was implemented by Sergey. Differential Revision: https://developer.blender.org/D2249	2017-11-05 20:48:33 +01:00
Brecht Van Lommel	c571be4e05	Code refactor: sum transparent and absorption weights outside closures.	2017-11-05 18:13:44 +01:00
Brecht Van Lommel	2c02a04c46	Code refactor: remove emission and background closures, sum directly.	2017-11-05 18:13:44 +01:00
Brecht Van Lommel	cac3d4d166	Cycles: fix inefficient attribute map storage, saves 615MB in victor scene.	2017-11-05 18:00:48 +01:00
Brecht Van Lommel	5801ef71e4	Code refactor: device memory cleanups, preparing for mapped host memory.	2017-11-05 15:22:04 +01:00
Sergey Sharybin	71f46bc367	Cycles: Add utility function to distinguish between scatter and absorption volume ID	2017-11-01 11:10:51 +01:00
Sergey Sharybin	5d7138c08a	Cycles: Cleanup, make it more obvious what preprocessor belongs to	2017-11-01 11:10:10 +01:00
Sergey Sharybin	7f45acee80	Cycles: Cleanup, delete trailing whitespace	2017-11-01 11:06:55 +01:00
Brecht Van Lommel	bbc7eb8ae5	Cycles: restore SOBOL_SKIP hack, for some cases where it helps still.	2017-10-29 16:44:20 +01:00
Brecht Van Lommel	171c4e982f	Cycles: use AO factor to let user adjust intensity of AO bounces. We are already using the AO distance, so might as well offer this extra control over the intensity. Useful when an interior scene is supposed to be significantly darker than the background shader.	2017-10-25 21:46:23 +02:00
Brecht Van Lommel	7ad9333fad	Code refactor: store device/interp/extension/type in each device_memory.	2017-10-24 01:03:59 +02:00
Brecht Van Lommel	d85a0a722e	Fix part of T53038: principled BSDF clearcoat weight has no effect with 0 roughness.	2017-10-18 23:35:54 +02:00
Campbell Barton	99520e3f92	Cleanup: use 'e' prefix for enum typedefs Convention was only followed loosely, apply to DNA where changes aren't likely to conflict. (Skipped ModifierType for eg).	2017-10-17 13:49:20 +11:00
Brecht Van Lommel	2e50add164	Fix OpenCL performance regression after cubic interpolation. Reorganize code to reduce register pressure.	2017-10-15 17:46:50 +02:00
Sergey Sharybin	8d73ba58b6	Cycles: Fix compilation of sm_20 and sm_21 kernels Was broken since the bicubic commit for GPU support.	2017-10-10 12:26:02 +05:00
Brecht Van Lommel	cdb0b3b1dc	Code refactor: use DeviceInfo to enable QBVH and decoupled volume shading.	2017-10-08 13:17:33 +02:00
Brecht Van Lommel	f61c340bc1	Cycles: OpenCL bicubic and tricubic texture interpolation support.	2017-10-08 02:55:44 +02:00
Brecht Van Lommel	c040dedc12	Fix incorrect MIS with principled BSDF and specular roughness 0.	2017-10-07 22:10:02 +02:00
Brecht Van Lommel	d7eabc6765	Code cleanup: simplify cmake kernel install.	2017-10-07 15:32:20 +02:00
Brecht Van Lommel	2d92988f6b	Cycles: CUDA bicubic and tricubic texture interpolation support. While cubic interpolation is quite expensive on the CPU compared to linear interpolation, the difference on the GPU is quite small.	2017-10-07 15:30:57 +02:00
Brecht Van Lommel	23098cda99	Code refactor: make texture code more consistent between devices. * Use common TextureInfo struct for all devices, except CUDA fermi. * Move image sampling code to kernels//kernel__image.h files. * Use arrays for data textures on Fermi too, so device_vector<Struct> works.	2017-10-07 14:53:14 +02:00
Sergey Sharybin	837383ac78	Cycles: Cleanup, indendation	2017-10-06 19:33:59 +05:00
Sergey Sharybin	a950af8e24	Fix T53012: Shadow catcher creates artifacts on contact area The issue was caused by light sample being evaluated to nan at some point. This is root of the cause which is to be fixed, but is very hard to trace down especially via ssh (the issue only happens on AVX2 release build). Will give it a closer look when back to my AVX2 machine. For until then this is a good check to have anyway, it corresponds to what's happening in regular radiance sum.	2017-10-06 17:27:34 +05:00
Sergey Sharybin	0d3c8d0701	Cycles: Cleanup, indentation and wrapping	2017-10-06 16:54:37 +05:00
Brecht Van Lommel	4537e85584	Fix T53001: more workarounds for crash in AMD compiler with recent drivers.	2017-10-05 17:57:58 +02:00
Brecht Van Lommel	fb99ea79f8	Code refactor: split displace/background into separate kernels, remove luma.	2017-10-05 17:57:58 +02:00
Brecht Van Lommel	6da6f8d33f	Cycles: CUDA faster rendering of small tiles, using multiple samples like OpenCL. The work size is still very conservative, and this doesn't help for progressive refine. For that we will need to render multiple tiles at the same time. But this should already help for denoising renders that require too much memory with big tiles, and just generally soften the performance dropoff with small tiles. Differential Revision: https://developer.blender.org/D2856	2017-10-04 21:58:47 +02:00

1 2 3 4 5 ...

1915 Commits