griefith/test

Author	SHA1	Message	Date
Brecht Van Lommel	7a6967cbe6	Fix mistake in previous fix for T53600, shows we really need a smarter solution.	2017-12-29 00:07:49 +01:00
Brecht Van Lommel	948515c21a	Fix T53600: Cycles shader mixing issue with principled BSDF and zero weights. SVM nodes need to read all data to get the right offset for the following node. This is quite weak, a more generic solution would be good in the future.	2017-12-25 23:59:20 +01:00
Lukas Stockner	bf1dc39679	Fix T53567: Negative pixel values causing artifacts with denoising Now negative color values are clamped to zero before the actual denoising.	2017-12-21 14:24:23 +01:00
Sergey Sharybin	2e8914549b	Cycles: Fix difference in image Clip extension method between CPU and GPU Our own implementation was behaving different comparing to OSL and GPU, namely on the border pixels OSL and CUDA was doing interpolation with black, but we were clamping coordinate. This partially fixes issue reported in T53452. Similar change should also be done for 3D interpolation perhaps, but this is to be investigated separately.	2017-12-08 12:03:11 +01:00
Sergey Sharybin	f31fb4a014	Cycles: Cleanup, split 2D interpolation function	2017-12-08 11:22:04 +01:00
Lukas Stockner	fa3d50af95	Cycles: Improve denoising speed on GPUs with small tile sizes Previously, the NLM kernels would be launched once per offset with one thread per pixel. However, with the smaller tile sizes that are now feasible, there wasn't enough work to fully occupy GPUs which results in a significant slowdown. Therefore, the kernels are now launched in a single call that handles all offsets at once. This has two downsides: Memory accesses to accumulating buffers are now atomic, and more importantly, the temporary memory now has to be allocated for every shift at once, increasing the required memory. On the other hand, of course, the smaller tiles significantly reduce the size of the memory. The main bottleneck right now is the construction of the transformation - there is nothing to be parallelized there, one thread per pixel is the maximum. I tried to parallelize the SVD implementation by storing the matrix in shared memory and launching one block per pixel, but that wasn't really going anywhere. To make the new code somewhat readable, the handling of rectangular regions was cleaned up a bit and commented, it should be easier to understand what's going on now. Also, some variables have been renamed to make the difference between buffer width and stride more apparent, in addition to some general style cleanup.	2017-11-30 07:37:08 +01:00
Mathieu Menuet	83e80db56e	Fix T53349: AO bounces not working correct with OpenCL.	2017-11-26 15:53:00 +01:00
Brecht Van Lommel	e50ed90e4d	Fix T53348: Cycles difference between gradient texture on CPU and GPU.	2017-11-23 17:14:04 +01:00
Brecht Van Lommel	d77f1d6538	Fix T53313: bevel shader with transmission render artifacts.	2017-11-22 01:59:21 +01:00
Stefan Werner	58a15b2bfe	Cycles: Fixed compilation of CUDA kernels. Follow-up fix for my last commit.	2017-11-21 10:43:40 +01:00
Mai Lavelle	d8f80fbe72	Cycles: Fix OSL brick node after recent fix	2017-11-21 04:30:12 -05:00
Stefan Werner	1febc85855	Cycles: Workaround for performance loss with the CUDA 9.0 SDK. CUDA 9.0.176 apparently caused some slow down on high-end Pascal cards that can be mitigated by increasing the number of registers. See https://developer.blender.org/F1142667 for a detailed comparison.	2017-11-21 10:29:11 +01:00
Mai Lavelle	9325b9bf15	Fix T53365: OpenCL has wrong shading of brick texture Looks like some weird compiler difference with signed vs unsigned ints.	2017-11-21 00:42:55 -05:00
Brecht Van Lommel	d089875c4c	Fix build with OSL 1.9.x, automatically aligns to 16 bytes now.	2017-11-20 23:24:24 +01:00
Sergey Sharybin	51e2844387	Cycles: Fix wrong behavior of sharpness in Cubic SSS Was giving difference when using sharpness of 1.0 and 0.999 even though the result was expected to be really close to each other. This SSS profile will probably be removed in the future in favor of more physically bases Burley, but for the time being don't see anything wrong fixing an existing code.	2017-11-20 11:40:55 +01:00
Lukas Stockner	40f528a7da	Cycles: Add per-tile render time debug pass Reviewers: sergey, brecht Differential Revision: https://developer.blender.org/D2920	2017-11-17 16:40:24 +01:00
Lukas Stockner	a0c02e4d1b	Cycles: Add Volume Direct and Volume Indirect passes for volume-scattered light No color pass because it's hard to define what to use as color in a volume. Reviewers: sergey, brecht Differential Revision: https://developer.blender.org/D2903	2017-11-17 16:39:45 +01:00
Lukas Stockner	f78e963858	Cycles: Refactor PassType from bitflag to index in order to allow for more passes	2017-11-17 16:34:19 +01:00
Mai Lavelle	470b4cb62f	Cycles: Fix crash with split branched path tracing ShaderData memory was getting clobbered in the branched path code paths. Was caused by `087331c495`	2017-11-16 04:59:31 -05:00
Lukas Stockner	212a8d9e5a	Cycles: Make per-object random value output also work for Lamps	2017-11-14 04:17:54 +01:00
Lukas Stockner	d8066fb0f1	Cycles: Refactor closure roughness detection to fix a potential bug with Denoising of specular shaders	2017-11-14 04:17:54 +01:00
Brecht Van Lommel	a466d7ae24	Cycles: better distance sampling for chromatic volume extinction. Previously we picked one of the RGB channels with equal probability, but this works poorly in a dense volume after many bounces. Now we take into account the throughput and single scattering albedo. This makes it a little more practical to do brute force SSS with volumes, but is still very inefficient because we do direct light sampling at every volume bounce even when inside an opaque mesh. In theory there could be a light inside the mesh so we can't automatically disable direct lighting.	2017-11-10 01:37:10 +01:00
Brecht Van Lommel	21a535840d	Fix T53270: crash with multiscatter GGX after recent refactoring. In fact this was an existing issue when exceeding the number of available closure, but it's more common now that we set the number to 0 for shadows and emission	2017-11-09 20:28:00 +01:00
Mai Lavelle	087331c495	Cycles: Replace __MAX_CLOSURE__ build option with runtime integrator variable Goal is to reduce OpenCL kernel recompilations. Currently viewport renders are still set to use 64 closures as this seems to be faster and we don't want to cause a performance regression there. Needs to be investigated. Reviewed By: brecht Differential Revision: https://developer.blender.org/D2775	2017-11-09 01:04:06 -05:00
Brecht Van Lommel	26f39e6359	Cycles: add bevel shader, for raytrace based rounded edges. The algorithm averages normals from nearby surfaces. It uses the same sampling strategy as BSSRDFs, casting rays along the normal and two orthogonal axes, and combining the samples with MIS. The main concern here is that we are introducing raytracing inside shader evaluation, which could be quite bad for GPU performance and stack memory usage. In practice it doesn't seem so bad though. Note that using this feature can easily slow down renders 20%, and that if you care about performance then it's better to use a bevel modifier. Mainly this is useful for baking, and for cases where the mesh topology makes it difficult for the bevel modifier to work well. Differential Revision: https://developer.blender.org/D2803	2017-11-07 22:35:12 +01:00
Brecht Van Lommel	f79f386731	Code refactor: rename subsurface to local traversal, for reuse.	2017-11-07 22:35:12 +01:00
Brecht Van Lommel	d0af56fe3b	Cycles: antialias normal baking if the mesh has a bump map.	2017-11-07 22:35:12 +01:00
Brecht Van Lommel	e74b229342	Fix incorrect MIS weights in Cycles with multiple lights. This causes some difference in the classroom scene, where ray visibility tricks are used and break the MIS balance. Otherwise there doesn't seem to be much effect, but better to use the right formulas. Problem originally identified by Lukas.	2017-11-07 22:35:12 +01:00
Brecht Van Lommel	8a72be7697	Cycles: reduce closure memory usage for emission/shadow shader data. With a Titan Xp, reduces path trace local memory from 1092MB to 840MB. Benchmark performance was within 1% with both RX 480 and Titan Xp. Original patch was implemented by Sergey. Differential Revision: https://developer.blender.org/D2249	2017-11-05 20:48:33 +01:00
Brecht Van Lommel	c571be4e05	Code refactor: sum transparent and absorption weights outside closures.	2017-11-05 18:13:44 +01:00
Brecht Van Lommel	2c02a04c46	Code refactor: remove emission and background closures, sum directly.	2017-11-05 18:13:44 +01:00
Brecht Van Lommel	cac3d4d166	Cycles: fix inefficient attribute map storage, saves 615MB in victor scene.	2017-11-05 18:00:48 +01:00
Brecht Van Lommel	5801ef71e4	Code refactor: device memory cleanups, preparing for mapped host memory.	2017-11-05 15:22:04 +01:00
Sergey Sharybin	71f46bc367	Cycles: Add utility function to distinguish between scatter and absorption volume ID	2017-11-01 11:10:51 +01:00
Sergey Sharybin	5d7138c08a	Cycles: Cleanup, make it more obvious what preprocessor belongs to	2017-11-01 11:10:10 +01:00
Sergey Sharybin	7f45acee80	Cycles: Cleanup, delete trailing whitespace	2017-11-01 11:06:55 +01:00
Brecht Van Lommel	bbc7eb8ae5	Cycles: restore SOBOL_SKIP hack, for some cases where it helps still.	2017-10-29 16:44:20 +01:00
Brecht Van Lommel	171c4e982f	Cycles: use AO factor to let user adjust intensity of AO bounces. We are already using the AO distance, so might as well offer this extra control over the intensity. Useful when an interior scene is supposed to be significantly darker than the background shader.	2017-10-25 21:46:23 +02:00
Brecht Van Lommel	7ad9333fad	Code refactor: store device/interp/extension/type in each device_memory.	2017-10-24 01:03:59 +02:00
Brecht Van Lommel	d85a0a722e	Fix part of T53038: principled BSDF clearcoat weight has no effect with 0 roughness.	2017-10-18 23:35:54 +02:00
Campbell Barton	99520e3f92	Cleanup: use 'e' prefix for enum typedefs Convention was only followed loosely, apply to DNA where changes aren't likely to conflict. (Skipped ModifierType for eg).	2017-10-17 13:49:20 +11:00
Brecht Van Lommel	2e50add164	Fix OpenCL performance regression after cubic interpolation. Reorganize code to reduce register pressure.	2017-10-15 17:46:50 +02:00
Sergey Sharybin	8d73ba58b6	Cycles: Fix compilation of sm_20 and sm_21 kernels Was broken since the bicubic commit for GPU support.	2017-10-10 12:26:02 +05:00
Brecht Van Lommel	cdb0b3b1dc	Code refactor: use DeviceInfo to enable QBVH and decoupled volume shading.	2017-10-08 13:17:33 +02:00
Brecht Van Lommel	f61c340bc1	Cycles: OpenCL bicubic and tricubic texture interpolation support.	2017-10-08 02:55:44 +02:00
Brecht Van Lommel	c040dedc12	Fix incorrect MIS with principled BSDF and specular roughness 0.	2017-10-07 22:10:02 +02:00
Brecht Van Lommel	d7eabc6765	Code cleanup: simplify cmake kernel install.	2017-10-07 15:32:20 +02:00
Brecht Van Lommel	2d92988f6b	Cycles: CUDA bicubic and tricubic texture interpolation support. While cubic interpolation is quite expensive on the CPU compared to linear interpolation, the difference on the GPU is quite small.	2017-10-07 15:30:57 +02:00
Brecht Van Lommel	23098cda99	Code refactor: make texture code more consistent between devices. * Use common TextureInfo struct for all devices, except CUDA fermi. * Move image sampling code to kernels//kernel__image.h files. * Use arrays for data textures on Fermi too, so device_vector<Struct> works.	2017-10-07 14:53:14 +02:00
Sergey Sharybin	837383ac78	Cycles: Cleanup, indendation	2017-10-06 19:33:59 +05:00

1 2 3 4 5 ...

1920 Commits