test2

Author	SHA1	Message	Date
Brecht Van Lommel	394a1373a0	Cycles: use OpenCL C 2.0 if available, to improve performance for AMD Tested with AMD Radeon Pro WX 9100, where it brings performance back to 2.80 level, and combined with recent changes is about 2-15% faster than 2.80 in our benchmark scenes. This somehow appears to specifically address the issue where adding more shader nodes leads to slower runtime. I found no additional speedup by applying this to change to 2.80 or removing the new shader node code. Ref T71479 Patch by Jeroen Bakker. Differential Revision: https://developer.blender.org/D6252	2020-03-24 20:09:36 +01:00
Ray Molenkamp	44c6b6615b	OpenCL: Bring back CYCLES_OPENCL_TEST override Back in 2.79 you could either use the debug panel or an environment variable to override using OpenCL for unsupported hardware. Which was rather useful for developers when testing on NVidia just to be sure the CL kernels at-least build properly. This broke in rB949ab753bb2 This diff restores testing though the CYCLES_OPENCL_TEST environment variable. Differential Revision: https://developer.blender.org/D7202 Reviewers: brecht	2020-03-21 11:55:45 -06:00
Dalai Felinto	2d1cce8331	Cleanup: `make format` after SortedIncludes change	2020-03-19 09:33:58 +01:00
Brecht Van Lommel	26bea849cf	Cleanup: add device_texture for images, distinct from other global memory There was too much image texture specific stuff in device_memory, and too much code duplication between devices.	2020-03-12 17:28:55 +01:00
Brecht Van Lommel	f01bc597a8	Cleanup: stop encoding image data type in slot index This is legacy code from when we had a fixed number of textures.	2020-03-11 17:07:17 +01:00
Stefan Werner	51e898324d	Adaptive Sampling for Cycles. This feature takes some inspiration from "RenderMan: An Advanced Path Tracing Architecture for Movie Rendering" and "A Hierarchical Automatic Stopping Condition for Monte Carlo Global Illumination" The basic principle is as follows: While samples are being added to a pixel, the adaptive sampler writes half of the samples to a separate buffer. This gives it two separate estimates of the same pixel, and by comparing their difference it estimates convergence. Once convergence drops below a given threshold, the pixel is considered done. When a pixel has not converged yet and needs more samples than the minimum, its immediate neighbors are also set to take more samples. This is done in order to more reliably detect sharp features such as caustics. A 3x3 box filter that is run periodically over the tile buffer is used for that purpose. After a tile has finished rendering, the values of all passes are scaled as if they were rendered with the full number of samples. This way, any code operating on these buffers, for example the denoiser, does not need to be changed for per-pixel sample counts. Reviewed By: brecht, #cycles Differential Revision: https://developer.blender.org/D4686	2020-03-05 12:21:38 +01:00
Patrick Mours	af54bbd61c	Cycles: Rework tile scheduling for denoising This fixes denoising being delayed until after all rendering has finished. Instead, tile-based denoising is now part of the "RENDER" task again, so that it is all in one task and does not cause issues with dedicated task pools where tasks are serialized. Reviewed By: brecht Differential Revision: https://developer.blender.org/D6940	2020-02-28 16:12:29 +01:00
Patrick Mours	153e001c74	Cleanup: Move common CUDA/OptiX Cycles device code into separate file This reduces code duplication between the CUDA and OptiX device implementations: The CUDA device class is now split into declaration and definition (similar to the OpenCL device) and the OptiX device class implements that and only overrides the functions it actually has to change, while using the CUDA implementation for everything else. Reviewed By: brecht Differential Revision: https://developer.blender.org/D6814	2020-02-12 13:11:32 +01:00
Patrick Mours	38589de10c	Cycles: Add support for denoising in the viewport The OptiX denoiser can be a great help when rendering in the viewport, since it is really fast and needs few samples to produce convincing results. This patch therefore adds support for using any Cycles denoiser in the viewport also (but only the OptiX one is selectable because the NLM one is too slow to be usable currently). It also adds support for denoising on a different device than rendering (so one can e.g. render with the CPU but denoise with OptiX). Reviewed By: #cycles, brecht Differential Revision: https://developer.blender.org/D6554	2020-02-11 18:03:43 +01:00
Jeroen Bakker	f5e37af5a8	Cycles/OpenCL: Remove NULL PTR Workaround In the current OpenCL implementation we have a work-around for platforms that didn't support NULL pointers. We used to replace all NULLs and empty arrays with a pointer to a single byte on the OpenCL Device. During investigation of {T65924} it was asked to remove this work-around for testing. This change improves the render times. SCENE \| BEFORE \| AFTER --------------------+--------+------- bmw27 \| 108 \| 89 barbershop_interior \| 867 \| 673 classroom \| 270 \| 173 fishy_cat \| 244 \| 196 koro \| 249 \| 207 pavillon_barcelona \| 582 \| 414 Note that this change does not fix T65924 it just improves the rendering performance for OpenCL. We haven't tested this patch on all platforms so we should keep an eye out on the tracker. Reviewed By: sergey Differential Revision: https://developer.blender.org/D6391	2019-12-11 11:59:21 +01:00
Jeroen Bakker	b9ed30c25c	Cycles: OpenCL Separate Compilation Debug Flag OpenCL Parallel compilation only works inside Blender. When using cycles in a different setup (standaline or other software) it failed compiling kernels as they don't have the appropriate Python API and command line arguments. This change introduces a `running_inside_blender` debug flag, that triggers out of process compilation of the kernels. Compilation still happens in subthread that enabled the preview kernels and compilation of the kernels during BVH building Reviewed By: brecht Differential Revision: https://developer.blender.org/D5439	2019-08-30 13:53:23 +02:00
Campbell Barton	2425401a59	Cleanup: spelling	2019-08-04 12:51:44 +10:00
Campbell Barton	cd6b49f995	Cleanup: spelling	2019-07-07 15:38:41 +10:00
Campbell Barton	c47d669f24	Cleanup: comments (long lines) in cycles	2019-05-01 21:41:07 +10:00
Campbell Barton	e12c08e8d1	ClangFormat: apply to source, most of intern Apply clang format as proposed in T53211. For details on usage and instructions for migrating branches without conflicts, see: https://wiki.blender.org/wiki/Tools/ClangFormat	2019-04-17 06:21:24 +02:00
Campbell Barton	5ef4b0438c	Cleanup: trailing space	2019-03-19 15:08:16 +11:00
Brecht Van Lommel	9873005ecd	Cleanup: simplify kernel features definition. No functional changes, logic here got too complex after many changes over the years.	2019-03-17 12:01:19 +01:00
Brecht Van Lommel	e17f7af0ce	Cleanup: remove Cycles advanced shading features toggle. It's effectively always enabled, only not on some unsupported OpenCL devices. For testing those it's not useful to disable these features. This is replaced by the more fine grained feature toggles that we have now.	2019-03-17 01:58:39 +01:00
Jeroen Bakker	2f6257fd7f	Cycles/OpenCL: Compile Kernels During Scene Update The main goals of this change is faster starting when using foreground rendering. This patch will build kernels in parallel to the update process of the scene. When these optimized kernels are not available (yet) an AO kernel will be used. These AO kernels are fast to compile (3-7 seconds) and can be reused by all scenes. When the final kernels become available we will switch to these kernels. In background mode the AO kernels will not be used. Some kernels are being used during Scene update (displace, background light). When these kernels are being used the process can halt until these become available. Reviewed By: brecht, #cycles Maniphest Tasks: T61752 Differential Revision: https://developer.blender.org/D4428	2019-03-15 16:18:21 +01:00
Jeroen Bakker	6237743111	Cycles/OpenCL: Added missing opencl programs The functions that determine the program name + filename of kernels were missing some base kernels like denoising and base. For completeness I added those kernels so the function returns the correct results.	2019-03-15 08:11:28 +01:00
Jeroen Bakker	298dabc79b	Cycles/OpenCL: Reduce How Often Kernel Recompilations Are Needed This patch will reduce the number of times that we need to recompile kernels. It does this by (en/dis)abling features by default. So when the user needs them that the kernels are already available. Other features are enabled by default for background and foreground rendering. When in background rendering the user wants the best render performance. When in foreground rendering the user wants the least amount of recompilations. Enabling volumetrics or subdivision evaluation will still trigger a recompilation during foreground rendering. Reviewed By: #cycles, brecht Differential Revision: https://developer.blender.org/D4485	2019-03-12 14:06:45 +01:00
Jeroen Bakker	02a7e875d7	Cycles OpenCL: Remove single program Part of the cleanup of the OpenCL codebase. Single program is not effective when using OpenCL, it is slower to compile and slower during rendering (when used in for example `barbershop` or `victor`). Reviewers: brecht, #cycles Maniphest Tasks: T62267 Differential Revision: https://developer.blender.org/D4481	2019-03-08 16:31:35 +01:00
Jeroen Bakker	76442e676e	Codestyle: comments	2019-03-08 08:56:16 +01:00
Brecht Van Lommel	564d252d60	Cleanup: remove unnecessary assert.	2019-02-26 20:01:20 +01:00
Ray Molenkamp	ff304d3665	Cycles: Fix build error introduced by rBdabe5cd31add8aa55b9ad4bce1b591ed4e98f1a1	2019-02-26 08:32:41 -07:00
Jeroen Bakker	dabe5cd31a	T61971: Compilation Displacement/Background Kernel Displacement and Background kernels are selectively used, but always compiled. This patch will not compile these kernels when they are not needed. Displacement kernel is only used for true displacement. Background kernel is only used when there is a (Cycles)Light of type `LIGHT_BACKGROUND`. Reviewed By: brecht, #cycles Tags: #cycles Maniphest Tasks: T61971 Differential Revision: https://developer.blender.org/D4412	2019-02-26 14:06:25 +01:00
Jeroen Bakker	e6099c7e46	T61576: Do Not (Re-)Compile OpenCL kernels The goal of this patch is to have limit the number of times kernels needs to be compiled and are reused as kernels with different compile directives can lead to identical same binaries. The implementation does this by stripping the compile directives. and reshuffling kernels so the output is more likely to be the same. We focussed on the kernels where it was easy to detect and maintain (bundle, bake, displace, do_volume and background). More optimizations could be done but they are probably less obvious. Merged the data_init and state_buffer_size kernels to split_bundle. This patch will also remove empty kernels for do_volume and bake when their features are not enabled. When using the benchmark files there are less background, bake and do_volume kernels compiled. Fix: T61576, T61501, T61466 Reviewed By: brecht, #cycles Differential Revision: https://developer.blender.org/D4390	2019-02-26 12:45:26 +01:00
Brecht Van Lommel	f1304c973f	Fix T61810: Cycles OpenCL denoising broken after recent changes.	2019-02-21 16:47:04 +01:00
Jeroen Bakker	6e9dca2214	Codestyle: Indentation	2019-02-21 08:52:04 +01:00
Brecht Van Lommel	fda79dbd79	Cleanup: fix compiler warning.	2019-02-20 16:39:12 +01:00
Jeroen Bakker	949ab753bb	Cycles OpenCL: Remove OpenCL MegaKernel Using OpenCL MegaKernel has been slow and therefore not usefull. This patch will remove the mega kernel from the OpenCL codebase and the OpenCLDeviceBase class. T61736: removal of mega kernel T61703: baking does not work with mega kernel Tags: #cycles Differential Revision: https://developer.blender.org/D4383	2019-02-20 15:17:22 +01:00
Jeroen Bakker	667033e89e	T61463: Separate Baking kernels Cycles OpenCL: Split baking kernels in own program Fix T61463. Before this patch baking was part of the base kernels. There are 3 baking kernels that and all 3 uses shader evaluation. Only for one of these kernels the functionality was wrapped in the __NO_BAKING__ compile directive. When you start baking this leads to long compile times. By separating in individual programs will reduce the compile times. Also wrapped all baking kernels with __NO_BAKING__ to reduce the compilation times. Impact on compilation time job \| scene_name \| previous \| new \| percentage --------+-----------------+----------+-------+------------ T61463 \| empty \| 10.63 \| 7.27 \| 32% T61463 \| bmw \| 17.91 \| 14.24 \| 20% T61463 \| fishycat \| 19.57 \| 15.08 \| 23% T61463 \| barbershop \| 54.10 \| 48.18 \| 11% T61463 \| classroom \| 17.55 \| 14.42 \| 18% T61463 \| koro \| 18.92 \| 17.15 \| 9% T61463 \| pavillion \| 17.43 \| 14.23 \| 18% T61463 \| splash279 \| 16.48 \| 15.33 \| 7% T61463 \| volume_emission \| 36.22 \| 34.19 \| 6% Impact on render time job \| scene_name \| previous \| new \| percentage --------+-----------------+----------+---------+------------ T61463 \| empty \| 21.06 \| 20.54 \| 2% T61463 \| bmw \| 198.44 \| 189.59 \| 4% T61463 \| fishycat \| 394.20 \| 388.50 \| 1% T61463 \| barbershop \| 1188.16 \| 1185.49 \| 0% T61463 \| classroom \| 341.08 \| 339.27 \| 1% T61463 \| koro \| 472.43 \| 360.70 \| 24% T61463 \| pavillion \| 905.77 \| 902.14 \| 0% T61463 \| splash279 \| 55.26 \| 54.92 \| 1% T61463 \| volume_emission \| 62.59 \| 39.09 \| 38% I don't have a grounded explanation why koro and volume_emission is this much faster; I have done several tests though... Maniphest Tasks: T61463 Differential Revision: https://developer.blender.org/D4376	2019-02-19 16:34:55 +01:00
Brecht Van Lommel	8138eb0dfe	Fix Cycles OpenCL multithreaded compilation not working on Windows.	2019-02-19 13:48:56 +01:00
Brecht Van Lommel	9800837b98	Cycles: Support multithreaded compilation of kernels This patch implements a workaround to get the multithreaded compilation from D2231 working. So far, it only works for Blender, not for Cycles Standalone. Also, I have only tested the Linux codepath in the helper function. Depends on D2231. Patch by lukasstockner97, jbakker, brecht job \| scene_name \| compilation_time ----------+-----------------+------------------ Baseline \| empty \| 22.73 D2264 \| empty \| 13.94 Baseline \| bmw \| 56.44 D2264 \| bmw \| 41.32 Baseline \| fishycat \| 59.50 D2264 \| fishycat \| 45.19 Baseline \| barbershop \| 212.28 D2264 \| barbershop \| 169.81 Baseline \| victor \| 67.51 D2264 \| victor \| 53.60 Baseline \| classroom \| 51.46 D2264 \| classroom \| 39.02 Baseline \| koro \| 62.48 D2264 \| koro \| 49.03 Baseline \| pavillion \| 54.37 D2264 \| pavillion \| 38.82 Baseline \| splash279 \| 47.43 D2264 \| splash279 \| 37.94 Baseline \| volume_emission \| 145.22 D2264 \| volume_emission \| 121.10 This patch reduced compilation time as the split kernels and base kernels are compiled in parallel. In cycles debug mode (256) you can set unmark the opencl single program file, what reduces the compilation time even further (bmw 17 seconds, barbershop 53 seconds). Reviewers: brecht, dingto, sergey, juicyfruit, lukasstockner97 Reviewed By: brecht Subscribers: Loner, jbakker, candreacchio, 3dLuver, LazyDodo, bliblubli Differential Revision: https://developer.blender.org/D2264	2019-02-15 08:56:20 +01:00
Lukas Stockner	fccf506ed7	Cycles: animation denoising support in the kernel. This is the internal implementation, not available from the API or interface yet. The algorithm takes into account past and future frames, both to get more coherent animation and reduce noise. Ref D3889.	2019-02-06 15:18:42 +01:00
Lukas Stockner	405cacd4cd	Cycles: prefilter feature passes separate from denoising. Prefiltering of feature passes will happen during rendering, which can then be used for denoising immediately or written as a render pass for later (animation) denoising. The number of denoising data passes written is reduced because of this, leaving out the feature variance passes. The passes are now Normal, Albedo, Depth, Shadowing, Variance and Intensity. Ref D3889.	2019-02-06 15:18:29 +01:00
Sergey Sharybin	bb0d812d98	Cycles: Disable OpenCL on macOS This is unfortunate, but the number of bugs in this configuration keeps growing, and almost all of them are caused by bug in OpenCL compiler. The compiler is not likely to be fixed, since Apple declared OpenCL deprecated. This evil commit is aimed to keep officially supported features of Blender in a good working and stable state.	2018-12-07 14:37:47 +01:00
Brecht Van Lommel	a8b8da5567	Fix T58183: crash with CPU + GPU rendering after profiling changes. Multi-device was not passing along profiler to the CPU.	2018-11-29 23:43:27 +01:00
Sergey Sharybin	203de0bbf0	Cycles: Cleanup, space after (void) It was used in like 95% of places.	2018-11-09 12:08:51 +01:00
Sergey Sharybin	cb4b5e12ab	Cycles: Cleanup, spacing after preprocessor It is supposed to be two spaces before comment stating which if else/endif statements corresponds to. Was mainly violated in the header guards.	2018-11-09 11:34:54 +01:00
Sergey Sharybin	e0cc3e9809	Cycles: Fix wrong BVH used when disabling AVX2 in debug settings Mainly useful for debugging. Previously, when AVX2 was disabled in the debug panel but BVH layout was kept on BVH8 nothing was rendered. Needed to make it so supported BVH layout mask for devices is queried in "dynamic", so it is possible to use DebugFlags there.	2018-10-31 11:46:52 +01:00
Lukas Stockner	7920ebd157	Cycles: Fix NLM denoising kernels zeroing the wrong buffer on OpenCL Since my temporary buffer commit (about a month ago), the OpenCL device was zeroing the wrong buffer, leading to completely wrong filtered feature passes and therefore significantly lower-quality results than CPU and CUDA.	2018-10-09 00:14:29 +02:00
Lukas Stockner	7aaeb06fb6	Cycles: Clean up extra minus in previous commit Forgot to add that change, sorry for the noise.	2018-10-08 22:22:05 +02:00
Lukas Stockner	15e9d80375	Cycles: Use existing shared temporary memory in reconstruction step of the denoiser Previously the code allocated its own temporary memory, but it's possible to just use the existing shared one instead.	2018-10-08 22:13:40 +02:00
Alex Fuller	107f1c0a2b	Fix Cycles half float pragma for strict OpenCL compilers (like ROCm). Differential Revision: https://developer.blender.org/D3669	2018-09-03 12:03:56 +02:00
Lukas Stockner	94efc651d4	Cycles Denoiser: Allocate a single temporary buffer for the entire denoising process With small tiles, the repeated allocations on GPUs can actually slow down the denoising quite a lot. Allocating the buffer just once reduces rendertime for the default cube with 16x16 tiles and denoising on a mobile 1050 from 22.7sec to 14.0sec.	2018-08-25 12:23:52 -07:00
Sergey Sharybin	658a9c6cf5	Cycles: Cleanup, style I wouldn't mind changing style to have space after keyword, but there was no official code style change proposed.	2018-08-24 14:36:18 +02:00
Stefan Werner	a9700e7ad2	Fix T56359: Unitialized variable in Cycles OpenCL could cause crashes.	2018-08-14 22:51:53 +02:00
fclem	e20a0798dc	Cycles: Append compute units for RX Vega card names Makes it more clear whether compute device is Vega 56 or Vega 64.	2018-08-09 15:51:23 +02:00
Stefan Werner	df30b50f2f	Cycles: Enabled half precision textures for OpenCL devices that support the cl_khr_fp16 extension.	2018-07-06 11:42:34 +02:00

1 2 3

139 Commits