griefith/test

Author	SHA1	Message	Date
Brecht Van Lommel	3407ed5f9b	Cleanup: change internal Cycles compact BVH default to match UI	2022-07-18 15:34:13 +02:00
Brecht Van Lommel	5152c7c152	Cycles: refactor rays to have start and end distance, fix precision issues For transparency, volume and light intersection rays, adjust these distances rather than the ray start position. This way we increment the start distance by the smallest possible float increment to avoid self intersections, and be sure it works as the distance compared to be will be exactly the same as before, due to the ray start position and direction remaining the same. Fix T98764, T96537, hair ray tracing precision issues. Differential Revision: https://developer.blender.org/D15455	2022-07-15 18:46:24 +02:00
Brecht Van Lommel	bb376da6df	Fix Cycles MetalRT error after recent specialization changes	2022-07-15 18:28:13 +02:00
Brecht Van Lommel	011d3c75a7	Cleanup: compiler warning	2022-07-15 15:20:53 +02:00
Brecht Van Lommel	523bbf7065	Cycles: generalize shader sorting / locality heuristic to all GPU devices This was added for Metal, but also gives good results with CUDA and OptiX. Also enable it for future Apple GPUs instead of only M1 and M2, since this has been shown to help across multiple GPUs so the better bet seems to enable rather than disable it. Also moves some of the logic outside of the Metal device code, and always enables the code in the kernel since other devices don't do dynamic compile. Time per sample with OptiX + RTX A6000: new old barbershop_interior 0.0730s 0.0727s bmw27 0.0047s 0.0053s classroom 0.0428s 0.0464s fishy_cat 0.0102s 0.0108s junkshop 0.0366s 0.0395s koro 0.0567s 0.0578s monster 0.0206s 0.0223s pabellon 0.0158s 0.0174s sponza 0.0088s 0.0100s spring 0.1267s 0.1280s victor 0.0524s 0.0531s wdas_cloud 0.0817s 0.0816s Ref D15331, T87836	2022-07-15 13:42:47 +02:00
Michael Jones	da4ef05e4d	Cycles: Apple Silicon optimization to specialize intersection kernels The Metal backend now compiles and caches a second set of kernels which are optimized for scene contents, enabled for Apple Silicon. The implementation supports doing this both for intersection and shading kernels. However this is currently only enabled for intersection kernels that are quick to compile, and already give a good speedup. Enabling this for shading kernels would be faster still, however this also causes a long wait times and would need a good user interface to control this. M1 Max samples per minute (macOS 13.0): PSO_GENERIC PSO_SPECIALIZED_INTERSECT PSO_SPECIALIZED_SHADE barbershop_interior 83.4 89.5 93.7 bmw27 1486.1 1671.0 1825.8 classroom 175.2 196.8 206.3 fishy_cat 674.2 704.3 719.3 junkshop 205.4 212.0 257.7 koro 310.1 336.1 342.8 monster 376.7 418.6 424.1 pabellon 273.5 325.4 339.8 sponza 830.6 929.6 1142.4 victor 86.7 96.4 96.3 wdas_cloud 111.8 112.7 183.1 Code contributed by Jason Fielder, Morteza Mostajabodaveh and Michael Jones Differential Revision: https://developer.blender.org/D14645	2022-07-15 13:40:04 +02:00
Michael Jones	5653c5fcdd	Cycles: keep track of SVM nodes used in kernels To be used for specialization in Metal, to automatically leave out unused nodes from the kernel. Ref D14645	2022-07-15 13:40:04 +02:00
Brecht Van Lommel	79da7f2a8f	Cycles: refactor to move part of KernelData definition to template header To be used for specialization on Metal in a following commit, turning these members into compile time constants. Ref D14645	2022-07-15 13:40:04 +02:00
Damien Picard	2e70d5cb98	Render: camera depth of field support for armature bone targets This is useful when using an armature as a camera rig, to avoid creating and targetting an empty object. Differential Revision: https://developer.blender.org/D7012	2022-07-15 13:40:04 +02:00
Brecht Van Lommel	b8ffd43bd2	Cleanup: make format	2022-07-15 13:40:04 +02:00
Olivier Maury	1b5db02a02	Fix Cycles MNEE wrong results with area light spread When the solve is successful, the light sample needs to be updated since the effective shading point is now on the last refractive interface. Spread was not taken into account, creating false caustics. Differential Revision: https://developer.blender.org/D15449	2022-07-14 16:36:38 +02:00
Brecht Van Lommel	28c3739a9b	Cleanup: replace state flow macros in the kernel with functions	2022-07-14 16:36:38 +02:00
Brecht Van Lommel	5539fb3121	Cycles: add presets to the Performance panel With choices Default, Lower Memory and Faster Render. For convenience, and to help communicate what the various settings do. Differential Revision: https://developer.blender.org/D15446	2022-07-14 16:36:38 +02:00
Michael Jones	4b1d315017	Cycles: Improve cache usage on Apple GPUs by chunking active indices This patch partitions the active indices into chunks prior to sorting by material in order to tradeoff some material coherence for better locality. On Apple Silicon GPUs (particularly higher end M1-family GPUs), we observe overall render time speedups of up to 15%. The partitioning is implemented by repeating the range of `shader_sort_key` for each partition, and encoding a "locator" key which distributes the indices into sorted chunks. Reviewed By: brecht Differential Revision: https://developer.blender.org/D15331	2022-07-14 14:26:18 +01:00
Campbell Barton	2d04012e57	Cleanup: spelling in comments Also remove duplicate comments in bmesh_log.h, caused by automated comment relocation in [0]. [0]: `c4e041da23`	2022-07-14 22:02:52 +10:00
Xavier Hallade	5f09440d5a	Cycles: Make not-compact BVH the default for embree Measurements shown on average a 1.08x speedup for a 1.04x increase in memory usage which is an acceptable trade off for a default setting, although discoverability of such settings influencing memory usage could be improved. Reviewed By: brecht Differential Revision: https://developer.blender.org/D15429	2022-07-12 18:40:14 +02:00
Xavier Hallade	47dd42485e	Cycles: fix and enable JIT oneAPI CentOS7 builds for drivers 23570+ The current specific CentOS7 workaround we have for AoT, which is to disable __FAST_MATH__ by using -fhonor-nans, now also fixes the compilation issue for JIT as well since at least driver 23570.	2022-07-12 15:55:32 +02:00
Brecht Van Lommel	6e426259b4	Fix T99218: light group add button should be disabled when name is empty Previously it was inactive but still clickable. Ref D15316	2022-07-11 14:02:38 +02:00
Brecht Van Lommel	8159e0a666	Curves: use consistent default radius for Cycles, Eevee, Set Curve Radius node To avoid Cycles not showing any hair by default, and to avoid very slow render due to many overlaps with the previous 1 meter default in the node. Fixes T97584, T99319 Differential Revision: https://developer.blender.org/D15405	2022-07-08 16:21:32 +02:00
Xavier Hallade	0f50ae131f	Cycles: enable oneAPI in Linux release builds with a very high min-driver version requirement, placeholder until JIT CentOS runtime compilation issue gets fixed in a defined version. min-driver version check can be worked around by setting CYCLES_ONEAPI_ALL_DEVICES environment variable.	2022-07-08 15:39:13 +02:00
Xavier Hallade	190ad73590	Cycles oneAPI: Remove direct dependency on Level-Zero We used it only to access device id for explicitly allowing Arc GPUs. It made the backend require ze_loader.dll which could be problematic if we end up using direct linking. I've replaced filtering based on PCI device id by using other HW properties instead (EUs, threads per EU), that are now available through Level-Zero.	2022-07-06 18:55:38 +02:00
Xavier Hallade	debb233787	Cleanup: fix comments in oneAPI kernel.cpp	2022-07-06 18:55:38 +02:00
Nikita Sirgienko	0df574b55e	Cycles: Improve an occupancy for Intel GPUs Initially oneAPI implementation have waited after each memory operation, even if there was no need for this. Now, the implementation will wait only if it is really necessary - it have improved performance noticeble for some scenes and a bit for the rest of them.	2022-07-06 17:26:23 +02:00
Xavier Hallade	41c10ac84a	Cycles: fix support for multiple Intel GPUs Identical Intel GPUs ended up with the same id. Added PCI BDF to the id to make it unique.	2022-07-01 11:20:00 +02:00
Xavier Hallade	0554537c3c	Cleanup: add missing license headers in Cycles oneAPI implementation	2022-07-01 10:13:07 +02:00
Brecht Van Lommel	fbcc00d10d	Fix broken Cycles performance benchmark after recent logging changes Ensure full render report is printed with default verbosity.	2022-06-30 19:51:50 +02:00
Andrii Symkin	f00d9e80ae	Cycles: add more math functions for float4 Add more math functions for float4 to make them on par with float3 ones. It makes it possible to change the types of float3 variables to float4 without additional work. Differential Revision: https://developer.blender.org/D15318	2022-06-30 16:25:21 +02:00
Campbell Barton	feeb8310c8	Cleanup: format	2022-06-30 12:14:23 +10:00
Campbell Barton	b6c28002ac	Cleanup: spelling in comments	2022-06-30 12:14:22 +10:00
Xavier Hallade	a02992f131	Cycles: Add support for rendering on Intel GPUs using oneAPI This patch adds a new Cycles device with similar functionality to the existing GPU devices. Kernel compilation and runtime interaction happen via oneAPI DPC++ compiler and SYCL API. This implementation is primarly focusing on Intel® Arc™ GPUs and other future Intel GPUs. The first supported drivers are 101.1660 on Windows and 22.10.22597 on Linux. The necessary tools for compilation are: - A SYCL compiler such as oneAPI DPC++ compiler or https://github.com/intel/llvm - Intel® oneAPI Level Zero which is used for low level device queries: https://github.com/oneapi-src/level-zero - To optionally generate prebuilt graphics binaries: Intel® Graphics Compiler All are included in Linux precompiled libraries on svn: https://svn.blender.org/svnroot/bf-blender/trunk/lib The same goes for Windows precompiled binaries but for the graphics compiler, available as "Intel® Graphics Offline Compiler for OpenCL™ Code" from https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html, for which path can be set as OCLOC_INSTALL_DIR. Being based on the open SYCL standard, this implementation could also be extended to run on other compatible non-Intel hardware in the future. Reviewed By: sergey, brecht Differential Revision: https://developer.blender.org/D15254 Co-authored-by: Nikita Sirgienko <nikita.sirgienko@intel.com> Co-authored-by: Stefan Werner <stefan.werner@intel.com>	2022-06-29 12:58:04 +02:00
Brecht Van Lommel	c257443192	Fix Cycles assert with mix weights outside of 0..1 range This could result in wrong skipping of SVM nodes in the graph. Now make the logic consistent with the clamping in the OSL implementation and constant folding. Thanks to Christophe Hery for finding the problem and providing the fix.	2022-06-28 19:13:57 +02:00
Sayak Biswas	abfa09752f	Cycles: enable Vega GPU/APU support Enables Vega and Vega II GPUs as well as Vega APU, using changes in HIP code to support 64-bit waves and a new HIP SDK version. Tested with Radeon WX9100, Radeon VII GPUs and Ryzen 7 PRO 5850U with Radeon Graphics APU. Ref T96740, T91571 Differential Revision: https://developer.blender.org/D15242	2022-06-28 18:35:43 +02:00
Brecht Van Lommel	9b6e86ace1	Cycles: stop Metal rendering on command buffer error If there is an error we should stop rendering, instead of finishing with a wrong render result or reporting a wrong benchmark time. Ref T96519 Differential Revision: https://developer.blender.org/D15287	2022-06-24 16:51:56 +02:00
Brecht Van Lommel	a5ff46e0fc	Cleanup: make format	2022-06-23 19:28:39 +02:00
Xavier Hallade	633c2f07da	Cyles: switch primitive.h inline hints to forceinline This change helps decrease Intel GPU binaries compile time by 5-10 minutes without impacting other backends. Reviewed By: sergey, brecht Differential Revision: http://developer.blender.org/D15273	2022-06-23 18:36:48 +02:00
Andrii Symkin	c2a2f3553a	Cycles: unify math functions names This patch unifies the names of math functions for different data types and uses overloading instead. The goal is to make it possible to swap out all the float3 variables containing RGB data with something else, with as few as possible changes to the code. It's a requirement for future spectral rendering patches. Differential Revision: https://developer.blender.org/D15276	2022-06-23 15:02:53 +02:00
Michael Jones	d8e9647ae2	Cycles: Add diagnostic tracing of MTLLibrary compilation time Reviewed By: sergey Differential Revision: https://developer.blender.org/D15268	2022-06-23 10:06:20 +01:00
Michael Jones	532b33973b	Cycles: Tidy of KernelData patchup code Reviewed By: sergey Differential Revision: https://developer.blender.org/D15267	2022-06-22 22:38:00 +01:00
Michael Jones	328a911379	Cycles: Distinguish Apple GPUs by core count This patch suffixes Apple GPU device names with `(GPU - # cores)` so that variant GPUs with the same chipset can be distinguished. Currently benchmark scores for these M1 family GPUs are being incorrectly merged: - M1: 7 or 8 cores - M1 Pro: 14 or 16 cores - M1 Max: 24 or 32 cores - M1 Ultra: 48 or 64 cores Reviewed By: brecht, sergey Differential Revision: https://developer.blender.org/D15257	2022-06-22 22:32:56 +01:00
Brecht Van Lommel	ff1883307f	Cleanup: renaming and consistency for kernel data * Rename "texture" to "data array". This has not used textures for a long time, there are just global memory arrays now. (On old CUDA GPUs there was a cache for textures but not global memory, so we used to put all data in textures.) * For CUDA and HIP, put globals in KernelParams struct like other devices. * Drop __ prefix for data array names, no possibility for naming conflict now that these are in a struct.	2022-06-20 12:30:48 +02:00
Brecht Van Lommel	2c1bffa286	Cleanup: add verbose logging category names instead of numbers And use them more consistently than before.	2022-06-17 14:08:14 +02:00
Brecht Van Lommel	24246d9870	Cleanup: replace uint4 by AttributeMap struct	2022-06-17 14:08:14 +02:00
Michael Jones	19e0b60f3e	Cycles: MetalDeviceQueue - capture of multiple dispatches, and some tidying This patch adds a new mode of gpu capture (env var `CYCLES_DEBUG_METAL_CAPTURE_SAMPLES`) to capture a block of dispatches between "reset" calls. It also fixes member data naming inconsistencies and adds some missing OS version checks. Screenshot showing .gputrace capture in Xcode 14.0 beta (using `CYCLES_DEBUG_METAL_CAPTURE_SAMPLES="1"` and `CYCLES_DEBUG_METAL_CAPTURE_LIMIT="10"`): {F13155703} Reviewed By: sergey, brecht Differential Revision: https://developer.blender.org/D15179	2022-06-13 13:42:07 +01:00
Sergey Sharybin	0fddff027e	Cleanup: Unused but set variable in Cycles Metal profiler	2022-06-09 10:20:26 +02:00
Aaron Carlisle	a632260828	Cleanup: Removed unused variable	2022-06-08 22:28:46 -04:00
Michael Jones	4412e14708	Cycles: Useful Metal backend debug & profiling functionality This patch adds some useful debugging & profiling env vars to the Metal backend: - `CYCLES_METAL_PROFILING`: output a per-kernel timing report at the end of the render - `CYCLES_METAL_DEBUG`: enable per-dispatch tracing (very verbose) - `CYCLES_DEBUG_METAL_CAPTURE_KERNEL`: enable programatic .gputrace capture for a specified kernel index Here's an example of the timing report with `CYCLES_METAL_PROFILING` enabled: ``` --------------------------------------------------------------------------------------------------- Kernel name Total threads Dispatches Avg. T/D Time Time% --------------------------------------------------------------------------------------------------- integrator_init_from_camera 657,407,232 161 4,083,274 0.24s 0.51% integrator_intersect_closest 1,629,288,440 681 2,392,494 15.18s 32.12% integrator_intersect_shadow 751,652,291 470 1,599,260 5.80s 12.28% integrator_shade_background 304,612,074 263 1,158,220 1.16s 2.45% integrator_shade_surface 1,159,764,041 676 1,715,627 20.57s 43.52% integrator_shade_shadow 598,885,847 418 1,432,741 1.27s 2.69% integrator_queued_paths_array 2,969,650,130 805 3,689,006 0.35s 0.74% integrator_queued_shadow_paths_array 593,936,619 379 1,567,115 0.14s 0.29% integrator_terminated_paths_array 22,205,417 155 143,260 0.05s 0.10% integrator_sorted_paths_array 2,517,140,043 676 3,723,579 1.65s 3.50% integrator_compact_paths_array 648,912,748 155 4,186,533 0.03s 0.07% integrator_compact_states 20,872,687 155 134,662 0.14s 0.29% integrator_terminated_shadow_paths_array 374,100,675 438 854,111 0.16s 0.33% integrator_compact_shadow_paths_array 503,768,657 438 1,150,156 0.05s 0.10% integrator_compact_shadow_states 37,664,941 202 186,460 0.23s 0.50% integrator_reset 25,165,824 6 4,194,304 0.06s 0.12% film_convert_combined_half_rgba 3,110,400 6 518,400 0.00s 0.01% prefix_sum 676 676 1 0.19s 0.40% --------------------------------------------------------------------------------------------------- 6,760 47.27s 100.00% --------------------------------------------------------------------------------------------------- ``` Reviewed By: brecht Differential Revision: https://developer.blender.org/D15044	2022-06-07 11:08:39 +01:00
Campbell Barton	263371dc4e	Cleanup: spelling in comments, additional white space	2022-06-07 15:01:03 +10:00
Brecht Van Lommel	da45c12bef	Merge branch 'blender-v3.2-release'	2022-06-03 19:02:46 +02:00
Patrick Mours	34f94a02f3	Fix use of OpenGL interop breaking in Hydra viewports that do not support it Rendering directly to a resource using OpenGL interop and Hgi doesn't work in Houdini, since it never uses the resulting resource (it does not call `HdRenderBuffer::GetResource`). But since doing that simultaneously disables mapping (`HdRenderBuffer::Map` is not implemented then), nothing was displayed. To fix this, keep track of whether a Hydra viewport does support displaying a Hgi resource directly, by checking whether `HdRenderBuffer::GetResource` is ever called and only enable use of OpenGL interop if that is the case. Differential Revision: https://developer.blender.org/D15090	2022-06-03 18:56:30 +02:00
Dalai Felinto	e7156be86e	Merge remote-tracking branch 'origin/blender-v3.2-release'	2022-06-03 16:13:51 +02:00

1 2 3 4 5 ...

7166 Commits