griefith/test

Author	SHA1	Message	Date
Michael Jones	482fb791ce	Fix #105100 : Metal using wrong kernels in multi-pass renders This fixes issue [#105100](https://projects.blender.org/blender/blender/issues/105100) where multi-pass renders can be incorrect due to kernels using stale specialisation constants (e.g. when rendering Pokedstudio). This patch adds a new group of md5 hashes (`global_defines_md5`) to track whether the injected block of #defines is stale and regenerate the source string as appropriate. It also renames the existing group of md5 hashes from `source_md5` to `kernels_md5` to clarify that these refer to a specific kernel set rather than just the source (which might build an arbitrarily large number of kernel sets). Pull Request #105103	2023-02-23 11:07:28 +01:00
Brecht Van Lommel	6583acb880	Fix Cycles MetalRT access of macOS 11 features when unavailable After recent changes in `2d994de`. Pull Request #104976	2023-02-21 12:03:21 +01:00
Brecht Van Lommel	6a0b1eae8c	Fix #104097 : re-enable Cycles AMD Vega support The internal compiler error appears to be gone. Unclear why it appeared in the first place and why it's gone now. Just random kernel code changes causing it. Pull Request #104719	2023-02-13 22:53:08 +01:00
Campbell Barton	91346755ce	Cleanup: use '#' prefix for issues instead of 'T' Match the convention from Gitea instead of Phabricator's T for tasks.	2023-02-12 14:56:05 +11:00
Michael Jones (Apple)	01480229b1	Cycles: Fix MetalRT checkbox not hooked up to device on AMD (Follow on from D17043) On AMD Navi2 devices the MetalRT checkbox was not hooked up properly and had no effect. This patch fixes it. Co-authored-by: Michael Jones <michael_p_jones@apple.com> Pull Request #104520	2023-02-10 10:55:39 +01:00
Lucas Tadeu	a1282ab015	Fix Cycles debug build error after host falback changes Introduced in dcfb6df9ce6. Co-authored-by: Lucas Tadeu Teixeira <lucas@lucastadeu.com> Pull Request #104454	2023-02-08 19:27:40 +01:00
Campbell Barton	a99022e22d	Cleanup: spelling in comments	2023-02-07 14:17:01 +11:00
Nikita Sirgienko	6dcfb6df9c	Cycles: Abstract host memory fallback for GPU devices Host memory fallback in CUDA and HIP devices is almost identical. We remove duplicated code and create a shared generic version that other devices (oneAPI) will be able to use. Reviewed By: brecht Differential Revision: https://developer.blender.org/D17173	2023-02-06 22:19:32 +01:00
Michael Jones	2d994de77c	Cycles: MetalRT optimisation for subsurface intersection queries This patch optimises subsurface intersection queries on MetalRT. Currently intersect_local traverses from the scene root, retrospectively discarding all non-local hits. Using a lookup of bottom level acceleration structures, we can explicitly query only the relevant instance. On M1 Max, with MetalRT selected, this can give a render speedup of 15-20% for scenes like Monster which make heavy use of subsurface scattering. Patch authored by Marco Giordano. Reviewed By: brecht Differential Revision: https://developer.blender.org/D17153	2023-02-06 19:12:29 +00:00
Patrick Mours	f2538c7173	Fix T104335: MNEE + OptiX OSL results in illegal address error The OptiX pipeline created for OSL was missing sufficient continuation stack to handle the MNEE ray generation program.	2023-02-06 15:06:52 +01:00
Michael Jones	654e1e901b	Cycles: Use local atomics for faster shader sorting (enabled on Metal) This patch adds two new kernels: SORT_BUCKET_PASS and SORT_WRITE_PASS. These replace PREFIX_SUM and SORTED_PATHS_ARRAY on supported devices (currently implemented on Metal, but will be trivial to enable on the other backends). The new kernels exploit sort partitioning (see D15331) by sorting each partition separately using local atomics. This can give an overall render speedup of 2-3% depending on architecture. As before, we fall back to the original non-partitioned sorting when the shader count is "too high". Reviewed By: brecht Differential Revision: https://developer.blender.org/D16909	2023-02-06 11:18:26 +00:00
Michael Jones	be0912a402	Cycles: Prevent use of both AMD and Intel Metal devices at same time This patch removes the option to select both AMD and Intel GPUs on system that have both. Currently both devices will be selected by default which results in crashes and other poorly understood behaviour. This patch adds precedence for using any discrete AMD GPU over an integrated Intel one. This can be overridden with CYCLES_METAL_FORCE_INTEL. Reviewed By: brecht Differential Revision: https://developer.blender.org/D17166	2023-02-06 11:13:33 +00:00
Michael Jones	0a3df611e7	Fix T103393: Cycles: Undefine __LIGHT_TREE__ on Metal/AMD to fix perf This patch fixes T103393 by undefining `__LIGHT_TREE__` on Metal/AMD as it has an unexpected & major impact on performance even when light trees are not in use. Patch authored by Prakash Kamliya. Reviewed By: brecht Maniphest Tasks: T103393 Differential Revision: https://developer.blender.org/D17167	2023-02-06 11:12:34 +00:00
Campbell Barton	266d8de687	Cleanup: spelling in comments	2023-02-03 12:41:01 +11:00
Xavier Hallade	8afcecdf1f	Cycles: update Intel Graphics compiler to 101.4032 on Windows A noticeable (>5%) performance regression in oneAPI backend came with `a501a2dbff`. Updating to latest graphics compiler from driver 101.4032 fixes it. I've tested it with current min-supported drivers and it runs well but since compatibility of graphics compiler with older drivers isn't guaranteed, I'm also bumping the min-supported driver versions. If end-users consider latest drivers too fresh to switch to (version isn't released as stable on Linux as of today but should be before Blender 3.5 release), CYCLES_ONEAPI_ALL_DEVICES=1 env variable can be used. Intel Graphics Compiler on Linux will be updated in a later commit so we can then close D16984. Reviewed By: sergey, LazyDodo	2023-01-23 19:36:34 +01:00
Brecht Van Lommel	8e56ded86d	Cycles: temporarily disable AMD Vega GPU rendering due to compiler bug To make daily builds pass while we figure this out. Ref T104097	2023-01-23 17:30:12 +01:00
Brecht Van Lommel	fe552bf236	Cleanup: make format	2023-01-19 22:48:05 +01:00
Michael Jones	e270a198a5	Cycles: Markup to disable specialisation of kernel data fields (Metal) This patch adds markup to specify that certain kernel data constants should not be specialised. Currently it is used for `tabulated_sobol_sequence_size` and `sobol_index_mask` which change frequently based on the aa sample count, trash the shader cache, and have little bearing on performance. Reviewed By: brecht Differential Revision: https://developer.blender.org/D16968	2023-01-19 17:57:42 +00:00
Michael Jones	08b3426df9	Cycles: Occupancy tuning for new higher end M2 machines This patch adds occupancy tuning for the newly announced high-end M2 machines, giving 10-15% render speedup over a pre-tuned build. Reviewed By: brecht Differential Revision: https://developer.blender.org/D17037	2023-01-19 17:56:40 +00:00
Brecht Van Lommel	a84a8a528d	Cycles: remove SSE3 and AVX kernel optimization levels While keeping SSE2, SSE4.1 and AVX2. This does not affect hardware support, it only slightly reduces performance for some older CPUs. To reduce maintenance cost and improve compile times. Differential Revision: https://developer.blender.org/D16978	2023-01-16 17:53:36 +01:00
Campbell Barton	63c985e0f7	Cleanup: format	2023-01-09 18:56:54 +11:00
Campbell Barton	14fc02f91d	Cleanup: spelling in comments	2023-01-06 14:00:36 +11:00
Brecht Van Lommel	87f7b630b5	Cleanup: make format	2023-01-05 19:43:19 +01:00
Michael Jones	a7cc6e015c	Cycles: Additional Metal kernel specialisation exposed through UI This patch adds a new "Kernel Optimization Level" dropdown menu to control Metal kernel specialisation. Currently this defaults to "full" optimisation, on the assumption that the changes proposed in D16371 will address usability concerns around app responsiveness and shader cache housekeeping. Reviewed By: brecht Differential Revision: https://developer.blender.org/D16514	2023-01-04 23:36:52 +00:00
Chris Blackbourn	496d736adc	Cleanup: format	2023-01-05 11:21:51 +13:00
Jacques Lucke	2540a52f91	Cleanup: quiet unused parameter warning	2023-01-04 17:30:55 +01:00
Michael Jones	77c3e67d3d	Cycles: Improved render start/stop responsiveness on Metal All kernel specialisation is now performed in the background regardless of kernel type, meaning that the first render will be visible a few seconds sooner. The only exception is during benchmark warm up, in which case we wait for all kernels to be cached. When stopping a render, we call a new `cancel()` method on the device which causes any outstanding compilation work to be cancelled, and we destroy the device in a detached thread so that any stale queued compilations can be safely purged without blocking the UI for longer than necessary. Reviewed By: brecht Differential Revision: https://developer.blender.org/D16371	2023-01-04 16:00:53 +00:00
Nikita Sirgienko	858fffc2df	Cycles: oneAPI: add support for SYCL host task This functionality is related only to debugging of SYCL implementation via single-threaded CPU execution and is disabled by default. Host device has been deprecated in SYCL 2020 spec and we removed it in `305b92e05f`. Since this is still very useful for debugging, we're restoring a similar functionality here through SYCL 2020 Host Task.	2023-01-03 20:47:24 +01:00
Campbell Barton	2ac6e26c25	Cleanup: cmake formatting	2022-12-17 13:33:27 +11:00
Patrick Mours	a8530d31c2	Fix T103258: Deleting a shader with OptiX OSL results in an illegal address error Materials without connections to the output node would crash with OSL in OptiX, since the Cycles `OSLCompiler` generates an empty shader group reference for them, which resulted in the OptiX device implementation setting an empty SBT entry for the corresponding direct callables, which then crashed when calling those direct callables was attempted in `osl_eval_nodes`. This fixes that by setting the SBT entries for empty shader groups to a dummy direct callable that does nothing.	2022-12-16 15:41:21 +01:00
Patrick Mours	c9eb583460	Fix T103257: Enabling or disabling viewport denoising while using OptiX OSL results in an error Switching viewport denoising causes kernels to be reloaded with a new feature mask, which would destroy the existing OptiX pipelines. But OSL kernels were not reloaded as well, leaving the shading pipeline uninitialized and therefore causing an error when it is later attempted to execute it. This fixes that by ensuring OSL kernels are always reloaded when the normal kernels are too.	2022-12-16 14:04:03 +01:00
Hallam Roberts	a501a2dbff	Images: add mirror extension type This adds a new mirror image extension type for shaders and geometry nodes (next to the existing repeat, extend and clip options). See D16432 for a more detailed explanation of `wrap_mirror`. This also adds a new sampler flag `GPU_SAMPLER_MIRROR_REPEAT`. It acts as a modifier to `GPU_SAMPLER_REPEAT`, so any `REPEAT` flag must be set for the `MIRROR` flag to have an effect. Differential Revision: https://developer.blender.org/D16432	2022-12-14 19:27:29 +01:00
Thomas Dinges	54aec4629e	Cleanup: Remove unused code in Cycles * preempt_attr was copied from CUDA, but not used in HIP. * Remove shadowed variable before conditional in EnvironmentTextureNode code. Differential Revision: https://developer.blender.org/D16741	2022-12-12 18:15:41 +01:00
Michael Jones	2dc51fccb8	Fix T101787, T102786. Cycles: Improved out-of-memory messaging on Metal This patch adds a new `max_working_set_exceeded()` check on Metal so that we can display a "System is out of GPU memory" message to the user. Without this, we get obtuse "CommandBuffer failed" errors at render time due to exceeding the size limit of resident resources. Likely fix for T101787 & T102786. Reviewed By: brecht Differential Revision: https://developer.blender.org/D16713	2022-12-07 13:56:21 +00:00
Weizhen Huang	ee89f213de	Cycles: improve many lights sampling using light tree Uses a light tree to more effectively sample scenes with many lights. This can significantly reduce noise, at the cost of a somewhat longer render time per sample. Light tree sampling is enabled by default. It can be disabled in the Sampling > Lights panel. Scenes using light clamping or ray visibility tricks may render different as these are biased techniques that depend on the sampling strategy. The implementation is currently disabled on AMD HIP. This is planned to be fixed before the release. Implementation by Jeffrey Liu, Weizhen Huang, Alaska and Brecht Van Lommel. Ref T77889	2022-12-05 16:09:03 +01:00
Brecht Van Lommel	009f7de619	Cleanup: use better matching integer types for graphics interop handle Ref D16042	2022-12-01 15:55:48 +01:00
Nikita Sirgienko	f07b09da27	Cycles: Improve oneAPI backend support for non-Intel platforms	2022-11-25 17:46:59 +01:00
Nikita Sirgienko	412642865d	Cleanup: Resolve a warning for the ambiguity on the parenthesis in oneAPI code No functional changes.	2022-11-24 18:05:02 +01:00
Patrick Mours	a859837cde	Cleanup: Move OptiX denoiser code from device into denoiser class Cycles already treats denoising fairly separate in its code, with a dedicated `Denoiser` base class used to describe denoising behavior. That class has been fully implemented for OIDN (`denoiser_oidn.cpp`), but for OptiX was mostly empty (`denoiser_optix.cpp`) and denoising was instead implemented in the OptiX device. That meant denoising code was split over various files and directories, making it a bit awkward to work with. This patch moves the OptiX denoising implementation into the existing `OptiXDenoiser` class, so that everything is in one place. There are no functional changes, code has been mostly moved as-is. To retain support for potential other denoiser implementations based on a GPU device in the future, the `DeviceDenoiser` base class was kept and slightly extended (and its file renamed to `denoiser_gpu.cpp` to follow similar naming rules as `path_trace_work_*.cpp`). Differential Revision: https://developer.blender.org/D16502	2022-11-15 15:50:01 +01:00
Michael Jones	b0e2e45496	Cycles: Enable MetalRT pointclouds & other fixes Code authored by Marco Giordano. This fixes pointcloud rendering on MetalRT and some other subtle MetalRT bugs: - Incorrect kernel hashing - Missing specialisation constants - Incorrect visibility filtering - Missing null pointer check Reviewed By: brecht Differential Revision: https://developer.blender.org/D16499	2022-11-14 16:39:18 +00:00
Michael Jones	2c596319a4	Cycles: Cache only up to 5 kernels of each type on Metal This patch adapts D14754 for the Metal backend. Kernels of the same type are already organised into subdirectories which simplifies type matching. Reviewed By: brecht Differential Revision: https://developer.blender.org/D16469	2022-11-11 18:10:29 +00:00
Patrick Mours	e6b38deb9d	Cycles: Add basic support for using OSL with OptiX This patch generalizes the OSL support in Cycles to include GPU device types and adds an implementation for that in the OptiX device. There are some caveats still, including simplified texturing due to lack of OIIO on the GPU and a few missing OSL intrinsics. Note that this is incomplete and missing an update to the OSL library before being enabled! The implementation is already committed now to simplify further development. Maniphest Tasks: T101222 Differential Revision: https://developer.blender.org/D15902	2022-11-09 15:30:21 +01:00
Chris Blackbourn	4b57bc4e5d	Cleanup: format	2022-11-09 08:30:18 +13:00
Brecht Van Lommel	b539d425f0	Merge branch 'blender-v3.4-release'	2022-11-08 19:47:55 +01:00
Gon Solo	c306ccb67f	Fix Cycles error with runtime compilation when there is no path to OptiX SDK If no OPTIX_ROOT is set, nvcc fails to compile because there is a stray "-I" in the arguments. Detect if the include path is empty and act accordingly. Differential Revision: https://developer.blender.org/D16308	2022-11-08 19:40:57 +01:00
Michael Jones	74140d41b1	Cycles: Apple GPU threadgroup tuning This patch tunes maximum threads-per-threadgroup and threads-per-block for faster renders on Apple GPUs. Appropriate tuning is selected based on the GPU architecture (M1 or M2). We see a benchmark uplift of around 5-10% on M1 family chips. Similar uplift is expected on M2 with upcoming OS changes. (Ref T101931) Reviewed By: brecht Maniphest Tasks: T101931 Differential Revision: https://developer.blender.org/D16299	2022-11-07 10:00:46 +00:00
Campbell Barton	6377d00a61	Cleanup: cmake comment line length	2022-11-03 12:11:08 +11:00
Xavier Hallade	454dd3f7f0	Cycles: fix up logic in oneAPI devices filtering CYCLES_ONEAPI_ALL_DEVICES environment variable wasn't working as intended after `305b92e05f`.	2022-10-27 23:09:14 +02:00
Michael Jones	8dd7b5b26b	Cycles: Metal integrator state size tuning This patch tunes the integrator state sizing for Metal (`num_concurrent_states` and `num_concurrent_busy_states`). On all GPUs architecture, we adjust the busy:total states ratio to be 1:4 which gives better rendering performance than the previous 1:16 ratio (independent of total state count). This gives a small performance uplift (e.g. 2-3% on M1 Ultra). Additionally for M2 architectures, we double the overall state size if there is available headroom. Inclusive of the first change, we can expect uplift of close to 10% in future, as this results in larger dispatch sizes and minimises work submission overheads. In order to make an accurate determination of available headroom, we defer the calculation of `num_concurrent_states` and `num_concurrent_busy_states` until the time of integrator state allocation (i.e. after all of the scene data has been allocated). We also refactor `alloc_integrator_soa` to calculate an exact single-state-size in a first pass, right before allocating the integrator SoA buffers in a second pass. Reviewed By: brecht Differential Revision: https://developer.blender.org/D16313	2022-10-24 17:14:33 +01:00
Sergey Sharybin	2c108d5503	Avoid re-compilation of oneAPI AoT kernels when configuration changes Buildbot infrastructure relies on the fact that it can enable and disable `WITH_CYCLES_<COMPUTE>_BINARIES` without affecting speed of incremental builds. This allows buildbot to skip GPU kernels when doing CI regression tests which do not need GPU kernels, as well as it allows to move GPU kernels compilation to a separate step where all the resources are available to the GPU kernel builders. For the oneAPI compute enabling and disabling AoT kernels has much higher implications due to the kernels being a part of the device implementation from the build target perspective. This change makes it so different target names are used for JIT and AoT configurations, which allows CMake to more fully benefit from "caching" the compiled result. The end goal of this change is to make it so sequential build of the same code base on the buildbot happens super fast, Blender binary still needs to be re-linked when the AOT of oneAPI option is toggled, but that's already the case in the buildbot due to the WITH_BUILDINFO. Differential Revision: https://developer.blender.org/D16312	2022-10-21 17:17:51 +02:00

1 2 3 4 5 ...

1118 Commits