griefith/test

Author	SHA1	Message	Date
Campbell Barton	ac447ba1a3	Cleanup: clang-format, trailing space	2021-11-30 10:15:17 +11:00
Campbell Barton	76471dbd5e	Cleanup: capitalize NOTE tag	2021-11-30 10:15:17 +11:00
Campbell Barton	4e45265dc6	Cleanup: spelling in comments & strings	2021-11-30 10:15:17 +11:00
Brecht Van Lommel	4fac3be146	Fix Cycles OptiX doing a bit too much work for almost opaque curve shadows Found in D13353, likely has no significant impact in performance.	2021-11-29 18:41:37 +01:00
Michael Jones	f613c4c095	Cycles: MetalRT support (kernel side) This patch adds MetalRT support to Cycles kernel code. It is mostly additive in nature or confined to Metal-specific code, however there are a few areas where this interacts with other code: - MetalRT closely follows the Optix implementation, and in some cases (notably handling of transforms) it makes sense to extend Optix special-casing to MetalRT. For these generalisations we now have `__KERNEL_GPU_RAYTRACING__` instead of `__KERNEL_OPTIX__`. - MetalRT doesn't support primitive offsetting (as with `primitiveIndexOffset` in Optix), so we define and populate a new kernel texture, `__object_prim_offset`, containing per-object primitive / curve-segment offsets. This is referenced and applied in MetalRT intersection handlers. - Two new BVH layout enum values have been added: `BVH_LAYOUT_METAL` and `BVH_LAYOUT_MULTI_METAL_EMBREE` for XPU mode). Some host-side enum case handling has been updated where it is trivial to do so. Ref T92212 Reviewed By: brecht Maniphest Tasks: T92212 Differential Revision: https://developer.blender.org/D13353	2021-11-29 15:20:26 +00:00
Michael Jones	eb7827e797	Cycles: Fix film convert address space mismatch on Metal This patch fixes an address space mismatch in the film convert kernels on Metal. The `film_get_pass_pixel_...` functions take a `ccl_private` result pointer, but the film convert kernels pass a `ccl_global` memory pointer. Specialising the pass-fetch functions with templates results in compilation errors on Visual Studio, so instead this patch just adds an intermediate local on Metal. Reviewed By: brecht Differential Revision: https://developer.blender.org/D13350	2021-11-26 13:58:48 +00:00
Sergey Sharybin	5ffb9b6dc4	Merge remote-tracking branch 'origin/blender-v3.0-release'	2021-11-25 10:22:32 +01:00
Sergey Sharybin	726bc3a46b	Fix T93155: Approximate shadow catcher displayed wrong on CPU and GPU Was happening during rendering, causing visual artifacts when doing CPU+GPU rendering, and giving different in-progress results on different devices. The root of the issue comes to the fact that math used in the approximate shadow catcher calculation might have resulted in negative alpha channel, and negative values for display are handled differently on CPU and GPU. Such difference in handling is caused by an approximate conversion used on the CPU for the performance reasons. This change makes it so no negative alpha is generated by the approximate shadow catcher. Not sure if we need some explicit clamping somewhere to deal with possible negative values coming from somewhere else. The shadow catcher cornell box tests are to be updated for the new code, but the new result seems to be more accurate. Differential Revision: https://developer.blender.org/D13354	2021-11-25 10:17:52 +01:00
William Leeson	c49d2cbe92	Merge branch 'blender-v3.0-release' to bring in D13042: Fix performance decrease with Scrambling Distance on	2021-11-25 09:41:03 +01:00
Alaska	b41c72b710	Fix performance decrease with Scrambling Distance on With the current code in master, scrambling distance is enabled on non-hardware accelerated ray tracing devices see a measurable performance decrease when compared scrambling distance on vs off. From testing, this performance decrease comes from the large tile sizes scheduled in `tile.cpp`. This patch attempts to address the performance decrease by using different algorithms to calculate the tile size for devices with hardware accelerated ray traversal and devices without. Large tile sizes for hardware accelerated devices and small tile sizes for others. Most of this code is based on proposals from @brecht and @leesonw Reviewed By: brecht, leesonw Differential Revision: https://developer.blender.org/D13042	2021-11-25 09:32:26 +01:00
Patrick Mours	7a97e925fd	Cycles: Add support for building with OptiX 7.4 SDK and use built-in catmull-rom curve type Some enum names were changed/removed in OptiX 7.4, so some changes are necessary to make things compile still. In addition, OptiX 7.4 also adds built-in support for catmull-rom curves, so it is no longer necessary to convert the catmull-rom data to cubic bsplines first, and has endcaps disabled by default now, so can remove the special handling via any-hit programs that filtered them out before. Differential Revision: https://developer.blender.org/D13351	2021-11-24 16:33:04 +01:00
Sergey Sharybin	db450c9320	Merge branch 'blender-v3.0-release'	2021-11-23 16:38:30 +01:00
Sergey Sharybin	70424195a8	Cycles: Fix possible access to non-initialized light sample in volume Happened in barbershop file where number of bounces to the light was reached. Differential Revision: https://developer.blender.org/D13336	2021-11-23 16:38:15 +01:00
Brecht Van Lommel	48c2b4012f	Merge branch 'blender-v3.0-release'	2021-11-22 21:06:10 +01:00
Brecht Van Lommel	29681f186e	Fix T93283: Cycles render error with CUDA CPU + GPU after recent optimization BVH2 triangle intersection was broken on the GPU since packed floats can't be loaded directly into SSE. The better long term solution for performance would be to build a BVH2 for GPU and Embree for CPU, similar to what we do for OptiX.	2021-11-22 21:02:46 +01:00
Brecht Van Lommel	e2b736aa40	Fix part of T93278: transparent glass option not working with environment pass	2021-11-22 20:58:09 +01:00
Brecht Van Lommel	06a2e2b28c	Merge branch 'blender-v3.0-release'	2021-11-19 18:05:17 +01:00
Brecht Van Lommel	1b686c60b5	Fix T93046: Cycles world volume rendering very slow in OptiX with some scenes With very long ray distance, OptiX ends up traversing many BVH nodes due to a feature that improves precision. However this causes very slow rendering. We now avoid generating such long rays by rejecting the few samples that have long ray distances and very low probability of being generated. This should not meaningfully affect render results. Thanks to Sergey and Patrick for the investigation.	2021-11-19 17:42:22 +01:00
Brecht Van Lommel	1b94c53aa6	Cleanup: fix typos in comments and docs Contributed by luzpaz. Differential Revision: https://developer.blender.org/D10447	2021-11-19 13:02:16 +01:00
Brecht Van Lommel	167ee8f2c7	Merge branch 'blender-v3.0-release'	2021-11-18 19:37:48 +01:00
Brecht Van Lommel	fd2a155d06	Fix T91797: Cycles volume rendering artifact with overlapping volumes With the new volume rendering code this was no longer accurate, we always need to use a new dimension for the next volume segment.	2021-11-18 19:27:37 +01:00
Sybren A. Stüvel	ada6742601	Merge remote-tracking branch 'origin/blender-v3.0-release'	2021-11-18 17:58:26 +01:00
Brecht Van Lommel	f0be276514	Fix T93082: Cycles baking not handling transparency correctly For baking, replace transparent BSDF with holdout for baking. This ensure no objects behind are baked, and that the baked image has alpha.	2021-11-18 17:13:16 +01:00
Michael Jones	d1f944c186	Cycles: declare constants at program scope on Metal MSL requires that constant address space literals be declared at program scope. This patch moves the `blackbody_table_r/g/b` and `cie_colour_match` constants into separate files so they can be declared at the appropriate scope. Ref T92212 Differential Revision: https://developer.blender.org/D13241	2021-11-18 14:38:05 +01:00
Michael Jones	d19e35873f	Cycles: several small fixes and additions for MSL This patch contains many small leftover fixes and additions that are required for Metal-enablement: - Address space fixes and a few other small compile fixes - Addition of missing functionality to the Metal adapter headers - Addition of various scattered `__KERNEL_METAL__` blocks (e.g. for atomic support & maths functions) Ref T92212 Differential Revision: https://developer.blender.org/D13263	2021-11-18 14:38:02 +01:00
Brecht Van Lommel	c0d52db783	Merge branch 'blender-v3.0-release'	2021-11-18 14:33:43 +01:00
Brecht Van Lommel	bd2e3bb7bd	Fix T93045: Cycles HIP not rendering OpenVDB volumes Build HIP kernels with NanoVDB, and patch NanoVDB to work with HIP. This is a header only library so no rebuild is needed. The changes are being submitted upstream to openvdb, so this patch should be temporary. Thanks Thomas for help testing this.	2021-11-18 13:24:56 +01:00
Brecht Van Lommel	fa7a6d67a8	Fix Cycles CUDA/HIP compiler error after recent changes	2021-11-17 19:56:18 +01:00
Sebastian Herholz	d9bc8f189c	Cycles: add build option to enable a debugging feature for MIS This patch adds a CMake option "WITH_CYCLES_DEBUG" which builds cycles with a feature that allows debugging/selecting the direct-light sampling strategy. The same option may later be used to add other debugging features that could affect performance in release builds. The three options are: * Forward path tracing (e.g., via BSDF or phase function) * Next-event estimation * Multiple importance sampling combination of the previous two methods Such a feature is useful for debugging light different sampling, evaluation, and pdf methods (e.g., for light sources and BSDFs). Differential Revision: https://developer.blender.org/D13152	2021-11-17 18:03:56 +01:00
Brecht Van Lommel	063ad8635e	Cycles: reduce triangle memory usage with packed_float3 Depends on D13243 Differential Revision: https://developer.blender.org/D13244	2021-11-17 17:29:41 +01:00
Brecht Van Lommel	9937d5379c	Cycles: add packed_float3 type for storage Introduce a packed_float3 type for smaller storage that is exactly 3 floats, instead of 4. For computation float3 is still used since it can use SIMD instructions. Ref T92212 Differential Revision: https://developer.blender.org/D13243	2021-11-17 17:29:41 +01:00
Hans Goudey	c9fb08e075	Merge branch 'blender-v3.0-release'	2021-11-16 14:55:13 -06:00
Brecht Van Lommel	7293c1b357	Fix T93106: Cycles SSS not working with normals pointing inside	2021-11-16 19:44:45 +01:00
Michael Jones	64003fa4b0	Cycles: Adapt volumetric lambda functions to work on MSL This patch adapts the existing volumetric read/write lambda functions for Metal. Lambda expressions are not supported on MSL, so two new macros `VOLUME_READ_LAMBDA` and `VOLUME_WRITE_LAMBDA` have been defined with a default implementation which, on Metal, is overridden to use inline function objects. This patch also removes the last remaining mention of the now-unused `ccl_addr_space`. Ref T92212 Reviewed By: leesonw Maniphest Tasks: T92212 Differential Revision: https://developer.blender.org/D13234	2021-11-16 13:42:23 +00:00
Campbell Barton	1143bf281a	Cleanup: spelling in comments, comment block formatting	2021-11-13 13:07:13 +11:00
Campbell Barton	acc800d24d	Cleanup: clang-format	2021-11-13 12:47:18 +11:00
Brecht Van Lommel	1b55b911f2	Merge branch 'blender-v3.0-release'	2021-11-12 20:04:05 +01:00
Brecht Van Lommel	b4d9b8b7f8	Fix T91893, T92455: wrong transmission pass with hair and multiscatter glass We need to increase GPU memory usage a bit. Unfortunately we can't get away with writing either reflection or transmission passes because these BSDFs may scatter in either direction but still must be in a fixed reflection or transmission category to match up with the color passes.	2021-11-12 20:03:46 +01:00
Brecht Van Lommel	ef0b8d6306	Fix T92002: no Cycles combined baking support for filter settings	2021-11-12 20:03:46 +01:00
Sergey Sharybin	ce395c84a3	Merge branch 'blender-v3.0-release'	2021-11-11 15:29:35 +01:00
Sergey Sharybin	d26d3cfe19	Fix T92868: Cycles catcher with transparency crashes The issue was caused by splitting happening twice. Fixed by checking for split flag which is assigned to the both states during split. The tricky part was to write catcher data at the moment of split: the transparency and shadow catcher sample count is to be accumulated at that point. Now it is happening in the `intersect_closest` kernel. The downside is that render buffer is to be passed to the kernel, but the benefit is that extra split bounce check is not needed now. Had to move the passes write to shadow catcher header, since include of `film/passes.h` causes all the fun of requirement to have BSDF data structures available. Differential Revision: https://developer.blender.org/D13177	2021-11-11 15:21:35 +01:00
Andrii	c63e735f6b	Cycles: Add sample offset option This patch exposes the sampling offset option to Blender. It is located in the "Sampling > Advanced" panel. For example, this can be useful to parallelize rendering and distribute different chunks of samples for each computer to render. --- I also had to add this option to `RenderWork` and `RenderScheduler` classes so that the sample count in the status string can be calculated correctly. Reviewed By: leesonw Differential Revision: https://developer.blender.org/D13086	2021-11-11 09:39:25 +01:00
Brecht Van Lommel	3fa86f4b28	Merge branch 'blender-v3.0-release'	2021-11-10 20:19:09 +01:00
Brecht Van Lommel	6b0008129e	Fix T92972: Cycles HIP wrong render display after a recent refactor It's unclear why this fails. Maybe the size of half4 is not the expected 8 bytes and adjacent pixels are overwritten. Or there is some bug in the HIP compiler writing a struct into global memory, which we probably don't do elsewhere in the kernel. Thanks to Thomas, William and Jeroen for helping investigate this.	2021-11-10 20:03:07 +01:00
Patrick Mours	f565620435	Fix T92985: CUDA errors with Cycles film convert kernels rB3a4c8f406a3a3bf0627477c6183a594fa707a6e2 changed the macros that create the film convert kernel entry points, but in the process accidentally changed the parameter definition to one of those (which caused CUDA launch and misaligned address errors) and changed the implementation as well. This restores the correct implementation from before. In addition, the `ccl_gpu_kernel_threads` macro did not work as intended and caused the generated launch bounds to end up with an incorrect input for the second parameter (it was set to "thread_num_registers", rather than the result of the block number calculation). I'm not entirely sure why, as the macro definition looked sound to me. Decided to simply go with two separate macros instead, to simplify and solve this. Also changed how state is captured with the `ccl_gpu_kernel_lambda` macro slightly, to avoid a compiler warning (expression has no effect) that otherwise occurred. Maniphest Tasks: T92985 Differential Revision: https://developer.blender.org/D13175	2021-11-10 15:49:50 +01:00
Michael Jones	3a4c8f406a	Cycles: Adapt shared kernel/device/gpu layer for MSL This patch adapts the shared kernel entrypoints so that they can be compiled as MSL (Metal Shading Language). Where possible, the adaptations avoid changes in common code. In MSL, kernel function inputs are explicitly bound to resources. In the case of argument buffers, we declare a struct containing the kernel arguments, accessible via device pointer. This differs from CUDA and HIP where kernel function arguments are declared as traditional C-style function parameters. This patch adapts the entrypoints declared in kernel.h so that they can be translated via a new `ccl_gpu_kernel_signature` macro into the required parameter struct + kernel entrypoint pairing for MSL. MSL buffer attribution must be applied to function parameters or non-static class data members. To allow universal access to the integrator state, kernel data, and texture fetch adapters, we wrap all of the shared kernel code in a `MetalKernelContext` class. This is achieved by bracketing the appropriate kernel headers with "context_begin.h" and "context_end.h" on Metal. When calling deeper into the kernel code, we must reference the context class (e.g. `context.integrator_init_from_camera`). This extra prefixing is performed by a set of defines in "context_end.h". These will require explicit maintenance if entrypoints change. We invite discussion on more maintainable ways to enforce correctness. Lambda expressions are not supported on MSL, so a new `ccl_gpu_kernel_lambda` macro generates an inline function object and optionally capturing any required state. This yields the same behaviour. This approach is applied to all parallel_... implementations which are templated by operation. The lambda expressions in the film_convert... kernels don't adapt cleanly to use function objects. However, these entrypoints can be macro-generated more concisely to avoid lambda expressions entirely, instead relying on constant folding to handle the pixel/channel conversions. A separate implementation of `gpu_parallel_active_index_array` is provided for Metal to workaround some subtle differences in SIMD width, and also to encapsulate some required thread parameters which must be declared as explicit entrypoint function parameters. Ref T92212 Reviewed By: brecht Maniphest Tasks: T92212 Differential Revision: https://developer.blender.org/D13109	2021-11-09 21:43:10 +00:00
Brecht Van Lommel	5f44298280	Fix T92645: Cycles OSL crash due use of uninitialized pointer Thanks to Ilja Razinkov for identifying the problem and solution.	2021-11-09 15:29:41 +01:00
Patrick Mours	440a3475b8	Cycles: Improve OptiX denoising with dark images and fix crash when denoiser is destroyed Adds a pass before denoising that calculates the intensity of the image, which can be passed into the OptiX denoiser for more optimal results for very dark or very bright images. In addition this also fixes a crash that sometimes occurred on exit. The OptiX denoiser object has to be destroyed before the OptiX device context object (since it references that). But in C++ the destructor function of a class is called before its fields are destructed, so "~OptiXDevice" was always called before "OptiXDevice::~Denoiser" and therefore "optixDeviceContextDestroy" was called before "optixDenoiserDestroy", hence the crash. Differential Revision: https://developer.blender.org/D13160	2021-11-09 14:49:00 +01:00
Brecht Van Lommel	c56cf50bd0	Fix T92876: Cycles incorrect volume emission + absorption handling	2021-11-09 13:04:58 +01:00
Brecht Van Lommel	97ff37bf54	Cycles: perform CPU film reading in the kernel, to use AVX2 half conversion Adds a bunch of CPU kernel function to process on row of pixels, and use those instead of calling unoptimized implementations. Fixes T92598	2021-11-05 22:04:36 +01:00

1 2 3 4 5 ...

2704 Commits