test2

Author	SHA1	Message	Date
Sybren A. Stüvel	ada6742601	Merge remote-tracking branch 'origin/blender-v3.0-release'	2021-11-18 17:58:26 +01:00
Brecht Van Lommel	f0be276514	Fix T93082: Cycles baking not handling transparency correctly For baking, replace transparent BSDF with holdout for baking. This ensure no objects behind are baked, and that the baked image has alpha.	2021-11-18 17:13:16 +01:00
Sebastian Herholz	d9bc8f189c	Cycles: add build option to enable a debugging feature for MIS This patch adds a CMake option "WITH_CYCLES_DEBUG" which builds cycles with a feature that allows debugging/selecting the direct-light sampling strategy. The same option may later be used to add other debugging features that could affect performance in release builds. The three options are: * Forward path tracing (e.g., via BSDF or phase function) * Next-event estimation * Multiple importance sampling combination of the previous two methods Such a feature is useful for debugging light different sampling, evaluation, and pdf methods (e.g., for light sources and BSDFs). Differential Revision: https://developer.blender.org/D13152	2021-11-17 18:03:56 +01:00
Brecht Van Lommel	9937d5379c	Cycles: add packed_float3 type for storage Introduce a packed_float3 type for smaller storage that is exactly 3 floats, instead of 4. For computation float3 is still used since it can use SIMD instructions. Ref T92212 Differential Revision: https://developer.blender.org/D13243	2021-11-17 17:29:41 +01:00
Hans Goudey	c9fb08e075	Merge branch 'blender-v3.0-release'	2021-11-16 14:55:13 -06:00
Brecht Van Lommel	7293c1b357	Fix T93106: Cycles SSS not working with normals pointing inside	2021-11-16 19:44:45 +01:00
Michael Jones	64003fa4b0	Cycles: Adapt volumetric lambda functions to work on MSL This patch adapts the existing volumetric read/write lambda functions for Metal. Lambda expressions are not supported on MSL, so two new macros `VOLUME_READ_LAMBDA` and `VOLUME_WRITE_LAMBDA` have been defined with a default implementation which, on Metal, is overridden to use inline function objects. This patch also removes the last remaining mention of the now-unused `ccl_addr_space`. Ref T92212 Reviewed By: leesonw Maniphest Tasks: T92212 Differential Revision: https://developer.blender.org/D13234	2021-11-16 13:42:23 +00:00
Brecht Van Lommel	1b55b911f2	Merge branch 'blender-v3.0-release'	2021-11-12 20:04:05 +01:00
Brecht Van Lommel	b4d9b8b7f8	Fix T91893, T92455: wrong transmission pass with hair and multiscatter glass We need to increase GPU memory usage a bit. Unfortunately we can't get away with writing either reflection or transmission passes because these BSDFs may scatter in either direction but still must be in a fixed reflection or transmission category to match up with the color passes.	2021-11-12 20:03:46 +01:00
Brecht Van Lommel	ef0b8d6306	Fix T92002: no Cycles combined baking support for filter settings	2021-11-12 20:03:46 +01:00
Sergey Sharybin	ce395c84a3	Merge branch 'blender-v3.0-release'	2021-11-11 15:29:35 +01:00
Sergey Sharybin	d26d3cfe19	Fix T92868: Cycles catcher with transparency crashes The issue was caused by splitting happening twice. Fixed by checking for split flag which is assigned to the both states during split. The tricky part was to write catcher data at the moment of split: the transparency and shadow catcher sample count is to be accumulated at that point. Now it is happening in the `intersect_closest` kernel. The downside is that render buffer is to be passed to the kernel, but the benefit is that extra split bounce check is not needed now. Had to move the passes write to shadow catcher header, since include of `film/passes.h` causes all the fun of requirement to have BSDF data structures available. Differential Revision: https://developer.blender.org/D13177	2021-11-11 15:21:35 +01:00
Andrii	c63e735f6b	Cycles: Add sample offset option This patch exposes the sampling offset option to Blender. It is located in the "Sampling > Advanced" panel. For example, this can be useful to parallelize rendering and distribute different chunks of samples for each computer to render. --- I also had to add this option to `RenderWork` and `RenderScheduler` classes so that the sample count in the status string can be calculated correctly. Reviewed By: leesonw Differential Revision: https://developer.blender.org/D13086	2021-11-11 09:39:25 +01:00
Brecht Van Lommel	6c24cafecc	Fix T92876: Cycles incorrect volume emission + absorption handling	2021-11-09 13:13:56 +01:00
Brecht Van Lommel	c56cf50bd0	Fix T92876: Cycles incorrect volume emission + absorption handling	2021-11-09 13:04:58 +01:00
Brecht Van Lommel	d1a9425a2f	Fix T91733, T92486: Cycles wrong shadow catcher with volumes Changes: * After hitting a shadow catcher, re-initialize the volume stack taking into account shadow catcher ray visibility. This ensures that volume objects are included in the stack only if they are shadow catchers. * If there is a volume to be shaded in front of the shadow catcher, the split is now performed in the shade_volume kernel after volume shading is done. * Previously the background pass behind a shadow catcher was done as part of the regular path, now it is done as part of the shadow catcher path. For a shadow catcher path with volumes and visible background, operations are done in this order now: * intersect_closest * shade_volume * shadow catcher split * intersect_volume_stack * shade_background * shade_surface The world volume is currently assumed to be CG, that is it does not exist in the footage. We may consider adding an option to control this, or change the default. With a volume object this control is already possible. This includes refactoring to centralize the logic for next kernel scheduling in intersect_closest.h. Differential Revision: https://developer.blender.org/D13093	2021-11-05 20:50:19 +01:00
Brecht Van Lommel	5c34e34195	Fix part of T91797: Cycles CPU and GPU render differences with camera inside volume	2021-11-04 19:03:49 +01:00
Brecht Van Lommel	ffe115d1a8	Fix T92450: Cycles wrong render with overlapping glass, transparency and volumes We need to store the continuation probability used to make the termination decision in intersect_closest, instead of recomputing it in shade_surface. Because otherwise a shade_volume in between can change the throughput and change the probability.	2021-11-04 16:39:49 +01:00
William Leeson	0b060905d9	Fix T92575: Cycles black pixels when rendering with > 65k samples Differential Revision: https://developer.blender.org/D13039	2021-11-01 08:36:50 +01:00
Brecht Van Lommel	f2cc38a62b	Fix T92255: Cycles Christensen-Burley render errors with scaled objects	2021-10-28 21:53:30 +02:00
Brecht Van Lommel	673984b222	Fix T92158: Cycles crash with Fast GI and area light MIS	2021-10-28 21:33:52 +02:00
Brecht Van Lommel	fd25e883e2	Cycles: remove prefix from source code file names Remove prefix of filenames that is the same as the folder name. This used to help when #includes were using individual files, but now they are always relative to the cycles root directory and so the prefixes are redundant. For patches and branches, git merge and rebase should be able to detect the renames and move over code to the right file.	2021-10-26 15:37:04 +02:00
Brecht Van Lommel	d7d40745fa	Cycles: changes to source code folders structure * Split render/ into scene/ and session/. The scene/ folder now contains the scene and its nodes. The session/ folder contains the render session and associated data structures like drivers and render buffers. * Move top level kernel headers into new folders kernel/camera/, kernel/film/, kernel/light/, kernel/sample/, kernel/util/ * Move integrator related kernel headers into kernel/integrator/ * Move OSL shaders from kernel/shaders/ to kernel/osl/shaders/ For patches and branches, git merge and rebase should be able to detect the renames and move over code to the right file.	2021-10-26 15:36:39 +02:00
Brecht Van Lommel	75704091fc	Cycles: add additive AO support through Fast GI settings Add a Fast GI Method, either Replace for the existing behavior, or Add to add ambient occlusion like the old world settings. This replaces the old Ambient Occlusion settings in the world properties.	2021-10-26 14:56:43 +02:00
Brecht Van Lommel	be558d2d97	Fix T92363: OptiX fails with ambient occlusion node, after recent changes This triggered a compiler bug where it does not handle the sub.s16 PTX instruction. Instead refactor the code so we don't need to do uint16_t subtraction at all. Also update OptiX device to remove the AO pass direct callable. Thanks Patrick Mours for figuring this out.	2021-10-21 21:25:34 +02:00
Brecht Van Lommel	df00463764	Cycles: add shadow path compaction for GPU rendering Similar to main path compaction that happens before adding work tiles, this compacts shadow paths before launching kernels that may add shadow paths. Only do it when more than 50% of space is wasted. It's not a clear win in all scenes, some are up to 1.5% slower. Likely caused by different order of scheduling kernels having an unpredictable performance impact. Still feels like compaction is just the right thing to avoid cases where a few shadow paths can hold up a lot of main paths. Differential Revision: https://developer.blender.org/D12944	2021-10-21 15:38:03 +02:00
Brecht Van Lommel	52c5300214	Cleanup: some renaming to better distinguish main and shadow paths	2021-10-20 17:50:31 +02:00
Brecht Van Lommel	cccfa597ba	Cycles: make ambient occlusion pass take into account transparency again Taking advantage of the new decoupled main and shadow paths. For CPU we just store two nested structs in the integrator state, one for direct light shadows and one for AO. For the GPU we restrict the number of shade surface states to be executed based on available space in the shadow paths queue. This also helps improve performance in benchmark scenes with an AO pass, since it is no longer needed to use the shader raytracing kernel there, which has worse performance. Differential Revision: https://developer.blender.org/D12900	2021-10-20 17:50:31 +02:00
Brecht Van Lommel	fd77a28031	Cycles: bake transparent shadows for hair These transparent shadows can be expansive to evaluate. Especially on the GPU they can lead to poor occupancy when only some pixels require many kernel launches to trace and evaluate many layers of transparency. Baked transparency allows tracing a single ray in many cases by accumulating the throughput directly in the intersection program without recording hits or evaluating shaders. Transparency is baked at curve vertices and interpolated, for most shaders this will look practically the same as actual shader evaluation. Fixes T91428, performance regression with spring demo file due to transparent hair, and makes it render significantly faster than Blender 2.93. Differential Revision: https://developer.blender.org/D12880	2021-10-19 15:11:09 +02:00
Brecht Van Lommel	d06828f0b8	Cycles: avoid intermediate stack array for writing shadow intersections Helps save one OptiX payload and is a bit more efficient. Differential Revision: https://developer.blender.org/D12909	2021-10-19 15:10:55 +02:00
Brecht Van Lommel	943e73b07e	Cycles: decouple shadow paths from main path on GPU The motivation for this is twofold. It improves performance (5-10% on most benchmark scenes), and will help to bring back transparency support for the ambient occlusion pass. * Duplicate some members from the main path state in the shadow path state. * Add shadow paths incrementally to the array similar to what we do for the shadow catchers. * For the scheduling, allow running shade surface and shade volume kernels as long as there is enough space in the shadow paths array. If not, execute shadow kernels until it is empty. * Add IntegratorShadowState and ConstIntegratorShadowState typedefs that can be different between CPU and GPU. For GPU both main and shadow paths juse have an integer for SoA access. Bt with CPU it's a different pointer type so we get type safety checks in code shared between CPU and GPU. * For CPU, add a separate IntegratorShadowStateCPU struct embedded in IntegratorShadowState. * Update various functions to take the shadow state, and make SVM take either type of state using templates. Differential Revision: https://developer.blender.org/D12889	2021-10-19 15:09:29 +02:00
Brecht Van Lommel	41eba47a87	Revert "Cycles: optimize volume stack copying for shadow catcher/compaction" This reverts commit `3065d26097`. Causing crashes in the spring scene.	2021-10-18 22:38:33 +02:00
Brecht Van Lommel	a9cb330815	Cleanup: minor refactoring in preparation of main and shadow path decoupling Ref D12889	2021-10-18 19:02:10 +02:00
Brecht Van Lommel	2430f75279	Cycles: reduce GPU state memory a little * isect Ng is no longer needed for shadows, for main path needed for SSS only * Reduce rng_offset and queued_kernel to 16 bits Ref D12889	2021-10-18 19:02:10 +02:00
Brecht Van Lommel	3065d26097	Cycles: optimize volume stack copying for shadow catcher/compaction Only copy the number of items used instead of the max items. Ref D12889	2021-10-18 19:02:10 +02:00
Brecht Van Lommel	fc4b1fede3	Cleanup: consistently use uint32_t for path flag	2021-10-18 19:02:10 +02:00
Brecht Van Lommel	1df3b51988	Cycles: replace integrator state argument macros * Rename struct KernelGlobals to struct KernelGlobalsCPU * Add KernelGlobals, IntegratorState and ConstIntegratorState typedefs that every device can define in its own way. * Remove INTEGRATOR_STATE_ARGS and INTEGRATOR_STATE_PASS macros and replace with these new typedefs. * Add explicit state argument to INTEGRATOR_STATE and similar macros In preparation for decoupling main and shadow paths. Differential Revision: https://developer.blender.org/D12888	2021-10-18 19:02:10 +02:00
Brecht Van Lommel	2ba7c3aa65	Cleanup: refactor to make number of channels for shader evaluation variable	2021-10-15 15:42:44 +02:00
Michael Jones	a0f269f682	Cycles: Kernel address space changes for MSL This is the first of a sequence of changes to support compiling Cycles kernels as MSL (Metal Shading Language) in preparation for a Metal GPU device implementation. MSL requires that all pointer types be declared with explicit address space attributes (device, thread, etc...). There is already precedent for this with Cycles' address space macros (ccl_global, ccl_private, etc...), therefore the first step of MSL-enablement is to apply these consistently. Line-for-line this represents the largest change required to enable MSL. Applying this change first will simplify future patches as well as offering the emergent benefit of enhanced descriptiveness. The vast majority of deltas in this patch fall into one of two cases: - Ensuring ccl_private is specified for thread-local pointer types - Ensuring ccl_global is specified for device-wide pointer types Additionally, the ccl_addr_space qualifier can be removed. Prior to Cycles X, ccl_addr_space was used as a context-dependent address space qualifier, but now it is either redundant (e.g. in struct typedefs), or can be replaced by ccl_global in the case of pointer types. Associated function variants (e.g. lcg_step_float_addrspace) are also redundant. In cases where address space qualifiers are chained with "const", this patch places the address space qualifier first. The rationale for this is that the choice of address space is likely to have the greater impact on runtime performance and overall architecture. The final part of this patch is the addition of a metal/compat.h header. This is partially complete and will be extended in future patches, paving the way for the full Metal implementation. Ref T92212 Reviewed By: brecht Maniphest Tasks: T92212 Differential Revision: https://developer.blender.org/D12864	2021-10-14 16:14:43 +01:00
Sergey Sharybin	aa46459543	Fix shadow catcher behind transparent object on GPU The assumption about absent shadow path was wrong. The rest of the changes are to ensure shadow paths are finished prior to the split, so that they write to the proper passes. The issue was caught by running regression tests on OptiX. Differential Revision: https://developer.blender.org/D12857	2021-10-14 09:39:38 +02:00
Campbell Barton	c1c6c11ca6	Cleanup: spelling in comments	2021-10-12 17:55:02 +11:00
Brecht Van Lommel	a94343a8af	Cycles: improve SSS Fresnel and retro-reflection in Principled BSDF For details see the "Extending the Disney BRDF to a BSDF with Integrated Subsurface Scattering" paper. We split the diffuse BSDF into a lambertian and retro-reflection component. The retro-reflection component is always handled as a BSDF, while the lambertian component can be replaced by a BSSRDF. For the BSSRDF case, we compute Fresnel separately at the entry and exit points, which may have different normals. As the scattering radius decreases this converges to the BSDF case. A downside is that this increases noise for subsurface scattering in the Principled BSDF, due to some samples going to the retro-reflection component. However the previous logic (also in 2.93) was simple wrong, using a non-sensical view direction vector at the exit point. We use an importance sampling weight estimate for the retro-reflection to try to better balance samples between the BSDF and BSSRDF. Differential Revision: https://developer.blender.org/D12801	2021-10-11 18:22:54 +02:00
Brecht Van Lommel	73a05ff9e8	Cycles: restore Christensen-Burley SSS There is not enough time before the release to improve Random Walk to handle all cases this was used for, so restore it for now. Since there is no more path splitting in cycles-x, this can increase noise in non-flat areas for the sample number of samples, though fewer rays will be traced also. This is fundamentally a trade-off we made in the new design and why Random Walk is a better fit. However the importance resampling we do now does help to reduce noise. Differential Revision: https://developer.blender.org/D12800	2021-10-11 18:22:54 +02:00
Brecht Van Lommel	736be7cf58	Fix T91997: Cycles glass + SSS not rendering correctly	2021-10-08 16:11:02 +02:00
Sergey Sharybin	f01c4f27f9	Fix Cycles speed regression after dynamic volume stack change Only copy required part of volume stack instead of entire stack. Solves time regression introduced by D12759 and avoids need in implementing volume stack calculation to exactly match what the path tracing will do (as well as potentially makes scenes with a lot of volumes ans a tiny bit of deeply nested ones render faster). Still need to look into memory aspect of the regression, but that is for separate patch. Ref T92014 Maniphest Tasks: T92014 Differential Revision: https://developer.blender.org/D12790	2021-10-08 15:44:03 +02:00
Campbell Barton	de07bf2b13	Cleanup: spelling	2021-10-08 13:23:19 +11:00
Brecht Van Lommel	04857cc8ef	Cycles: fully decouple triangle and curve primitive storage from BVH2 Previously the storage here was optimized to avoid indirections in BVH2 traversal. This helps improve performance a bit, but makes performance and memory usage of Embree and OptiX BVHs a bit worse also. It also adds code complexity in other parts of the code. Now decouple triangle and curve primitive storage from BVH2. * Reduced peak memory usage on all devices * Bit better performance for OptiX and Embree * Bit worse performance for CUDA * Simplified code: Intersection.prim/object now matches ShaderData.prim/object No more offset manipulation for mesh displacement before a BVH is built Remove primitive packing code and flags for Embree and OptiX Curve segments are now stored in a KernelCurve struct * Also happens to fix a bug in baking with incorrect prim/object Fixes T91968, T91770, T91902 Differential Revision: https://developer.blender.org/D12766	2021-10-06 17:52:04 +02:00
Sergey Sharybin	0194e54fd3	Fix compilation error with MSVC MSVC does not support variable size array definition. Use maximum possible stack, similar to the GPU case. Not expected to have user-measurable difference.	2021-10-06 16:51:07 +02:00
Sergey Sharybin	c6275da852	Fix T91922: Cycles artifacts with high volume nested level Make volume stack allocated conditionally, potentially based on the actual nested level of objects in the scene. Currently the nested level is estimated by number of volume objects. This is a non-expensive check which is probably enough in practice to get almost perfect memory usage and performance. The conditional allocation is a bit tricky. For the CPU we declare and define maximum possible volume stack, because there are only that many integrator states on the CPU. On the GPU we declare outer SoA to have all volume stack elements, but only allocate actually needed ones. The actually used volume stack size is passed as a pre-processor, which seems to be easiest and fastest for the GPU state copy. There seems to be no speed regression in the demo files on RTX6000. Note that scenes with high nested level of volume will now be slower but correct. Differential Revision: https://developer.blender.org/D12759	2021-10-06 15:46:32 +02:00
Sergey Sharybin	f806bd8261	Fix T91861: Black environment behind shadow catcher Always sample background pass behind shadow catcher (if the pass exists, of course), regardless of whether shadow catcher will be used as approximate or accurate. Allows to combine accurate shadows into an environment map. Differential Revision: https://developer.blender.org/D12747	2021-10-04 15:07:32 +02:00

1 2

58 Commits