griefith/test

Author	SHA1	Message	Date
Campbell Barton	cccc2c77c5	Cleanup: consistent for C-style comment blocks	2025-08-08 07:37:33 +10:00
Campbell Barton	e8501d2f54	Cleanup: grammar corrections, minor improvements to wording Also back-tick quote some code references in comments to differentiate them from English text.	2025-08-06 00:20:39 +00:00
Lukas Stockner	793040ad1c	Cycles: Improve parameter packing for the Principled BSDF The Principled BSDF has a ton of inputs, and the previous SVM code just always allocated stack space for all of them. This results in a ton of additional NODE_VALUE_x SVM nodes, which slow down execution. However, this is not really needed for two reasons: - First, many inputs are only used consitionally. For example, if the subsurface weight is zero, none of the other subsurface inputs are used. - Many of the inputs have a "usual" value that they will have in most materials, so if they happen to have that value we can just indicate that by not allocating space for them. This is a bit similar to the standard "pack the fixed value and provide a stack offset if there's a link" pattern, except that the fixed value is a constant in the code and we allocate a NODE_VALUE_x if a different fixed value is used. Therefore, this PR re-implements the parameter packing in a more efficient way: - If we can determine that a component is disabled, all conditional inputs are disconnected (to avoid generating upstream nodes). - If we can determine that a component is disabled, we skip allocating all conditional inputs on the stack. - The inputs for which a reasonable "usual" value exists are changed to respect that, and to only be allocated if they differ. - param1 and param2 (which are fixed-value-packed as on all BSDF nodes) are used to store IOR and roughness, which have a decent chance to be fixed values. - The parameter packing is more aggressive about using uchar4, which allows to get rid of two SVM nodes while still storing the same inputs. The result is a considerable speedup in scenes that make heavy use of the Principled BSDF: \| Scene \| CPU speedup \| OptiX speedup \| \| --- \| --- \| --- \| \| attic \| 5% \| 9% \| \| bistro \| 5% \| 8% \| \| junkshop \| 5% \| 10% \| \| monster \| 3% \| 4% \| \| spring \| 1% \| 6% \| Pull Request: https://projects.blender.org/blender/blender/pulls/143910	2025-08-04 18:34:58 +02:00
Amogh Shivaram	ff4d840cf8	Cycles: Add polarized Fresnel function for conductors This PR adds a new `fresnel_conductor_polarized` function, which calculates reflectance and phase shift (if requested) for both parallel and perpendicular polarized light. This is needed for applying thin film iridescence to conductors (see !141131). For consistency, this PR also makes `fresnel_conductor` call `fresnel_conductor_polarized` instead of using a fast approximation of the Fresnel equations that is inaccurate at lower n and k values. This will change the output of some Metallic BSDF renders using Physical Conductor and prevent discrepancies when enabling thin film iridescence. I didn't do any rigorous performance testing, but from timing the functions outside of Blender, `fresnel_conductor_polarized` is significantly slower than the approximation, between 1.5-3x depending on the compiler. This makes sense because it has three square roots and the approximation has none. In some informal tests with metallic_multiggx_physical.blend modified to have more spheres, the new renders took around 1-2% longer on both CPU and GPU. There are some avoidable inefficiencies in this approach of just calling `fresnel_conductor_polarized`: - one of the three square roots could be saved since `fresnel_conductor` never needs the phase shift and there are simplifications possible when only calculating the reflectance - there are several unnecessary multiplications by 1.0 since `fresnel_conductor` uses relative IOR and `fresnel_conductor_polarized` doesn't, though those could get optimized out if inlined Pull Request: https://projects.blender.org/blender/blender/pulls/143903	2025-08-04 15:36:36 +02:00
Lukas Stockner	3107d1f962	Cycles: Improve parameter packing for BSDFs and emission This replaces `stack_assign` with `stack_assign_if_linked`, which should save a few SVM nodes for constant parameters. Running benchmarks (all scenes in the benchmark repo, 3 runs, median value for each) shows 1.0% improvement on CPU and 1.5% on OptiX. Not huge, but fairly (all between -0.2% and 3.0%). Pull Request: https://projects.blender.org/blender/blender/pulls/143404	2025-08-04 15:19:40 +02:00
Campbell Barton	a3bf386d43	Cleanup: use full sentences in text editor code-comments Also minor improvements, clarifications.	2025-08-02 13:33:05 +10:00
Weizhen Huang	1667d69d3b	Cleanup: Cycles: use `constexpr` in kernel instead of lambda and macro guard. Should be possible after `ce0ae95ed3` Pull Request: https://projects.blender.org/blender/blender/pulls/143723	2025-08-01 14:06:13 +02:00
Campbell Barton	2c27d2be54	Cleanup: grammar corrections, minor improvements to wording	2025-08-01 21:41:24 +10:00
Hugh Delaney	930a942dd0	Refactor: Cycles: Move block sizes into common header This change puts all the block size macros in the same common header, so they can be included in host side code without needing to also include the kernels that are defined in the device headers that contained these values. This change also removes a magic number used to enqueue a kernel, which happened to agree with the GPU_PARALLEL_SORT_BLOCK_SIZE macro. Pull Request: https://projects.blender.org/blender/blender/pulls/143646	2025-08-01 13:26:02 +02:00
Weizhen Huang	f8eae6b58a	Fix: Cycles: Division by zero in Oren-Nayar shader `Eavg` can still be 1 for very small roughness, causing division by zero when computing `Ems`. A roughness of 1e-5 gives an `Evg` of 0.999998, seems reasonable. Pull Request: https://projects.blender.org/blender/blender/pulls/143637	2025-07-30 16:57:00 +02:00
Patrick Mours	6487395fa5	Cycles: Add linear curve shape Add new "Linear 3D Curves" option in the Curves panel in the render properties. This renders curves as linear segments rather than smooth curves, for faster render time at the cost of accuracy. On NVIDIA Blackwell GPUs, this can give a 6x speedup compared to smooth curves, due to hardware acceleration. On NVIDIA Ada there is still a 3x speedup, and CPU and other GPU backends will also render this faster. A difference with smooth curves is that these have end caps, as this was simpler to implement and they are usually helpful anyway. In the future this functionality will also be used to properly support the CURVE_TYPE_POLY on the new curves object. Pull Request: https://projects.blender.org/blender/blender/pulls/139735	2025-07-29 17:05:01 +02:00
Brecht Van Lommel	f03ac5ec4b	Fix #142876 : Cycles crash with OSL and interactive updates Update use_shading, use_camera and the shading system pointers in the same location, so that when the render is interrupted they are in a consistent state. The added null pointer checks are not strictly needed, but just in case it goes out of sync for another reason. Pull Request: https://projects.blender.org/blender/blender/pulls/143467	2025-07-28 18:43:57 +02:00
Weizhen Huang	ea45c776fd	Cycles: introduce dual types to replace some uses of dfdx/dfdy/differentials. No functional change expected. Pull Request: https://projects.blender.org/blender/blender/pulls/143178	2025-07-28 17:34:24 +02:00
Weizhen Huang	345d23bff8	Cleanup: Cycles: add more float3 util functions and vectorize `wrap` and `safe_fmod`.	2025-07-28 17:34:21 +02:00
Weizhen Huang	48777385c2	Cleanup: Cycles: simplify computation of dPdx and dPdy `sd->dPdu`, `sd->dPdv`, `sd->du` and `sd->dv` are computed from `sd->dP` by constructing a local frame, so both results are the same, subject to some numerical differences. This avoids constructing the local frame again, so might be faster.	2025-07-28 17:34:21 +02:00
Weizhen Huang	f9a65ebbea	Cleanup: Cycles: Deduplication svm bump functions	2025-07-28 17:34:21 +02:00
Sergey Sharybin	dcae48d1d3	Cycles: Add Portal Depth light pass information It allows to implement tricks based on a knowledge whether the path ever cam through a portal or not, and even something more advanced based on the number of portals. The main current objective is for strokes shading: stroke shader uses Ray Portal BSDF to place ray to the center of the stroke and point it in the direction of the surface it is generated for. This gives stroke a single color which matches shading of the original object. For this usecase to work the ray bounced from the original surface should ignore the strokes, which is now possible by using Portal Depth input and mixing with the Transparent BSDF. It also helps to make shading look better when there are multiple stroke layers. A solution of using portal depth is chosen over a single flag due to various factors: - Last time we've looked into it it was a bit tricky to implement as a flag due to us running out of bits. - It feels to be more flexible solution, even though it is a bit hard to come up with 100% compelling setup for it. - It needs to be slightly different from the current "Is Foo" flags, and be more "Is Portal Descendant" or something. An extra uint16 is added to the state to count the portal depth, but it is only allocated for scenes that use Ray Portal BSDF. Portal BSDF still increments Transparent bounce, as it is required to have some "limiting" factor so that ray does not get infinitely move to different place of the scene. Ref #125213 Pull Request: https://projects.blender.org/blender/blender/pulls/143107	2025-07-25 18:09:38 +02:00
Weizhen Huang	3a1fbe17b9	Fix: OSL: Attribute "generated" not available for World and Point Cloud When "generated" is required via Attribute Node, it is not available for World and Point Cloud. Make OSL match the SVM behavior to use the object coordinates (see `svm/attribute.h`), Pull Request: https://projects.blender.org/blender/blender/pulls/143198	2025-07-25 15:39:06 +02:00
Brecht Van Lommel	47f9b7a98e	Fix #142022 : Cycles undisplaced normal not available Previously with adaptive subdivision this happened to work with the N attribute, but that was not meant to be undisplaced. This adds a new undisplaced_N attribute specifically for this purpose. For backwards compatibility in Blender 4.5, this also keeps N undisplaced. But that will be changed in 5.0. Pull Request: https://projects.blender.org/blender/blender/pulls/142090	2025-07-24 18:16:25 +02:00
Michael Jones	f3485cc925	Cycles: MetalRT: Only use extended limits if needed (revisited) Currently MetalRT renders always use extended limits, which is needed to correctly render scenes where the max primitive count can exceed 2^28 or the instance count can exceed 2^24. This patch adopts Metal best practices of only enabling this flag if it is needed. This PR is similar to #133364, but there are some notable differences: 1) The old PR made an overly optimistic assumption that all the relevant visibility bits could be squeezed into 8 bits. This new PR adopts the same approach that Optix takes of using 8 bits as a primary HW filter, and checking the full 32 bit mask inside the SW intersection handler. ~~2) I moved the scene scanning check from Scene into MetalDevice. This avoids platform specific details leaking into platform agnostic areas.~~ ~~3) In live viewport mode, we always use extended limits in case we tip over the threshold.~~ _EDIT:_ 2) The limits are scanned in `Scene::update_kernel_features`, and given to the device by a new `set_bvh_limits` method which returns true if the BVH and kernels need to be reloaded. Pull Request: https://projects.blender.org/blender/blender/pulls/142401	2025-07-24 13:27:20 +02:00
Weizhen Huang	bb689687a7	Fix #140892 : Cycles: equiangular sampling numerical issue When `delta` is significantly larger than the difference between `tmin` and `tmax`, the precision of `theta_a` and `theta_b` reduces, resulting in banding artefacts. For equiangular sampling, we use uniform sampling to fix this case. For light tree, we use the equality atan(a) + atan(b) = atan2(a + b, 1 - a*b). Pull Request: https://projects.blender.org/blender/blender/pulls/142845	2025-07-23 12:56:11 +02:00
Thomas Dinges	ce0ae95ed3	Cycles: Bump minimum supported CUDA architecture to sm_50 Pull Request: https://projects.blender.org/blender/blender/pulls/142212	2025-07-21 19:49:21 +02:00
Weizhen Huang	5a27edcf79	Fix #141136 : Cycles Hair black artifacts with direct coloring fixed by clamping negative input colors Pull Request: https://projects.blender.org/blender/blender/pulls/142667	2025-07-21 12:15:04 +02:00
Nikita Sirgienko	9875836519	Cycles: oneAPI: Compile only needed device binaries in multi-GPU case The code of the "oneapi_load_kernels" function before this modification was loading kernels and compiling them, if needed, for all devices in the associated GPU context. This makes sense for one GPU execution scenario, as well as for execution scenario of multi identical GPU, but in cases where Blender users have several different GPUs in render, the previous implementation would compile all kernels for all devices for each device, unnecessarily doing the same work multiple times. Because of this, I am changing the implementation so that now compilation happens only for the used device per used device, ensuring that no unnecessary work is done. No render performance changes are expected.	2025-07-19 14:15:36 +02:00
Sebastian Herholz	20e0fed7da	Cycles: Fixing wrong PDF evaluation when BSDF closures are excluded by the light source Pull Request: https://projects.blender.org/blender/blender/pulls/142323	2025-07-18 17:13:54 +02:00
Michael Jones	8077384e3a	Cycles: Improve Metal kernel specialisation This improves the existing scene specialisation mechanism by replacing "kernel_data.kernel_features" with a function constant. It doesn't cause any additional compilation requests, but allows the backend compiler to eliminate more dead code. An additional compiler hint is provided for dead-stripping "volume_stack_enter_exit" which results in slightly faster rendering of non-volumetric scenes. Pull Request: https://projects.blender.org/blender/blender/pulls/142235	2025-07-18 11:18:43 +02:00
Campbell Barton	9d41b04aec	Cleanup: quiet warnings, typo	2025-07-18 12:03:53 +10:00
Brecht Van Lommel	73fe848e07	Fix: Cycles log levels conflict with macros on some platforms In particular DEBUG, but prefix all of them to be sure. Pull Request: https://projects.blender.org/blender/blender/pulls/141749	2025-07-10 19:44:14 +02:00
Lukas Stockner	eaa5f63ba2	Cycles: Replace thin-film basis function approximation with accurate LUTs Previously, we used precomputed Gaussian fits to the XYZ CMFs, performed the spectral integration in that space, and then converted the result to the RGB working space. That worked because we're only supporting dielectric base layers for the thin film code, so the inputs to the spectral integration (reflectivity and phase) are both constant w.r.t. wavelength. However, this will no longer work for conductive base layers. We could handle reflectivity by converting to XYZ, but that won't work for phase since its effect on the output is nonlinear. Therefore, it's time to do this properly by performing the spectral integration directly in the RGB primaries. To do this, we need to: - Compute the RGB CMFs from the XYZ CMFs and XYZ-to-RGB matrix - Resample the RGB CMFs to be parametrized by frequency instead of wavelength - Compute the FFT of the CMFs - Store it as a LUT to be used by the kernel code However, there's two optimizations we can make: - Both the resampling and the FFT are linear operations, as is the XYZ-to-RGB conversion. Therefore, we can resample and Fourier-transform the XYZ CMFs once, store the result in a precomputed table, and then just multiply the entries by the XYZ-to-RGB matrix at runtime. - I've included the Python script used to compute the table under `intern/cycles/doc/precompute`. - The reference implementation by the paper authors [1] simply stores the real and imaginary parts in the LUT, and then computes `cos(shift)real + sin(shift)imag`. However, the real and imaginary parts are oscillating, so the LUT with linear interpolation is not particularly good at representing them. Instead, we can convert the table to Magnitude/Phase representation, which is much smoother, and do `mag * cos(phase - shift)` in the kernel. - Phase needs to be unwrapped to handle the interpolation decently, but that's easy. - This requires an extra trig operation in the kernel in the dielectric case, but for the conductive case we'll actually save three. Rendered output is mostly the same, just slightly different because we're no longer using the Gaussian approximation. [1] "A Practical Extension to Microfacet Theory for the Modeling of Varying Iridescence" by Laurent Belcour and Pascal Barla, https://belcour.github.io/blog/research/publication/2017/05/01/brdf-thin-film.html Pull Request: https://projects.blender.org/blender/blender/pulls/140944	2025-07-09 22:10:28 +02:00
Lukas Stockner	cf92af3ac4	Cycles: Support Thin Film iridescence in the Glass BSDF Supporting this on the Metallic BSDF will require some extra work, and on the Glossy BSDF it doesn't make much sense conceptually (for that kind of shader setup, we'll want to support layering in SVM), but Glass BSDF just needs to be hooked up so might as well do that. Pull Request: https://projects.blender.org/blender/blender/pulls/140832	2025-07-09 22:07:24 +02:00
Brecht Van Lommel	13ab5067ce	Cycles: Detect volume attribute nodes that can use stochastic sampling Detect which volume attributes nodes have a linear mapping to their usage as density / color / temperature in volume shader nodes, and use stochastic sampling for them. Pull Request: https://projects.blender.org/blender/blender/pulls/132908	2025-07-09 21:04:38 +02:00
Brecht Van Lommel	646dc7fe4d	Cycles: Use stochastic sampling to speed up tricubic volume filter Stochastically turn a tricubic filter into a trilinear one. This reduces the number of taps from 64 to 8. It combines ideas from the "Stochastic Texture Filtering" paper and our previous GPU sampling of 3D textures. This is currently only used in a few places where we know stochastic interpolation is valid or close enough in practice. * Principled volume density, color and temperature * Motion blur velocity On an Macbook Pro M3 with the openvdb_smoke.blend regression test and cubic sampling, this gives a ~2x speedup for CPU and ~4x speedup for GPU. However it also increases noise, usually only a little. Equal time renders for this scene show a clear reduction in noise for both CPU and GPU. Note we can probably get a bigger speedup with acceptable noise trade-off using full stochastic sampling, but will investigate that separately. Pull Request: https://projects.blender.org/blender/blender/pulls/132908	2025-07-09 21:04:38 +02:00
Brecht Van Lommel	4c25b49875	Refactor: Cycles: Deduplicate 3D texture sampling between devices Pull Request: https://projects.blender.org/blender/blender/pulls/132908	2025-07-09 21:04:38 +02:00
Brecht Van Lommel	b6c4233b28	Refactor: Cycles: Remove now unused 3D image texture support Pull Request: https://projects.blender.org/blender/blender/pulls/132908	2025-07-09 21:04:38 +02:00
Brecht Van Lommel	7978799e6f	Cycles: Always render volume as NanoVDB All GPU backends now support NanoVDB, using our own kernel side code that is easily portable. This simplifies kernel and device code. Volume bounds are now built from the NanoVDB grid instead of OpenVDB, to avoid having to keep around the OpenVDB grid after loading. While this reduces memory usage, it does have a performance impact, particularly for the Cubic filter. That will be addressed by another commit. Pull Request: https://projects.blender.org/blender/blender/pulls/132908	2025-07-09 21:04:38 +02:00
Brecht Van Lommel	fb4e3c8167	Refactor: Cycles: Remove distinction between severity and verbosity Only use LOG() and LOG_IS_ON() macros, no more VLOG_. Pull Request: https://projects.blender.org/blender/blender/pulls/140244	2025-07-09 20:59:24 +02:00
Michael Jones	b4be954856	Cycles: Simplify Metal backend with direct bindless resource encoding This PR is a more extensive follow on from #123551 (removal of AMD and Intel GPU support). All supported Apple GPUs have Metal 3 and tier 2 argument buffer support. The invariant resource properties `gpuAddress` and `gpuResourceID` can be written directly into GPU structs once at setup time rather than once per dispatch. More background info can be found in [this article](https://developer.apple.com/documentation/metal/improving-cpu-performance-by-using-argument-buffers?language=objc). Code changes: - All code relating to `MTLArgumentEncoder` is removed - `KernelParamsMetal` updates are directly written into `id<MTLBuffer> launch_params_buffer` which is used for the "static" dispatch arguments - Dynamic dispatch arguments are small enough to be encoded using the `MTLComputeCommandEncoder.setBytes` function, eliminating the need for cycling temporary arg buffers Pull Request: https://projects.blender.org/blender/blender/pulls/140671	2025-07-08 23:20:16 +02:00
Lukas Stockner	bfcfe730ed	Cleanup: Cycles: Move F82 Fresnel model into helper function	2025-07-08 01:23:33 +02:00
Campbell Barton	776dbe942c	Cleanup: spelling (make check_spelling_*)	2025-06-22 11:34:32 +00:00
Weizhen Huang	2f7797dd4d	Merge branch 'blender-v4.5-release'	2025-06-20 14:20:00 +02:00
weizhen	bf9836da65	Fix: Cycles not building with OptiX 9.0 As suggested by @pmoursnv Was throwing errors like `identifier "half" is undefined`. Pull Request: https://projects.blender.org/blender/blender/pulls/140676	2025-06-20 14:19:43 +02:00
Brecht Van Lommel	17bda2cf3f	Cycles: Enable multi-bounce random walk subsurface scattering Multi-bounce was mainly disabled for disk sampling where the probability of hitting something is relatively low even with high albedo, but this is not so much an issue with random walk. This reduces darkening artifacts at the cost of some extra render time. The difference is mainly visible when using a high radius. Pull Request: https://projects.blender.org/blender/blender/pulls/140665	2025-06-19 20:04:49 +02:00
Lukas Stockner	8eb94f7c6f	Merge branch 'blender-v4.5-release'	2025-06-19 20:04:29 +02:00
Lukas Stockner	8f00a00283	Fix #138188 : camera_shader_random_sample returns zero if DOF is off	2025-06-19 20:03:03 +02:00
Lukas Stockner	49ae867de4	Fix #139870 : Cycles: Some objects with normal maps leak light This was broken by !138632, the refactor of the microfacet code to no longer check the "geometric normal", which in reality was the smoothed normal. Since the logic is now the same for all closure types, it seemed weird that the light leak only affects Microfacet closures, not Diffuse. Turns out that for diffuse closures, the relevant paths were rejected by the initial hemisphere check in the smooth bump terminator code, which also incorporates the smoothed but non-bump/normal-mapped normal sd->N. So, we can detect and prevent the new light leaks by extending this check to all closure types for the eval case. Sampling already has stricter checks, so this doesn't apply there. With this change, we can revert the two test cases back to their pre-refactor version. In hindsight it was a mistake to just shrug off these changes as okay, I should have looked closer into the difference. Pull Request: https://projects.blender.org/blender/blender/pulls/140415	2025-06-19 19:20:06 +02:00
Alaska	b561c78f93	Nodes: Remove legacy combine/separate nodes In Blender 3.3 (1) the individual combine and separate color nodes were combined together into a single combine/separate color node. To ensure legacy addons still worked, the old nodes were left in Blender, but hidden from the Add menus. It has been nearly 3 years since that change was made, most if not all addons should have been updated by now. So this commit removes these hidden legacy nodes. (1) blender/blender@82df48227b Pull Request: https://projects.blender.org/blender/blender/pulls/135376	2025-06-17 15:36:33 +02:00
marcopavanello	ab21755aaf	Shaders: Remove old Preetham and Hosek sky texture models Remove old Preetham and Hosek-Wilkie sky models, which are less accurate. The Nishita improved model has been available for long enough. Pull Request: https://projects.blender.org/blender/blender/pulls/139923	2025-06-16 14:36:18 +02:00
Brecht Van Lommel	b920f6f1a7	Shaders: Remove point density texture node This is replaced by geometry nodes, where volumes can now be generated from point clouds and meshes with more control, and more efficient rendering as a sparse volume. No backwareds compatibility is provided, as this would be complicated, and probably this feature was not used much in the past few years. This node was supported in Cycles only, not by EEVEE. Pull Request: https://projects.blender.org/blender/blender/pulls/140292	2025-06-16 12:06:02 +02:00
Campbell Barton	63600f806b	Cleanup: spelling in comments (make check_spelling_*)	2025-06-13 11:23:28 +10:00
Aras Pranckevicius	68111db969	Nodes: Speedup Voronoi by changing the hash function The 2D->2D, 3D->3D, 4D->4D hash functions used in Voronoi node were using quite an expensive hash function. Switch these to dedicated 2D/3D/4D hash functions (pcg2d, pcg3d, pcg4d) -- these are still very good quality, but the hash function itself is 3x-4x faster. Which makes Voronoi node calculation overall be around 2x faster. In some cases when using OSL, the speedup is even larger. This visibly changes output of the Voronoi noise however. The actual noise "behaves" the same, just if someone was depending on the noise pattern being exactly like it was before, this will change the pattern. Images, more performance results and details wrt OSL are in the PR. Pull Request: https://projects.blender.org/blender/blender/pulls/139520	2025-06-12 20:07:52 +02:00

1 2 3 4 5 ...

3907 Commits