test2

Author	SHA1	Message	Date
Xavier Hallade	aeb103fb50	Cycles: Pack uint3/int3 structs for oneAPI This recently changed after a fix in `28f93d5443` but we get better performance by ensuring int3 is packed instead. Packing int3 currently gives a 7% speedup when rendering wdas_cloud on Intel Arc B580. Pull Request: https://projects.blender.org/blender/blender/pulls/145593	2025-09-08 09:22:32 +02:00
Jesse Yurkovich	96e7242678	Cycles: Tesselate adaptive subdivision meshes in parallel Meshes that require adaptive subdivision are currently tesselated one at a time. Change this part of device update to be done in parallel. To remove the possibility of the status message going backwards, a mutex was required to keep that portion of the loop atomic. Results for the loop in question: On one particular scene with over 300 meshes requiring tesselation, the update time drops from ~16 seconds to ~3 seconds. The attached synthetic test drops from ~9 seconds down to ~1 second. Pull Request: https://projects.blender.org/blender/blender/pulls/145220	2025-08-28 20:22:14 +02:00
Campbell Barton	c45ee0eb98	Cleanup: quiet compiler warnings Suppressing "null-pointer-subtraction" was needed for clang but caused a warning with GCC.	2025-08-20 11:18:29 +10:00
Brecht Van Lommel	c7e2368d6c	Fix #144528 : Cycles renders OpenVDB grids with rotation wrong Pull Request: https://projects.blender.org/blender/blender/pulls/144825	2025-08-19 21:39:30 +02:00
Brecht Van Lommel	28f93d5443	Fix #144569 : Cycles NanoVDB rendering broken with oneAPI Wrong assumption about packed_int3, and not caught because the assert was in the wrong place. Pull Request: https://projects.blender.org/blender/blender/pulls/144803	2025-08-19 18:41:53 +02:00
Brecht Van Lommel	2615cecf10	Refactor: Cycles: Align log levels with CLOG WORK -> DEBUG DEBUG, STATS -> TRACE Pull Request: https://projects.blender.org/blender/blender/pulls/144490	2025-08-18 20:22:44 +02:00
Weizhen Huang	df496eb894	Cycles: use one-tap stochastic interpolation for volume It has ~1.2x speed-up on CPU and ~1.5x speed-up on GPU (tested on Metal M2 Ultra). Individual samples are noisier, but equal time renders are mostly better. Note that volume emission renders differently than before. Pull Request: https://projects.blender.org/blender/blender/pulls/144451	2025-08-14 15:22:44 +02:00
Weizhen Huang	a4f8e0bfa2	Cycles: Use RGBE for denoised guiding buffers to reduce memory usage Co-authored-by: Brecht Van Lommel <brecht@blender.org>	2025-08-13 10:28:50 +02:00
Weizhen Huang	5cb6014efd	Cycles: Volume Scattering Probability Guiding Guide the probability to scatter in or transmit through the volume. Only applied for primary rays. Co-authored-by: Brecht Van Lommel <brecht@blender.org>	2025-08-13 10:28:50 +02:00
Weizhen Huang	b2b2d9a4f3	Cycles: Render volume by ray marching through octrees One octree per volume per shader based on the density. In preparation for the null scattering	2025-08-13 10:28:50 +02:00
Campbell Barton	77d6960d24	Cleanup: quiet GCC warning for pointer subtraction Ref !144032	2025-08-06 20:31:14 +00:00
Campbell Barton	e8501d2f54	Cleanup: grammar corrections, minor improvements to wording Also back-tick quote some code references in comments to differentiate them from English text.	2025-08-06 00:20:39 +00:00
Amogh Shivaram	ff4d840cf8	Cycles: Add polarized Fresnel function for conductors This PR adds a new `fresnel_conductor_polarized` function, which calculates reflectance and phase shift (if requested) for both parallel and perpendicular polarized light. This is needed for applying thin film iridescence to conductors (see !141131). For consistency, this PR also makes `fresnel_conductor` call `fresnel_conductor_polarized` instead of using a fast approximation of the Fresnel equations that is inaccurate at lower n and k values. This will change the output of some Metallic BSDF renders using Physical Conductor and prevent discrepancies when enabling thin film iridescence. I didn't do any rigorous performance testing, but from timing the functions outside of Blender, `fresnel_conductor_polarized` is significantly slower than the approximation, between 1.5-3x depending on the compiler. This makes sense because it has three square roots and the approximation has none. In some informal tests with metallic_multiggx_physical.blend modified to have more spheres, the new renders took around 1-2% longer on both CPU and GPU. There are some avoidable inefficiencies in this approach of just calling `fresnel_conductor_polarized`: - one of the three square roots could be saved since `fresnel_conductor` never needs the phase shift and there are simplifications possible when only calculating the reflectance - there are several unnecessary multiplications by 1.0 since `fresnel_conductor` uses relative IOR and `fresnel_conductor_polarized` doesn't, though those could get optimized out if inlined Pull Request: https://projects.blender.org/blender/blender/pulls/143903	2025-08-04 15:36:36 +02:00
Weizhen Huang	a7042ca30c	Fix: warning template-id-cdtor on gcc	2025-07-29 10:41:17 +02:00
Weizhen Huang	ea45c776fd	Cycles: introduce dual types to replace some uses of dfdx/dfdy/differentials. No functional change expected. Pull Request: https://projects.blender.org/blender/blender/pulls/143178	2025-07-28 17:34:24 +02:00
Weizhen Huang	345d23bff8	Cleanup: Cycles: add more float3 util functions and vectorize `wrap` and `safe_fmod`.	2025-07-28 17:34:21 +02:00
Weizhen Huang	9404db8c7c	Fix #141388 : Cycles: CPU/GPU difference in `pow` function with 0 base If the base is 0 and the exponent is non-zero, return 0 for both CPU and GPU. Pull Request: https://projects.blender.org/blender/blender/pulls/142678	2025-07-21 14:45:30 +02:00
Brecht Van Lommel	f38b4323f9	Fix: Build error with NDEBUG after recent fix for log level macro	2025-07-10 21:10:36 +02:00
Brecht Van Lommel	73fe848e07	Fix: Cycles log levels conflict with macros on some platforms In particular DEBUG, but prefix all of them to be sure. Pull Request: https://projects.blender.org/blender/blender/pulls/141749	2025-07-10 19:44:14 +02:00
Campbell Barton	ce7561982a	Cleanup: use conventional license formatting Quiet "make check_licenses" warning.	2025-07-10 00:38:11 +00:00
Lukas Stockner	eaa5f63ba2	Cycles: Replace thin-film basis function approximation with accurate LUTs Previously, we used precomputed Gaussian fits to the XYZ CMFs, performed the spectral integration in that space, and then converted the result to the RGB working space. That worked because we're only supporting dielectric base layers for the thin film code, so the inputs to the spectral integration (reflectivity and phase) are both constant w.r.t. wavelength. However, this will no longer work for conductive base layers. We could handle reflectivity by converting to XYZ, but that won't work for phase since its effect on the output is nonlinear. Therefore, it's time to do this properly by performing the spectral integration directly in the RGB primaries. To do this, we need to: - Compute the RGB CMFs from the XYZ CMFs and XYZ-to-RGB matrix - Resample the RGB CMFs to be parametrized by frequency instead of wavelength - Compute the FFT of the CMFs - Store it as a LUT to be used by the kernel code However, there's two optimizations we can make: - Both the resampling and the FFT are linear operations, as is the XYZ-to-RGB conversion. Therefore, we can resample and Fourier-transform the XYZ CMFs once, store the result in a precomputed table, and then just multiply the entries by the XYZ-to-RGB matrix at runtime. - I've included the Python script used to compute the table under `intern/cycles/doc/precompute`. - The reference implementation by the paper authors [1] simply stores the real and imaginary parts in the LUT, and then computes `cos(shift)real + sin(shift)imag`. However, the real and imaginary parts are oscillating, so the LUT with linear interpolation is not particularly good at representing them. Instead, we can convert the table to Magnitude/Phase representation, which is much smoother, and do `mag * cos(phase - shift)` in the kernel. - Phase needs to be unwrapped to handle the interpolation decently, but that's easy. - This requires an extra trig operation in the kernel in the dielectric case, but for the conductive case we'll actually save three. Rendered output is mostly the same, just slightly different because we're no longer using the Gaussian approximation. [1] "A Practical Extension to Microfacet Theory for the Modeling of Varying Iridescence" by Laurent Belcour and Pascal Barla, https://belcour.github.io/blog/research/publication/2017/05/01/brdf-thin-film.html Pull Request: https://projects.blender.org/blender/blender/pulls/140944	2025-07-09 22:10:28 +02:00
Brecht Van Lommel	4c25b49875	Refactor: Cycles: Deduplicate 3D texture sampling between devices Pull Request: https://projects.blender.org/blender/blender/pulls/132908	2025-07-09 21:04:38 +02:00
Brecht Van Lommel	b6c4233b28	Refactor: Cycles: Remove now unused 3D image texture support Pull Request: https://projects.blender.org/blender/blender/pulls/132908	2025-07-09 21:04:38 +02:00
Brecht Van Lommel	7978799e6f	Cycles: Always render volume as NanoVDB All GPU backends now support NanoVDB, using our own kernel side code that is easily portable. This simplifies kernel and device code. Volume bounds are now built from the NanoVDB grid instead of OpenVDB, to avoid having to keep around the OpenVDB grid after loading. While this reduces memory usage, it does have a performance impact, particularly for the Cubic filter. That will be addressed by another commit. Pull Request: https://projects.blender.org/blender/blender/pulls/132908	2025-07-09 21:04:38 +02:00
Brecht Van Lommel	8cf031ba95	Fix: Wrong Cycles NanoVDB memory alignment on Windows This was not a problem in practice so far, but will be with upcoming changes. Pull Request: https://projects.blender.org/blender/blender/pulls/132908	2025-07-09 20:59:27 +02:00
Brecht Van Lommel	8111152c67	Refactor: Cycles: Add some OpenVDB and NanoVDB functions to util OpenVDB to NanoVDB was moved, a new NanoVDB to OpenVDB mask grid was added for future use. Some redundant CMake code was simplified. Pull Request: https://projects.blender.org/blender/blender/pulls/132908	2025-07-09 20:59:27 +02:00
Brecht Van Lommel	cf36acbc0c	Refactor: Cycles: Replace remaining fprintf with logging Pull Request: https://projects.blender.org/blender/blender/pulls/140244	2025-07-09 20:59:25 +02:00
Brecht Van Lommel	b9d7bab6e6	Refactor: Cycles: Add comments to explain the logging API Pull Request: https://projects.blender.org/blender/blender/pulls/140244	2025-07-09 20:59:25 +02:00
Brecht Van Lommel	fb4e3c8167	Refactor: Cycles: Remove distinction between severity and verbosity Only use LOG() and LOG_IS_ON() macros, no more VLOG_. Pull Request: https://projects.blender.org/blender/blender/pulls/140244	2025-07-09 20:59:24 +02:00
Brecht Van Lommel	8392ca915b	Cycles: Remove glog dependency, redirect logs to CLOG * Add own simple logging system to replace glog, which is no longer maintained by Google. * When building in Blender, integrate with CLOG and print all messages through that system instead. * --log cycles now replaces --debug-cycles. The latter still works but is no longer documented. Pull Request: https://projects.blender.org/blender/blender/pulls/140244	2025-07-09 20:59:24 +02:00
Brecht Van Lommel	cf7f276d49	Refactor: Cycles: Tweak logging to prepare for dropping glog * Implement own simple ScopedMockLog * Always use names instead of numbers * Avoid logging in header files Pull Request: https://projects.blender.org/blender/blender/pulls/140244	2025-07-09 20:59:24 +02:00
Sergey Sharybin	9ace788faf	Merge branch 'blender-v4.5-release'	2025-07-02 10:42:01 +02:00
Michael Jones	681eed7e4d	Fix #135659 : Some types of motion are incorrect at low step counts with MetalRT Following #136253, this PR enables decomposed MetalRT motion interpolation on macOS 15.6. The bounding box issue is fixed in the latest macOS 15.6 beta (24G5054d). Pull Request: https://projects.blender.org/blender/blender/pulls/141207	2025-07-02 10:41:42 +02:00
Aras Pranckevicius	68111db969	Nodes: Speedup Voronoi by changing the hash function The 2D->2D, 3D->3D, 4D->4D hash functions used in Voronoi node were using quite an expensive hash function. Switch these to dedicated 2D/3D/4D hash functions (pcg2d, pcg3d, pcg4d) -- these are still very good quality, but the hash function itself is 3x-4x faster. Which makes Voronoi node calculation overall be around 2x faster. In some cases when using OSL, the speedup is even larger. This visibly changes output of the Voronoi noise however. The actual noise "behaves" the same, just if someone was depending on the noise pattern being exactly like it was before, this will change the pattern. Images, more performance results and details wrt OSL are in the PR. Pull Request: https://projects.blender.org/blender/blender/pulls/139520	2025-06-12 20:07:52 +02:00
Brecht Van Lommel	04e325029f	Revert "Cycles: Guiding cleaning up and refactoring the guiding code" This reverts commit `5abf42012d` in the blender-v4.5-release branch to work around HIP compiler issues. It will remain in the main branch. Ref blender/blender#139836	2025-06-11 15:47:06 +02:00
Brecht Van Lommel	501b4641f6	Revert "Cleanup: Unused arguments in Cycles kernel" This reverts commit `0e7a696819` in the blender-v4.5-release branch to work around HIP compiler issues. It will remain in the main branch. Ref blender/blender#139836	2025-06-11 15:47:06 +02:00
Campbell Barton	07121d44ae	Cleanup: use braces (follow own style guide)	2025-06-11 09:05:26 +00:00
Brecht Van Lommel	0e7a696819	Cleanup: Unused arguments in Cycles kernel And add back the compiler flag that hid them. Pull Request: https://projects.blender.org/blender/blender/pulls/139497	2025-05-27 21:30:45 +02:00
Lukas Stockner	507267393e	Cleanup: Cycles: Restructure camera viewplane calculation This started with investigating a render issue that appears to be caused by GCC 15. From what I can tell, it was caused by `viewplane = (viewplane) * bcam->zoom;`. I'm not entirely sure what the root cause is (potentially pointer aliasing?), but the restructured code works fine now. Pull Request: https://projects.blender.org/blender/blender/pulls/139416	2025-05-26 22:24:20 +02:00
Michael Jones	8dd9aeb11e	Cycles: Fix occasional failure in path_create_directories This PR adds a global mutex to `path_create_directories` to fix a thread-safety issue which can occur when concurrently creating multiple subdirectories with common stems. Pull Request: https://projects.blender.org/blender/blender/pulls/139266	2025-05-22 16:06:51 +02:00
Sebastian Herholz	5abf42012d	Cycles: Guiding cleaning up and refactoring the guiding code In detail: - Direct accesses of state attributes are replaced with the INTEGRATOR_STATE and INTEGRATOR_STATE_WRITE macros. - Unified the checks for the __PATH_GUIDING define to use # if defined (__PATH_GUIDING__). - Even if __PATH_GUIDING__ is defined, we now check if the feature is enabled using if ((kernel_data.kernel_features & KERNEL_FEATURE_PATH_GUIDING)) {. This is important for later GPU ports. - The kernel usage of the guiding field, surface, and volume sampling distributions is wrapped behind macros for each specific device (atm only CPU). This will make it easier for a GPU port later.	2025-05-22 13:46:30 +02:00
Brecht Van Lommel	fc686ff257	Fix #139002 : Cycles particle object instance appears in center of scene The particle system generates some particles with NaN values. The set_if_different mechanism skipped copying those due to a refactor in the matrix equality test. Revert that part of `689633d802` for now. A better solution would be to improve handling of NaNs in Cycles, and to find and fix the cause of the NaN in the particle system. Pull Request: https://projects.blender.org/blender/blender/pulls/139238	2025-05-22 01:10:19 +02:00
Brecht Van Lommel	59b4842117	Cycles: Adaptive subdivision triangular patches There is a corner case where one side of a quad needs splitting and the other side has only one segment. Previously this would produce either gaps or after recent changes to stitch together geometry, uninitialized memory. Now solve this by splitting into triangular patches, as suggested in the DiagSplit paper. These triangular patches can be further subdivided themselves. Dicing has special cases for 1 or 2 segments on edges. For more segments it works the same as: quad dicing: A regular inner triangle grid stitched to the outer edges. Fix #136973: Inconsistent results with adaptive subdivision Pull Request: https://projects.blender.org/blender/blender/pulls/139062	2025-05-19 12:04:11 +02:00
Campbell Barton	b3dfde88f3	Cleanup: spelling in comments (check_spelling_* target) Also uppercase acronyms: API, UTF & ASCII.	2025-05-17 10:17:37 +10:00
Weizhen Huang	1f01a1aee9	Cleanup: remove unnecessary `defined(__KERNEL_METAL__)` The top level guard is already `#ifndef __KERNEL_METAL__`, additional guard is not only unnecessary but also confusing.	2025-05-05 18:35:24 +02:00
Campbell Barton	43af16a4c1	Cleanup: spelling in comments, correct comment block formatting Also use doxygen comments more consistently.	2025-05-01 11:44:33 +10:00
Lukas Stockner	8bc9f174d3	Fix: Cycles: Wrong derivative handling in OptiX OSL transform() osl_transform_triple(), osl_transform_dvmdv() and so on are supposed to apply the given transform in the context of OSL's auto-differentiation system. Therefore, the given input is a dual vector, containing both the value as v[0] and its derivatives w.r.t. X and Y in v[1] and v[2]. However, the existing code treats these as a simple list of vectors, applying the same operation to all three instead of propagating the derivatives. On top of that, it also treated the given matrix input as if there were three of them, which isn't the case. Therefore, this commit replaces the implementation to do the right thing. The Vector and Normal case are straightforward since the operation is linear, so applying the same operation to all three vectors works. The Point case is a bit more complicated, but not too bad when written out. This bug mostly became apparent when using Object or Camera texture coordinates with a Bump node, since that node uses OSL differentials and Object/Camera coordinates are implemented using transform(). I'm pretty sure that all the other builtin functions (e.g. sin) at the bottom of services_gpu.h have the same problem, but one thing at a time... Pull Request: https://projects.blender.org/blender/blender/pulls/138045	2025-04-28 12:46:54 +02:00
Brecht Van Lommel	b174e5f0d1	Cycles: Vulkan CUDA graphics interop * Using CUDA external memory * Checks that device UUID matches Vulkan Pull Request: https://projects.blender.org/blender/blender/pulls/137363	2025-04-28 11:38:56 +02:00
Campbell Barton	c90e8bae0b	Cleanup: spelling in comments & replace some use of single quotes Previously spell checker ignored text in single quotes however this meant incorrect spelling was ignored in text where it shouldn't have been. In cases single quotes were used for literal strings (such as variables, code & compiler flags), replace these with back-ticks. In cases they were used for UI labels, replace these with double quotes. In cases they were used to reference symbols, replace them with doxygens symbol link syntax (leading hash). Apply some spelling corrections & tweaks (for check_spelling_* targets).	2025-04-26 11:17:13 +00:00
Sergey Sharybin	30b962b3d8	Cycles: Optimize 3d and 4d noise The goal is to reduce the affect of the fmod() used in the noise code, which was initially reported in the comment: https://projects.blender.org/blender/blender/pulls/119884#issuecomment-1258902 Basic idea is to benefit from SIMD vectorization on CPU. Tested on Linux i9-11900K and macOS on M2 Ultra, in both cases performance after this change is very close to what it could be with the fmod() commented out (the call itself, `p = p + precision_correction`). On macOS the penalty of fmod() was about 10%, on Linux it was closer to 30% when built with GCC-13. With Linux builds from the buildbot it is more like 18%. The optimization is only done for 3d and 4d noise. It might be possible to gain some performance improvement for 1d and 2d cases, but the approach would need to be different: we'd need to optimize scalar version fmodf(). Maybe tricks with integer cast will be faster (since we are a bit optimistic in the kernel and do not guarantee exact behavior in extreme cases such as NaN inputs). Pull Request: https://projects.blender.org/blender/blender/pulls/137109	2025-04-09 13:40:10 +02:00

1 2 3 4 5 ...

1327 Commits