test2

Author	SHA1	Message	Date
Sergey Sharybin	36559fd89f	Fix #136811 : HIP-RT performance regression in 4.5 Reduce the register pressure and branching in the switch() by using subclass and cast from void* to the base class. This ensures intersection functions are not inlined multiple times, bringing performance back. Alternative could be to avoid functions (they are quite large) but that only partially resolves the performance regression. Pull Request: https://projects.blender.org/blender/blender/pulls/136823	2025-04-01 17:59:44 +02:00
Campbell Barton	42ad772a1f	Cleanup: spelling & repeated terms (make check_spelling_*) Also use comment blocks for English text.	2025-03-27 01:13:34 +00:00
Sergey Sharybin	2ab231d802	Refactor: Pass proper KernelGlobals HIP-RT functions do have access to kg, and it was used inconsistently: some functions were passed actual kg, other were passed nullptr. This change makes it consistent and passes kg everywhere. Pull Request: https://projects.blender.org/blender/blender/pulls/136503	2025-03-26 11:07:06 +01:00
Sergey Sharybin	709371b278	Refactor: Avoid creation of local copy of RaySelfPrimitives	2025-03-26 11:07:04 +01:00
Sergey Sharybin	888c7e1df9	Cleanup: Avoid redundant data fetch	2025-03-26 11:07:04 +01:00
Sergey Sharybin	3d882acee2	Cleanup: Else after return	2025-03-26 11:07:04 +01:00
Sergey Sharybin	b2dd523d0d	Cleanup: Avoid default hit initialization The entire object is assigned later on, no need to initialize it.	2025-03-26 11:07:04 +01:00
Sergey Sharybin	323e27d825	Cleanup: Remove redundant assignment The payload stores pointers, no need to restore pointer of the function argument to the same value.	2025-03-26 11:07:04 +01:00
Sergey Sharybin	e92a8042c3	Refactor: Payload for shadow intersection and filter in HIP-RT The code before this change was relying on the ShadowPayload have the same "header" as RayPayload for some of the primitive types (curve, motion triangle, point): intersection functions were shared between "regular" and shadow rays (shadow in this case is shadow_all), but extra filter function was used for shadow rays. This is fragile if someone changes one of these structures. What is worse is that compiler might actually decide to shuffle things in some structs, or remove unused fields. This change also solves confusion about ShadowPayload::prim_type seemingly only being assigned to PRIMITIVE_NONE. With time it is not impossible that compiler will also see this, and constant-fold some checks, or even remove the field. If that happens then the render result will be wrong. Maybe it is already happening as there are some GPU and driver and optimization flag specific bugs in the area. It is unclear whether it was causing any actual problem: W7800 seems to render all hair correctly on Linux.	2025-03-26 11:07:04 +01:00
Sergey Sharybin	cdb3f34944	Cleanup: Use full name for the primitive_type Makes it extra clear locally type of what the variable contains: primitive, ray, or something else.	2025-03-26 11:07:04 +01:00
Sergey Sharybin	72542f3bb4	Cleanup: Follow Blender style and use more const Also make some style decisions more consistent: for example, the way how stop/continue search return value is commented. Prefer lower vertical space for those.	2025-03-26 11:07:04 +01:00
Sergey Sharybin	bf9c95f164	Cleanup: Move payload type cast to caller in HIP-RT Mainly readability purposes: - Having variables called local_payload is ambiguous: does it refer to LocalPayload type or to a variable be local in a function? - Some of the functions are used for different ray types, so having the type case in intersectFunc and filterFunc makes it easier to scan. For the latter: now it is more obvious that Curve_Intersect_Shadow expects RayPayload, but Curve_Filter_Shadow expects ShadowPayload. It might not be a problem currently as ShadowPayload has the same "header" RayPayload, but it might change in the future. Also, compiler might optimize fields out from one but not from the other.	2025-03-26 11:07:04 +01:00
Sergey Sharybin	3daaf21bab	Cleanup: Remove unused function argument in HIP-RT	2025-03-26 11:07:04 +01:00
Sergey Sharybin	5ce4e91a80	Fix #136319 : Incorrect transparent bounce count with spatial splits The transparent bounce test was too optimistic in regards to the intersection being considered. The check needs to happen after it has been validated that it is not duplicate. It was already the case for Metal and HIP-RT, but not for Embree and BVH2. Tests updated by: Alaska <Alaskayou01@gmail.com> Pull Request: https://projects.blender.org/blender/blender/pulls/136325	2025-03-22 04:51:42 +01:00
Sergey Sharybin	50180283e9	Fix #117527 : Spatial split leads to artifacts on transparent shadows The reason for this to happen is because when spatial split is used the same intersection could be recorded twice (via different BVH nodes). This change introduces check for the intersection being already recoded, similar to the check in the local BVH. The check is done during BVH intersection which allows to properly ignore intersections even for the maximum bounce number check. A faster approach would be to do such filtering after sorting, but then we can not keep bounce check in the BVH code consistent with and without spatial splits. Intuitively it seems that it should be possible to merge the new loop with the one that checks for which intersection to keep. But it is not so trivial in practice: it doesn't run for all intersections, and also it is formulated in a way that updates isect_index for the next record. Pull Request: https://projects.blender.org/blender/blender/pulls/136251	2025-03-21 13:56:50 +01:00
Sergey Sharybin	bf65b64708	Refactor: De-duplicate local intersection reservoir sampling logic The code which was checking whether local intersection is to be recorded, and under which index was duplicated for triangles, motion triangles, and HIP-RT triangle filter function. This change moves the common logic to an utility function which is reused from all the places mentioned above. Pull Request: https://projects.blender.org/blender/blender/pulls/136244	2025-03-20 17:19:31 +01:00
Sergey Sharybin	7165146fb2	Cleanup: More spelling fixes in comments	2025-03-20 10:37:09 +01:00
Sergey Sharybin	ae4f6026dc	Cleanup: Spelling in comments	2025-03-20 10:36:12 +01:00
Bastien Montagne	dd98cede18	Merge branch 'blender-v4.4-release'	2025-03-14 18:20:26 +01:00
Sahar A. Kashi	9ad3b74867	Fix: SSS and Motion Blur or Curves not working on HIP-RT This change fixes the remaining failing tests with SSS when using HIP-RT. This includes crash when SSS is used on curves, and objects with motion blur and SSS rendering black. The root cause for both cases was the fact that traversal was always assuming regular BVH (built for triangles), while curves and motion triangles are using custom primitives, which requires specialized BVH traversal. This change includes: - Early output from `scene_intersect_local()` for non-triangle and non-motion-triangle primitives. This fixes `sss_hair.blend` test, and also avoids unnecessary BVH traversal when the local intersection is requested from curve object. The same early-output could be added to other BVH traversal implementation. - Use `hiprtGeomCustomTraversalAnyHitCustomStack` for motion triangles primitives. This fixes motion blur on objects with SSS render black. Fixes #135856 Co-authored-by: Sahar A. Kashi <sahar.alipourkashi@amd.com> Co-authored-by: Sergey Sharybin <sergey@blender.org> Pull Request: https://projects.blender.org/blender/blender/pulls/135943	2025-03-14 18:17:54 +01:00
Sergey Sharybin	977a334f6f	Merge branch 'blender-v4.4-release'	2025-03-12 19:24:01 +01:00
Sergey Sharybin	a3eb0faa3f	Fix: Incorrect ray time used for HIP-RT local intersections It was always hard-coded to be 0. It does not seem to result in any extra tests passing, but they are probably not sophisticated enough. Noticed while looking into details for the #135856. Pull Request: https://projects.blender.org/blender/blender/pulls/135878	2025-03-12 19:23:38 +01:00
Xavier Hallade	90a10dcd50	Cycles: Adjust inlining attributes for oneAPI device Now ccl_device sets inlining and ccl_device_inline forces inlining. This matches more closely with what is currently done for cuda and metal backends. I've measured from 1% to 6% overall performance improvement in rendering benchmark scenes on Arc B580, as well as a small decrease in compile time.	2025-03-03 18:20:02 +01:00
Alaska	fb7b53143e	Merge branch 'blender-v4.4-release'	2025-02-27 12:03:30 +13:00
Alaska	d840d249b3	Cycles: Re-enable HIPRT point cloud rendering Previously point cloud rendering was disabled on the HIPRT backend due to unexpected performance regressions introduce by it. With the recent update to HIP SDK 6.3 and HIPRT 2.5, these performance regressions have been resolved and so this commit re-enables point cloud rendering on HIPRT. Pull Request: https://projects.blender.org/blender/blender/pulls/134902	2025-02-27 00:01:35 +01:00
Lukas Stockner	8cb5e05c48	Cleanup: Cycles: Deduplicate kernel attribute code using templating The attribute handling code in the kernel is currently highly duplicated since it needs to handle five different data types and we couldn't use templates back then. We can now, so might as well make use of it and get rid of ~1000 lines. There are also some small fixes for the GPU OSL code: - Wrong derivative for .w component when converting float2/float3->float4 - Different conversion for float2->float (CPU averages, GPU used to take .x) - Removed useless code for converting to float2, not used by OSL Pull Request: https://projects.blender.org/blender/blender/pulls/134694	2025-02-20 19:28:45 +01:00
Sahar A. Kashi	6363181af9	Cycles: HIP-RT 2.5 integration and gfx12 support This change brings the following improvements on the user level - Support of GPUs with gfx12 architecture - New HIP-RT library which in addition to the gfx12 support brings various bug-fixes. The known limitation of gfx12 is that OpenImageDenoiser does not yet support this GPU architecture. This means that while Cycles will use the full advantage of the gfx12 (including hardware accelerated ray-tracing), denoising will only be possible on CPU, or secondary gfx11 or below GPU. This is something that requires a change in OIDN and it is to late to do it for Blender 4.4, but it is something to look forward for Blender 4.5. The gfx12 changes for the pre-compiled kernels is rather trivial, so it comes together (in the same PR) as the bigger HIP-RT change. On the development side this change brings the following improvements: - One step compile and link (much simpler CMake rules) - Embedding BVH binaries in hiprt dll (which makes it easier to package and load, without relying on special path configuration) Co-authored-by: Sahar Kashi <sahar.kashi@amd.com> Co-authored-by: Sergey Sharybin <sergey@blender.org> Co-authored-by: Brecht Van Lommel <brecht@blender.org> Pull Request: https://projects.blender.org/blender/blender/pulls/133129	2025-02-20 17:34:14 +01:00
Nikita Sirgienko	2bab4ae370	Cycles: oneAPI: Optimize texture access by using GPU HW sampler The current usage of software-based texture operations in the oneAPI implementation puts additional register pressure on the GPU compiler during register allocation. And it also creates code that requires maintenance. This commit is intended to address this situation by utilizing a recently productized SYCL bindless texture API to enable HW-based texture operations using Intel GPUs' hardware sampler. This currently translates to 1-11% rendering speedups (scene-specific) on my Arc A770 and Arc B580. At the moment, there are small performance regressions with NanoVDB texture operations on Arc B580 and small performance regressions in shade surface MNEE and Raytrace kernels on Arc A770, but they look recoverable and will be handled in the future. Pull Request: https://projects.blender.org/blender/blender/pulls/133457	2025-02-12 21:47:34 +01:00
Nikita Sirgienko	a0b7ad436b	Cleanup: Cycles: oneAPI: Switch to non-experimental work item API There is now a non-experimental API for this_work_item functionality, so let's use it for better code quality and also to avoid the deprecation warning during compilation. No functional or performance changes are expected. Pull Request: https://projects.blender.org/blender/blender/pulls/133472	2025-02-12 21:46:22 +01:00
Patrick Mours	5810c94f95	Cycles: Add Blackwell to Cycles CUDA binaries architectures Enables building of a Cubin for GPUs based on Blackwell architecture if CUDA toolkit version 12.8 or higher is installed. Only added sm_120 to the default set, since it is the one relevant for consumer GPUs (RTX 5090 etc.) that are generally used with Blender. Pull Request: https://projects.blender.org/blender/blender/pulls/134170	2025-02-10 14:55:28 +01:00
Brecht Van Lommel	f2bf9d747e	Cleanup: Cycles: Remove some unused kernel entry points on CPU	2025-01-13 10:07:37 +01:00
Brecht Van Lommel	2bf6d0fd71	Cleanup: Cycles: Remove unnecessary SSE4.2 CPU kernel This is the minimum requirement, so just the regular kernel already includes these instructions if supported by the CPU architecture.	2025-01-13 10:07:37 +01:00
Brecht Van Lommel	9971648783	Refactor: Cycles: Replace new/delete by unique_ptr, in simple cases Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:23:30 +01:00
Brecht Van Lommel	a8654a1dbe	Refactor: Cycles: Make CPU kernel globals storage more sane Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:23:27 +01:00
Brecht Van Lommel	57ff24cb99	Refactor: Cycles: Add const keyword to more function parameters Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:23:24 +01:00
Brecht Van Lommel	dd51c8660b	Refactor: Cycles: Add const keyword where possible, using clang-tidy Check was misc-const-correctness, combined with readability-isolate-declaration as suggested by the docs. Temporarily clang-format "QualifierAlignment: Left" was used to get consistency with the prevailing order of keywords. Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:23:20 +01:00
Brecht Van Lommel	71b8ecdd84	Cleanup: Cycles: Remove workaround for slow expf in glibc < 2.16 We're on 2.28 now, and were already on 2.17 for many years before that. Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:23:03 +01:00
Brecht Van Lommel	3a57b97eba	Cleanup: Cycles: Remove unneeded oneAPI double emulation for NanoVDB Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:22:59 +01:00
Brecht Van Lommel	d0c2e68e5f	Refactor: Cycles: Automated clang-tidy fixups in Cycles * Use .empty() and .data() * Use nullptr instead of 0 * No else after return * Simple class member initialization * Add override for virtual methods * Include C++ instead of C headers * Remove some unused includes * Use default constructors * Always use braces * Consistent names in definition and declaration * Change typedef to using Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:22:55 +01:00
Brecht Van Lommel	5c46063607	Refactor: Cycles: Make kernel headers work by themselves Shuffle around some code and add more includes so that individual header files compile without errors. Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:22:50 +01:00
Brecht Van Lommel	3c2a6fbb9c	Refactor: Cycles: Use nullptr instead of NULL Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:22:43 +01:00
Brecht Van Lommel	4453ca25b4	Fix: Cycles table precompute app build failure	2024-12-31 00:50:44 +01:00
Thomas Dinges	1be75e86aa	Cleanup: replace floatX_to_floatY() with make_floatY() Now that function overloads are usable on all GPUs, replace the former explicit functions. Pull Request: https://projects.blender.org/blender/blender/pulls/132067	2024-12-19 09:41:55 +01:00
Thomas Dinges	22e16ca096	Cycles: add make_float4(float3 a, float b) type This resolves a todo from the code. Part of the Quality Project. Pull Request: https://projects.blender.org/blender/blender/pulls/131915	2024-12-17 09:11:08 +01:00
Michael Jones	8fe2e37dd0	Fix #130641 : MetalRT: Motion Blur (render errors) This PR fixes #130641. The bug was caused by a missing self-object constraint when performing SSS on motion blur scenes. scene_intersect_local tests were erroneously hitting other objects, and out of range primitive IDs were causing spurious downstream behavior. Pull Request: https://projects.blender.org/blender/blender/pulls/131156	2024-12-03 20:24:36 +01:00
Weizhen Huang	e2d7681fe6	Cleanup: Cycles: remove unused `ccl_loop_no_unroll` Was added in `6121c28501` to ensure compiling on OpenCL, now the definition is empty on all platforms Pull Request: https://projects.blender.org/blender/blender/pulls/131100	2024-11-28 16:37:01 +01:00
Nikita Sirgienko	2aa9203f2f	Cycles: Reintroduce noinline keyword for oneAPI device In `891d71a4d4` this keyword was dropped due to performance regression after `fdc2962beb`, but currently code does not experience this performance degradation, and in fact there is minor performance improvement on Lunar Lake GPUs, along with an expected improvement in compile time. However, this change brings a minor performance regression to shade_surface kernel on Intel Arc and Meteor Lake GPUs, which will be solved later by disabling this keyword for these platforms only. Pull Request: https://projects.blender.org/blender/blender/pulls/130299	2024-11-15 12:09:37 +01:00
Campbell Barton	1b320d5205	Merge branch 'blender-v4.3-release'	2024-10-25 08:03:11 +11:00
Michael Jones	029cd1f739	Cycles: Remove invalid use of MetalRT accept_any_intersection in scene_intersect_local This PR fixes a latent issue arising from invalid use of `accept_any_intersection(true)` when performing SSS ray-stepping with MetalRT. The comment incorrectly states that "we can optimize and accept the first hit", but to guarantee correct behaviour in future we need to request the closest hit.	2024-10-24 10:42:59 +01:00
Xavier Hallade	b614953971	Cycles: oneAPI: fix Linux compilation with fno-honor-nans Previously, when compiling on Rocky Linux 8 with fno-honor-nans, compile time was more than 5x longer than expected, and there was an unresolved symbol to __sqrtf_finite in GPU binaries. Once defining sqrtf in compat.h, both issues are effectively gone, this was certainly due to problematic interactions with build system's math library headers. So we can remove current workaround of defining fhonor-nans, and now have the same set of flags on both Windows and Linux.	2024-10-04 17:50:24 +02:00

1 2 3 4 5 ...

302 Commits