test2

Author	SHA1	Message	Date
Brecht Van Lommel	f2bf9d747e	Cleanup: Cycles: Remove some unused kernel entry points on CPU	2025-01-13 10:07:37 +01:00
Brecht Van Lommel	2bf6d0fd71	Cleanup: Cycles: Remove unnecessary SSE4.2 CPU kernel This is the minimum requirement, so just the regular kernel already includes these instructions if supported by the CPU architecture.	2025-01-13 10:07:37 +01:00
Xavier Hallade	ce463bd6b1	Cycles: oneAPI: optimize device<->host copies There is a large overhead when doing copies between a device and non-USM host memory. Using the prepare/release API avoids it, as presented in the optimization guide: https://www.intel.com/content/www/us/en/docs/oneapi/optimization-guide-gpu/2025-0/optimizing-data-transfers.html This currently translates to a 4-5% overall rendering speedups on my Arc B580 in most scenes. Pull Request: https://projects.blender.org/blender/blender/pulls/132859	2025-01-09 21:00:12 +01:00
Stefan Werner	a79d95099f	Cycles: Fix OneAPI crash after unique_ptr refactor Memory was freed too early, probably a typo.	2025-01-07 09:37:47 +01:00
Michael Jones	fd06944d15	Fix #131458 : Cycles Metal workaround for binary archives crash There is a macOS bug that causes `[binaryArchive serializeToURL]` to crash sometimes. The fix is coming in macOS 15.4. Pull Request: https://projects.blender.org/blender/blender/pulls/132688	2025-01-06 14:12:22 +01:00
Brecht Van Lommel	d48e73977c	Fix: Build errors on Linux/GCC after recent Cycles refactoring	2025-01-03 11:52:13 +01:00
Brecht Van Lommel	9971648783	Refactor: Cycles: Replace new/delete by unique_ptr, in simple cases Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:23:30 +01:00
Brecht Van Lommel	a8654a1dbe	Refactor: Cycles: Make CPU kernel globals storage more sane Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:23:27 +01:00
Brecht Van Lommel	57ff24cb99	Refactor: Cycles: Add const keyword to more function parameters Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:23:24 +01:00
Brecht Van Lommel	dd51c8660b	Refactor: Cycles: Add const keyword where possible, using clang-tidy Check was misc-const-correctness, combined with readability-isolate-declaration as suggested by the docs. Temporarily clang-format "QualifierAlignment: Left" was used to get consistency with the prevailing order of keywords. Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:23:20 +01:00
Brecht Van Lommel	689633d802	Refactor: Cycles: Avoid unsafe memcpy and memcmp Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:23:15 +01:00
Brecht Van Lommel	d9150484a2	Cleanup: Cycles: Remove some unnecessary #if 0 and #if 1 Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:23:09 +01:00
Brecht Van Lommel	60bec183cb	Refactor: Cycles: Replace foreach() by range based for loops Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:23:05 +01:00
Brecht Van Lommel	d0c2e68e5f	Refactor: Cycles: Automated clang-tidy fixups in Cycles * Use .empty() and .data() * Use nullptr instead of 0 * No else after return * Simple class member initialization * Add override for virtual methods * Include C++ instead of C headers * Remove some unused includes * Use default constructors * Always use braces * Consistent names in definition and declaration * Change typedef to using Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:22:55 +01:00
Brecht Van Lommel	5c46063607	Refactor: Cycles: Make kernel headers work by themselves Shuffle around some code and add more includes so that individual header files compile without errors. Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:22:50 +01:00
Brecht Van Lommel	f53e13411b	Refactor: Cycles: Use #pragma once Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:22:45 +01:00
Brecht Van Lommel	3c2a6fbb9c	Refactor: Cycles: Use nullptr instead of NULL Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:22:43 +01:00
Brecht Van Lommel	4e777476b5	Refactor: Cycles: Replace std::bind by lambdas Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:22:35 +01:00
salipourto	4e5a9c5dfb	Cycles: Handling SDK/ROCm 6+ lack of backward compatibility with pre ROCm 6 This commit introduces proper handling of ROCm 5 and ROCm 6 runtimes on Linux, based on the version of the ROCm compiler used at build time. Previously, HIPEW (the HIP equivalent of Cuda Wrangler) defaulted to loading the ROCm 5 runtime. If ROCm 5 was unavailable, it would attempt to load ROCm 6. However, ROCm 6 introduces changes in certain structures and functions that are not backward compatible, leading to potential issues when kernels compiled with the ROCm 6 compiler are executed on the ROCm 5 runtime. ### Summary of Changes: Separation of Structures and Functions: Structures and functions are now separated into hipew5 and hipew6 to accommodate the differences between ROCm versions. Build-Time Version Detection: The ROCm version is determined during build time, and the corresponding hipew5 or hipew6 is included accordingly. Runtime Default to ROCm 6: By default, HIPEW now loads the ROCm 6 runtime and includes hipew6 (Linux only). JIT Compilation Behavior: Since ROCm 6 is the default version, JIT compilation is supported only when the ROCm 6 compiler is detected at runtime. HIP-RT Update: HIP-RT has been updated to load the ROCm 6 runtime by default. These changes ensure compatibility and stability when switching between ROCm versions, avoiding issues caused by runtime and compiler mismatches. Co-authored-by: Alaska <alaskayou01@gmail.com> Co-authored-by: Sergey Sharybin <sergey@blender.org> Pull Request: https://projects.blender.org/blender/blender/pulls/130153	2024-12-17 16:19:36 +01:00
Alaska	c42894a695	Fix: Various issues with Cycles HIP JIT compilation On Linux, Cycles HIP has a JIT compilation feature. This feature is used when Cycles can not find a precompiled kernel for your GPU. Which is most common when using hardware that wasn't out at the time that a version of Blender was released. There were various issues with this JIT compilation system, this commit aims to solve them. The changes include: - Enable `WITH_NANOVDB` when Blender is built with NanoVDB. - This fixes a issue where VDB objects would not render. - Enable some extra debug options for developers when desired (This is so we match the CUDA implementation of the same feature). - Reduce the optimizaiton level from -O3 to the default. - This is to avoid any extra issues that may occur as a result of an increase optimization level that isn't tested with precompiled kernels. - Reduce the optimization level even further to -O1 for Vega. - This was done on precompiled kernels to work around some issues, so I decided to apply it to JIT kernels as well. - Note: Although Vega is not officially supported, this may help people that unofficially use Vega. - Added some previously missing compiler arguments and fixed errors that were introduced when enabling these compiler arguments. - Fixed a issue where JIT compilation would fail if Blener was installed in a path that had a space in it. Pull Request: https://projects.blender.org/blender/blender/pulls/131853	2024-12-17 01:02:39 +01:00
Lukas Stockner	0de1cea5c5	Cycles: Use fused OptiX OSL programs Based on #123377 by @brecht, but Gitea doesn't like the rebase these so here's a new PR. The purpose here is to switch to fused OptiX programs for OSL execution on CUDA. On the one hand, this makes the code easier since, but there's also another advantage - how memory allocation is managed. OSL shaders need memory to store intermediate values, but how much is needed depends on the complexity of the shader. With the split program approach, Cycles had to provide that memory, so we had to allocate a certain amount (2 KiB, to be precise) statically and show an error if the shader would need more. If the shader used less (which is the case for the vast majority), the memory was just wasted. By switching to fused kernels, OSL knows the required amount during JIT codegen, so it can allocate only what's required, which avoids this waste. One still needs to set a maximum, and in theory, OSL would also support spilling over into a Cycles-provided alternative memory region. However, we currently don't implement that - instead, we default to the same 2048 limit as before and let advanced users override it via the CYCLES_OSL_GROUPDATA_ALLOC environment variable if really needed. Co-authored-by: Brecht Van Lommel <brecht@blender.org> Pull Request: https://projects.blender.org/blender/blender/pulls/130149	2024-11-26 23:58:32 +01:00
Patrick Mours	6f0ed29378	Cycles: Add OptiX 8.1 support The function table symbol declared in the headers was renamed starting in OptiX 8.1, from `g_optixFunctionTable` to `g_optixFunctionTable_<ABI version>`. This adds support for that by using the new macro for the name when available (after OptiX 8.1) and falling back to the old name when it is not (before OptiX 8.1). Pull Request: https://projects.blender.org/blender/blender/pulls/130451	2024-11-18 17:20:49 +01:00
Sergey Sharybin	aec4ba39b9	Merge branch 'blender-v4.3-release'	2024-11-04 17:54:52 +01:00
Michael Jones	d1368883ed	Cycles: MetalRT: Fix logic bug when deciding if HW RT should be used Don't try to use MetalRT by default unless the device explicitly reports that RT is supported. We shouldn't just rely on an assumption that it's supported for M3 and beyond, ad infinitum. Pull Request: https://projects.blender.org/blender/blender/pulls/129688	2024-11-04 17:54:12 +01:00
Clément Foucault	47f7aaa2cc	Merge branch 'blender-v4.3-release'	2024-11-01 12:16:38 +01:00
Jason Fielder	7fbc9e9428	Fix: Metal: Memory leaks identified by Instruments and Xcode memory graph. Running Xcode memory graphs and the Instruments tools revealed memory leaks caused, in the main, by over-retained objects. This removes the unnecessary 'retains' and adds some asserts to guard against over-retaining in the future. There are a few memory leaks remaining involving PyUnicode_DecodeUTF8 but I am unable to identify the cause of these at this time. Authored by Apple: James McCarthy Pull Request: https://projects.blender.org/blender/blender/pulls/129117	2024-11-01 11:56:51 +01:00
Sergey Sharybin	175e46bb51	Merge branch 'blender-v4.3-release'	2024-10-31 17:22:08 +01:00
Patrick Mours	5804a1cc2c	Fix #124200 : OptiX error when updating 3D curves in viewport rendering Changing 3D curve properties while viewport rendering was active resulted in an error, because Cycles would attempt to update the acceleration structure containing the curves, but that acceleration structure was built without the `OPTIX_BUILD_FLAG_ALLOW_UPDATE` flag allowing updates. This fixes that by adding the flag to all curve build inputs. Ideally could just use the same flags as for other build inputs and differentiate between viewport and final rendering (based on `bvh_type`), but that's not currently an option since the same flags have to be specified to query the curve intersection module in `load_kernels()`, where that differentiation is not known. See also commit `5c6053ccb1`. Pull Request: https://projects.blender.org/blender/blender/pulls/129634	2024-10-31 17:21:30 +01:00
Sergey Sharybin	981ab904ba	Merge branch 'blender-v4.3-release'	2024-10-31 16:05:22 +01:00
Alaska	c2f93e0f68	Cycles: Remove support for Vega in Cycles AMD HIP backend This commit removes support for Vega GPUs from the AMD HIP backend of Cycles. This is being done as: - AMD no longer provides official support for Vega GPUs in their ROCm software. - Vega GPUs have rendering artifacts on all supported platforms, and as a result of the reduction of support from AMD, are unlikely to be fixed. Rendering artifacts include. - The incorrect shading of volumes (Windows and Linux) - Missing intersections on many meshes with HIPRT - Crashing rendering subsurface scattering materials (Linux) - And more. Pull Request: https://projects.blender.org/blender/blender/pulls/129523	2024-10-31 16:04:54 +01:00
Weizhen Huang	34b95fe3f6	Cleanup: Cycles: use existing utility functions for geometry types Pull Request: https://projects.blender.org/blender/blender/pulls/129552	2024-10-30 16:45:56 +01:00
Xavier Hallade	2cfe69c07d	Cycles: Fix error handling of BVH transfer to device Previously, in case of a failure during BVH transfer, when running out of memory for example, we could get an error such as "BVH failed to migrate to the GPU due to Embree library error (no error)", because embree error status was actually reset before being queried. This commit fixes its propagation. Pull Request: https://projects.blender.org/blender/blender/pulls/129022	2024-10-15 10:31:30 +02:00
Sergey Sharybin	6c3f3a7fb6	Fix: Proper forward declaration for friend class Turns out it is possible to have code to pick up wrong class when defining a friend: ``` intern\cycles\device/memory.h(255): warning C4099: 'GPUDevice': type name first seen using 'struct' now seen using 'class' source\blender\gpu\GPU_platform.hh(69): note: see declaration of 'GPUDevice' ``` Now made it so the classes have forward declaration in the CCL namespace, avoiding possible conflict with the classes with the same name in the global namespace. Pull Request: https://projects.blender.org/blender/blender/pulls/128485	2024-10-04 09:56:54 +02:00
Sergey Sharybin	95f361ac31	Fix: Cycles occasional crash after Metal render Happens for renders from command line, when kernel specialization thread is still working after the allocators on the Blender side have been deinitialized. Add an explicit deinitializaiton, which ensures all Cycles worker and cache threads are finished before the allocators are deinitialized. This should solve occasional crashes when running regression tests for Metal or Metal-RT. Pull Request: https://projects.blender.org/blender/blender/pulls/128239	2024-09-27 14:39:49 +02:00
Sergey Sharybin	b96a7b7204	Fix #127622 : 4.1 splash screen won't render with MetalRT The commit which made the issue to be more easily discoverable is `4651f8a08f`. The fix is similar to #127114. Pull Request: https://projects.blender.org/blender/blender/pulls/128173	2024-09-26 13:39:22 +02:00
Sahar A. Kashi	26ed4d3892	Cycles: Linux Support for HIP-RT This change switches Cycles to an opensource HIP-RT library which implements hardware ray-tracing. This library is now used on both Windows and Linux. While there should be no noticeable changes on Windows, on Linux this adds support for hardware ray-tracing on AMD GPUs. The majority of the change is typical platform code to add new library to the dependency builder, and a change in the way how ahead-of-time (AoT) kernels are compiled. There are changes in Cycles itself, but they are rather straightforward: some APIs changed in the opensource version of the library. There are a couple of extra files which are needed for this to work: hiprt02003_6.1_amd.hipfb and oro_compiled_kernels.hipfb. There are some assumptions in the HIP-RT library about how they are available. Currently they follow the same rule as AoT kernels for oneAPI: - On Windows they are next to blender.exe - On Linux they are in the lib/ folder Performance comparison on Ubuntu 22.04.5: ``` GPU: AMD Radeon PRO W7800 Driver: amdgpu-install_6.1.60103-1_all.deb main hip-rt attic 0.1414s 0.0932s barbershop_interior 0.1563s 0.1258s bistro 0.2134s 0.1597s bmw27 0.0119s 0.0099s classroom 0.1006s 0.0803s fishy_cat 0.0248s 0.0178s junkshop 0.0916s 0.0713s koro 0.0589s 0.0720s monster 0.0435s 0.0385s pabellon 0.0543s 0.0391s sponza 0.0223s 0.0180s spring 0.1026s 1.5145s victor 0.1901s 0.1239s wdas_cloud 0.1153s 0.1125s ``` Co-authored-by: Brecht Van Lommel <brecht@blender.org> Co-authored-by: Ray Molenkamp <github@lazydodo.com> Co-authored-by: Sergey Sharybin <sergey@blender.org> Pull Request: https://projects.blender.org/blender/blender/pulls/121050	2024-09-24 14:35:24 +02:00
salipourto	17ddca5017	Fix #127240 : Deforming motion blurred point clouds do not render under certain conditions Deforming motion blurred point clouds do not render in Cycles HIP-RT when BVH timesteps != 0 if Blender is launched with debug memory. The root cause is that the size of allocated memory for the bounding boxes is reported to HIP-RT not the number of valid bounding boxes. Pull Request: https://projects.blender.org/blender/blender/pulls/127432	2024-09-12 16:47:13 +02:00
salipourto	4bfee1936c	Fix #126749 : HIP-RT Memory leak in Cycles viewport hiprtScene object wasn't being freed between frames. Pull Request: https://projects.blender.org/blender/blender/pulls/127473	2024-09-12 15:05:08 +02:00
Xavier Hallade	a1182e07b1	Build: upgrade Intel Graphics Compiler and ocloc on Linux IGC 1.0.17384, ocloc 24.31.30508, which: - add support for Battlemage and Lunar Lake GPUs - recover from recent performance regression on Linux - allow to drop older work-around (`9d5164d472`) and need for a patched version on Windows - ocloc now needs "dg2,mtl" naming for fat binaries. opencl-clang patches don't get applied anymore by igc build scripts when llvm is not a git repository, hence I could also drop we can drop current patch disabling patching. I've only slightly pushed min-driver-version updates after carefull testing, instead of jumping to the same version as ocloc as we use to. Pull Request: https://projects.blender.org/blender/blender/pulls/127251	2024-09-12 09:11:56 +02:00
Xavier Hallade	56db2d393d	Cycles: oneAPI: use ocloc 101.5972 on Windows This new version of the graphics compiler solves a performance regression on Arc, adds support for Battlemage and Lunar Lake GPUs, and allows to drop older patch to build fat binaries with broad compatibility. This latter change requires using -device dg2,mtl naming instead of passing architecture ids. Pull Request: https://projects.blender.org/blender/blender/pulls/127371	2024-09-11 17:34:13 +02:00
salipourto	d4597e20b6	Fix #127131 : Deforming motion blurred point clouds do not render in Cycles HIP-RT when BVH timesteps != 0 The device code was disabled for primitives with deformation blur and the intersection function always returned false, hence no rendered primitive. Other than that, there were a few bugs on both device and host codes (e.g., the order of current and previous times and the primitive name.) Pull Request: https://projects.blender.org/blender/blender/pulls/127163	2024-09-06 12:27:17 +02:00
Nikita Sirgienko	94c9898f41	Fix #124811 : Cycles: oneAPI: no hair strands in viewport with Embree oneAPI kernels preloading logic was letting un-needed kernels to be compiled without features, which would then miss when these kernels were needed later. Pull Request: https://projects.blender.org/blender/blender/pulls/127114	2024-09-04 11:08:00 +02:00
Alaska	8cf4d47fe2	Fix: Improve Cycles point clouds in HIPRT Fixes a few issues with point clouds with HIPRT. 1. Crashing when building the BLAS due to an incorrect sized array. 2. A typo leading to all point cloud intersections being skipped. 3. A typo leading to some motion blurred point clouds rendering as if they were stationary, or not rendering at all. Pointclouds, with deformable motion blur, with BVH time steps set to >0 still do not render. Curves seem to have the same issue. Ref #125086 Pull Request: https://projects.blender.org/blender/blender/pulls/125834	2024-09-03 16:31:41 +02:00
Sergey Sharybin	92733a9415	Fix: Cycles memory leak in HIP-RT Some of the device memory objects had their host_pointer overwritten with another CPU-side buffer after allocation. This leads to a leak of host memory allocated by the device_memory. There are few remaining places where the host_pointer is assigned and those seems to be fine because the memory was not yet allocated with a alloc() call. While the approach in this change is not very ideal, it is small and potentially could be ported to the LTS tracks. More ideal solution would be to utilize device_vector::give_data(). Pull Request: https://projects.blender.org/blender/blender/pulls/126788	2024-08-27 12:46:54 +02:00
Patrick Mours	013a2ce765	Cycles: Change OptiX curve vertex data generation to use more compact representation OptiX has accepted Catmull-Rom curve data natively since OptiX 7.4, but due to the previous conversion to B-Spline code, the format that data is fed to OptiX wasn't optimal. Each curve segment was put in the vertex buffer as four independent control points, even though continuous segments actually share control points between each other. This patch compacts that so shared control points only occur once in the vertex buffer. This compact form uses less memory and also allows OptiX to easily identify segments that belong together into a curve (those where the step between indices is one). Pull Request: https://projects.blender.org/blender/blender/pulls/125899	2024-08-15 15:00:56 +02:00
Sergey Sharybin	ce6454d02f	Fix #126005 : x64 Blender on Apple Silicon doesn't render properly in Cycles GPU The GPU packed state is a static check from the Cycles core perspective, and it is disabled for non-Apple Silicon GPUs. However, the Metal kernel always used packed integrator. This change makes it so the Host and Device side checks for the Host CPU are aligned, and that Device-side packed state check does not differ from the Host side. Pull Request: https://projects.blender.org/blender/blender/pulls/126082	2024-08-08 16:01:23 +02:00
Alaska	50ba7a3033	Fix build failure after recent HIP-RT change Fixes build failure after `6b848a9993` Pull Request: https://projects.blender.org/blender/blender/pulls/125936	2024-08-06 02:41:11 +02:00
Alaska	6b848a9993	Fix: Crash with deforming motion blurred meshes in HIPRT Fixes a crash that can occur if motion blur was on, there is a deforming mesh in the scene with deformable motion blur turned on, with BVH time steps set >0. Render results in my test scene appear to match CPU Embree. Pull Request: https://projects.blender.org/blender/blender/pulls/125854	2024-08-05 18:38:28 +02:00
Xavier Hallade	cee4ad4518	Refactor: Cycles: oneAPI: Simplify num_concurrent_states() Deduplicated code by reusing num_concurrent_busy_states().	2024-07-18 15:46:17 +02:00
Xavier Hallade	c8421a0007	Cycles: set num_sort_partition_elements to 65536 for simd16+ Intel GPUs Intel(R) Data Center GPU Max greatly benefits from this change since its bigger simd width leads to a greater execution divergence.	2024-07-18 15:15:00 +02:00

1 2 3 4 5 ...

1375 Commits