test2

Author	SHA1	Message	Date
Lukas Stockner	8cb5e05c48	Cleanup: Cycles: Deduplicate kernel attribute code using templating The attribute handling code in the kernel is currently highly duplicated since it needs to handle five different data types and we couldn't use templates back then. We can now, so might as well make use of it and get rid of ~1000 lines. There are also some small fixes for the GPU OSL code: - Wrong derivative for .w component when converting float2/float3->float4 - Different conversion for float2->float (CPU averages, GPU used to take .x) - Removed useless code for converting to float2, not used by OSL Pull Request: https://projects.blender.org/blender/blender/pulls/134694	2025-02-20 19:28:45 +01:00
Brecht Van Lommel	57ff24cb99	Refactor: Cycles: Add const keyword to more function parameters Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:23:24 +01:00
Brecht Van Lommel	dd51c8660b	Refactor: Cycles: Add const keyword where possible, using clang-tidy Check was misc-const-correctness, combined with readability-isolate-declaration as suggested by the docs. Temporarily clang-format "QualifierAlignment: Left" was used to get consistency with the prevailing order of keywords. Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:23:20 +01:00
Brecht Van Lommel	d0c2e68e5f	Refactor: Cycles: Automated clang-tidy fixups in Cycles * Use .empty() and .data() * Use nullptr instead of 0 * No else after return * Simple class member initialization * Add override for virtual methods * Include C++ instead of C headers * Remove some unused includes * Use default constructors * Always use braces * Consistent names in definition and declaration * Change typedef to using Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:22:55 +01:00
Brecht Van Lommel	5c46063607	Refactor: Cycles: Make kernel headers work by themselves Shuffle around some code and add more includes so that individual header files compile without errors. Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:22:50 +01:00
Brecht Van Lommel	3c2a6fbb9c	Refactor: Cycles: Use nullptr instead of NULL Pull Request: https://projects.blender.org/blender/blender/pulls/132361	2025-01-03 10:22:43 +01:00
Thomas Dinges	1be75e86aa	Cleanup: replace floatX_to_floatY() with make_floatY() Now that function overloads are usable on all GPUs, replace the former explicit functions. Pull Request: https://projects.blender.org/blender/blender/pulls/132067	2024-12-19 09:41:55 +01:00
Michael Jones	8fe2e37dd0	Fix #130641 : MetalRT: Motion Blur (render errors) This PR fixes #130641. The bug was caused by a missing self-object constraint when performing SSS on motion blur scenes. scene_intersect_local tests were erroneously hitting other objects, and out of range primitive IDs were causing spurious downstream behavior. Pull Request: https://projects.blender.org/blender/blender/pulls/131156	2024-12-03 20:24:36 +01:00
Weizhen Huang	e2d7681fe6	Cleanup: Cycles: remove unused `ccl_loop_no_unroll` Was added in `6121c28501` to ensure compiling on OpenCL, now the definition is empty on all platforms Pull Request: https://projects.blender.org/blender/blender/pulls/131100	2024-11-28 16:37:01 +01:00
Michael Jones	029cd1f739	Cycles: Remove invalid use of MetalRT accept_any_intersection in scene_intersect_local This PR fixes a latent issue arising from invalid use of `accept_any_intersection(true)` when performing SSS ray-stepping with MetalRT. The comment incorrectly states that "we can optimize and accept the first hit", but to guarantee correct behaviour in future we need to request the closest hit.	2024-10-24 10:42:59 +01:00
Alaska	c8340cf754	Cycles: Remove AMD and Intel GPU support from Metal backend This is because with the addition of new features to Cycles, these GPUs experienced significant performance regressions and bugs, all stemming from bugs in the Metal GPU driver/compiler. The only reasonable way to work around these issues was to disable parts of Cycles code on these GPUs to avoid the driver/compiler bugs. This resulted in increased development time maintaining these platforms while being unable to deliver feature parity with other GPU backends. It has been decided that this development time is better spent maintaining platforms that are still actively maintained by hardware/software vendors, and so AMD and Intel GPU support will be removed from the Metal backend for Cycles. Pull Request: https://projects.blender.org/blender/blender/pulls/123551	2024-06-26 17:16:20 +02:00
Lukas Stockner	f3f05f945c	Cycles: Add missing make_uintX definitions for Metal	2024-06-05 03:04:04 +02:00
Michael Jones	5be30b7d2b	Cycles: "Struct-of-array-of-packed-structs" for parts of the integrator state On a M3 MacBook Pro, this change increases the benchmark score by 8% (with classroom seeing a path-tracing speedup of 15%). The integrator state is currently store using struct-of-arrays, with one array per field. Such fine grained separation can result in poor GPU cache utilisation in cases where multiple fields of the same parent struct are accessed together. This PR changes the layout of the `ray`, `isect`, `subsurface`, and `shadow_ray` structs so that the data is interleaved (per parent struct) instead of separate. To try and keep this change localised, I encapsulated the layout change by extending the integrator state access macros, however maybe we want to do this more explicitly? (e.g. by updating every bit of code that accesses these parts of the state). Feedback welcome. Pull Request: https://projects.blender.org/blender/blender/pulls/122015	2024-06-04 14:53:30 +02:00
Michael Jones	5508b41a40	Cycles: MetalRT optimisations (scene_intersect_shadow + random_walk) This PR contains optimisations and a general tidy-up of the MetalRT backend. - Currently `scene_intersect` is used for both normal and (opaque) shadow rays, however the usage patterns are different enough to warrant specialisation. Shadow intersection tests (flagged with `PATH_RAY_SHADOW_OPAQUE`) only need a bool result, but need a larger "self" payload in order to exclude hits against target lights. By specialising we can minimise the payload size in each case (which is helps performance) and avoid some dynamic branching. This PR introduces a new `scene_intersect_shadow` function which is specialised in Metal, and currently redirects to `scene_intersect` in the other backends. - Currently `scene_intersect_local` is implemented for worst-case payload requirements as demanded by `subsurface_disk` (where `max_hits` is 4). The random_walk case only demands 1 hit result which we can retrieve directly from the intersector object (rather than stashing it in the payload). By specialising, we significantly reduce the payload size for random_walk queries, which has a big impact on performance. Additionally, we only need to use a custom intersection function for the first ray test in a random walk (for self-primitive filtering), so this PR forces faster `opaque` intersection testing for all but the first random walk test. - Currently `scene_intersect_volume` has a lot of redundant code to handle non-triangle primitives despite volumes only being enclosed by trimeshes. This PR removes this code. Additionally, this PR tidies up the convoluted intersection function linking code, removes some redundant intersection handlers, and uses more consistent naming of intersection functions. On a M3 MacBook Pro, these changes give 2-3% performance increase on typical scenes with opaque trimesh materials (e.g. barbershop, classroom junkshop), but can give over 15% performance increase for certain scenes using random walk SSS (e.g. monster). Pull Request: https://projects.blender.org/blender/blender/pulls/121397	2024-05-10 16:38:02 +02:00
Michael Jones	99f5433445	Cycles: Dormant fixes for adaptive feature compilation This PR fixes the (currently unused) scene-based selective feature compilation macros. These feature based macros haven't been used for a few years, and enabling them currently results in compilation errors. The only functional change in this PR is in geom/primitive.h where undef-ing `__HAIR__` had exposed an inconsistency in how pointcloud attributes were being fetched. Using the more general `primitive_surface_attribute_float4` (instead of `curve_attribute_float4`) fixed a compilation error that occurred when rendering pointcloud unit test scenes with adaptive compilation enabled. Pull Request: https://projects.blender.org/blender/blender/pulls/121216	2024-04-30 12:56:22 +02:00
Weizhen Huang	418acfe8bb	Cleanup: remove unused function parameters This is not a complete list of all the unused parameters in kernel, but those I touch often, so I am more confident that it's safe to delete them.	2024-04-17 18:49:00 +02:00
Weizhen Huang	b81b0308fd	Fix: `WITH_CYCLES_DEBUG` flag not enabled on Metal seems to be enabled on other GPUs already Pull Request: https://projects.blender.org/blender/blender/pulls/119701	2024-03-20 16:42:42 +01:00
Campbell Barton	617f7b76df	Cleanup: comment block formatting	2024-01-08 11:31:43 +11:00
Brecht Van Lommel	6cdb43195e	Refactor: replace NanoVDB kernel side implementation by own code The NanoVDB headers are not compatible with Metal due to missing address space qualifiers. We currently have a big patch for NanoVDB header files, which is difficult to update for OpenVDB 11. Instead extract a few hundred lines of code from NanoVDB to do just what we need. Pull Request: https://projects.blender.org/blender/blender/pulls/115992	2023-12-10 19:37:36 +01:00
Brecht Van Lommel	8ba474dc4f	Refactor: replace NanoVDB SampleFromVoxels by own code This makes the GPU tricubic implementation more efficient. The dense grid code implemented this in terms of trilinear lookups that are hardware accelerated, but for NanoVDB this just causes unnecessary voxel reads. Instead match the CPU code. Pull Request: https://projects.blender.org/blender/blender/pulls/115992	2023-12-10 19:37:36 +01:00
Campbell Barton	58ea0e051f	Cleanup: spelling in comments	2023-11-09 09:54:28 +11:00
Campbell Barton	6bba008325	Cleanup: format	2023-11-09 09:34:49 +11:00
Michael Jones	051ce95628	Cycles: Use Metal Program Scope Global Built-ins on macOS >= 14.0 This PR simplifies the kernel entrypoints by using [Metal Program Scope Global Built-ins](https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf) when available (macOS >= 14.0). Pull Request: https://projects.blender.org/blender/blender/pulls/114535	2023-11-07 11:20:16 +01:00
Michael Jones	1c1c6ac457	Cycles: Fix last failing unit test (T39823) on MetalRT This PR fixes T39823, the sole failing unit test when running with MetalRT. It does so by implementing and binding a missing intersection handler (`__anyhit__cycles_metalrt_volume_test_tri`) which is required for `scene_intersect_volume` (as used by `integrator_volume_stack_update_for_subsurface`) to work as intended. This scene exposed the error as it uses subsurface scattering on a sphere which is intersected by volume. Pull Request: https://projects.blender.org/blender/blender/pulls/112876	2023-09-25 22:41:27 +02:00
Campbell Barton	b7f3e0d84e	Cleanup: spelling & punctuation in comments Also remove some unhelpful/redundant comments.	2023-09-14 13:25:24 +10:00
Harley Acheson	092b568a90	Cleanup: Make format Formatting changes resulting from Make Format	2023-09-13 11:03:43 -07:00
Michael Jones	6c98cb73ac	Cycles: Use new MetalRT curve primitives for 3D curves and ribbons This patch updates the experimental MetalRT code path to use new [curve primitives](https://developer.apple.com/videos/play/wwdc2023/10128/) which were recently added in macOS 14. This replaces the previous custom box intersection implementation, allowing the driver to better optimise curve acceleration structures for the GPU. On existing hardware, this can speed up MetalRT renders by up to 40% for scenes that use hair / curve primitives extensively. The MetalRT option will only be available on macOS >= 14, and requires Xcode >= 15 to build (otherwise the option will be compiled out). Authored by Marco Giordano, Michael Jones, and Jason Fielder --- Before / after render times (M1 Max MacBook Pro, macOS 14 beta, MetalRT enabled): ``` Custom box intersection MetalRT curve primitives Speedup fishy_cat 111.5 80.5 1.39 koro 114.4 86.7 1.32 sinosauropteryx 291.8 279.2 1.05 spring 142.3 142.2 1.00 victor 442.7 347.7 1.27 ``` --- Pull Request: https://projects.blender.org/blender/blender/pulls/111795	2023-09-13 16:02:49 +02:00
Campbell Barton	9e41eccc6e	Cleanup: spelling in comments	2023-09-08 17:12:29 +10:00
Sergey Sharybin	7e4a51329b	Fix shadow linking for Cycles Metal RT The shadow intersection kernels needs to perform extra checks to see whether object is really considered a blocker. Pull Request: https://projects.blender.org/blender/blender/pulls/112012	2023-09-06 15:25:30 +02:00
Campbell Barton	1f01a64403	Cleanup: spelling in comments	2023-09-06 14:23:01 +10:00
Sergey Sharybin	71b4a97cbc	Refactor: De-duplicate Metal RT self intersection checks Use the common BVH utilities header for this. Added a special type qualifier ccl_ray_data which is defined to ccl_private for all platforms but Metal. On Metal it is defined to ray_data. The tricky part is that the BVH utilities are wrapped into the Metal context class. In some of the BVH functions the context has been already constructed, but it wasn't done in all the callbacks. From a quick render tests of the Junkshop benchmark scene there is no render time difference, No functional changes are expected. Pull Request: https://projects.blender.org/blender/blender/pulls/111967	2023-09-05 17:21:49 +02:00
Sergey Sharybin	7365f0b094	Cleanup: Cover .metal files with `make format` Pull Request: https://projects.blender.org/blender/blender/pulls/111930	2023-09-05 09:59:47 +02:00
Sergey Sharybin	c59c97c947	Cleanup: Ensure correct order of headers in Metal kernel Explicitly splint into groups of headers, so that clang-format does not ruin the required order of headers.	2023-09-05 09:59:41 +02:00
Campbell Barton	0caf227530	License headers: use SPDX-FileCopyrightText for .inl and .osl files	2023-08-04 13:24:17 +10:00
Campbell Barton	c12994612b	License headers: use SPDX-FileCopyrightText in intern/cycles	2023-06-14 16:53:23 +10:00
Michael Jones	d0467a277a	Cycles: MetalRT: Don't apply local object transform if it is baked This patch fixes the failing shader/bevel unit test when MetalRT is enabled. The ray was being transformed into local object space even when the SD_OBJECT_TRANSFORM_APPLIED flag was set. Pull Request: https://projects.blender.org/blender/blender/pulls/107292	2023-04-24 15:20:21 +02:00
Michael Jones	5f61eca7af	Cycles: Exploit non-uniform threadgroup sizes on Metal This patch replaces `dispatchThreadgroups` with `dispatchThreads` which takes care of non-uniform threadgroup bounds. This allows us to remove the bounds guards in the integrator kernel entry points. Pull Request: https://projects.blender.org/blender/blender/pulls/106217	2023-03-29 21:46:11 +02:00
Michael Jones	944a5854c6	Cycles: Fix MetalRT shadow all hit bug This patch fixes a MetalRT issue where viable shadow hits are discounted based on the false assumption that hits are ordered by distance. With this patch, the following unit tests now pass: - openvdb smoke - shadow catcher pt transparent lamp only 0.8 - shadow catcher pt transparent lamp only 1.0 Pull Request: https://projects.blender.org/blender/blender/pulls/106276	2023-03-29 20:20:07 +02:00
Julian Eisel	30e517c3ca	Merge branch 'blender-v3.5-release'	2023-03-15 13:07:26 +01:00
Michael Jones	089e8a1887	Cycles: Fix Metal API validation error (use uint instead of ushort) This PR fixes an error that is given when Metal API validation is enabled. The compute grid can exceed 65536 threads so `ushort` is not sufficient for `metal_grid_id [[threadgroup_position_in_grid]]`. This PR also fixes OS version warnings ([Cycles Metal: Unguarded access to newer macOS features #105630](https://projects.blender.org/blender/blender/issues/105630)) Pull Request: https://projects.blender.org/blender/blender/pulls/105763	2023-03-14 22:05:55 +01:00
William Leeson	6c03339e48	Cycles: reduce mesh memory usage by unflattening To improve mesh upload speeds and reduce the size of the scene data which allows larger scenes to be rendered. The meshes in Cycles are currently stored as flattened meshes, where each triangle is stored as a set of 3 vertices. Unflattening writes out the vertices in a list according to the index buffer. This uses a lot of memory and for current hardware does not provide a noticeable benefit. This change unflattens the mesh by directly using the meshes vertex and index buffers directly and skips the unflattening. This change allows for larger scenes and also a reduction in the sizes of the meshes. Further it results in a decrease the amount of time it takes to upload the data to a GPU. This is especially important for when multiple GPUs are used in a single machine. Pull Request #105173	2023-02-27 10:39:19 +01:00
Brecht Van Lommel	02c2970983	Cycles: add NanoVDB support for Metal on Apple Silicon Contributed by Yulia Kuznetcova at Apple. NanoVDB is patched to give add address spaces required by Metal. We hope that in the future Metal will support the generic address space. For AMD and Intel this is currently not available since it causes a performance regression also on scenes without volumes. Pull Request #104837	2023-02-21 15:03:52 +01:00
Michael Jones	2d994de77c	Cycles: MetalRT optimisation for subsurface intersection queries This patch optimises subsurface intersection queries on MetalRT. Currently intersect_local traverses from the scene root, retrospectively discarding all non-local hits. Using a lookup of bottom level acceleration structures, we can explicitly query only the relevant instance. On M1 Max, with MetalRT selected, this can give a render speedup of 15-20% for scenes like Monster which make heavy use of subsurface scattering. Patch authored by Marco Giordano. Reviewed By: brecht Differential Revision: https://developer.blender.org/D17153	2023-02-06 19:12:29 +00:00
Michael Jones	654e1e901b	Cycles: Use local atomics for faster shader sorting (enabled on Metal) This patch adds two new kernels: SORT_BUCKET_PASS and SORT_WRITE_PASS. These replace PREFIX_SUM and SORTED_PATHS_ARRAY on supported devices (currently implemented on Metal, but will be trivial to enable on the other backends). The new kernels exploit sort partitioning (see D15331) by sorting each partition separately using local atomics. This can give an overall render speedup of 2-3% depending on architecture. As before, we fall back to the original non-partitioned sorting when the shader count is "too high". Reviewed By: brecht Differential Revision: https://developer.blender.org/D16909	2023-02-06 11:18:26 +00:00
Campbell Barton	79c82fc1c5	Cleanup: trailing space	2023-01-31 15:49:04 +11:00
Hallam Roberts	a501a2dbff	Images: add mirror extension type This adds a new mirror image extension type for shaders and geometry nodes (next to the existing repeat, extend and clip options). See D16432 for a more detailed explanation of `wrap_mirror`. This also adds a new sampler flag `GPU_SAMPLER_MIRROR_REPEAT`. It acts as a modifier to `GPU_SAMPLER_REPEAT`, so any `REPEAT` flag must be set for the `MIRROR` flag to have an effect. Differential Revision: https://developer.blender.org/D16432	2022-12-14 19:27:29 +01:00
Michael Jones	b0e2e45496	Cycles: Enable MetalRT pointclouds & other fixes Code authored by Marco Giordano. This fixes pointcloud rendering on MetalRT and some other subtle MetalRT bugs: - Incorrect kernel hashing - Missing specialisation constants - Incorrect visibility filtering - Missing null pointer check Reviewed By: brecht Differential Revision: https://developer.blender.org/D16499	2022-11-14 16:39:18 +00:00
Patrick Mours	e6b38deb9d	Cycles: Add basic support for using OSL with OptiX This patch generalizes the OSL support in Cycles to include GPU device types and adds an implementation for that in the OptiX device. There are some caveats still, including simplified texturing due to lack of OIIO on the GPU and a few missing OSL intrinsics. Note that this is incomplete and missing an update to the OSL library before being enabled! The implementation is already committed now to simplify further development. Maniphest Tasks: T101222 Differential Revision: https://developer.blender.org/D15902	2022-11-09 15:30:21 +01:00
Lukas Stockner	e2a93e9c7c	Fix T94136: Cycles: No Hair Shadows with Transparent BSDF	2022-10-20 04:47:21 +02:00
Morteza Mostajab	e6902d19a0	Cycles: Allow Intel GPUs under Metal Known Issues: - Command buffer failures when using binary archives (binary archives is disabled for Intel GPUs as a workaround) - Wrong texture sampler being applied (to be addressed in the future) Ref T92212 Reviewed By: brecht Maniphest Tasks: T92212 Differential Revision: https://developer.blender.org/D16253	2022-10-19 17:09:38 +01:00

1 2

88 Commits