Commit Graph

1278 Commits

Author SHA1 Message Date
Sergey Sharybin
30b962b3d8 Cycles: Optimize 3d and 4d noise
The goal is to reduce the affect of the fmod() used in the noise code,
which was initially reported in the comment:

    https://projects.blender.org/blender/blender/pulls/119884#issuecomment-1258902

Basic idea is to benefit from SIMD vectorization on CPU.

Tested on Linux i9-11900K and macOS on M2 Ultra, in both cases performance
after this change is very close to what it could be with the fmod() commented
out (the call itself, `p = p + precision_correction`).

On macOS the penalty of fmod() was about 10%, on Linux it was closer to 30%
when built with GCC-13. With Linux builds from the buildbot it is more like 18%.

The optimization is only done for 3d and 4d noise. It might be possible to
gain some performance improvement for 1d and 2d cases, but the approach would
need to be different: we'd need to optimize scalar version fmodf(). Maybe
tricks with integer cast will be faster (since we are a bit optimistic in the
kernel and do not guarantee exact behavior in extreme cases such as NaN inputs).

Pull Request: https://projects.blender.org/blender/blender/pulls/137109
2025-04-09 13:40:10 +02:00
Sergey Sharybin
5b0ed683a0 Cycles: Make select() and mask() for vectorized float work on CPU and GPU
Pull Request: https://projects.blender.org/blender/blender/pulls/137148
2025-04-08 17:04:18 +02:00
Michael Jones
326d5bca03 Cycles: Support Decomposed MetalRT motion interpolation
Currently MetalRT interpolates transformation matrix on per-element basis
which leads to issues like #135659.

This change adds implementation of for decomposed (Scale/Rotate/Translate)
motion interpolation, matching behavior of BVH2 and other HW-RT.

This requires macOS 15 and Xcode 16 in order to use this interpolation.
On older platforms and compilers old interpolation is used.

Currently there is no changes on the user (by default) and it is only
available via CYCLES_METALRT_PCMI environment variable. This is because
there are some issues with complex motion paths that need to be looked
into. Having code available makes it easier to do further debugging.

Ref #135659

Authored by Emma Liu

Pull Request: https://projects.blender.org/blender/blender/pulls/136253
2025-04-03 16:24:04 +02:00
Brecht Van Lommel
e394fd191b Refactor: Cycles: Sync various build fixes from the standalone repository
Pull Request: https://projects.blender.org/blender/blender/pulls/136576
2025-03-27 22:07:50 +01:00
Pierre Pontier
178b0cbff9 Cleanup: Fix warnings about comparing int and size_t
Pull Request: https://projects.blender.org/blender/cycles/pulls/24
2025-03-27 22:07:50 +01:00
Brecht Van Lommel
e054dc3eeb Refactor: Cycles: Add concurrent vector, concurrent set from TBB
Pull Request: https://projects.blender.org/blender/blender/pulls/136411
2025-03-24 09:42:25 +01:00
Brecht Van Lommel
07b60c189b Cycles: Perform attribute subdivision on the host side
* Add SubdAttributeInterpolation class for linear attribute interpolation.
* Dicing computes ptex UV and face ID for interpolation.
* Simplify mesh storage of subd primitive counts
* Remove kernel code for subd attribute interpolation
* Remove patch table packing and upload

The old optimization adds a fair amount of complexity to the kernel, affecting
performance even when not using the feature. It's also not that useful as it
does not work for UVs that needs special interpolation. With this simpler code
it should be easier to make it feature complete.

Pull Request: https://projects.blender.org/blender/blender/pulls/135681
2025-03-11 20:58:07 +01:00
Lukas Stockner
8cb5e05c48 Cleanup: Cycles: Deduplicate kernel attribute code using templating
The attribute handling code in the kernel is currently highly duplicated since
it needs to handle five different data types and we couldn't use templates
back then.
We can now, so might as well make use of it and get rid of ~1000 lines.

There are also some small fixes for the GPU OSL code:
- Wrong derivative for .w component when converting float2/float3->float4
- Different conversion for float2->float (CPU averages, GPU used to take .x)
- Removed useless code for converting to float2, not used by OSL

Pull Request: https://projects.blender.org/blender/blender/pulls/134694
2025-02-20 19:28:45 +01:00
Bastien Montagne
48e26c3afe MEM_guardedalloc: Refactor to add more type-safety.
The main goal of these changes are to improve static (i.e. build-time)
checks on whether a given data can be allocated and freed with `malloc`
and `free` (C-style), or requires proper C++-style construction and
destruction (`new` and `delete`).

* Add new `MEM_malloc_arrayN_aligned` API.
* Make `MEM_freeN` a template function in C++, which does static assert on
  type triviality.
* Add `MEM_SAFE_DELETE`, similar to `MEM_SAFE_FREE` but calling
  `MEM_delete`.

The changes to `MEM_freeN` was painful and useful, as it allowed to fix a bunch
of invalid calls in existing codebase already.

It also highlighted a fair amount of places where it is called to free incomplete
type pointers, which is likely a sign of badly designed code (there should
rather be an API to destroy and free these data then, if the data type is not fully
publicly exposed). For now, these are 'worked around' by explicitly casting the
freed pointers to `void *` in these cases - which also makes them easy to search for.
Some of these will be addressed separately (see blender/blender!134765).

Finally, MSVC seems to consider structs defining new/delete operators (e.g. by
using the `MEM_CXX_CLASS_ALLOC_FUNCS` macro) as non-trivial. This does not
seem to follow the definition of type triviality, so for now static type checking in
`MEM_freeN` has been disabled for Windows. We'll likely have to do the same
with type-safe `MEM_[cm]allocN` API being worked on in blender/blender!134771

Based on ideas from Brecht in blender/blender!134452

Pull Request: https://projects.blender.org/blender/blender/pulls/134463
2025-02-20 10:37:10 +01:00
Sergey Sharybin
55e45123f5 Merge branch 'blender-v4.4-release' 2025-02-18 13:12:04 +01:00
Sergey Sharybin
24fbd71a56 Fix #130829: Incorrect render result with light trees
The original report stumbled upon this issue with a more tricky
configuration when light linking is combined with light tress.
However, the actual contributing factor was a mesh with emission
shader which is not assigned to any triangles. This triggered a
bug in the BoundBox::transformed() which converted non-valid bounds
to bounds by performing per-corner growing.

Additionally fix incorrect handling of shared nodes which only
worked for leaf nodes. This was due to the fact how the measure
was accumulated: it is possible that add() is called with an empty
measure.

Pull Request: https://projects.blender.org/blender/blender/pulls/134699
2025-02-18 13:11:46 +01:00
Brecht Van Lommel
416d2c6115 Merge branch 'blender-v4.4-release' 2025-02-12 17:26:38 +01:00
Brecht Van Lommel
44e9e3bfb2 Fix: Build error with latest MSVC 2022 2025-02-12 17:18:55 +01:00
Weizhen Huang
b76fbb285e Cycles: Change the integration measure in Huang Hair from gamma to h
To align better with the pixel and reduce the samples needed.

The paper was using gamma because the jacobian |d_gamma/d_h| approaches
infinity at the boundaries, but it seems that clamping at 0.999 is
enough for numerical stability.

In practice I did not notice a change in the noise level, but it
simplifies the range computation and renders faster due to reduced
sample amount.

Co-authored-by: Olivier Maury <omaury@meta.com>

Ref: !129616

Pull Request: https://projects.blender.org/blender/blender/pulls/134130
2025-02-10 14:58:26 +01:00
Weizhen Huang
f22bfc46d1 Cleanup: Cycles: use utility struct Interval to improve readability 2025-02-05 18:44:03 +01:00
Brecht Van Lommel
f80f97ca0d Refactor: Cycles: Rename rcp to reciprocal
To avoid symbol conflicts with upcoming HIP changes. Also remove
unused implementations for float4 and float8.

Pull Request: https://projects.blender.org/blender/blender/pulls/134045
2025-02-04 18:59:24 +01:00
Brecht Van Lommel
0e8a7c751a Refactor: Cycles: Simplify util_guarded_mem_alloc/free calls
Pull Request: https://projects.blender.org/blender/blender/pulls/132912
2025-01-29 14:11:47 +01:00
Brecht Van Lommel
612cb61199 Cleanup: Cycles: Use simpler make_float3 for single value 2025-01-13 10:07:37 +01:00
Brecht Van Lommel
2bf6d0fd71 Cleanup: Cycles: Remove unnecessary SSE4.2 CPU kernel
This is the minimum requirement, so just the regular kernel already
includes these instructions if supported by the CPU architecture.
2025-01-13 10:07:37 +01:00
Brecht Van Lommel
c3c05559d6 OpenImageIO: Compatibility with version 3.0
Pull Request: https://projects.blender.org/blender/blender/pulls/132654
2025-01-06 17:21:11 +01:00
Campbell Barton
d2d754be3f Cleanup: spelling in comments (make check_spelling*)
- Back-tick quote math expressions so differentiate them
  from English.
- Use doxygen code blocks for TEX expressions.
2025-01-04 16:26:39 +11:00
Hans Goudey
05603e0a64 Fix: Build error from missing include in Cycles after recent cleanups 2025-01-03 09:54:43 -05:00
Brecht Van Lommel
988c1798ac Refactor: Cycles: Replace new/delete with unique_ptr also for nodes
Using new unique_ptr_vector utility class.

Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:32 +01:00
Brecht Van Lommel
9971648783 Refactor: Cycles: Replace new/delete by unique_ptr, in simple cases
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:30 +01:00
Brecht Van Lommel
57ff24cb99 Refactor: Cycles: Add const keyword to more function parameters
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:24 +01:00
Brecht Van Lommel
dd51c8660b Refactor: Cycles: Add const keyword where possible, using clang-tidy
Check was misc-const-correctness, combined with readability-isolate-declaration
as suggested by the docs.

Temporarily clang-format "QualifierAlignment: Left" was used to get consistency
with the prevailing order of keywords.

Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:20 +01:00
Brecht Van Lommel
f2c13cb639 Refactor: Cycles: Work around strange clang-tidy behavior in transform.h
Get rid of somewhat unusual include.

Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:17 +01:00
Brecht Van Lommel
689633d802 Refactor: Cycles: Avoid unsafe memcpy and memcmp
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:15 +01:00
Brecht Van Lommel
da5251f06c Cleanup: Cycles: Remove unused math_matrix.h
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:13 +01:00
Brecht Van Lommel
60bec183cb Refactor: Cycles: Replace foreach() by range based for loops
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:23:05 +01:00
Brecht Van Lommel
d0c2e68e5f Refactor: Cycles: Automated clang-tidy fixups in Cycles
* Use .empty() and .data()
* Use nullptr instead of 0
* No else after return
* Simple class member initialization
* Add override for virtual methods
* Include C++ instead of C headers
* Remove some unused includes
* Use default constructors
* Always use braces
* Consistent names in definition and declaration
* Change typedef to using

Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:22:55 +01:00
Brecht Van Lommel
4951356ebc Refactor: Cycles: Stop using entire OIIO namespace
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:22:52 +01:00
Brecht Van Lommel
5c46063607 Refactor: Cycles: Make kernel headers work by themselves
Shuffle around some code and add more includes so that individual
header files compile without errors.

Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:22:50 +01:00
Brecht Van Lommel
7db0bc2e64 Refactor: Cycles: Make math and type headers work by themselves
Remove separate impl.h headers, shuffle around some code and add more
includes so that individual header files compile without errors.

Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:22:47 +01:00
Brecht Van Lommel
f53e13411b Refactor: Cycles: Use #pragma once
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:22:45 +01:00
Brecht Van Lommel
3c2a6fbb9c Refactor: Cycles: Use nullptr instead of NULL
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:22:43 +01:00
Brecht Van Lommel
4e777476b5 Refactor: Cycles: Replace std::bind by lambdas
Pull Request: https://projects.blender.org/blender/blender/pulls/132361
2025-01-03 10:22:35 +01:00
Brecht Van Lommel
fe368edbb3 Fix: Cycles lite build failure without Pugixml 2024-12-31 00:50:44 +01:00
Thomas Dinges
1be75e86aa Cleanup: replace floatX_to_floatY() with make_floatY()
Now that function overloads are usable on all GPUs, replace the former explicit functions.

Pull Request: https://projects.blender.org/blender/blender/pulls/132067
2024-12-19 09:41:55 +01:00
Thomas Dinges
22e16ca096 Cycles: add make_float4(float3 a, float b) type
This resolves a todo from the code. Part of the Quality Project.

Pull Request: https://projects.blender.org/blender/blender/pulls/131915
2024-12-17 09:11:08 +01:00
Aras Pranckevicius
35d7477371 Cycles: fix accuracy issues in fast_sin/fast_cos/fast_sincos
Most of these originate from OIIO of about 10 years ago. Integrate
the upstream fix from OIIO:
https://github.com/AcademySoftwareFoundation/OpenImageIO/commit/88feb65fc992

Cover them with unit tests. Before the fix, fast_sinf(1.57085085f)
was returning 0.0 instead of 1.0 as expected.

Revert previous hair workaround (a16879a5f0)

Co-authored-by: Sergey Sharybin <sergey@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/131957
2024-12-16 10:05:47 +01:00
Weizhen Huang
27fc091be8 Fix #131723: Cycles volume not sampling channels with zero extinction
The original paper uses the single scattering albedo `sigma_s/sigma_t`
to pick a channel for sampling the scattering distance. However, this
only considers the situation where there is scattering inside the volume.
If some channel has an extinction coefficient of zero, the light passes
through without attenuation for that channel. We assign such channel
with a weight of 1 instead of 0 to make sure it can be sampled.

Pull Request: https://projects.blender.org/blender/blender/pulls/131741
2024-12-13 10:27:53 +01:00
Weizhen Huang
13fb28581b Refactor: Cycles: Share function between volume scattering and shadowing 2024-12-06 16:23:00 +01:00
Weizhen Huang
e2d7681fe6 Cleanup: Cycles: remove unused ccl_loop_no_unroll
Was added in 6121c28501 to ensure compiling
on OpenCL, now the definition is empty on all platforms

Pull Request: https://projects.blender.org/blender/blender/pulls/131100
2024-11-28 16:37:01 +01:00
Falk David
1d571a810f Fix: Cycles: Compiler warning
The `ProjectionTransform` object has no trivial copy-assignment constructor.
This results in the following warning on `gcc (Ubuntu 13.2.0-23ubuntu4) 13.2.0`:
```
/.../blender-git/blender/intern/cycles/kernel/../util/projection.h: In function ‘ccl::ProjectionTransform ccl::projection_inverse(ProjectionTransform)’:
/.../blender-git/blender/intern/cycles/kernel/../util/projection.h:219:9: warning: ‘void* memcpy(void*, const void*, size_t)’ writing to an object of type ‘ccl::ProjectionTransform’ {aka ‘struct ccl::ProjectionTransform’} with no trivial copy-assignment; use copy-assignment or copy-initialization instead [-Wclass-memaccess]
  219 |   memcpy(&tfmR, R, sizeof(R));
      |   ~~~~~~^~~~~~~~~~~~~~~~~~~~~
/.../blender-git/blender/intern/cycles/kernel/../util/projection.h:67:16: note: ‘ccl::ProjectionTransform’ {aka ‘struct ccl::ProjectionTransform’} declared here
   67 | typedef struct ProjectionTransform {
      |                ^~~~~~~~~~~~~~~~~~~
```
To fix the warning, cast the pointer to `(void *)`.

Pull Request: https://projects.blender.org/blender/blender/pulls/130321
2024-11-15 15:24:49 +01:00
weizhen
43187cf174 Fix: Cycles: Compile error on GPU
Missing function qualifiers. Oversight of 93a34b1077
2024-11-12 15:02:59 +01:00
Weizhen Huang
93a34b1077 Refactor: Cycles: add helper struct Interval
To improve readability

Pull Request: https://projects.blender.org/blender/blender/pulls/130156
2024-11-12 12:06:09 +01:00
Patrick Mours
d0dd587b60 Fix #108372: GPU implementation of OSL matrix intrinsic functions
All the OSL matrix functions had been implemented using the
`Transform` utility of Cycles, but that's built around a 4x3 matrix,
when the OSL matrix functions are working with 4x4 matrices.
This resulted in them not producing results consistent with the
CPU implementation.

This fixes that by making use of the `ProjectionTransform` utility
of Cycles instead, because it's built around a 4x4 matrix. Since
matrix inversion is required, I had to make a few more utility
functions available on the GPU (except Metal, due to use of
references/pointers without specification) that were previously
CPU-only.

Co-authored-by: Brecht Van Lommel <brecht@blender.org>

Pull Request: https://projects.blender.org/blender/blender/pulls/110102
2024-11-04 17:59:29 +01:00
Jeroen Bakker
a62fa40b58 Merge branch 'blender-v4.3-release' 2024-10-10 11:28:53 +02:00
Anthony Roberts
ef58d4ae26 Windows: Switch to ProcessorNameString for CPU identification on ARM64
This probably should always have been the value used, really.

Now, instead of reporting `Qualcomm Technologies Inc`, it reports the more informative `Snapdragon(R) X Elite - X1E78100 - Qualcomm(R) Oryon(TM) CPU` on a Thinkpad T14s Gen6 device.

Pull Request: https://projects.blender.org/blender/blender/pulls/128808
2024-10-10 10:37:17 +02:00