The 2D->2D, 3D->3D, 4D->4D hash functions used in Voronoi node were
using quite an expensive hash function. Switch these to dedicated
2D/3D/4D hash functions (pcg2d, pcg3d, pcg4d) -- these are still very
good quality, but the hash function itself is 3x-4x faster.
Which makes Voronoi node calculation overall be around 2x faster. In
some cases when using OSL, the speedup is even larger.
This visibly changes output of the Voronoi noise however. The actual
noise "behaves" the same, just if someone was depending on the noise
pattern being exactly like it was before, this will change the pattern.
Images, more performance results and details wrt OSL are in the PR.
Pull Request: https://projects.blender.org/blender/blender/pulls/139520
This was a missing features in EEVEE for ages which
was in fact very easy to implement.
EEVEE implements the sample override like the default
`Use` value in Cycles. It always override the sample
count if not 0. Adding a new option for changing this
behavior just like Cycles can be done later while
at the same time making the option more understandable
and its value moved to the blender's DNA.
This PR moves the UI panel to the Blender side to
be shared between Cycles and EEVEE.
Pull Request: https://projects.blender.org/blender/blender/pulls/140219
a degenerate triangle could produce a tangent that is antiparallel to
the normal, resulting the mapped normal to be zero, and becomes NaN when
normalized in `object_normal_transform()`. Fixed by falling back to
unperturbed normal in this case.
Fixes an assertion in the attic benchmark scene.
Pull Request: https://projects.blender.org/blender/blender/pulls/140135
This commit implements #125759.
It removes:
* Blender does not build on big endian systems anymore.
* Support for opening blendfiles written from a big endian system is
removed.
It keeps:
* Support to generate thumbnails from big endian blendfiles.
* BE support in `extern` or `intern` libraries, including Cycles.
* Support to open big endian versions of third party file formats:
- PLY files.
- Some image files (cineon, ...).
Pull Request: https://projects.blender.org/blender/blender/pulls/140138
In blender-v4.5 some problematic commits were reverted, but for 5.0 we will
keep the changes and wait for the HIP SDK to be upgraded and hopefully fix
these issues.
Ref blender/blender#139836
Instead of allowing leaks when parsing arguments, always cleanup before
calling exit(). This impacts -a (animation player), --help & --version
arguments, as well as scripts executed via --python which meant tests
that ran scripts could leak memory without raising an error as intended.
Avoid having suppress warnings & rationalize in code-comments when
leaking memory is/isn't acceptable, any leaks from the animation-player
are now reported as well.
This change exposed leaks: !140182, !140116.
Ref !140098
This prevents the use of unaligned data types in
vertex formats. These formats are not supported on many
platform.
This simplify the `GPUVertexFormat` class a lot as
we do not need packing shenanigans anymore and just
compute the vertex stride.
The old enums are kept for progressive porting of the
backends and user code.
This will break compatibility with python addons.
TODO:
- [x] Deprecation warning for PyGPU (4.5)
- [x] Deprecate matrix attributes
- [x] Error handling for PyGPU (5.0)
- [x] Backends
- [x] Metal
- [x] OpenGL
- [x] Vulkan
Pull Request: https://projects.blender.org/blender/blender/pulls/138846
e934792169 introduced a workaround for
NVIDIA windows where NVIDIA drivers fail to allocate swapchain images
of minimized windows. The fix was to clamp it to 1,1. With this clamping
AMD driver seems to tell Blender that the created swapchain is suboptimal,
and needs to be recreated. This results in over and over creation of
swapchains as they are all considered sub-optimal.
This PR limits the clamping to NVIDIA drivers only.
Pull Request: https://projects.blender.org/blender/blender/pulls/140112
when there is no uv, we call the function `map_to_sphere()` to create
temporary uv for computing the tangent. It could happen that a triangle
has vertices with the u coordinates going across the line where u wraps
from 1 to 0. In this case, just computing the difference of the u
coordinates results in the wrong triangle area.
To fix this problem, we compute distance in toroidal (wrap around)
space.
This is safe for coordinates generated by `map_to_sphere()` function,
because it is not supposed to map the positions of a triangle to u
coordinates that span larger than 0.5.
Pull Request: https://projects.blender.org/blender/blender/pulls/139880
Descriptor sets/pools are known to be troublesome as it doesn't match
how GPUs work, or how application want to work, adding more complexity
than needed. This results is quite an overhead allocating and
deallocating descriptor sets.
This PR will use descriptor buffers when they are available. Most platforms
support descriptor buffers. When not available descriptor pools/sets
will be used.
Although this is a feature I would like to land it in 4.5 due to the API changes.
This makes it easier to fix issues when 4.5 is released.
The feature can easily be disabled by setting the feature to false if it has
to many problems.
Pull Request: https://projects.blender.org/blender/blender/pulls/138266
This requires a minimum driver version of 535, however most devices
were already requiring 570 due to the CUDA toolkit version.
The update is required to be able to use an API function for correct
stack size calculation.
Code for older API versions has been removed.
Fix#138185: OSL custom camera errors with OptiX
Pull Request: https://projects.blender.org/blender/blender/pulls/139801
This is required to make ray differentials work correctly for OSL custom
cameras.
But it also lets us simplify the implementation, and makes the OSL
functionality more complete, such as implementing all noise types.
Pull Request: https://projects.blender.org/blender/blender/pulls/138161
Keep around the dummy BVH for lights, even if it serves no purpose for now.
Previously I assumed it was not needed, but there is some device specific
code that assumes it exists, and not much point trying to refactor that now
when in the future we actually want to create a BVH for lights.
Pull Request: https://projects.blender.org/blender/blender/pulls/139798
With these changes, we can now mark devices which are expected to work as
performant as possible, and devices which were not optimized for some reason.
For example, because the device was released after the Blender release,
making it impossible for developers to optimize for devices in already
released unchangeable code. This is primarily relevant for the LTS versions,
which are supported for two years and require proper communication about
optimization status for the new devices released during this time.
This is implemented for oneAPI devices. Other device types currently are
marked as optimized for compatibility with old behavior, but may implement
the same in the future.
Pull Request: https://projects.blender.org/blender/blender/pulls/139751
calloc is generally faster than zeroing separately after a regular
allocation. Our allocator API exposed an allocation call with "calloc"
in the name that didn't actually use "calloc" because it had an
alignment argument (there is no standardized calloc-with-alignment
provided by the OS). However, we can still use calloc internally if
the alignment fits within the default. That just aligns the function
better with performance expectations.
Pull Request: https://projects.blender.org/blender/blender/pulls/139749
`xkb_compose_state_get_utf8` may return multiple characters, while
this isn't supported, prevent a buffer overflow & report a warning.
Co-authored-by: Phoenix Katsch <phoenixkatsch@gmail.com>
Ref: !114612
e.g. stands for "exempli gratia" in Latin which means "for example".
The best way to make sure it makes sense when writing is to just expand
it to "for example". In these cases where the text was "for e.g.", that
leaves us with "for for example" which makes no sense. This commit fixes
all 110 cases, mostly just just replacing the words with "for example",
but also restructuring the text a bit more in a few cases, mostly by
moving "e.g." to the beginning of a list in parentheses.
Pull Request: https://projects.blender.org/blender/blender/pulls/139596
Several small speedups for Voronoi node (no behavior change). This
affects Cycles and CPU execution of Voronoi node e.g. in Compositor.
- F1 mode: when evaluating distance for Voronoi cells, use a faster
distance estimation, and only do final distance calculation on the
resulting closest cell. This is only really relevant for the default
Euclidian distance, where this saves a square root per evaluated cell
(in 3D Voronoi case saves 26 square roots; in 4D case saves 80 square
roots).
- N-Sphere Radius mode: speedup by doing squared distance calculations.
We only need to find the closest one, so again doing the square root
per cell is not needed here.
Something like 5%-10% speedup for F1 3D Voronoi; more performance details
in the PR.
Pull Request: https://projects.blender.org/blender/blender/pulls/139490
This started with investigating a render issue that appears to be caused by
GCC 15. From what I can tell, it was caused by
`*viewplane = (*viewplane) * bcam->zoom;`.
I'm not entirely sure what the root cause is (potentially pointer aliasing?),
but the restructured code works fine now.
Pull Request: https://projects.blender.org/blender/blender/pulls/139416
On selected platforms there were some validation errors. It was caused by
platforms that returned a different number of swapchain images then were
requested. In that case the semaphores can get out of sync.
Current mechanism isn't future proof as the max number of images are
statically defined.
For this change the present semaphores is also separated from the frames
to better support out of order swapchain images.
Pull Request: https://projects.blender.org/blender/blender/pulls/139446
With OIDN extending its support for new AMD devices, Blender source
code needs to be updated accordingly to reflect these OIDN changes
in the hipSupportsDeviceOIDN function. This function represents
OIDN support on AMD, allowing Blender to know this information
beforehand and avoid unnecessary errors due to attempt to denoise
on unsupported devices, as it was before the introduction of
the hipSupportsDeviceOIDN function.
Pull Request: https://projects.blender.org/blender/blender/pulls/139413
GHOST backend didn't use logging. This PR adds an initial ghost.vulkan
logging and improves the reporting of logging in vulkan.
logging can be enabled by `blender --log "gpu.vulkan,ghost.vulkan" --log-level 2`
it shows the optional extensions that are enabled and information about swap chain
events.
Pull Request: https://projects.blender.org/blender/blender/pulls/139437
Unlike OpenGL and Metal, this handle is not shared, but rather Cycles
has to take ownership of it. This required a fair amount of refactoring
to ensure the handle is closed, ownership is properly transferred, and
the handle is recreated once when the pixel buffer is modified.