This implements the proposal from #124512. For that it contains the following
changes:
* Remove the global override of `new`/`delete` when `WITH_CXX_GUARDEDALLOC` was
enabled.
* Always use `MEM_CXX_CLASS_ALLOC_FUNCS` where it is currently used. This used
to be guarded by `WITH_CXX_GUARDEDALLOC` in some but not all cases. This means
that a few classes which didn't use our guarded allocator by default before,
are now using it.
Pull Request: https://projects.blender.org/blender/blender/pulls/130181
NOTE: This also required some changes to Cycles code itself, who is now
directly including `BKE_image.hh` instead of declaring a few prototypes
of these functions in its `blender/utils.h` header (due to C++ functions
names mangling, this was not working anymore).
Pull Request: https://projects.blender.org/blender/blender/pulls/130174
This commits fixes a bug where the new macOS Image Editor image copy
paste feature was correctly pasting images from macOS to Blender, but
wrongly copying image from Blender to other software upside-down.
This was fixed by porting the row inversion logic present in
GHOST_SystemCocoa::getClipboardImage to the putClipboardImage function.
Pull Request: https://projects.blender.org/blender/blender/pulls/129938
This commits fixes a bug where the new macOS Image Editor image copy
paste feature was correctly pasting images from macOS to Blender, but
wrongly copying image from Blender to other software upside-down.
This was fixed by porting the row inversion logic present in
GHOST_SystemCocoa::getClipboardImage to the putClipboardImage function.
Pull Request: https://projects.blender.org/blender/blender/pulls/129938
This change makes it so only kernels of the same vendor are compiled in
parallel. For example for the release builds it will be:
1. All CUDA kernels
2. All OptiX kernels
3. All HIP kernels
4. All OneAPI kernels
This potentially leads to a lower CPU utilization, but it makes it much
easier to manage memory usage and tweak per-vendor concurrency.
The goal of this change is to solve occasional out-of-memory during the
GPU kernels compilation step on the CI/CD farm.
This change also includes tweaks to the prallel jobs for HIP-RT and
oneAPI. The tweak is based on measuring apparent memory usage peak on
Linux when doing single-thread compilation, and giving some safe margin
from the available memory on the buildbot.
Pull Request: https://projects.blender.org/blender/blender/pulls/129945
All the OSL matrix functions had been implemented using the
`Transform` utility of Cycles, but that's built around a 4x3 matrix,
when the OSL matrix functions are working with 4x4 matrices.
This resulted in them not producing results consistent with the
CPU implementation.
This fixes that by making use of the `ProjectionTransform` utility
of Cycles instead, because it's built around a 4x4 matrix. Since
matrix inversion is required, I had to make a few more utility
functions available on the GPU (except Metal, due to use of
references/pointers without specification) that were previously
CPU-only.
Co-authored-by: Brecht Van Lommel <brecht@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/110102
Don't try to use MetalRT by default unless the device explicitly reports that RT is supported. We shouldn't just rely on an assumption that it's supported for M3 and beyond, ad infinitum.
Pull Request: https://projects.blender.org/blender/blender/pulls/129688
6c03339e48 moved from
rtcSetNewGeometryBuffer to rtcSetSharedGeometryBuffer but kept the
additional padding of 1 element in the function call.
It was previously used for over-allocating, to allow 16-byte reads of
all accessed elements, as Embree requires.
With rtcSetSharedGeometryBuffer, this argument led to an out-of-bounds
read as memory was already allocated without padding.
float3 is already 16-bytes so there is no need for padding, hence we
remove it.
We can also note that now, even when using rtcSetSharedGeometryBuffer,
over-allocating is not needed as it's done and functional on Embree side
since v3.6.
Pull Request: https://projects.blender.org/blender/blender/pulls/129643
This commit adds Epoxy as an explicit library requirement for
`intern/opensubdiv`, which uses it in `gl_compute_evaluator.cc`.
This fixes build errors where lite builds that additionally enabled
OpenSubdiv would fail to link due to missing Epoxy symbols. This
problem did not occur in regular release builds due to other CMake
modules adding Epoxy to the library link path in place of OpenSubdiv.
Pull Request: https://projects.blender.org/blender/blender/pulls/129740
This commit adds Epoxy as an explicit library requirement for
`intern/opensubdiv`, which uses it in `gl_compute_evaluator.cc`.
This fixes build errors where lite builds that additionally enabled
OpenSubdiv would fail to link due to missing Epoxy symbols. This
problem did not occur in regular release builds due to other CMake
modules adding Epoxy to the library link path in place of OpenSubdiv.
Pull Request: https://projects.blender.org/blender/blender/pulls/129740
Running Xcode memory graphs and the Instruments tools revealed
memory leaks caused, in the main, by over-retained objects.
This removes the unnecessary 'retains' and adds some asserts
to guard against over-retaining in the future.
There are a few memory leaks remaining involving PyUnicode_DecodeUTF8
but I am unable to identify the cause of these at this time.
Authored by Apple: James McCarthy
Pull Request: https://projects.blender.org/blender/blender/pulls/129117
Similar to 5e46e3d28a.
This commit replaces the C-API version of `OpenSubdiv_Evaluator`
with direct calls to `EvalOutputAPI`. This removes a level of indirection,
theoretically reducing function call overhead, but also making the whole
system easier to understand and easier to modify. The downside is
further spread of `WITH_OPENSUBDIV` into the code, but I think that
can be improved in the future relatively easily once more of this sort
of change is finished.
Pull Request: https://projects.blender.org/blender/blender/pulls/128278
Changing 3D curve properties while viewport rendering was active
resulted in an error, because Cycles would attempt to update the
acceleration structure containing the curves, but that acceleration
structure was built without the
`OPTIX_BUILD_FLAG_ALLOW_UPDATE` flag allowing updates. This
fixes that by adding the flag to all curve build inputs.
Ideally could just use the same flags as for other build inputs and
differentiate between viewport and final rendering (based on
`bvh_type`), but that's not currently an option since the same flags
have to be specified to query the curve intersection module in
`load_kernels()`, where that differentiation is not known. See also
commit 5c6053ccb1.
Pull Request: https://projects.blender.org/blender/blender/pulls/129634
This commit removes support for Vega GPUs from the AMD HIP backend of
Cycles. This is being done as:
- AMD no longer provides official support for Vega GPUs in their
ROCm software.
- Vega GPUs have rendering artifacts on all supported platforms,
and as a result of the reduction of support from AMD, are unlikely
to be fixed. Rendering artifacts include.
- The incorrect shading of volumes (Windows and Linux)
- Missing intersections on many meshes with HIPRT
- Crashing rendering subsurface scattering materials (Linux)
- And more.
Pull Request: https://projects.blender.org/blender/blender/pulls/129523
The `osl_wavelength_color_vf` intrinsic was missing an implementation for
OptiX, causing a link error when attempting to load OSL shaders using the
wavelength node.
Pull Request: https://projects.blender.org/blender/blender/pulls/129372
For now, PointerRNA is made non-trivial by giving explicit default
values to its members.
Besides of BPY python binding code, the change is relatively trivial.
The main change (besides the creation/deletion part) is the replacement
of `memset` by zero-initialized assignment (using `{}`).
makesrna required changes are quite small too.
The big piece of this PR is the refactor of the BPY RNA code.
It essentially brings back allocation and deletion of the BPy_StructRNA,
BPy_Pointer etc. python objects into 'cannonical process', using `__new__`,
and `__init__` callbacks (and there matching CAPI functions).
Existing code was doing very low-level manipulations to create these
data, which is not really easy to understand, and AFAICT incompatible
with handling C++ data that needs to be constructed and destructed.
Unfortunately, similar change in destruction code (using `__del__` and
matching `tp_finalize` CAPI callback) is not possible, because of technical
low-level implementation details in CPython (see [1] for details).
`std::optional` pointer management is used to encapsulate PointerRNA
data. This allows to keep control on _when_ actual RNA creation is done,
and to have a safe destruction in `tp_dealloc` callbacks.
Note that a critical change in Blender's Python API will be that classes
inherinting from `bpy_struct` etc. will now have to properly call the
base class `__new__` and/or `__init__`if they define them.
Implements #122431.
[1] https://discuss.python.org/t/cpython-usage-of-tp-finalize-in-c-defined-static-types-with-no-custom-tp-dealloc/64100
In volume segment, the minimal angle formed by the emitter bounding cone
axis and the vector pointing from the cluster centroid to any point on
the ray is computed via `dot(bcone.axis, point_to_centroid)`, see Fig.8.
in paper.
For distant light this angle is 0, but due to numerical issues this is
not always true. Therefore explicitly assign `-bcone.axis` to
`point_to_centroid` in this case.
Pull Request: https://projects.blender.org/blender/blender/pulls/129489
Currently the number of shadow paths is multiplied by the ratio of
0.5f which would half the number of paths. However, the index can
never be smaller than the number of paths so the shadow paths will
always be compacted.
Pull Request: https://projects.blender.org/blender/blender/pulls/125048