* Rename "texture" to "data array". This has not used textures for a long time,
there are just global memory arrays now. (On old CUDA GPUs there was a cache
for textures but not global memory, so we used to put all data in textures.)
* For CUDA and HIP, put globals in KernelParams struct like other devices.
* Drop __ prefix for data array names, no possibility for naming conflict now that
these are in a struct.
* Replace license text in headers with SPDX identifiers.
* Remove specific license info from outdated readme.txt, instead leave details
to the source files.
* Add list of SPDX license identifiers used, and corresponding license texts.
* Update copyright dates while we're at it.
Ref D14069, T95597
Use a shorter/simpler license convention, stops the header taking so
much space.
Follow the SPDX license specification: https://spdx.org/licenses
- C/C++/objc/objc++
- Python
- Shell Scripts
- CMake, GNUmakefile
While most of the source tree has been included
- `./extern/` was left out.
- `./intern/cycles` & `./intern/atomic` are also excluded because they
use different header conventions.
doc/license/SPDX-license-identifiers.txt has been added to list SPDX all
used identifiers.
See P2788 for the script that automated these edits.
Reviewed By: brecht, mont29, sergey
Ref D14069
There are two things achieved by this change:
- No possible downcast of size_t to int when calculating motion steps.
- Disambiguate call to `min()` which was for some reason considered
ambiguous on 32bit platforms `min(int, unsigned int)`.
- Do the same for the `max()` call to keep them symmetrical.
On an implementation side the `min()` is defined for a fixed width
integer type to disambiguate uint from size_t on 32bit platforms,
and yet be able to use it for 32bit operands on 64bit platforms without
upcast.
This ended up in a bit bigger change as the conditional compile-in of
functions is easiest if the functions is templated. Making the functions
templated required to remove the other source of ambiguity which is
`algorithm.h` which was pulling min/max from std.
Now it is the `math.h` which is the source of truth for min/max.
It was only one place which was relying on `algorithm.h` for these
functions, hence the choice of `math.h` as the safest and least
intrusive.
Fixes 32bit platforms (such as i386) in Debian package build system.
Differential Revision: https://developer.blender.org/D14062
There are two things achieved by this change:
- No possible downcast of size_t to int when calculating motion steps.
- Disambiguate call to min() which was for some reason considered
ambiguous on 32bit platforms `min(int, unsigned int)`.
On an implementation side the `min()` is defined for a fixed width
integer type to disambiguate uint from size_t on 32bit platforms,
and yet be able to use it for 32bit operands on 64bit platforms without
upcast.
Fixes 32bit platforms (such as i386) in Debian package build system.
Differential Revision: https://developer.blender.org/D13992
Make the Embree RTC_SCENE_FLAG_COMPACT flag optional and enabled per default.
Disabling it makes CPU rendering a bit faster in some scenes at the cost of a higher memory usage.
Barbershop renders about 3% faster, victor about 4% on CPU with compact BVH disabled.
Differential Revision: https://developer.blender.org/D13592
This add support for rendering of the point cloud object in Blender, as a native
geometry type in Cycles that is more memory and time efficient than instancing
sphere meshes. This can be useful for rendering sand, water splashes, particles,
motion graphics, etc.
Points are currently always rendered as spheres, with backface culling. More
shapes are likely to be added later, but this is the most important one and can
be customized with shaders.
For CPU rendering the Embree primitive is used, for GPU there is our own
intersection code. Motion blur is suppored. Volumes inside points are not
currently supported.
Implemented with help from:
* Kévin Dietrich: Alembic procedural integration
* Patrick Mourse: OptiX integration
* Josh Whelchel: update for cycles-x changes
Ref T92573
Differential Revision: https://developer.blender.org/D9887
This fixes crash T94022 when selecting live viewport render with both GPU & CPU devices selected. It is caused by incorrect `KernelBVHLayout` assignment. Similar to `BVH_LAYOUT_MULTI_OPTIX` for Optix, this patch adds a `BVH_LAYOUT_MULTI_METAL` to correctly redirect to the correct Metal BVH layout type.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D13561
This patch adds the Metal host-side code:
- Add all core host-side Metal backend files (device_impl, queue, etc)
- Add MetalRT BVH setup files
- Integrate with Cycles device enumeration code
- Revive `path_source_replace_includes` in util/path (required for MSL compilation)
This patch also includes a couple of small kernel-side fixes:
- Add an implementation of `lgammaf` for Metal [Nemes, Gergő (2010), "New asymptotic expansion for the Gamma function", Archiv der Mathematik](https://users.renyi.hu/~gergonemes/)
- include "work_stealing.h" inside the Metal context class because it accesses state now
Ref T92212
Reviewed By: brecht
Maniphest Tasks: T92212
Differential Revision: https://developer.blender.org/D13423
This patch adds MetalRT support to Cycles kernel code. It is mostly additive in nature or confined to Metal-specific code, however there are a few areas where this interacts with other code:
- MetalRT closely follows the Optix implementation, and in some cases (notably handling of transforms) it makes sense to extend Optix special-casing to MetalRT. For these generalisations we now have `__KERNEL_GPU_RAYTRACING__` instead of `__KERNEL_OPTIX__`.
- MetalRT doesn't support primitive offsetting (as with `primitiveIndexOffset` in Optix), so we define and populate a new kernel texture, `__object_prim_offset`, containing per-object primitive / curve-segment offsets. This is referenced and applied in MetalRT intersection handlers.
- Two new BVH layout enum values have been added: `BVH_LAYOUT_METAL` and `BVH_LAYOUT_MULTI_METAL_EMBREE` for XPU mode). Some host-side enum case handling has been updated where it is trivial to do so.
Ref T92212
Reviewed By: brecht
Maniphest Tasks: T92212
Differential Revision: https://developer.blender.org/D13353
Happens when device runs out of memory and Cycles is moving some
textures to the host memory.
The delayed memory free for OptiX BVH was moving data from one
device_memory to another, leaving the original device memory in
an invalid state. This was ruining the allocation map in the CUDA
device which is using pointer to the device_memory.
This change makes it so the memory pointer is stolen from BVH
into the delayed memory free list.
Additionally, forbid copying and moving instances of device_memory
and added sanity checks in the device implementation.
Differential Revision: https://developer.blender.org/D13316
Remove prefix of filenames that is the same as the folder name. This used
to help when #includes were using individual files, but now they are always
relative to the cycles root directory and so the prefixes are redundant.
For patches and branches, git merge and rebase should be able to detect the
renames and move over code to the right file.
* Split render/ into scene/ and session/. The scene/ folder now contains the
scene and its nodes. The session/ folder contains the render session and
associated data structures like drivers and render buffers.
* Move top level kernel headers into new folders kernel/camera/, kernel/film/,
kernel/light/, kernel/sample/, kernel/util/
* Move integrator related kernel headers into kernel/integrator/
* Move OSL shaders from kernel/shaders/ to kernel/osl/shaders/
For patches and branches, git merge and rebase should be able to detect the
renames and move over code to the right file.
These transparent shadows can be expansive to evaluate. Especially on the
GPU they can lead to poor occupancy when only some pixels require many kernel
launches to trace and evaluate many layers of transparency.
Baked transparency allows tracing a single ray in many cases by accumulating
the throughput directly in the intersection program without recording hits
or evaluating shaders. Transparency is baked at curve vertices and
interpolated, for most shaders this will look practically the same as actual
shader evaluation.
Fixes T91428, performance regression with spring demo file due to transparent
hair, and makes it render significantly faster than Blender 2.93.
Differential Revision: https://developer.blender.org/D12880
* Rename struct KernelGlobals to struct KernelGlobalsCPU
* Add KernelGlobals, IntegratorState and ConstIntegratorState typedefs
that every device can define in its own way.
* Remove INTEGRATOR_STATE_ARGS and INTEGRATOR_STATE_PASS macros and
replace with these new typedefs.
* Add explicit state argument to INTEGRATOR_STATE and similar macros
In preparation for decoupling main and shadow paths.
Differential Revision: https://developer.blender.org/D12888
Previously the storage here was optimized to avoid indirections in BVH2
traversal. This helps improve performance a bit, but makes performance
and memory usage of Embree and OptiX BVHs a bit worse also. It also adds
code complexity in other parts of the code.
Now decouple triangle and curve primitive storage from BVH2.
* Reduced peak memory usage on all devices
* Bit better performance for OptiX and Embree
* Bit worse performance for CUDA
* Simplified code:
** Intersection.prim/object now matches ShaderData.prim/object
** No more offset manipulation for mesh displacement before a BVH is built
** Remove primitive packing code and flags for Embree and OptiX
** Curve segments are now stored in a KernelCurve struct
* Also happens to fix a bug in baking with incorrect prim/object
Fixes T91968, T91770, T91902
Differential Revision: https://developer.blender.org/D12766
It's unclear why this code was added in the first place, but it seems
unnecessary, it can be restored if we find this breaks something.
The Embree docs mention that the same primitive may be hit multiple times, but
my understanding is that about e.g. curves where both the frontside and backside
may be hit. However those hits would be at different distances.
The context for this change is that we want to add an optimization where we
can immediately update throughput for transparent shadows instead of recording
intersections, and avoid duplicate would require extra work. However there is
an Embree example that does something similar without worrying about duplicate
hits either.
This includes much improved GPU rendering performance, viewport interactivity,
new shadow catcher, revamped sampling settings, subsurface scattering anisotropy,
new GPU volume sampling, improved PMJ sampling pattern, and more.
Some features have also been removed or changed, breaking backwards compatibility.
Including the removal of the OpenCL backend, for which alternatives are under
development.
Release notes and code docs:
https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycleshttps://wiki.blender.org/wiki/Source/Render/Cycles
Credits:
* Sergey Sharybin
* Brecht Van Lommel
* Patrick Mours (OptiX backend)
* Christophe Hery (subsurface scattering anisotropy)
* William Leeson (PMJ sampling pattern)
* Alaska (various fixes and tweaks)
* Thomas Dinges (various fixes)
For the full commit history, see the cycles-x branch. This squashes together
all the changes since intermediate changes would often fail building or tests.
Ref T87839, T87837, T87836
Fixes T90734, T89353, T80267, T80267, T77185, T69800
This patch changes the `MEM_DEVICE_ONLY` type to only allocate on the device and fail if
that is not possible anymore because out-of-memory (since OptiX acceleration structures may
not be allocated in host memory). It also fixes high peak memory usage during OptiX
acceleration structure building.
Reviewed By: brecht
Maniphest Tasks: T85985
Differential Revision: https://developer.blender.org/D10535
Changing the geometry in the current scene caused the primitive offsets for all geometry to
change, but the values would not be updated in all bottom-level BVH structures. Rendering
artifacts and crashes where the result. This fixes that by ensuring all BVH structures are
updated when the primitive offsets change.
Adds support for building multiple BVH types in order to support using both CPU and OptiX
devices for rendering simultaneously. Primitive packing for Embree and OptiX is now
standalone, so it only needs to be run once and can be shared between the two. Additionally,
BVH building was made a device call, so that each device backend can decide how to
perform the building. The multi-device for instance creates a special multi-BVH that holds
references to several sub-BVHs, one for each sub-device.
Reviewed By: brecht, kevindietrich
Differential Revision: https://developer.blender.org/D9718
This encapsulates Node socket members behind a set of specific methods;
as such it is no longer possible to directly access Node class members
from exporters and parts of Cycles.
The methods are defined via the NODE_SOCKET_API macros in `graph/
node.h`, and are for getting or setting a specific socket's value, as
well as querying or modifying the state of its update flag.
The setters will check whether the value has changed and tag the socket
as modified appropriately. This will let us know how a Node has changed
and what to update, which is the first concrete step toward a more
granular scene update system.
Since the setters will tag the Node sockets as modified when passed
different data, this patch also removes the various modified methods
on Nodes in favor of Node::is_modified which checks the sockets'
update flags status.
Reviewed By: brecht
Maniphest Tasks: T79174
Differential Revision: https://developer.blender.org/D8544
This avoids recomputing the BVH for geometries that do not have changes in topology but whose vertices are modified (like a simple character animation), and gives up to 40% speedup for BVH building.
This is only available for viewport renders at the moment.
Reviewed By: pmoursnv, brecht
Differential Revision: https://developer.blender.org/D9353