NanoVDB is a platform-independent sparse volume data structure that makes it possible to
use OpenVDB volumes on the GPU. This patch uses it for volume rendering in Cycles,
replacing the previous usage of dense 3D textures.
Since it has a big impact on memory usage and performance and changes the OpenVDB
branch used for the rest of Blender as well, this is not enabled by default yet, which will
happen only after 2.82 was branched off. To enable it, build both dependencies and Blender
itself with the "WITH_NANOVDB" CMake option.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D8794
The current way of setting the compute device makes sense for local
use, but for headless rendering it it a massive pain to get Cycles
to use the correct device, usually involving entire Python scripts.
Therefore, this patch adds a simple command-line option to Blender
for specifying the type of device that should be used. If the option
is present, the settings in the user preferences and the scene are
ignored, and instead all devices matching the specified type are used.
Differential Revision: https://developer.blender.org/D9086
Gathers information for time spent in the various managers or object (Film, Camera, etc.) being updated in Scene::device_update.
The stats include the total time spent in the device_update methods as well as time spent in subroutines (e.g. bvh build, displacement, etc.).
This does not qualify as a full blown profiler, but is useful to identify potential bottleneck areas.
The stats can be enabled and printed by passing `--cycles-print-stats` on the command line to Cycles, or `-- --cycles-print-stats` to Blender.
Reviewed By: brecht
Maniphest Tasks: T79174
Differential Revision: https://developer.blender.org/D8596
1ul << n will still be a 32 bit integer regardless
of the value of n, given the target here is 64 bits
the upper 32 bits will always be zero. Using 1ull
will yield the expected result.
The adds a new option to simplify volumes in the viewport.
The setting can be found in the Simplify panel in the render properties.
Volume objects use OpenVDB grids, which are sparse. For rendering,
we have to convert sparse grids to dense grids (for now). Those require
significantly more memory. Therefore, it's often a good idea to reduce
the resolution of volumes in the viewport.
Reviewers: brecht
Differential Revision: https://developer.blender.org/D9040
Ref T73201.
Running Blender on Ampere cards was already possible with ptx, this fix is
needed to support building CUDA binaries.
Note the CUDA version used for official Blender builds is still 10, this is
merely the change to make it possible for those using CUDA 11 and specifying
the sm_8x kernels to be compiled.
Found by Milan Jaros.
On user level this fixes dead-lock of OpenCL render on Intel Iris GPUs.
Note that this patch does not include change in the logic which allows
or disallows OpenCL platforms to be used, that will happen after the
kernel fix is known to be fine for the currently officially supported
platforms.
The dead-lock was caused by wrong usage of memory barriers: as per the
OpenCL specification the barrier is to be executed by the entire work
group. This means, that the following code is invalid:
void foo() {
if (some_condition) {
return;
}
barrier(CLK_LOCAL_MEM_FENCE);
}
void bar() {
foo();
}
The Cycles code was mentioning this as an invalid code on CPU, while in
fact this is invalid as per specification. From the implementation side
this change removes the ifdefs around the CPU-only barrier logic, and
is implementing similar logic in the shader setup kernel.
Tested on NUC8i7HVK NUC.
The root cause of the dead-lock was identified by Max Dmitrichenko.
There is no measurable difference in performance of currently supported
OpenCL platforms.
Differential Revision: https://developer.blender.org/D9039
`SOCKET_OFFSETOF` was added in the initial commit {rBec51175f1fd6c91d5}
when `offsetof` [1] was not supported well enough. GCC and LLVM
support it since C++17.
Other two changes: type and size check can be done without creating
an invalid address too.
[1] https://cppreference.com/w/cpp/types/offsetof
Reviewed By: campbellbarton, brecht
Maniphest Tasks: T81100
Differential Revision: https://developer.blender.org/D9042
Those flags are meant for detecting which socket has changed, so in the
future we can have more granular updates.
`Node` now stores an `update_flags` member which is modified every time
a socket is changed though `Node::set`. The flags are or-able bits
stored in `SocketType` instances. Each `SocketType` stores a unique bit
out of 64, for the 64 bits of an uint64_t; the bit
corresponds to the index of the socket in the `Node`'s sockets array +
1, so the socket at index 5 will have the 6th bit set as its flag. This
limits us to 64 sockets per Node, which should be plenty for the current
set of `Nodes` that we have.
This does not change the behavior of other parts of Cycles.
This is part of T79131.
Reviewed By: brecht
Maniphest Tasks: T79131
Differential Revision: https://developer.blender.org/D8644
Ref {D8855}
Unix and Apple platform files use find_package(OpenSubdiv) which sets
`OPENSUBDIV_INCLUDE_DIR` as an advanced variable, as well as
`OPENSUBDIV_INCLUDE_DIRS` which should be used usually.
Windows sets `OPENSUBDIV_INCLUDE_DIR` which is used by the rest
of the code.
This patch renames it to `_DIRS` everywhere, for it to be like other
similar variables.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D8917
Allows to more easily access time information about how long Cycles
did spend synchronizing objects from evaluated depsgraph on Blender
side to its own structures.
This timer does not include time spent evaluating render depsgraph.
The ffmpeg, guardedalloc and blenlib are quite isolated and putting them in
their own executable separate from blender_test is faster for development than
linking the entire blender_tests executable.
For Cycles, this also bundles all the unit tests into one executable.
Ref T79958
Differential Revision: https://developer.blender.org/D8714
Since this buffer is used as an array of 12 32bits integers, and C++
`string` expect a NULL-terminated C-string, we need an extra char to
ensure last one is always NULL.
See D8906. Found while investigating T80657.
Commit 009971ba7a changed it so Cycles creates a separate
Embree device for each Cycles device, but missed the multi-device case. A multi-device with
Embree BVH can occur when CPU rendering is used with OptiX denoising and BVH creation then
failed to get a valid pointer to the Embree device, which crashed. This fixes that by providing the
correct device pointer in the multi-device case as well.
Caused by f04260d8c6.
Cycles' CMake defines macros with the same name as Blender, which
override the Blender ones. There's however a small difference in the
re-defined `remove_cc_flag()`, the Cycles version only takes one flag at
a time. So I guess Blender's calls to it would only result in the first
flag being removed.
Of course Cycles shouldn't override any Blender macros, but I'll leave
that up to Brecht to fix properly.
* Support precompiled libraries on Linux
* Add license headers
* Refactoring to deduplicate code
Includes work by Ray Molenkamp and Grische for precompiled libraries.
Ref D8769
The current 1D Voronoi implementation for the Distance to Edge option
computes the distance to the cells instead. This patch fixes that and
compute the distance to the edge.
Reviewed By: JacquesLucke, brecht
Differential Revision: https://developer.blender.org/D8634
This change enables the developer option `WITH_CYCLES_NATIVE_ONLY`
for MSVC. This allows a developer to just build the cycles
CPU kernel for their specific system rather than all kernels,
speeding up development.
Other platforms have had this option for years, but MSVC lacks
the compiler switch to target the host architecture hence it
always build all kernels.
This change uses a small helper program to detect the required
flags.
Only AVX/AVX2 are tested, for the following reasons
- SSE2 is enabled by default and requires no flags
- SSE3/4 have no specific build flags for msvc
- AVX512 is not yet supported by cycles
Differential Revision: https://developer.blender.org/D8775
Reviewed by: brecht, sergey
Before, Cycles was using a shared Embree device across all instances.
This could result in crashes when viewport rendering and material
preview were using Cycles simultaneously.
Fixes issue T80042
Maniphest Tasks: T80042
Differential Revision: https://developer.blender.org/D8772
Problem: the Blender synchronization process creates and tags nodes for usage. It does
this by directly adding and removing nodes from the scene data. If some node is not tagged
as used at the end of a synchronization, it then deletes the node from the scene. This poses
a problem when it comes to supporting procedural nodes who can create other nodes not known
by the Blender synchonization system, which will remove them.
Nodes now have a NodeOwner, which is set after creation. Those owners for now are the Scene
for scene level nodes and ShaderGraph for shader nodes. Instead of creating and deleting
nodes using `new` and `delete` explicitely, we now use `create_node` and `delete_node` methods
found on the owners. `delete_node` will assert that the owner is the right one.
Whenever a scene level node is created or deleted, the appropriate node manager is tagged for
an update, freeing this responsability from BlenderSync or other software exporters.
Concerning BlenderSync, the `id_maps` do not explicitely manipulate scene data anymore, they
only keep track of which nodes are used, employing the scene to create and delete them. To
achieve this, the ParticleSystem is now a Node, although it does not have any sockets.
This is part of T79131.
Reviewed By: #cycles, brecht
Maniphest Tasks: T79131
Differential Revision: https://developer.blender.org/D8540
We forgot to update this code as part of D3108. I'd like to include this in 2.90,
it's entirely broken now so can't really get any worse.
Differential Revision: https://developer.blender.org/D8738