Commit Graph

113 Commits

Author SHA1 Message Date
Campbell Barton
c12994612b License headers: use SPDX-FileCopyrightText in intern/cycles 2023-06-14 16:53:23 +10:00
Xavier Hallade
23de320878 Cycles: fix multi-device rendering with oneAPI and Hardware Raytracing
Only Embree CPU BVH was built in the multi-device case. However, one
Embree GPU BVH is needed per GPU, so we now reuse the same logic as in
the other backends.

Pull Request: https://projects.blender.org/blender/blender/pulls/107992
2023-05-22 15:26:58 +02:00
Sahar A. Kashi
557a245dd5 Cycles: add HIP RT device, for AMD hardware ray tracing on Windows
HIP RT enables AMD hardware ray tracing on RDNA2 and above, and falls back to a
to shader implementation for older graphics cards. It offers an average 25%
sample rendering rate improvement in Cycles benchmarks, on a W6800 card.

The ray tracing feature functions are accessed through HIP RT SDK, available on
GPUOpen. HIP RT traversal functionality is pre-compiled in bitcode format and
shipped with the SDK.

This is not yet enabled as there are issues to be resolved, but landing the
code now makes testing and further changes easier.

Known limitations:
* Not working yet with current public AMD drivers.
* Visual artifact in motion blur.
* One of the buffers allocated for traversal has a static size. Allocating it
  dynamically would reduce memory usage.
* This is for Windows only currently, no Linux support.

Co-authored-by: Brecht Van Lommel <brecht@blender.org>

Ref #105538
2023-04-25 20:19:43 +02:00
Brecht Van Lommel
9cfc7967dd Cycles: use SPDX license headers
* Replace license text in headers with SPDX identifiers.
* Remove specific license info from outdated readme.txt, instead leave details
  to the source files.
* Add list of SPDX license identifiers used, and corresponding license texts.
* Update copyright dates while we're at it.

Ref D14069, T95597
2022-02-11 17:47:34 +01:00
Michael Jones
e688c927eb Fix T94022: Both options GPU/CPU checked under preferences cause viewport render crash. (ARM/Metal)
This fixes crash T94022 when selecting live viewport render with both GPU & CPU devices selected. It is caused by incorrect `KernelBVHLayout` assignment. Similar to `BVH_LAYOUT_MULTI_OPTIX` for Optix, this patch adds a `BVH_LAYOUT_MULTI_METAL` to correctly redirect to the correct Metal BVH layout type.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D13561
2021-12-13 22:34:48 +00:00
Michael Jones
9558fa5196 Cycles: Metal host-side code
This patch adds the Metal host-side code:

- Add all core host-side Metal backend files (device_impl, queue, etc)
- Add MetalRT BVH setup files
- Integrate with Cycles device enumeration code
- Revive `path_source_replace_includes` in util/path (required for MSL compilation)

This patch also includes a couple of small kernel-side fixes:

- Add an implementation of `lgammaf` for Metal [Nemes, Gergő (2010), "New asymptotic expansion for the Gamma function", Archiv der Mathematik](https://users.renyi.hu/~gergonemes/)
- include "work_stealing.h" inside the Metal context class because it accesses state now

Ref T92212

Reviewed By: brecht

Maniphest Tasks: T92212

Differential Revision: https://developer.blender.org/D13423
2021-12-07 15:52:21 +00:00
Michael Jones
f613c4c095 Cycles: MetalRT support (kernel side)
This patch adds MetalRT support to Cycles kernel code. It is mostly additive in nature or confined to Metal-specific code, however there are a few areas where this interacts with other code:

- MetalRT closely follows the Optix implementation, and in some cases (notably handling of transforms) it makes sense to extend Optix special-casing to MetalRT. For these generalisations we now have `__KERNEL_GPU_RAYTRACING__` instead of `__KERNEL_OPTIX__`.
- MetalRT doesn't support primitive offsetting (as with `primitiveIndexOffset` in Optix), so we define and populate a new kernel texture, `__object_prim_offset`, containing per-object primitive / curve-segment offsets. This is referenced and applied in MetalRT intersection handlers.
- Two new BVH layout enum values have been added: `BVH_LAYOUT_METAL` and `BVH_LAYOUT_MULTI_METAL_EMBREE` for XPU mode). Some host-side enum case handling has been updated where it is trivial to do so.

Ref T92212

Reviewed By: brecht

Maniphest Tasks: T92212

Differential Revision: https://developer.blender.org/D13353
2021-11-29 15:20:26 +00:00
Brecht Van Lommel
fd25e883e2 Cycles: remove prefix from source code file names
Remove prefix of filenames that is the same as the folder name. This used
to help when #includes were using individual files, but now they are always
relative to the cycles root directory and so the prefixes are redundant.

For patches and branches, git merge and rebase should be able to detect the
renames and move over code to the right file.
2021-10-26 15:37:04 +02:00
Brecht Van Lommel
8119f0aad2 Cycles: refactor intrinsic functions implementation
* Add processor independent fallbacks
* Use uint32_t and uint64_t types
* Remove unused functions
* Better comments and less indentation

Ref D8237, T78710
2021-02-17 16:26:24 +01:00
Patrick Mours
bfb6fce659 Cycles: Add CPU+GPU rendering support with OptiX
Adds support for building multiple BVH types in order to support using both CPU and OptiX
devices for rendering simultaneously. Primitive packing for Embree and OptiX is now
standalone, so it only needs to be run once and can be shared between the two. Additionally,
BVH building was made a device call, so that each device backend can decide how to
perform the building. The multi-device for instance creates a special multi-BVH that holds
references to several sub-BVHs, one for each sub-device.

Reviewed By: brecht, kevindietrich

Differential Revision: https://developer.blender.org/D9718
2020-12-11 13:24:29 +01:00
Kévin Dietrich
31a620b942 Cycles API: encapsulate Node socket members
This encapsulates Node socket members behind a set of specific methods;
as such it is no longer possible to directly access Node class members
from exporters and parts of Cycles.

The methods are defined via the NODE_SOCKET_API macros in `graph/
node.h`, and are for getting or setting a specific socket's value, as
well as querying or modifying the state of its update flag.

The setters will check whether the value has changed and tag the socket
as modified appropriately. This will let us know how a Node has changed
and what to update, which is the first concrete step toward a more
granular scene update system.

Since the setters will tag the Node sockets as modified when passed
different data, this patch also removes the various modified methods
on Nodes in favor of Node::is_modified which checks the sockets'
update flags status.

Reviewed By: brecht

Maniphest Tasks: T79174

Differential Revision: https://developer.blender.org/D8544
2020-11-04 13:03:33 +01:00
Dalai Felinto
4a8146eb8f Cycles: silence unused variable warning 2020-10-29 18:33:29 +01:00
Stefan Werner
009971ba7a Cycles: Separate Embree device for each CPU Device.
Before, Cycles was using a shared Embree device across all instances.
This could result in crashes when viewport rendering and material
preview were using Cycles simultaneously.

Fixes issue T80042

Maniphest Tasks: T80042

Differential Revision: https://developer.blender.org/D8772
2020-09-01 21:00:55 +02:00
Brecht Van Lommel
d1ef5146d7 Cycles: remove SIMD BVH optimizations, to be replaced by Embree
Ref T73778

Depends on D8011

Maniphest Tasks: T73778

Differential Revision: https://developer.blender.org/D8012
2020-06-22 13:28:01 +02:00
Brecht Van Lommel
ace3268482 Cleanup: minor refactoring around DeviceTask 2020-06-22 13:06:47 +02:00
Brecht Van Lommel
41866070d3 Cleanup: remove unused flag 2020-06-04 14:37:20 +02:00
Giovanni Remigi
19b46b2fca Fix Cycles crash in BVH8 build due to out of bounds memory access
Differential Revision: https://developer.blender.org/D7114
2020-03-11 17:35:11 +01:00
Brecht Van Lommel
d9c5f0d25f Cleanup: split Cycles Hair and Mesh classes, with Geometry base class 2020-02-07 12:18:15 +01:00
Patrick Mours
a2b52dc571 Cycles: add Optix device backend
This uses hardware-accelerated raytracing on NVIDIA RTX graphics cards.

It is still currently experimental. Most features are supported, but a few
are still missing like baking, branched path tracing and using CPU memory.
https://wiki.blender.org/wiki/Reference/Release_Notes/2.81/Cycles#NVIDIA_RTX

For building with Optix support, the Optix SDK must be installed. See here for
build instructions:
https://wiki.blender.org/wiki/Building_Blender/CUDA

Differential Revision: https://developer.blender.org/D5363
2019-09-13 11:50:11 +02:00
Patrick Mours
f6da680946 Cycles: refactor of BVH building to prepare for Optix
Ref D5363
2019-08-26 17:39:57 +02:00
Campbell Barton
e12c08e8d1 ClangFormat: apply to source, most of intern
Apply clang format as proposed in T53211.

For details on usage and instructions for migrating branches
without conflicts, see:

https://wiki.blender.org/wiki/Tools/ClangFormat
2019-04-17 06:21:24 +02:00
Sergey Sharybin
8044e5f2d7 Cycles: Make BVH wider prior to packing
This allows to do more non-trivial tree modifications to make
it more dense and more friendly for vectorization.
2019-01-09 12:14:20 +01:00
Stefan Werner
2c5531c0a5 Cycles: Added Embree as BVH option for CPU renders.
Note that this is turned off by default and must be enabled at build time with the CMake WITH_CYCLES_EMBREE flag.
Embree must be built as a static library with ray masking turned on, the `make deps` scripts have been updated accordingly.
There, Embree is off by default too and must be enabled with the WITH_EMBREE flag.

Using Embree allows for much faster rendering of deformation motion blur while reducing the memory footprint.

TODO: GPU implementation, deduplication of data, leveraging more of Embrees features (e.g. tessellation cache).

Differential Revision: https://developer.blender.org/D3682
2018-11-07 12:58:12 +01:00
Sergey Sharybin
73f2056052 Cycles: Add BVH8 and packeted triangle intersection
This is an initial implementation of BVH8 optimization structure
and packated triangle intersection. The aim is to get faster ray
to scene intersection checks.

    Scene                BVH4      BVH8
barbershop_interior    10:24.94   10:10.74
bmw27                  02:41.25   02:38.83
classroom              08:16.49   07:56.15
fishy_cat              04:24.56   04:17.29
koro                   06:03.06   06:01.45
pavillon_barcelona     09:21.26   09:02.98
victor                 23:39.65   22:53.71

As memory goes, peak usage raises by about 4.7% in a complex
scenes.

Note that BVH8 is disabled when using OSL, this is because OSL
kernel does not get per-microarchitecture optimizations and
hence always considers BVH3 is used.

Original BVH8 patch from Anton Gavrikov.
Batched triangles intersection from Victoria Zhislina.
Extra work and tests and fixes from Maxym Dmytrychenko.
2018-08-29 15:03:09 +02:00
Ray Molenkamp
bf7e406766 Cycles: Fix optimal BVH selection. 2018-01-22 14:52:09 -07:00
Sergey Sharybin
2f79d1c058 Cycles: Replace use_qbvh boolean flag with an enum-based property
This was we can introduce other types of BVH, for example, wider ones, without
causing too much mess around boolean flags.

Thoughs:

- Ideally device info should probably return bitflag of what BVH types it
  supports.

  It is possible to implement based on simple logic in device/ and mesh.cpp,
  rest of the changes will stay the same.

- Not happy with workarounds in util_debug and duplicated enum in kernel.
  Maybe enbum should be stores in kernel, but then it's kind of weird to include
  kernel types from utils. Soudns some cyclkic dependency.

Reviewers: brecht, maxim_d33

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D3011
2018-01-22 17:19:20 +01:00
Brecht Van Lommel
811dbf5525 Code cleanup: deduplicate primitive refit code. 2017-10-15 21:53:58 +02:00
Sergey Sharybin
5a618ab737 Cycles: De-duplicate trace-time object visibility calculation
We already have enough files to worry about in BVH builders. no need to add yet
another copy-paste code which is tempting to be running out of sync.
2017-08-10 09:21:02 +02:00
Brecht Van Lommel
fc38276d74 Fix Cycles shadow catcher objects influencing each other.
Since all the shadow catchers are already assumed to be in the footage,
the shadows they cast on each other are already in the footage too. So
don't just let shadow catchers skip self, but all shadow catchers.

Another justification is that it should not matter if the shadow catcher
is modeled as one object or multiple separate objects, the resulting
render should be the same.

Differential Revision: https://developer.blender.org/D2763
2017-08-07 17:54:26 +02:00
Sergey Sharybin
0097f9b298 Cycles: Split BVH implementations into separate files 2017-04-13 10:55:46 +02:00
Sergey Sharybin
c8548871ac Cycles: Use more explicit and commonly used names for BVH structures
This renames BinaryBVH to BVH2 and QBVH to BVH8. There is no user measurable
difference, but allows us to add more types of BVH trees such as BVH8.
2017-04-13 10:29:14 +02:00
Sergey Sharybin
9b1564a862 Cycles: Cleanup, rename RegularBVH to BinaryBVH
Makes it more explicit what the structure is from it's name.
2017-03-30 09:47:27 +02:00
Sergey Sharybin
270df9a60f Cycles: Cleanup, don't use m_ prefix for public properties 2017-03-29 14:45:49 +02:00
Sergey Sharybin
0579eaae1f Cycles: Make all #include statements relative to cycles source directory
The idea is to make include statements more explicit and obvious where the
file is coming from, additionally reducing chance of wrong header being
picked up.

For example, it was not obvious whether bvh.h was refferring to builder
or traversal, whenter node.h is a generic graph node or a shader node
and cases like that.

Surely this might look obvious for the active developers, but after some
time of not touching the code it becomes less obvious where file is coming
from.

This was briefly mentioned in T50824 and seems @brecht is fine with such
explicitness, but need to agree with all active developers before committing
this.

Please note that this patch is lacking changes related on GPU/OpenCL
support. This will be solved if/when we all agree this is a good idea to move
forward.

Reviewers: brecht, lukasstockner97, maiself, nirved, dingto, juicyfruit, swerner

Reviewed By: lukasstockner97, maiself, nirved, dingto

Subscribers: brecht

Differential Revision: https://developer.blender.org/D2586
2017-03-29 13:41:11 +02:00
Sergey Sharybin
8b8c0d0049 Cycles: Don't calculate primitive time if BVH motion steps are not used
Solves memory regression by the default configuration.
2017-02-15 12:59:31 +01:00
Sergey Sharybin
dc7bbd731a Cycles: Fix wrong hair render results when using BVH motion steps
The issue here was mainly coming from minimal pixel width feature
which is quite commonly enabled in production shots.

This feature will use some probabilistic heuristic in the curve
intersection function to check whether we need to return intersection
or not. This probability is calculated for every intersection check.
Now, when we use multiple BVH nodes for curve primitives we increase
probability of that primitive to be considered a good intersection
for us. This is similar to increasing minimal width of curve.

What is worst here is that change in the intersection probability
fully depends on exact layout of BVH, meaning probability might
change differently depending on a view angle, the way how builder
binned the primitives and such. This makes it impossible to do
simple check like dividing probability by number of BVH steps.

Other solution might have been to split BVH into fully independent
trees, but that will increase memory usage of all the static
objects in the scenes, which is also not something desirable.

For now used most simple but robust approach: store BVH primitives
time and test it in curve intersection functions. This solves the
regression, but has two downsides:

- Uses more memory.

  which isn't surprising, and ANY solution to this problem will
  use more memory.

  What we still have to do is to avoid this memory increase for
  cases when we don't use BVH motion steps.

- Reduces number of maximum available textures on pre-kepler cards.

  There is not much we can do here, hardware gets old but we need
  to move forward on more modern hardware..
2017-02-15 12:45:04 +01:00
Sergey Sharybin
1ad04c7d65 Cycles: Store time in BVH nodes
This way we can stop traversing BVH node early on.

Gives about 2-2.5x times render time improvement with 3 BVH steps.
Hopefully this gives no measurable performance loss for scenes with
single BVH step.

Traversal is currently only implemented for QBVH, meaning old CPUs
and GPU do not benefit from this change.
2017-01-20 12:46:18 +01:00
Sergey Sharybin
48997d2e40 Cycles: Cleanup, style 2016-10-24 12:26:12 +02:00
Sergey Sharybin
1e1811357d Cycles: Cleanup, spaces 2016-10-24 11:47:32 +02:00
Sergey Sharybin
ec54a08d30 Revert "Cycles: Tweak empty boundbox children"
This reverts commit ecbfa31caa.

Original commit broke logic in nodes re-fitting. That area can
access non-existing children momentarely. Not sure what would
be best solution here, for now simply reverting the change/
2016-09-15 09:39:33 +02:00
Sergey Sharybin
ecbfa31caa Cycles: Tweak empty boundbox children
The idea here is to make assert failure to fail sooner on an incorrect
node address rather than later with stack overflow.
2016-09-13 11:05:11 +02:00
Sergey Sharybin
52038fd8c7 Fix T49290: Specific .blend with hair crashes in MacOS 2.78 RC1 on render
The issue was caused by some false-positive empty non-AABB intersection.
Tried to tweak it a bit so it does not record intersection anymore.

Hopefully will work for all platforms. Tested here on iMac and Debian.
2016-09-13 10:59:48 +02:00
Sergey Sharybin
70e7c0829e Cycles: Deduplicate QBVH node packing across BVH build and refit 2016-09-09 11:32:05 +02:00
Sergey Sharybin
6de08f6cd1 Cycles: Fix regular BVH nodes refit
For proper indexing to work we need to use unaligned node with
identity transform instead of aligned nodes when doing refit.

To be backported to 2.78 release.
2016-09-08 15:08:35 +02:00
Sergey Sharybin
27e2317513 Cycles: Add asserts to BVH node packing 2016-09-08 15:03:55 +02:00
Sergey Sharybin
3598a3d1d5 Cycles: Cleanup: line wrapping 2016-09-08 14:26:10 +02:00
Sergey Sharybin
ac061de20d Cycles: Fix refitting of regular BVH
Was causing CUDA issues on viewport edits.
2016-07-15 18:12:34 +02:00
Sergey Sharybin
b03e66e75f Cycles: Implement unaligned nodes BVH builder
This is a special builder type which is allowed to orient nodes to
strands direction, hence minimizing their surface area in comparison
with axis-aligned nodes. Such nodes are much more efficient for hair
rendering.

Implementation of BVH builder is based on Embree, and generally idea
there is to calculate axis-aligned SAH and oriented SAH and if SAH
of oriented node is smaller than axis-aligned SAH we create unaligned
node.

We store both aligned and unaligned nodes in the same tree (which
seems to be different from what Embree is doing) so we don't have
any any extra calculations needed to set up hair ray for BVH
traversal, hence avoiding any possible negative effect of this new
BVH nodes type.

This new builder is currently not in use, still need to make BVH
traversal code aware of unaligned nodes.
2016-07-07 17:25:48 +02:00
Sergey Sharybin
1a2012145d Cycles: Switch node address to absolute values in BVH tree
This seems to be straightforward way to support heterogeneous nodes
in the same tree.

There is some penalty related on 4gig limit of the address space now,
but here's are the thing:

Traversal code was already using ints to store final offset, so
there can't be regressions really.

This is a required commit to make it possible to encode both aligned
and unaligned nodes in the same array. Also, in the future we can use
this to get rid of __leaf_nodes array (which is a bit tricky to do since
trickery in pack_instances().
2016-07-07 17:25:48 +02:00
Sergey Sharybin
17e7454263 Cycles: Reduce memory usage by de-duplicating triangle storage
There are several internal changes for this:

First idea is to make __tri_verts to behave similar to __tri_storage,
meaning, __tri_verts array now contains all vertices of all triangles
instead of just mesh vertices. This saves some lookup when reading
triangle coordinates in functions like triangle_normal().

In order to make it efficient needed to store global triangle offset
somewhere. So no __tri_vindex.w contains a global triangle index which
can be used to read triangle vertices.

Additionally, the order of vertices in that array is aligned with
primitives from BVH. This is needed to keep cache as much coherent as
possible for BVH traversal. This causes some extra tricks needed to
fill the array in and deal with True Displacement but those trickery
is fully required to prevent noticeable slowdown.

Next idea was to use this __tri_verts instead of __tri_storage in
intersection code. Unfortunately, this is quite tricky to do without
noticeable speed loss. Mainly this loss is caused by extra lookup
happening to access vertex coordinate.

Fortunately, tricks here and there (i,e, some types changes to avoid
casts which are not really coming for free) reduces those losses to
an acceptable level. So now they are within couple of percent only,

On a positive site we've achieved:

- Few percent of memory save with triangle-only scenes. Actual save
  in this case is close to size of all vertices.

  On a more fine-subdivided scenes this benefit might become more
  obvious.

- Huge memory save of hairy scenes. For example, on koro.blend
  there is about 20% memory save. Similar figure for bunny.blend.

This memory save was the main goal of this commit to move forward
with Hair BVH which required more memory per BVH node. So while
this sounds exciting, this memory optimization will become invisible
by upcoming Hair BVH work.

But again on a positive side, we can add an option to NOT use Hair
BVH and then we'll have same-ish render times as we've got currently
but will have this 20% memory benefit on hairy scenes.
2016-07-07 17:25:48 +02:00