Commit Graph

2860 Commits

Author SHA1 Message Date
Xavier Hallade
190ad73590 Cycles oneAPI: Remove direct dependency on Level-Zero
We used it only to access device id for explicitly allowing Arc GPUs.
It made the backend require ze_loader.dll which could be problematic if
we end up using direct linking.
I've replaced filtering based on PCI device id by using other HW properties
instead (EUs, threads per EU), that are now available through Level-Zero.
2022-07-06 18:55:38 +02:00
Xavier Hallade
debb233787 Cleanup: fix comments in oneAPI kernel.cpp 2022-07-06 18:55:38 +02:00
Nikita Sirgienko
0df574b55e Cycles: Improve an occupancy for Intel GPUs
Initially oneAPI implementation have waited after each memory
operation, even if there was no need for this. Now, the implementation
will wait only if it is really necessary - it have improved
performance noticeble for some scenes and a bit for the rest of them.
2022-07-06 17:26:23 +02:00
Xavier Hallade
41c10ac84a Cycles: fix support for multiple Intel GPUs
Identical Intel GPUs ended up with the same id.
Added PCI BDF to the id to make it unique.
2022-07-01 11:20:00 +02:00
Xavier Hallade
0554537c3c Cleanup: add missing license headers in Cycles oneAPI implementation 2022-07-01 10:13:07 +02:00
Campbell Barton
b6c28002ac Cleanup: spelling in comments 2022-06-30 12:14:22 +10:00
Xavier Hallade
a02992f131 Cycles: Add support for rendering on Intel GPUs using oneAPI
This patch adds a new Cycles device with similar functionality to the
existing GPU devices.  Kernel compilation and runtime interaction happen
via oneAPI DPC++ compiler and SYCL API.

This implementation is primarly focusing on Intel® Arc™ GPUs and other
future Intel GPUs.  The first supported drivers are 101.1660 on Windows
and 22.10.22597 on Linux.

The necessary tools for compilation are:
- A SYCL compiler such as oneAPI DPC++ compiler or
  https://github.com/intel/llvm
- Intel® oneAPI Level Zero which is used for low level device queries:
  https://github.com/oneapi-src/level-zero
- To optionally generate prebuilt graphics binaries: Intel® Graphics
  Compiler All are included in Linux precompiled libraries on svn:
  https://svn.blender.org/svnroot/bf-blender/trunk/lib The same goes for
  Windows precompiled binaries but for the graphics compiler, available
  as "Intel® Graphics Offline Compiler for OpenCL™ Code" from
  https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html,
  for which path can be set as OCLOC_INSTALL_DIR.

Being based on the open SYCL standard, this implementation could also be
extended to run on other compatible non-Intel hardware in the future.

Reviewed By: sergey, brecht

Differential Revision: https://developer.blender.org/D15254

Co-authored-by: Nikita Sirgienko <nikita.sirgienko@intel.com>
Co-authored-by: Stefan Werner <stefan.werner@intel.com>
2022-06-29 12:58:04 +02:00
Brecht Van Lommel
c257443192 Fix Cycles assert with mix weights outside of 0..1 range
This could result in wrong skipping of SVM nodes in the graph. Now make the
logic consistent with the clamping in the OSL implementation and constant
folding.

Thanks to Christophe Hery for finding the problem and providing the fix.
2022-06-28 19:13:57 +02:00
Sayak Biswas
abfa09752f Cycles: enable Vega GPU/APU support
Enables Vega and Vega II GPUs as well as Vega APU, using changes in HIP code
to support 64-bit waves and a new HIP SDK version.

Tested with Radeon WX9100, Radeon VII GPUs and Ryzen 7 PRO 5850U with Radeon
Graphics APU.

Ref T96740, T91571

Differential Revision: https://developer.blender.org/D15242
2022-06-28 18:35:43 +02:00
Xavier Hallade
633c2f07da Cyles: switch primitive.h inline hints to forceinline
This change helps decrease Intel GPU binaries compile time by 5-10
minutes without impacting other backends.

Reviewed By: sergey, brecht

Differential Revision: http://developer.blender.org/D15273
2022-06-23 18:36:48 +02:00
Andrii Symkin
c2a2f3553a Cycles: unify math functions names
This patch unifies the names of math functions for different data types and uses
overloading instead. The goal is to make it possible to swap out all the float3
variables containing RGB data with something else, with as few as possible
changes to the code. It's a requirement for future spectral rendering patches.

Differential Revision: https://developer.blender.org/D15276
2022-06-23 15:02:53 +02:00
Brecht Van Lommel
ff1883307f Cleanup: renaming and consistency for kernel data
* Rename "texture" to "data array". This has not used textures for a long time,
  there are just global memory arrays now. (On old CUDA GPUs there was a cache
  for textures but not global memory, so we used to put all data in textures.)
* For CUDA and HIP, put globals in KernelParams struct like other devices.
* Drop __ prefix for data array names, no possibility for naming conflict now that
  these are in a struct.
2022-06-20 12:30:48 +02:00
Brecht Van Lommel
2c1bffa286 Cleanup: add verbose logging category names instead of numbers
And use them more consistently than before.
2022-06-17 14:08:14 +02:00
Brecht Van Lommel
24246d9870 Cleanup: replace uint4 by AttributeMap struct 2022-06-17 14:08:14 +02:00
Michael Jones
4412e14708 Cycles: Useful Metal backend debug & profiling functionality
This patch adds some useful debugging & profiling env vars to the Metal backend:

- `CYCLES_METAL_PROFILING`: output a per-kernel timing report at the end of the render
- `CYCLES_METAL_DEBUG`: enable per-dispatch tracing (very verbose)
- `CYCLES_DEBUG_METAL_CAPTURE_KERNEL`: enable programatic .gputrace capture for a specified kernel index

Here's an example of the timing report with `CYCLES_METAL_PROFILING` enabled:

```
---------------------------------------------------------------------------------------------------
Kernel name                                 Total threads   Dispatches     Avg. T/D    Time   Time%
---------------------------------------------------------------------------------------------------
integrator_init_from_camera                   657,407,232          161    4,083,274   0.24s   0.51%
integrator_intersect_closest                1,629,288,440          681    2,392,494  15.18s  32.12%
integrator_intersect_shadow                   751,652,291          470    1,599,260   5.80s  12.28%
integrator_shade_background                   304,612,074          263    1,158,220   1.16s   2.45%
integrator_shade_surface                    1,159,764,041          676    1,715,627  20.57s  43.52%
integrator_shade_shadow                       598,885,847          418    1,432,741   1.27s   2.69%
integrator_queued_paths_array               2,969,650,130          805    3,689,006   0.35s   0.74%
integrator_queued_shadow_paths_array          593,936,619          379    1,567,115   0.14s   0.29%
integrator_terminated_paths_array              22,205,417          155      143,260   0.05s   0.10%
integrator_sorted_paths_array               2,517,140,043          676    3,723,579   1.65s   3.50%
integrator_compact_paths_array                648,912,748          155    4,186,533   0.03s   0.07%
integrator_compact_states                      20,872,687          155      134,662   0.14s   0.29%
integrator_terminated_shadow_paths_array      374,100,675          438      854,111   0.16s   0.33%
integrator_compact_shadow_paths_array         503,768,657          438    1,150,156   0.05s   0.10%
integrator_compact_shadow_states               37,664,941          202      186,460   0.23s   0.50%
integrator_reset                               25,165,824            6    4,194,304   0.06s   0.12%
film_convert_combined_half_rgba                 3,110,400            6      518,400   0.00s   0.01%
prefix_sum                                            676          676            1   0.19s   0.40%
---------------------------------------------------------------------------------------------------
                                                                 6,760               47.27s 100.00%
---------------------------------------------------------------------------------------------------
```

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D15044
2022-06-07 11:08:39 +01:00
Dalai Felinto
e7156be86e Merge remote-tracking branch 'origin/blender-v3.2-release' 2022-06-03 16:13:51 +02:00
Brecht Van Lommel
074010ad6d Fix T98570: Cycles AOVs not working in background shader
Must include the AOV writing feature in background shader evaluation.

Differential Revision: https://developer.blender.org/D15114
2022-06-03 15:34:31 +02:00
Brecht Van Lommel
610619c203 Merge branch 'blender-v3.2-release' 2022-05-31 17:35:16 +02:00
Brecht Van Lommel
f2cd7e08fe Fix Cycles MNEE not working for Metal
Move MNEE to own kernel, separate from shader ray-tracing. This does introduce
the limitation that a shader can't use both MNEE and AO/bevel, but that seems
like the better trade-off for now.

We can experiment with bigger kernel organization changes later.

Differential Revision: https://developer.blender.org/D15070
2022-05-31 17:24:43 +02:00
Jacques Lucke
25d216724b Cleanup: make format 2022-05-24 15:53:16 +02:00
Brecht Van Lommel
770510915c Merge branch 'blender-v3.2-release' 2022-05-23 22:26:27 +02:00
Brecht Van Lommel
a22ad7fbd3 Fix T98036: Cycles blackbody inaccurate for low temperature and wide gamut
Regenerate blackbody RGB curve fit to not clamp values, and extend down to
800K since it does now change below 965K.

Note that as before, blackbody is only defined in the range 800K to 12000K
and has a fixed value outside of that. But within that range there should
be no more unnecessary gamut clamping.
2022-05-23 22:07:59 +02:00
Patrick Mours
a8c81ffa83 Cycles: Add half precision float support for volumes with NanoVDB
This patch makes it possible to change the precision with which to
store volume data in the NanoVDB data structure (as float, half, or
using variable bit quantization) via the previously unused precision
field in the volume data block.
It makes it possible to further reduce memory usage during
rendering, at a slight cost to the visual detail of a volume.

Differential Revision: https://developer.blender.org/D10023
2022-05-23 19:08:01 +02:00
Sergey Sharybin
698e394e7e Merge branch 'blender-v3.2-release' 2022-05-23 16:02:52 +02:00
Sergey Sharybin
9bb4bf5748 Fix missing 64bit casts when calculating Cycles render buffer offset
Found those missing casts while looking into a crash report made in
the Blender Chat. Was unable to reproduce the crash, but the casts
should totally be there to avoid integer overflow.
2022-05-23 15:59:52 +02:00
Campbell Barton
ffbeb34f5f Cleanup: format 2022-05-18 12:17:42 +10:00
Richard Antalik
df26f4f63a Merge branch 'blender-v3.2-release' 2022-05-17 21:23:55 +02:00
Olivier Maury
b48adbc9d7 Fix T97921: Cycles MNEE issue with light path nodes
Ensure the correct total/diffuse/transmission depth is set when evaluating
shaders for MNEE, consistent with regular light shader evaluation.

Differential Revision: https://developer.blender.org/D14902
2022-05-17 21:07:49 +02:00
Bastien Montagne
b3b5d4cabb Merge branch 'blender-v3.2-release' 2022-05-17 17:34:51 +02:00
Brecht Van Lommel
8fdd3aad9b Fix T98163: Cycles OSL rendering normal maps differently
Match SVM and ensure valid reflection when setting up BSDFs.
2022-05-17 16:46:37 +02:00
Antonio Vazquez
c8b740cc00 Merge branch 'blender-v3.2-release' 2022-05-17 16:06:27 +02:00
Brecht Van Lommel
dbb6016e94 Cleanup: fix compiler warning 2022-05-17 15:36:22 +02:00
Campbell Barton
1660eff1a2 Cleanup: format 2022-05-17 10:06:13 +10:00
Brecht Van Lommel
f1c27b383b Merge branch 'blender-v3.2-release' 2022-05-16 15:21:27 +02:00
Olivier Maury
48754bc146 Fix T97867: Cycles MNEE blocky artefacts for rough refractive interfaces
Made tangent frame consistent across the surface regardless of the sample,
which was not the case with the previous algorithm. Previously, a tangent
frame would stay consistent for the same sample throughout the walk, but not
from sample to sample for the same triangle. This actually resulted in code
simplification.

Also includes additional fixes:

* Fixed an important bug that manifested itself with multiple lights in the
  scene, where caustics had abnormally low amplitude: The final light pdf did
  not include the light distribution pdf.
* Removed unnecessary orthonormal basis generation function, using cycles'
  native one instead.
* Increased solver max iteration back to 64: It turns out we sometimes need
  these extra iterations in cases where projection back to the surface takes
  many steps. The effective solver iteration count, the most expensive part,
  is actually much less than the raw iteration count.

Differential Revision: https://developer.blender.org/D14931
2022-05-16 15:11:54 +02:00
Campbell Barton
427a2c920a Cleanup: spelling in comments, capitalize tags
Also add missing task-ID reference & remove colon after \note as it
doesn't render properly in doxygen.
2022-05-13 09:29:25 +10:00
Jesse Yurkovich
578771ae4d UDIM: Add support for packing inside .blend files
This completes support for tiled texture packing on the Blender / Cycles
side of things.

Most of these changes fall into one of three categories:
- Updating Image handling code to pack/unpack tiled and multi-view images
- Updating Cycles to handle tiled textures through BlenderImageLoader
- Updating OSL to properly handle textures with multiple slots

Differential Revision: https://developer.blender.org/D14395
2022-05-11 20:11:44 -07:00
Michael Jones
007184bcf2 Enable inlining on Apple Silicon. Use new process-wide ShaderCache in order to safely re-enable binary archives
This patch is the same as D14763, but with a fix for unit test failures caused by ShaderCache fetch logic not working in the non-MetalRT case:

```
diff --git a/intern/cycles/device/metal/kernel.mm b/intern/cycles/device/metal/kernel.mm
index ad268ae7057..6aa1a56056e 100644
--- a/intern/cycles/device/metal/kernel.mm
+++ b/intern/cycles/device/metal/kernel.mm
@@ -203,9 +203,12 @@ bool kernel_has_intersection(DeviceKernel device_kernel)

   /* metalrt options */
   request.pipeline->use_metalrt = device->use_metalrt;
-  request.pipeline->metalrt_hair = device->kernel_features & KERNEL_FEATURE_HAIR;
-  request.pipeline->metalrt_hair_thick = device->kernel_features & KERNEL_FEATURE_HAIR_THICK;
-  request.pipeline->metalrt_pointcloud = device->kernel_features & KERNEL_FEATURE_POINTCLOUD;
+  request.pipeline->metalrt_hair = device->use_metalrt &&
+                                   (device->kernel_features & KERNEL_FEATURE_HAIR);
+  request.pipeline->metalrt_hair_thick = device->use_metalrt &&
+                                         (device->kernel_features & KERNEL_FEATURE_HAIR_THICK);
+  request.pipeline->metalrt_pointcloud = device->use_metalrt &&
+                                         (device->kernel_features & KERNEL_FEATURE_POINTCLOUD);

   {
     thread_scoped_lock lock(cache_mutex);
@@ -225,9 +228,9 @@ bool kernel_has_intersection(DeviceKernel device_kernel)

   /* metalrt options */
   bool use_metalrt = device->use_metalrt;
-  bool metalrt_hair = device->kernel_features & KERNEL_FEATURE_HAIR;
-  bool metalrt_hair_thick = device->kernel_features & KERNEL_FEATURE_HAIR_THICK;
-  bool metalrt_pointcloud = device->kernel_features & KERNEL_FEATURE_POINTCLOUD;
+  bool metalrt_hair = use_metalrt && (device->kernel_features & KERNEL_FEATURE_HAIR);
+  bool metalrt_hair_thick = use_metalrt && (device->kernel_features & KERNEL_FEATURE_HAIR_THICK);
+  bool metalrt_pointcloud = use_metalrt && (device->kernel_features & KERNEL_FEATURE_POINTCLOUD);

   MetalKernelPipeline *best_pipeline = nullptr;
   for (auto &pipeline : collection) {

```

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D14923
2022-05-11 16:20:59 +01:00
Campbell Barton
ae683a22c6 Merge branch 'blender-v3.2-release' 2022-05-11 16:34:25 +10:00
Campbell Barton
87d74d03bf Merge branch 'blender-v3.2-release' 2022-05-11 16:34:22 +10:00
Mikhail Matrosov
dcce4a59a0 Fix T97966: Cycles shadow terminator offset wrong for scaled object instances
Differential Revision: https://developer.blender.org/D14893
2022-05-10 18:53:14 +02:00
Olivier Maury
dc55e095e6 Fix T97056: Cycles MNEE not working with glass and pure refraction BSDFs
Differential Revision: https://developer.blender.org/D14901
2022-05-10 18:51:02 +02:00
Campbell Barton
11f3a388ed Merge branch 'blender-v3.2-release' 2022-05-06 13:43:54 +10:00
Campbell Barton
11a7da675f Merge branch 'blender-v3.2-release' 2022-05-06 13:43:51 +10:00
Brecht Van Lommel
1b566b70c1 Fix T95308, T93913: Cycles mist pass wrong with SSS shader
It was wrongly writing passes twice, for both the surface entry and exit points.
We can skip code for filtering closures, emission and holdout also, as these do
nothing with only a subsurface diffuse closure present.
2022-05-05 22:01:45 +02:00
Brecht Van Lommel
e4931ab86d Cleanup: clang format 2022-05-05 21:57:08 +02:00
Brecht Van Lommel
108963d508 Merge branch 'blender-v3.2-release' 2022-05-05 21:01:54 +02:00
Brecht Van Lommel
75a051a6ab Fix T93246: Cycles wrong volume shading after transparent surface
The Russian roulette probability was not taken into account for volumes in all
cases. It should not be left out from the SD_HAS_ONLY_VOLUME case.
2022-05-05 20:51:01 +02:00
Campbell Barton
eb837ba17e Cleanup: format 2022-05-05 17:33:43 +10:00
Germano Cavalcante
73fa571598 Merge branch 'blender-v3.2-release' 2022-05-04 21:43:56 -03:00