2023-06-14 16:52:36 +10:00
|
|
|
/* SPDX-FileCopyrightText: 2011-2022 Blender Foundation
|
|
|
|
|
*
|
|
|
|
|
* SPDX-License-Identifier: Apache-2.0 */
|
2011-04-27 11:58:34 +00:00
|
|
|
|
2024-12-26 17:53:59 +01:00
|
|
|
#include <cstdlib>
|
|
|
|
|
#include <cstring>
|
2011-04-27 11:58:34 +00:00
|
|
|
|
2020-12-10 14:18:25 +01:00
|
|
|
#include "bvh/bvh2.h"
|
|
|
|
|
|
Cycles: Make all #include statements relative to cycles source directory
The idea is to make include statements more explicit and obvious where the
file is coming from, additionally reducing chance of wrong header being
picked up.
For example, it was not obvious whether bvh.h was refferring to builder
or traversal, whenter node.h is a generic graph node or a shader node
and cases like that.
Surely this might look obvious for the active developers, but after some
time of not touching the code it becomes less obvious where file is coming
from.
This was briefly mentioned in T50824 and seems @brecht is fine with such
explicitness, but need to agree with all active developers before committing
this.
Please note that this patch is lacking changes related on GPU/OpenCL
support. This will be solved if/when we all agree this is a good idea to move
forward.
Reviewers: brecht, lukasstockner97, maiself, nirved, dingto, juicyfruit, swerner
Reviewed By: lukasstockner97, maiself, nirved, dingto
Subscribers: brecht
Differential Revision: https://developer.blender.org/D2586
2017-03-28 20:39:14 +02:00
|
|
|
#include "device/device.h"
|
2021-10-24 14:19:19 +02:00
|
|
|
#include "device/queue.h"
|
Cycles: merge of cycles-x branch, a major update to the renderer
This includes much improved GPU rendering performance, viewport interactivity,
new shadow catcher, revamped sampling settings, subsurface scattering anisotropy,
new GPU volume sampling, improved PMJ sampling pattern, and more.
Some features have also been removed or changed, breaking backwards compatibility.
Including the removal of the OpenCL backend, for which alternatives are under
development.
Release notes and code docs:
https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycles
https://wiki.blender.org/wiki/Source/Render/Cycles
Credits:
* Sergey Sharybin
* Brecht Van Lommel
* Patrick Mours (OptiX backend)
* Christophe Hery (subsurface scattering anisotropy)
* William Leeson (PMJ sampling pattern)
* Alaska (various fixes and tweaks)
* Thomas Dinges (various fixes)
For the full commit history, see the cycles-x branch. This squashes together
all the changes since intermediate changes would often fail building or tests.
Ref T87839, T87837, T87836
Fixes T90734, T89353, T80267, T80267, T77185, T69800
2021-09-20 17:59:20 +02:00
|
|
|
|
|
|
|
|
#include "device/cpu/device.h"
|
2021-11-05 21:01:23 +01:00
|
|
|
#include "device/cpu/kernel.h"
|
Cycles: merge of cycles-x branch, a major update to the renderer
This includes much improved GPU rendering performance, viewport interactivity,
new shadow catcher, revamped sampling settings, subsurface scattering anisotropy,
new GPU volume sampling, improved PMJ sampling pattern, and more.
Some features have also been removed or changed, breaking backwards compatibility.
Including the removal of the OpenCL backend, for which alternatives are under
development.
Release notes and code docs:
https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycles
https://wiki.blender.org/wiki/Source/Render/Cycles
Credits:
* Sergey Sharybin
* Brecht Van Lommel
* Patrick Mours (OptiX backend)
* Christophe Hery (subsurface scattering anisotropy)
* William Leeson (PMJ sampling pattern)
* Alaska (various fixes and tweaks)
* Thomas Dinges (various fixes)
For the full commit history, see the cycles-x branch. This squashes together
all the changes since intermediate changes would often fail building or tests.
Ref T87839, T87837, T87836
Fixes T90734, T89353, T80267, T80267, T77185, T69800
2021-09-20 17:59:20 +02:00
|
|
|
#include "device/cuda/device.h"
|
|
|
|
|
#include "device/dummy/device.h"
|
2021-09-28 16:51:14 +02:00
|
|
|
#include "device/hip/device.h"
|
2021-12-07 15:11:35 +00:00
|
|
|
#include "device/metal/device.h"
|
Cycles: merge of cycles-x branch, a major update to the renderer
This includes much improved GPU rendering performance, viewport interactivity,
new shadow catcher, revamped sampling settings, subsurface scattering anisotropy,
new GPU volume sampling, improved PMJ sampling pattern, and more.
Some features have also been removed or changed, breaking backwards compatibility.
Including the removal of the OpenCL backend, for which alternatives are under
development.
Release notes and code docs:
https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycles
https://wiki.blender.org/wiki/Source/Render/Cycles
Credits:
* Sergey Sharybin
* Brecht Van Lommel
* Patrick Mours (OptiX backend)
* Christophe Hery (subsurface scattering anisotropy)
* William Leeson (PMJ sampling pattern)
* Alaska (various fixes and tweaks)
* Thomas Dinges (various fixes)
For the full commit history, see the cycles-x branch. This squashes together
all the changes since intermediate changes would often fail building or tests.
Ref T87839, T87837, T87836
Fixes T90734, T89353, T80267, T80267, T77185, T69800
2021-09-20 17:59:20 +02:00
|
|
|
#include "device/multi/device.h"
|
2022-06-29 12:58:04 +02:00
|
|
|
#include "device/oneapi/device.h"
|
Cycles: merge of cycles-x branch, a major update to the renderer
This includes much improved GPU rendering performance, viewport interactivity,
new shadow catcher, revamped sampling settings, subsurface scattering anisotropy,
new GPU volume sampling, improved PMJ sampling pattern, and more.
Some features have also been removed or changed, breaking backwards compatibility.
Including the removal of the OpenCL backend, for which alternatives are under
development.
Release notes and code docs:
https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycles
https://wiki.blender.org/wiki/Source/Render/Cycles
Credits:
* Sergey Sharybin
* Brecht Van Lommel
* Patrick Mours (OptiX backend)
* Christophe Hery (subsurface scattering anisotropy)
* William Leeson (PMJ sampling pattern)
* Alaska (various fixes and tweaks)
* Thomas Dinges (various fixes)
For the full commit history, see the cycles-x branch. This squashes together
all the changes since intermediate changes would often fail building or tests.
Ref T87839, T87837, T87836
Fixes T90734, T89353, T80267, T80267, T77185, T69800
2021-09-20 17:59:20 +02:00
|
|
|
#include "device/optix/device.h"
|
2011-04-27 11:58:34 +00:00
|
|
|
|
2024-12-26 17:53:59 +01:00
|
|
|
#ifdef WITH_HIPRT
|
|
|
|
|
# include <hiprtew.h>
|
|
|
|
|
#endif
|
|
|
|
|
|
2021-10-24 14:19:19 +02:00
|
|
|
#include "util/log.h"
|
|
|
|
|
#include "util/math.h"
|
|
|
|
|
#include "util/string.h"
|
|
|
|
|
#include "util/system.h"
|
2022-01-07 11:31:02 +01:00
|
|
|
#include "util/task.h"
|
2021-10-24 14:19:19 +02:00
|
|
|
#include "util/types.h"
|
|
|
|
|
#include "util/vector.h"
|
2011-04-27 11:58:34 +00:00
|
|
|
|
|
|
|
|
CCL_NAMESPACE_BEGIN
|
|
|
|
|
|
2016-01-12 16:00:48 +05:00
|
|
|
bool Device::need_types_update = true;
|
|
|
|
|
bool Device::need_devices_update = true;
|
2017-10-11 12:48:19 +05:00
|
|
|
thread_mutex Device::device_mutex;
|
2019-01-29 16:39:30 +01:00
|
|
|
vector<DeviceInfo> Device::cuda_devices;
|
2019-09-12 14:50:06 +02:00
|
|
|
vector<DeviceInfo> Device::optix_devices;
|
2019-01-29 16:39:30 +01:00
|
|
|
vector<DeviceInfo> Device::cpu_devices;
|
2021-09-28 16:51:14 +02:00
|
|
|
vector<DeviceInfo> Device::hip_devices;
|
2021-12-07 15:11:35 +00:00
|
|
|
vector<DeviceInfo> Device::metal_devices;
|
2022-06-29 12:58:04 +02:00
|
|
|
vector<DeviceInfo> Device::oneapi_devices;
|
2019-01-29 16:39:30 +01:00
|
|
|
uint Device::devices_initialized_mask = 0;
|
2016-01-12 16:00:48 +05:00
|
|
|
|
2011-04-27 11:58:34 +00:00
|
|
|
/* Device */
|
|
|
|
|
|
2024-12-26 17:53:59 +01:00
|
|
|
Device::~Device() noexcept(false) = default;
|
2011-04-27 11:58:34 +00:00
|
|
|
|
2025-06-22 22:23:52 +02:00
|
|
|
void Device::set_error(const string &error)
|
|
|
|
|
{
|
|
|
|
|
if (!have_error()) {
|
|
|
|
|
error_msg = error;
|
|
|
|
|
}
|
2025-07-10 19:44:14 +02:00
|
|
|
LOG_ERROR << error;
|
2025-06-22 22:23:52 +02:00
|
|
|
fflush(stderr);
|
|
|
|
|
}
|
|
|
|
|
|
2020-12-10 14:18:25 +01:00
|
|
|
void Device::build_bvh(BVH *bvh, Progress &progress, bool refit)
|
|
|
|
|
{
|
|
|
|
|
assert(bvh->params.bvh_layout == BVH_LAYOUT_BVH2);
|
|
|
|
|
|
|
|
|
|
BVH2 *const bvh2 = static_cast<BVH2 *>(bvh);
|
|
|
|
|
if (refit) {
|
|
|
|
|
bvh2->refit(progress);
|
|
|
|
|
}
|
|
|
|
|
else {
|
|
|
|
|
bvh2->build(progress, &stats);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2024-12-29 23:13:45 +01:00
|
|
|
unique_ptr<Device> Device::create(const DeviceInfo &info,
|
|
|
|
|
Stats &stats,
|
|
|
|
|
Profiler &profiler,
|
|
|
|
|
bool headless)
|
2011-04-27 11:58:34 +00:00
|
|
|
{
|
2020-02-11 16:30:01 +01:00
|
|
|
if (!info.multi_devices.empty()) {
|
|
|
|
|
/* Always create a multi device when info contains multiple devices.
|
|
|
|
|
* This is done so that the type can still be e.g. DEVICE_CPU to indicate
|
|
|
|
|
* that it is a homogeneous collection of devices, which simplifies checks. */
|
2024-06-07 17:53:44 +02:00
|
|
|
return device_multi_create(info, stats, profiler, headless);
|
2020-02-11 16:30:01 +01:00
|
|
|
}
|
|
|
|
|
|
2024-12-29 23:13:45 +01:00
|
|
|
unique_ptr<Device> device;
|
2011-04-27 11:58:34 +00:00
|
|
|
|
2012-01-04 18:06:32 +00:00
|
|
|
switch (info.type) {
|
2011-04-27 11:58:34 +00:00
|
|
|
case DEVICE_CPU:
|
2024-06-07 17:53:44 +02:00
|
|
|
device = device_cpu_create(info, stats, profiler, headless);
|
2011-04-27 11:58:34 +00:00
|
|
|
break;
|
|
|
|
|
#ifdef WITH_CUDA
|
|
|
|
|
case DEVICE_CUDA:
|
2023-09-17 09:01:48 +10:00
|
|
|
if (device_cuda_init()) {
|
2024-06-07 17:53:44 +02:00
|
|
|
device = device_cuda_create(info, stats, profiler, headless);
|
2023-09-17 09:01:48 +10:00
|
|
|
}
|
2011-04-27 11:58:34 +00:00
|
|
|
break;
|
|
|
|
|
#endif
|
2019-09-12 14:50:06 +02:00
|
|
|
#ifdef WITH_OPTIX
|
|
|
|
|
case DEVICE_OPTIX:
|
2024-12-26 17:53:59 +01:00
|
|
|
if (device_optix_init()) {
|
2024-06-07 17:53:44 +02:00
|
|
|
device = device_optix_create(info, stats, profiler, headless);
|
2024-12-26 17:53:59 +01:00
|
|
|
}
|
2011-04-27 11:58:34 +00:00
|
|
|
break;
|
|
|
|
|
#endif
|
2021-09-28 16:51:14 +02:00
|
|
|
|
|
|
|
|
#ifdef WITH_HIP
|
|
|
|
|
case DEVICE_HIP:
|
2024-12-26 17:53:59 +01:00
|
|
|
if (device_hip_init()) {
|
2024-06-07 17:53:44 +02:00
|
|
|
device = device_hip_create(info, stats, profiler, headless);
|
2024-12-26 17:53:59 +01:00
|
|
|
}
|
2021-09-28 16:51:14 +02:00
|
|
|
break;
|
|
|
|
|
#endif
|
|
|
|
|
|
2021-12-07 15:11:35 +00:00
|
|
|
#ifdef WITH_METAL
|
|
|
|
|
case DEVICE_METAL:
|
2024-12-26 17:53:59 +01:00
|
|
|
if (device_metal_init()) {
|
2024-06-07 17:53:44 +02:00
|
|
|
device = device_metal_create(info, stats, profiler, headless);
|
2024-12-26 17:53:59 +01:00
|
|
|
}
|
2021-12-07 15:11:35 +00:00
|
|
|
break;
|
|
|
|
|
#endif
|
2022-06-29 12:58:04 +02:00
|
|
|
|
|
|
|
|
#ifdef WITH_ONEAPI
|
|
|
|
|
case DEVICE_ONEAPI:
|
2024-06-07 17:53:44 +02:00
|
|
|
device = device_oneapi_create(info, stats, profiler, headless);
|
2022-06-29 12:58:04 +02:00
|
|
|
break;
|
|
|
|
|
#endif
|
|
|
|
|
|
2011-04-27 11:58:34 +00:00
|
|
|
default:
|
2020-10-28 19:55:41 +01:00
|
|
|
break;
|
|
|
|
|
}
|
|
|
|
|
|
2024-12-26 17:53:55 +01:00
|
|
|
if (device == nullptr) {
|
2024-06-07 17:53:44 +02:00
|
|
|
device = device_dummy_create(info, stats, profiler, headless);
|
2011-04-27 11:58:34 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
return device;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
DeviceType Device::type_from_string(const char *name)
|
|
|
|
|
{
|
2023-09-17 09:01:48 +10:00
|
|
|
if (strcmp(name, "CPU") == 0) {
|
2011-04-27 11:58:34 +00:00
|
|
|
return DEVICE_CPU;
|
2023-09-17 09:01:48 +10:00
|
|
|
}
|
2024-12-26 17:53:59 +01:00
|
|
|
if (strcmp(name, "CUDA") == 0) {
|
2011-04-27 11:58:34 +00:00
|
|
|
return DEVICE_CUDA;
|
2023-09-17 09:01:48 +10:00
|
|
|
}
|
2024-12-26 17:53:59 +01:00
|
|
|
if (strcmp(name, "OPTIX") == 0) {
|
2019-09-12 14:50:06 +02:00
|
|
|
return DEVICE_OPTIX;
|
2023-09-17 09:01:48 +10:00
|
|
|
}
|
2024-12-26 17:53:59 +01:00
|
|
|
if (strcmp(name, "MULTI") == 0) {
|
2011-04-27 11:58:34 +00:00
|
|
|
return DEVICE_MULTI;
|
2023-09-17 09:01:48 +10:00
|
|
|
}
|
2024-12-26 17:53:59 +01:00
|
|
|
if (strcmp(name, "HIP") == 0) {
|
2021-09-28 16:51:14 +02:00
|
|
|
return DEVICE_HIP;
|
2023-09-17 09:01:48 +10:00
|
|
|
}
|
2024-12-26 17:53:59 +01:00
|
|
|
if (strcmp(name, "METAL") == 0) {
|
2021-12-07 15:11:35 +00:00
|
|
|
return DEVICE_METAL;
|
2023-09-17 09:01:48 +10:00
|
|
|
}
|
2024-12-26 17:53:59 +01:00
|
|
|
if (strcmp(name, "ONEAPI") == 0) {
|
2022-06-29 12:58:04 +02:00
|
|
|
return DEVICE_ONEAPI;
|
2023-09-17 09:01:48 +10:00
|
|
|
}
|
2024-12-26 17:53:59 +01:00
|
|
|
if (strcmp(name, "HIPRT") == 0) {
|
2023-04-24 19:05:30 +02:00
|
|
|
return DEVICE_HIPRT;
|
2023-09-17 09:01:48 +10:00
|
|
|
}
|
2019-04-17 06:17:24 +02:00
|
|
|
|
2011-04-27 11:58:34 +00:00
|
|
|
return DEVICE_NONE;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
string Device::string_from_type(DeviceType type)
|
|
|
|
|
{
|
2023-09-17 09:01:48 +10:00
|
|
|
if (type == DEVICE_CPU) {
|
Cycles: Refactor Device selection to allow individual GPU compute device selection
Previously, it was only possible to choose a single GPU or all of that type (CUDA or OpenCL).
Now, a toggle button is displayed for every device.
These settings are tied to the PCI Bus ID of the devices, so they're consistent across hardware addition and removal (but not when swapping/moving cards).
From the code perspective, the more important change is that now, the compute device properties are stored in the Addon preferences of the Cycles addon, instead of directly in the User Preferences.
This allows for a cleaner implementation, removing the Cycles C API functions that were called by the RNA code to specify the enum items.
Note that this change is neither backwards- nor forwards-compatible, but since it's only a User Preference no existing files are broken.
Reviewers: #cycles, brecht
Reviewed By: #cycles, brecht
Subscribers: brecht, juicyfruit, mib2berlin, Blendify
Differential Revision: https://developer.blender.org/D2338
2016-11-07 02:33:53 +01:00
|
|
|
return "CPU";
|
2023-09-17 09:01:48 +10:00
|
|
|
}
|
2024-12-26 17:53:59 +01:00
|
|
|
if (type == DEVICE_CUDA) {
|
Cycles: Refactor Device selection to allow individual GPU compute device selection
Previously, it was only possible to choose a single GPU or all of that type (CUDA or OpenCL).
Now, a toggle button is displayed for every device.
These settings are tied to the PCI Bus ID of the devices, so they're consistent across hardware addition and removal (but not when swapping/moving cards).
From the code perspective, the more important change is that now, the compute device properties are stored in the Addon preferences of the Cycles addon, instead of directly in the User Preferences.
This allows for a cleaner implementation, removing the Cycles C API functions that were called by the RNA code to specify the enum items.
Note that this change is neither backwards- nor forwards-compatible, but since it's only a User Preference no existing files are broken.
Reviewers: #cycles, brecht
Reviewed By: #cycles, brecht
Subscribers: brecht, juicyfruit, mib2berlin, Blendify
Differential Revision: https://developer.blender.org/D2338
2016-11-07 02:33:53 +01:00
|
|
|
return "CUDA";
|
2023-09-17 09:01:48 +10:00
|
|
|
}
|
2024-12-26 17:53:59 +01:00
|
|
|
if (type == DEVICE_OPTIX) {
|
2019-09-12 14:50:06 +02:00
|
|
|
return "OPTIX";
|
2023-09-17 09:01:48 +10:00
|
|
|
}
|
2024-12-26 17:53:59 +01:00
|
|
|
if (type == DEVICE_MULTI) {
|
Cycles: Refactor Device selection to allow individual GPU compute device selection
Previously, it was only possible to choose a single GPU or all of that type (CUDA or OpenCL).
Now, a toggle button is displayed for every device.
These settings are tied to the PCI Bus ID of the devices, so they're consistent across hardware addition and removal (but not when swapping/moving cards).
From the code perspective, the more important change is that now, the compute device properties are stored in the Addon preferences of the Cycles addon, instead of directly in the User Preferences.
This allows for a cleaner implementation, removing the Cycles C API functions that were called by the RNA code to specify the enum items.
Note that this change is neither backwards- nor forwards-compatible, but since it's only a User Preference no existing files are broken.
Reviewers: #cycles, brecht
Reviewed By: #cycles, brecht
Subscribers: brecht, juicyfruit, mib2berlin, Blendify
Differential Revision: https://developer.blender.org/D2338
2016-11-07 02:33:53 +01:00
|
|
|
return "MULTI";
|
2023-09-17 09:01:48 +10:00
|
|
|
}
|
2024-12-26 17:53:59 +01:00
|
|
|
if (type == DEVICE_HIP) {
|
2021-09-28 16:51:14 +02:00
|
|
|
return "HIP";
|
2023-09-17 09:01:48 +10:00
|
|
|
}
|
2024-12-26 17:53:59 +01:00
|
|
|
if (type == DEVICE_METAL) {
|
2021-12-07 15:11:35 +00:00
|
|
|
return "METAL";
|
2023-09-17 09:01:48 +10:00
|
|
|
}
|
2024-12-26 17:53:59 +01:00
|
|
|
if (type == DEVICE_ONEAPI) {
|
2022-06-29 12:58:04 +02:00
|
|
|
return "ONEAPI";
|
2023-09-17 09:01:48 +10:00
|
|
|
}
|
2024-12-26 17:53:59 +01:00
|
|
|
if (type == DEVICE_HIPRT) {
|
2023-04-24 19:05:30 +02:00
|
|
|
return "HIPRT";
|
2023-09-17 09:01:48 +10:00
|
|
|
}
|
2019-04-17 06:17:24 +02:00
|
|
|
|
2011-04-27 11:58:34 +00:00
|
|
|
return "";
|
|
|
|
|
}
|
|
|
|
|
|
2019-01-29 16:39:30 +01:00
|
|
|
vector<DeviceType> Device::available_types()
|
2011-04-27 11:58:34 +00:00
|
|
|
{
|
2019-01-29 16:39:30 +01:00
|
|
|
vector<DeviceType> types;
|
|
|
|
|
types.push_back(DEVICE_CPU);
|
2011-04-27 11:58:34 +00:00
|
|
|
#ifdef WITH_CUDA
|
2019-01-29 16:39:30 +01:00
|
|
|
types.push_back(DEVICE_CUDA);
|
2011-04-27 11:58:34 +00:00
|
|
|
#endif
|
2019-09-12 14:50:06 +02:00
|
|
|
#ifdef WITH_OPTIX
|
|
|
|
|
types.push_back(DEVICE_OPTIX);
|
2011-04-27 11:58:34 +00:00
|
|
|
#endif
|
2021-09-28 16:51:14 +02:00
|
|
|
#ifdef WITH_HIP
|
|
|
|
|
types.push_back(DEVICE_HIP);
|
|
|
|
|
#endif
|
2021-12-07 15:11:35 +00:00
|
|
|
#ifdef WITH_METAL
|
|
|
|
|
types.push_back(DEVICE_METAL);
|
2022-06-29 12:58:04 +02:00
|
|
|
#endif
|
|
|
|
|
#ifdef WITH_ONEAPI
|
|
|
|
|
types.push_back(DEVICE_ONEAPI);
|
2023-04-24 19:05:30 +02:00
|
|
|
#endif
|
|
|
|
|
#ifdef WITH_HIPRT
|
2024-12-26 17:53:59 +01:00
|
|
|
if (hiprtewInit()) {
|
2023-04-24 19:05:30 +02:00
|
|
|
types.push_back(DEVICE_HIPRT);
|
2024-12-26 17:53:59 +01:00
|
|
|
}
|
2021-12-07 15:11:35 +00:00
|
|
|
#endif
|
2011-04-27 11:58:34 +00:00
|
|
|
return types;
|
|
|
|
|
}
|
|
|
|
|
|
2025-01-01 18:15:54 +01:00
|
|
|
vector<DeviceInfo> Device::available_devices(const uint mask)
|
2012-01-04 18:06:32 +00:00
|
|
|
{
|
2019-01-29 16:39:30 +01:00
|
|
|
/* Lazy initialize devices. On some platforms OpenCL or CUDA drivers can
|
|
|
|
|
* be broken and cause crashes when only trying to get device info, so
|
|
|
|
|
* we don't want to do any initialization until the user chooses to. */
|
2024-12-29 17:32:00 +01:00
|
|
|
const thread_scoped_lock lock(device_mutex);
|
2019-01-29 16:39:30 +01:00
|
|
|
vector<DeviceInfo> devices;
|
|
|
|
|
|
2020-05-14 04:54:45 +02:00
|
|
|
#if defined(WITH_CUDA) || defined(WITH_OPTIX)
|
|
|
|
|
if (mask & (DEVICE_MASK_CUDA | DEVICE_MASK_OPTIX)) {
|
2019-01-29 16:39:30 +01:00
|
|
|
if (!(devices_initialized_mask & DEVICE_MASK_CUDA)) {
|
|
|
|
|
if (device_cuda_init()) {
|
|
|
|
|
device_cuda_info(cuda_devices);
|
|
|
|
|
}
|
|
|
|
|
devices_initialized_mask |= DEVICE_MASK_CUDA;
|
2017-10-11 12:48:19 +05:00
|
|
|
}
|
2020-05-14 04:54:45 +02:00
|
|
|
if (mask & DEVICE_MASK_CUDA) {
|
2024-12-26 19:41:25 +01:00
|
|
|
for (DeviceInfo &info : cuda_devices) {
|
2020-05-14 04:54:45 +02:00
|
|
|
devices.push_back(info);
|
|
|
|
|
}
|
2019-01-29 16:39:30 +01:00
|
|
|
}
|
|
|
|
|
}
|
2012-01-04 18:06:32 +00:00
|
|
|
#endif
|
2019-01-29 16:39:30 +01:00
|
|
|
|
2019-09-12 14:50:06 +02:00
|
|
|
#ifdef WITH_OPTIX
|
|
|
|
|
if (mask & DEVICE_MASK_OPTIX) {
|
|
|
|
|
if (!(devices_initialized_mask & DEVICE_MASK_OPTIX)) {
|
|
|
|
|
if (device_optix_init()) {
|
2020-05-14 04:54:45 +02:00
|
|
|
device_optix_info(cuda_devices, optix_devices);
|
2019-09-12 14:50:06 +02:00
|
|
|
}
|
|
|
|
|
devices_initialized_mask |= DEVICE_MASK_OPTIX;
|
|
|
|
|
}
|
2024-12-26 19:41:25 +01:00
|
|
|
for (DeviceInfo &info : optix_devices) {
|
2019-09-12 14:50:06 +02:00
|
|
|
devices.push_back(info);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
#endif
|
|
|
|
|
|
2021-09-28 16:51:14 +02:00
|
|
|
#ifdef WITH_HIP
|
|
|
|
|
if (mask & DEVICE_MASK_HIP) {
|
|
|
|
|
if (!(devices_initialized_mask & DEVICE_MASK_HIP)) {
|
|
|
|
|
if (device_hip_init()) {
|
|
|
|
|
device_hip_info(hip_devices);
|
|
|
|
|
}
|
|
|
|
|
devices_initialized_mask |= DEVICE_MASK_HIP;
|
|
|
|
|
}
|
2024-12-26 19:41:25 +01:00
|
|
|
for (DeviceInfo &info : hip_devices) {
|
2021-09-28 16:51:14 +02:00
|
|
|
devices.push_back(info);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
#endif
|
|
|
|
|
|
2022-06-29 12:58:04 +02:00
|
|
|
#ifdef WITH_ONEAPI
|
|
|
|
|
if (mask & DEVICE_MASK_ONEAPI) {
|
|
|
|
|
if (!(devices_initialized_mask & DEVICE_MASK_ONEAPI)) {
|
|
|
|
|
if (device_oneapi_init()) {
|
|
|
|
|
device_oneapi_info(oneapi_devices);
|
|
|
|
|
}
|
|
|
|
|
devices_initialized_mask |= DEVICE_MASK_ONEAPI;
|
|
|
|
|
}
|
2024-12-26 19:41:25 +01:00
|
|
|
for (DeviceInfo &info : oneapi_devices) {
|
2022-06-29 12:58:04 +02:00
|
|
|
devices.push_back(info);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
#endif
|
|
|
|
|
|
2019-01-29 16:39:30 +01:00
|
|
|
if (mask & DEVICE_MASK_CPU) {
|
|
|
|
|
if (!(devices_initialized_mask & DEVICE_MASK_CPU)) {
|
|
|
|
|
device_cpu_info(cpu_devices);
|
|
|
|
|
devices_initialized_mask |= DEVICE_MASK_CPU;
|
|
|
|
|
}
|
2024-12-29 17:32:00 +01:00
|
|
|
for (const DeviceInfo &info : cpu_devices) {
|
2019-01-29 16:39:30 +01:00
|
|
|
devices.push_back(info);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2021-12-07 15:11:35 +00:00
|
|
|
#ifdef WITH_METAL
|
|
|
|
|
if (mask & DEVICE_MASK_METAL) {
|
|
|
|
|
if (!(devices_initialized_mask & DEVICE_MASK_METAL)) {
|
|
|
|
|
if (device_metal_init()) {
|
|
|
|
|
device_metal_info(metal_devices);
|
|
|
|
|
}
|
|
|
|
|
devices_initialized_mask |= DEVICE_MASK_METAL;
|
|
|
|
|
}
|
2024-12-29 17:32:00 +01:00
|
|
|
for (const DeviceInfo &info : metal_devices) {
|
2021-12-07 15:11:35 +00:00
|
|
|
devices.push_back(info);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
#endif
|
|
|
|
|
|
2012-01-04 18:06:32 +00:00
|
|
|
return devices;
|
|
|
|
|
}
|
|
|
|
|
|
2020-10-28 19:55:41 +01:00
|
|
|
DeviceInfo Device::dummy_device(const string &error_msg)
|
|
|
|
|
{
|
|
|
|
|
DeviceInfo info;
|
|
|
|
|
info.type = DEVICE_DUMMY;
|
|
|
|
|
info.error_msg = error_msg;
|
|
|
|
|
return info;
|
|
|
|
|
}
|
|
|
|
|
|
2025-01-01 18:15:54 +01:00
|
|
|
string Device::device_capabilities(const uint mask)
|
2015-01-06 14:13:21 +05:00
|
|
|
{
|
2024-12-29 17:32:00 +01:00
|
|
|
const thread_scoped_lock lock(device_mutex);
|
2024-12-26 17:53:59 +01:00
|
|
|
string capabilities;
|
2019-01-29 16:39:30 +01:00
|
|
|
|
|
|
|
|
if (mask & DEVICE_MASK_CPU) {
|
|
|
|
|
capabilities += "\nCPU device capabilities: ";
|
|
|
|
|
capabilities += device_cpu_capabilities() + "\n";
|
|
|
|
|
}
|
2015-01-06 14:13:21 +05:00
|
|
|
|
2017-10-08 18:20:55 +02:00
|
|
|
#ifdef WITH_CUDA
|
2019-01-29 16:39:30 +01:00
|
|
|
if (mask & DEVICE_MASK_CUDA) {
|
|
|
|
|
if (device_cuda_init()) {
|
2023-06-28 14:40:20 +02:00
|
|
|
const string device_capabilities = device_cuda_capabilities();
|
|
|
|
|
if (!device_capabilities.empty()) {
|
|
|
|
|
capabilities += "\nCUDA device capabilities:\n";
|
|
|
|
|
capabilities += device_capabilities;
|
|
|
|
|
}
|
2019-01-29 16:39:30 +01:00
|
|
|
}
|
2017-10-08 18:20:55 +02:00
|
|
|
}
|
|
|
|
|
#endif
|
|
|
|
|
|
2021-09-28 16:51:14 +02:00
|
|
|
#ifdef WITH_HIP
|
|
|
|
|
if (mask & DEVICE_MASK_HIP) {
|
|
|
|
|
if (device_hip_init()) {
|
2023-06-28 14:40:20 +02:00
|
|
|
const string device_capabilities = device_hip_capabilities();
|
|
|
|
|
if (!device_capabilities.empty()) {
|
|
|
|
|
capabilities += "\nHIP device capabilities:\n";
|
|
|
|
|
capabilities += device_capabilities;
|
|
|
|
|
}
|
2021-09-28 16:51:14 +02:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
#endif
|
|
|
|
|
|
2022-06-29 12:58:04 +02:00
|
|
|
#ifdef WITH_ONEAPI
|
|
|
|
|
if (mask & DEVICE_MASK_ONEAPI) {
|
|
|
|
|
if (device_oneapi_init()) {
|
2023-06-28 14:40:20 +02:00
|
|
|
const string device_capabilities = device_oneapi_capabilities();
|
|
|
|
|
if (!device_capabilities.empty()) {
|
|
|
|
|
capabilities += "\noneAPI device capabilities:\n";
|
|
|
|
|
capabilities += device_capabilities;
|
|
|
|
|
}
|
2022-06-29 12:58:04 +02:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
#endif
|
|
|
|
|
|
2021-12-07 15:11:35 +00:00
|
|
|
#ifdef WITH_METAL
|
|
|
|
|
if (mask & DEVICE_MASK_METAL) {
|
|
|
|
|
if (device_metal_init()) {
|
2023-06-28 14:40:20 +02:00
|
|
|
const string device_capabilities = device_metal_capabilities();
|
|
|
|
|
if (!device_capabilities.empty()) {
|
|
|
|
|
capabilities += "\nMetal device capabilities:\n";
|
|
|
|
|
capabilities += device_capabilities;
|
|
|
|
|
}
|
2021-12-07 15:11:35 +00:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
#endif
|
|
|
|
|
|
2015-01-06 14:13:21 +05:00
|
|
|
return capabilities;
|
|
|
|
|
}
|
|
|
|
|
|
2017-10-21 18:58:59 +02:00
|
|
|
DeviceInfo Device::get_multi_device(const vector<DeviceInfo> &subdevices,
|
2025-01-01 18:15:54 +01:00
|
|
|
const int threads,
|
2017-10-21 18:58:59 +02:00
|
|
|
bool background)
|
Cycles: Refactor Device selection to allow individual GPU compute device selection
Previously, it was only possible to choose a single GPU or all of that type (CUDA or OpenCL).
Now, a toggle button is displayed for every device.
These settings are tied to the PCI Bus ID of the devices, so they're consistent across hardware addition and removal (but not when swapping/moving cards).
From the code perspective, the more important change is that now, the compute device properties are stored in the Addon preferences of the Cycles addon, instead of directly in the User Preferences.
This allows for a cleaner implementation, removing the Cycles C API functions that were called by the RNA code to specify the enum items.
Note that this change is neither backwards- nor forwards-compatible, but since it's only a User Preference no existing files are broken.
Reviewers: #cycles, brecht
Reviewed By: #cycles, brecht
Subscribers: brecht, juicyfruit, mib2berlin, Blendify
Differential Revision: https://developer.blender.org/D2338
2016-11-07 02:33:53 +01:00
|
|
|
{
|
2024-12-26 17:53:59 +01:00
|
|
|
assert(!subdevices.empty());
|
2019-04-17 06:17:24 +02:00
|
|
|
|
2019-01-29 16:39:30 +01:00
|
|
|
if (subdevices.size() == 1) {
|
|
|
|
|
/* No multi device needed. */
|
|
|
|
|
return subdevices.front();
|
|
|
|
|
}
|
2019-04-17 06:17:24 +02:00
|
|
|
|
Cycles: Refactor Device selection to allow individual GPU compute device selection
Previously, it was only possible to choose a single GPU or all of that type (CUDA or OpenCL).
Now, a toggle button is displayed for every device.
These settings are tied to the PCI Bus ID of the devices, so they're consistent across hardware addition and removal (but not when swapping/moving cards).
From the code perspective, the more important change is that now, the compute device properties are stored in the Addon preferences of the Cycles addon, instead of directly in the User Preferences.
This allows for a cleaner implementation, removing the Cycles C API functions that were called by the RNA code to specify the enum items.
Note that this change is neither backwards- nor forwards-compatible, but since it's only a User Preference no existing files are broken.
Reviewers: #cycles, brecht
Reviewed By: #cycles, brecht
Subscribers: brecht, juicyfruit, mib2berlin, Blendify
Differential Revision: https://developer.blender.org/D2338
2016-11-07 02:33:53 +01:00
|
|
|
DeviceInfo info;
|
Cycles: merge of cycles-x branch, a major update to the renderer
This includes much improved GPU rendering performance, viewport interactivity,
new shadow catcher, revamped sampling settings, subsurface scattering anisotropy,
new GPU volume sampling, improved PMJ sampling pattern, and more.
Some features have also been removed or changed, breaking backwards compatibility.
Including the removal of the OpenCL backend, for which alternatives are under
development.
Release notes and code docs:
https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycles
https://wiki.blender.org/wiki/Source/Render/Cycles
Credits:
* Sergey Sharybin
* Brecht Van Lommel
* Patrick Mours (OptiX backend)
* Christophe Hery (subsurface scattering anisotropy)
* William Leeson (PMJ sampling pattern)
* Alaska (various fixes and tweaks)
* Thomas Dinges (various fixes)
For the full commit history, see the cycles-x branch. This squashes together
all the changes since intermediate changes would often fail building or tests.
Ref T87839, T87837, T87836
Fixes T90734, T89353, T80267, T80267, T77185, T69800
2021-09-20 17:59:20 +02:00
|
|
|
info.type = DEVICE_NONE;
|
Cycles: Refactor Device selection to allow individual GPU compute device selection
Previously, it was only possible to choose a single GPU or all of that type (CUDA or OpenCL).
Now, a toggle button is displayed for every device.
These settings are tied to the PCI Bus ID of the devices, so they're consistent across hardware addition and removal (but not when swapping/moving cards).
From the code perspective, the more important change is that now, the compute device properties are stored in the Addon preferences of the Cycles addon, instead of directly in the User Preferences.
This allows for a cleaner implementation, removing the Cycles C API functions that were called by the RNA code to specify the enum items.
Note that this change is neither backwards- nor forwards-compatible, but since it's only a User Preference no existing files are broken.
Reviewers: #cycles, brecht
Reviewed By: #cycles, brecht
Subscribers: brecht, juicyfruit, mib2berlin, Blendify
Differential Revision: https://developer.blender.org/D2338
2016-11-07 02:33:53 +01:00
|
|
|
info.id = "MULTI";
|
|
|
|
|
info.description = "Multi Device";
|
|
|
|
|
info.num = 0;
|
2019-04-17 06:17:24 +02:00
|
|
|
|
2021-03-29 22:58:19 +02:00
|
|
|
info.has_nanovdb = true;
|
2023-06-15 16:15:50 +02:00
|
|
|
info.has_mnee = true;
|
2017-10-20 05:08:26 +02:00
|
|
|
info.has_osl = true;
|
2022-09-21 17:58:34 +02:00
|
|
|
info.has_guiding = true;
|
2018-11-29 02:06:30 +01:00
|
|
|
info.has_profiling = true;
|
2020-06-08 17:16:10 +02:00
|
|
|
info.has_peer_memory = false;
|
2023-03-16 11:56:55 +01:00
|
|
|
info.use_hardware_raytracing = false;
|
2020-05-31 23:49:10 +02:00
|
|
|
info.denoisers = DENOISER_ALL;
|
2019-04-17 06:17:24 +02:00
|
|
|
|
2024-12-26 19:41:25 +01:00
|
|
|
for (const DeviceInfo &device : subdevices) {
|
2017-11-03 20:21:19 +01:00
|
|
|
/* Ensure CPU device does not slow down GPU. */
|
2017-10-21 18:58:59 +02:00
|
|
|
if (device.type == DEVICE_CPU && subdevices.size() > 1) {
|
|
|
|
|
if (background) {
|
2024-12-29 17:32:00 +01:00
|
|
|
const int orig_cpu_threads = (threads) ? threads : TaskScheduler::max_concurrency();
|
|
|
|
|
const int cpu_threads = max(orig_cpu_threads - (subdevices.size() - 1), size_t(0));
|
2019-04-17 06:17:24 +02:00
|
|
|
|
2025-07-10 19:44:14 +02:00
|
|
|
LOG_INFO << "CPU render threads reduced from " << orig_cpu_threads << " to " << cpu_threads
|
|
|
|
|
<< ", to dedicate to GPU.";
|
2019-04-17 06:17:24 +02:00
|
|
|
|
2017-10-21 18:58:59 +02:00
|
|
|
if (cpu_threads >= 1) {
|
|
|
|
|
DeviceInfo cpu_device = device;
|
|
|
|
|
cpu_device.cpu_threads = cpu_threads;
|
|
|
|
|
info.multi_devices.push_back(cpu_device);
|
|
|
|
|
}
|
2017-11-03 20:21:19 +01:00
|
|
|
else {
|
|
|
|
|
continue;
|
|
|
|
|
}
|
2017-10-21 18:58:59 +02:00
|
|
|
}
|
|
|
|
|
else {
|
2025-07-10 19:44:14 +02:00
|
|
|
LOG_INFO << "CPU render threads disabled for interactive render.";
|
2017-11-03 20:21:19 +01:00
|
|
|
continue;
|
2017-10-21 18:58:59 +02:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
else {
|
|
|
|
|
info.multi_devices.push_back(device);
|
|
|
|
|
}
|
2019-04-17 06:17:24 +02:00
|
|
|
|
2020-02-11 16:30:01 +01:00
|
|
|
/* Create unique ID for this combination of devices. */
|
|
|
|
|
info.id += device.id;
|
|
|
|
|
|
|
|
|
|
/* Set device type to MULTI if subdevices are not of a common type. */
|
Cycles: merge of cycles-x branch, a major update to the renderer
This includes much improved GPU rendering performance, viewport interactivity,
new shadow catcher, revamped sampling settings, subsurface scattering anisotropy,
new GPU volume sampling, improved PMJ sampling pattern, and more.
Some features have also been removed or changed, breaking backwards compatibility.
Including the removal of the OpenCL backend, for which alternatives are under
development.
Release notes and code docs:
https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycles
https://wiki.blender.org/wiki/Source/Render/Cycles
Credits:
* Sergey Sharybin
* Brecht Van Lommel
* Patrick Mours (OptiX backend)
* Christophe Hery (subsurface scattering anisotropy)
* William Leeson (PMJ sampling pattern)
* Alaska (various fixes and tweaks)
* Thomas Dinges (various fixes)
For the full commit history, see the cycles-x branch. This squashes together
all the changes since intermediate changes would often fail building or tests.
Ref T87839, T87837, T87836
Fixes T90734, T89353, T80267, T80267, T77185, T69800
2021-09-20 17:59:20 +02:00
|
|
|
if (info.type == DEVICE_NONE) {
|
|
|
|
|
info.type = device.type;
|
|
|
|
|
}
|
|
|
|
|
else if (device.type != info.type) {
|
2020-02-11 16:30:01 +01:00
|
|
|
info.type = DEVICE_MULTI;
|
|
|
|
|
}
|
|
|
|
|
|
2017-11-03 20:21:19 +01:00
|
|
|
/* Accumulate device info. */
|
2021-03-29 22:58:19 +02:00
|
|
|
info.has_nanovdb &= device.has_nanovdb;
|
2023-06-15 16:15:50 +02:00
|
|
|
info.has_mnee &= device.has_mnee;
|
2017-11-03 20:21:19 +01:00
|
|
|
info.has_osl &= device.has_osl;
|
2022-09-21 17:58:34 +02:00
|
|
|
info.has_guiding &= device.has_guiding;
|
2018-11-29 02:06:30 +01:00
|
|
|
info.has_profiling &= device.has_profiling;
|
2020-06-08 17:16:10 +02:00
|
|
|
info.has_peer_memory |= device.has_peer_memory;
|
2023-03-16 11:56:55 +01:00
|
|
|
info.use_hardware_raytracing |= device.use_hardware_raytracing;
|
2020-05-31 23:49:10 +02:00
|
|
|
info.denoisers &= device.denoisers;
|
Cycles: Refactor Device selection to allow individual GPU compute device selection
Previously, it was only possible to choose a single GPU or all of that type (CUDA or OpenCL).
Now, a toggle button is displayed for every device.
These settings are tied to the PCI Bus ID of the devices, so they're consistent across hardware addition and removal (but not when swapping/moving cards).
From the code perspective, the more important change is that now, the compute device properties are stored in the Addon preferences of the Cycles addon, instead of directly in the User Preferences.
This allows for a cleaner implementation, removing the Cycles C API functions that were called by the RNA code to specify the enum items.
Note that this change is neither backwards- nor forwards-compatible, but since it's only a User Preference no existing files are broken.
Reviewers: #cycles, brecht
Reviewed By: #cycles, brecht
Subscribers: brecht, juicyfruit, mib2berlin, Blendify
Differential Revision: https://developer.blender.org/D2338
2016-11-07 02:33:53 +01:00
|
|
|
}
|
2019-04-17 06:17:24 +02:00
|
|
|
|
Cycles: Refactor Device selection to allow individual GPU compute device selection
Previously, it was only possible to choose a single GPU or all of that type (CUDA or OpenCL).
Now, a toggle button is displayed for every device.
These settings are tied to the PCI Bus ID of the devices, so they're consistent across hardware addition and removal (but not when swapping/moving cards).
From the code perspective, the more important change is that now, the compute device properties are stored in the Addon preferences of the Cycles addon, instead of directly in the User Preferences.
This allows for a cleaner implementation, removing the Cycles C API functions that were called by the RNA code to specify the enum items.
Note that this change is neither backwards- nor forwards-compatible, but since it's only a User Preference no existing files are broken.
Reviewers: #cycles, brecht
Reviewed By: #cycles, brecht
Subscribers: brecht, juicyfruit, mib2berlin, Blendify
Differential Revision: https://developer.blender.org/D2338
2016-11-07 02:33:53 +01:00
|
|
|
return info;
|
|
|
|
|
}
|
|
|
|
|
|
2016-01-12 16:00:48 +05:00
|
|
|
void Device::tag_update()
|
|
|
|
|
{
|
2019-01-29 16:39:30 +01:00
|
|
|
free_memory();
|
2016-01-12 16:00:48 +05:00
|
|
|
}
|
|
|
|
|
|
2016-02-07 03:40:41 +05:00
|
|
|
void Device::free_memory()
|
|
|
|
|
{
|
2019-01-29 16:39:30 +01:00
|
|
|
devices_initialized_mask = 0;
|
2019-02-12 17:10:31 +01:00
|
|
|
cuda_devices.free_memory();
|
2019-09-30 12:12:34 +02:00
|
|
|
optix_devices.free_memory();
|
2021-09-28 16:51:14 +02:00
|
|
|
hip_devices.free_memory();
|
2022-06-29 12:58:04 +02:00
|
|
|
oneapi_devices.free_memory();
|
2019-02-12 17:10:31 +01:00
|
|
|
cpu_devices.free_memory();
|
2021-12-07 15:11:35 +00:00
|
|
|
metal_devices.free_memory();
|
2016-02-07 03:40:41 +05:00
|
|
|
}
|
|
|
|
|
|
Cycles: merge of cycles-x branch, a major update to the renderer
This includes much improved GPU rendering performance, viewport interactivity,
new shadow catcher, revamped sampling settings, subsurface scattering anisotropy,
new GPU volume sampling, improved PMJ sampling pattern, and more.
Some features have also been removed or changed, breaking backwards compatibility.
Including the removal of the OpenCL backend, for which alternatives are under
development.
Release notes and code docs:
https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycles
https://wiki.blender.org/wiki/Source/Render/Cycles
Credits:
* Sergey Sharybin
* Brecht Van Lommel
* Patrick Mours (OptiX backend)
* Christophe Hery (subsurface scattering anisotropy)
* William Leeson (PMJ sampling pattern)
* Alaska (various fixes and tweaks)
* Thomas Dinges (various fixes)
For the full commit history, see the cycles-x branch. This squashes together
all the changes since intermediate changes would often fail building or tests.
Ref T87839, T87837, T87836
Fixes T90734, T89353, T80267, T80267, T77185, T69800
2021-09-20 17:59:20 +02:00
|
|
|
unique_ptr<DeviceQueue> Device::gpu_queue_create()
|
2020-05-31 23:49:10 +02:00
|
|
|
{
|
2025-07-10 19:44:14 +02:00
|
|
|
LOG_FATAL << "Device does not support queues.";
|
Cycles: merge of cycles-x branch, a major update to the renderer
This includes much improved GPU rendering performance, viewport interactivity,
new shadow catcher, revamped sampling settings, subsurface scattering anisotropy,
new GPU volume sampling, improved PMJ sampling pattern, and more.
Some features have also been removed or changed, breaking backwards compatibility.
Including the removal of the OpenCL backend, for which alternatives are under
development.
Release notes and code docs:
https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycles
https://wiki.blender.org/wiki/Source/Render/Cycles
Credits:
* Sergey Sharybin
* Brecht Van Lommel
* Patrick Mours (OptiX backend)
* Christophe Hery (subsurface scattering anisotropy)
* William Leeson (PMJ sampling pattern)
* Alaska (various fixes and tweaks)
* Thomas Dinges (various fixes)
For the full commit history, see the cycles-x branch. This squashes together
all the changes since intermediate changes would often fail building or tests.
Ref T87839, T87837, T87836
Fixes T90734, T89353, T80267, T80267, T77185, T69800
2021-09-20 17:59:20 +02:00
|
|
|
return nullptr;
|
|
|
|
|
}
|
2020-05-31 23:49:10 +02:00
|
|
|
|
2021-11-05 21:01:23 +01:00
|
|
|
const CPUKernels &Device::get_cpu_kernels()
|
Cycles: merge of cycles-x branch, a major update to the renderer
This includes much improved GPU rendering performance, viewport interactivity,
new shadow catcher, revamped sampling settings, subsurface scattering anisotropy,
new GPU volume sampling, improved PMJ sampling pattern, and more.
Some features have also been removed or changed, breaking backwards compatibility.
Including the removal of the OpenCL backend, for which alternatives are under
development.
Release notes and code docs:
https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycles
https://wiki.blender.org/wiki/Source/Render/Cycles
Credits:
* Sergey Sharybin
* Brecht Van Lommel
* Patrick Mours (OptiX backend)
* Christophe Hery (subsurface scattering anisotropy)
* William Leeson (PMJ sampling pattern)
* Alaska (various fixes and tweaks)
* Thomas Dinges (various fixes)
For the full commit history, see the cycles-x branch. This squashes together
all the changes since intermediate changes would often fail building or tests.
Ref T87839, T87837, T87836
Fixes T90734, T89353, T80267, T80267, T77185, T69800
2021-09-20 17:59:20 +02:00
|
|
|
{
|
2021-11-05 21:01:23 +01:00
|
|
|
/* Initialize CPU kernels once and reuse. */
|
2024-12-29 17:32:00 +01:00
|
|
|
static const CPUKernels kernels;
|
2021-11-05 21:01:23 +01:00
|
|
|
return kernels;
|
Cycles: merge of cycles-x branch, a major update to the renderer
This includes much improved GPU rendering performance, viewport interactivity,
new shadow catcher, revamped sampling settings, subsurface scattering anisotropy,
new GPU volume sampling, improved PMJ sampling pattern, and more.
Some features have also been removed or changed, breaking backwards compatibility.
Including the removal of the OpenCL backend, for which alternatives are under
development.
Release notes and code docs:
https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycles
https://wiki.blender.org/wiki/Source/Render/Cycles
Credits:
* Sergey Sharybin
* Brecht Van Lommel
* Patrick Mours (OptiX backend)
* Christophe Hery (subsurface scattering anisotropy)
* William Leeson (PMJ sampling pattern)
* Alaska (various fixes and tweaks)
* Thomas Dinges (various fixes)
For the full commit history, see the cycles-x branch. This squashes together
all the changes since intermediate changes would often fail building or tests.
Ref T87839, T87837, T87836
Fixes T90734, T89353, T80267, T80267, T77185, T69800
2021-09-20 17:59:20 +02:00
|
|
|
}
|
2020-06-01 00:11:17 +02:00
|
|
|
|
Cycles: merge of cycles-x branch, a major update to the renderer
This includes much improved GPU rendering performance, viewport interactivity,
new shadow catcher, revamped sampling settings, subsurface scattering anisotropy,
new GPU volume sampling, improved PMJ sampling pattern, and more.
Some features have also been removed or changed, breaking backwards compatibility.
Including the removal of the OpenCL backend, for which alternatives are under
development.
Release notes and code docs:
https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycles
https://wiki.blender.org/wiki/Source/Render/Cycles
Credits:
* Sergey Sharybin
* Brecht Van Lommel
* Patrick Mours (OptiX backend)
* Christophe Hery (subsurface scattering anisotropy)
* William Leeson (PMJ sampling pattern)
* Alaska (various fixes and tweaks)
* Thomas Dinges (various fixes)
For the full commit history, see the cycles-x branch. This squashes together
all the changes since intermediate changes would often fail building or tests.
Ref T87839, T87837, T87836
Fixes T90734, T89353, T80267, T80267, T77185, T69800
2021-09-20 17:59:20 +02:00
|
|
|
void Device::get_cpu_kernel_thread_globals(
|
2024-12-29 23:13:45 +01:00
|
|
|
vector<ThreadKernelGlobalsCPU> & /*kernel_thread_globals*/)
|
Cycles: merge of cycles-x branch, a major update to the renderer
This includes much improved GPU rendering performance, viewport interactivity,
new shadow catcher, revamped sampling settings, subsurface scattering anisotropy,
new GPU volume sampling, improved PMJ sampling pattern, and more.
Some features have also been removed or changed, breaking backwards compatibility.
Including the removal of the OpenCL backend, for which alternatives are under
development.
Release notes and code docs:
https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycles
https://wiki.blender.org/wiki/Source/Render/Cycles
Credits:
* Sergey Sharybin
* Brecht Van Lommel
* Patrick Mours (OptiX backend)
* Christophe Hery (subsurface scattering anisotropy)
* William Leeson (PMJ sampling pattern)
* Alaska (various fixes and tweaks)
* Thomas Dinges (various fixes)
For the full commit history, see the cycles-x branch. This squashes together
all the changes since intermediate changes would often fail building or tests.
Ref T87839, T87837, T87836
Fixes T90734, T89353, T80267, T80267, T77185, T69800
2021-09-20 17:59:20 +02:00
|
|
|
{
|
2025-07-10 19:44:14 +02:00
|
|
|
LOG_FATAL << "Device does not support CPU kernels.";
|
Cycles: merge of cycles-x branch, a major update to the renderer
This includes much improved GPU rendering performance, viewport interactivity,
new shadow catcher, revamped sampling settings, subsurface scattering anisotropy,
new GPU volume sampling, improved PMJ sampling pattern, and more.
Some features have also been removed or changed, breaking backwards compatibility.
Including the removal of the OpenCL backend, for which alternatives are under
development.
Release notes and code docs:
https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycles
https://wiki.blender.org/wiki/Source/Render/Cycles
Credits:
* Sergey Sharybin
* Brecht Van Lommel
* Patrick Mours (OptiX backend)
* Christophe Hery (subsurface scattering anisotropy)
* William Leeson (PMJ sampling pattern)
* Alaska (various fixes and tweaks)
* Thomas Dinges (various fixes)
For the full commit history, see the cycles-x branch. This squashes together
all the changes since intermediate changes would often fail building or tests.
Ref T87839, T87837, T87836
Fixes T90734, T89353, T80267, T80267, T77185, T69800
2021-09-20 17:59:20 +02:00
|
|
|
}
|
2020-06-01 00:11:17 +02:00
|
|
|
|
2024-12-29 23:13:45 +01:00
|
|
|
OSLGlobals *Device::get_cpu_osl_memory()
|
Cycles: merge of cycles-x branch, a major update to the renderer
This includes much improved GPU rendering performance, viewport interactivity,
new shadow catcher, revamped sampling settings, subsurface scattering anisotropy,
new GPU volume sampling, improved PMJ sampling pattern, and more.
Some features have also been removed or changed, breaking backwards compatibility.
Including the removal of the OpenCL backend, for which alternatives are under
development.
Release notes and code docs:
https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycles
https://wiki.blender.org/wiki/Source/Render/Cycles
Credits:
* Sergey Sharybin
* Brecht Van Lommel
* Patrick Mours (OptiX backend)
* Christophe Hery (subsurface scattering anisotropy)
* William Leeson (PMJ sampling pattern)
* Alaska (various fixes and tweaks)
* Thomas Dinges (various fixes)
For the full commit history, see the cycles-x branch. This squashes together
all the changes since intermediate changes would often fail building or tests.
Ref T87839, T87837, T87836
Fixes T90734, T89353, T80267, T80267, T77185, T69800
2021-09-20 17:59:20 +02:00
|
|
|
{
|
|
|
|
|
return nullptr;
|
2020-05-31 23:49:10 +02:00
|
|
|
}
|
|
|
|
|
|
2025-06-12 02:19:55 +02:00
|
|
|
void *Device::get_guiding_device() const
|
|
|
|
|
{
|
2025-07-10 19:44:14 +02:00
|
|
|
LOG_ERROR << "Request guiding field from a device which does not support it.";
|
2025-06-12 02:19:55 +02:00
|
|
|
return nullptr;
|
|
|
|
|
}
|
|
|
|
|
|
2025-02-13 13:11:39 +01:00
|
|
|
void *Device::host_alloc(const MemoryType /*type*/, const size_t size)
|
|
|
|
|
{
|
2025-06-20 16:26:33 +02:00
|
|
|
return util_aligned_malloc(size, MIN_ALIGNMENT_DEVICE_MEMORY);
|
2025-02-13 13:11:39 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
void Device::host_free(const MemoryType /*type*/, void *host_pointer, const size_t size)
|
|
|
|
|
{
|
|
|
|
|
util_aligned_free(host_pointer, size);
|
|
|
|
|
}
|
|
|
|
|
|
2024-12-26 17:53:59 +01:00
|
|
|
GPUDevice::~GPUDevice() noexcept(false) = default;
|
2023-02-01 17:22:53 +01:00
|
|
|
|
|
|
|
|
bool GPUDevice::load_texture_info()
|
|
|
|
|
{
|
2025-01-17 09:09:41 +01:00
|
|
|
/* Note texture_info is never host mapped, and load_texture_info() should only
|
|
|
|
|
* be called right before kernel enqueue when all memory operations have completed. */
|
2023-02-01 17:22:53 +01:00
|
|
|
if (need_texture_info) {
|
|
|
|
|
texture_info.copy_to_device();
|
2025-01-17 09:09:41 +01:00
|
|
|
need_texture_info = false;
|
2023-02-01 17:22:53 +01:00
|
|
|
return true;
|
|
|
|
|
}
|
2024-12-26 17:53:59 +01:00
|
|
|
return false;
|
2023-02-01 17:22:53 +01:00
|
|
|
}
|
|
|
|
|
|
2025-01-01 18:15:54 +01:00
|
|
|
void GPUDevice::init_host_memory(const size_t preferred_texture_headroom,
|
|
|
|
|
const size_t preferred_working_headroom)
|
2023-02-01 17:22:53 +01:00
|
|
|
{
|
|
|
|
|
/* Limit amount of host mapped memory, because allocating too much can
|
|
|
|
|
* cause system instability. Leave at least half or 4 GB of system
|
|
|
|
|
* memory free, whichever is smaller. */
|
2024-12-29 17:32:00 +01:00
|
|
|
const size_t default_limit = 4 * 1024 * 1024 * 1024LL;
|
|
|
|
|
const size_t system_ram = system_physical_ram();
|
2023-02-01 17:22:53 +01:00
|
|
|
|
|
|
|
|
if (system_ram > 0) {
|
|
|
|
|
if (system_ram / 2 > default_limit) {
|
|
|
|
|
map_host_limit = system_ram - default_limit;
|
|
|
|
|
}
|
|
|
|
|
else {
|
|
|
|
|
map_host_limit = system_ram / 2;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
else {
|
2025-07-10 19:44:14 +02:00
|
|
|
LOG_WARNING << "Mapped host memory disabled, failed to get system RAM";
|
2023-02-01 17:22:53 +01:00
|
|
|
map_host_limit = 0;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/* Amount of device memory to keep free after texture memory
|
|
|
|
|
* and working memory allocations respectively. We set the working
|
|
|
|
|
* memory limit headroom lower than the working one so there
|
|
|
|
|
* is space left for it. */
|
|
|
|
|
device_working_headroom = preferred_working_headroom > 0 ? preferred_working_headroom :
|
|
|
|
|
32 * 1024 * 1024LL; // 32MB
|
|
|
|
|
device_texture_headroom = preferred_texture_headroom > 0 ? preferred_texture_headroom :
|
|
|
|
|
128 * 1024 * 1024LL; // 128MB
|
|
|
|
|
|
2025-07-10 19:44:14 +02:00
|
|
|
LOG_INFO << "Mapped host memory limit set to " << string_human_readable_number(map_host_limit)
|
|
|
|
|
<< " bytes. (" << string_human_readable_size(map_host_limit) << ")";
|
2023-02-01 17:22:53 +01:00
|
|
|
}
|
|
|
|
|
|
2025-01-17 09:30:05 +01:00
|
|
|
void GPUDevice::move_textures_to_host(size_t size, const size_t headroom, const bool for_texture)
|
2023-02-01 17:22:53 +01:00
|
|
|
{
|
2025-01-17 09:30:05 +01:00
|
|
|
static thread_mutex move_mutex;
|
|
|
|
|
const thread_scoped_lock lock(move_mutex);
|
|
|
|
|
|
|
|
|
|
/* Check if there is enough space. Within mutex locks so that multiple threads
|
|
|
|
|
* calling take into account memory freed by another thread. */
|
|
|
|
|
size_t total = 0;
|
|
|
|
|
size_t free = 0;
|
|
|
|
|
get_device_memory_info(total, free);
|
|
|
|
|
if (size + headroom < free) {
|
2023-02-01 17:22:53 +01:00
|
|
|
return;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
while (size > 0) {
|
|
|
|
|
/* Find suitable memory allocation to move. */
|
2024-12-26 17:53:55 +01:00
|
|
|
device_memory *max_mem = nullptr;
|
2023-02-01 17:22:53 +01:00
|
|
|
size_t max_size = 0;
|
|
|
|
|
bool max_is_image = false;
|
|
|
|
|
|
|
|
|
|
thread_scoped_lock lock(device_mem_map_mutex);
|
2024-12-26 19:41:25 +01:00
|
|
|
for (MemMap::value_type &pair : device_mem_map) {
|
2023-02-01 17:22:53 +01:00
|
|
|
device_memory &mem = *pair.first;
|
|
|
|
|
Mem *cmem = &pair.second;
|
|
|
|
|
|
|
|
|
|
/* Can only move textures allocated on this device (and not those from peer devices).
|
|
|
|
|
* And need to ignore memory that is already on the host. */
|
2025-02-13 13:11:39 +01:00
|
|
|
if (!mem.is_resident(this) || mem.is_shared(this)) {
|
2023-02-01 17:22:53 +01:00
|
|
|
continue;
|
|
|
|
|
}
|
|
|
|
|
|
2024-12-29 17:32:00 +01:00
|
|
|
const bool is_texture = (mem.type == MEM_TEXTURE || mem.type == MEM_GLOBAL) &&
|
|
|
|
|
(&mem != &texture_info);
|
|
|
|
|
const bool is_image = is_texture && (mem.data_height > 1);
|
2023-02-01 17:22:53 +01:00
|
|
|
|
|
|
|
|
/* Can't move this type of memory. */
|
|
|
|
|
if (!is_texture || cmem->array) {
|
|
|
|
|
continue;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/* For other textures, only move image textures. */
|
|
|
|
|
if (for_texture && !is_image) {
|
|
|
|
|
continue;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/* Try to move largest allocation, prefer moving images. */
|
|
|
|
|
if (is_image > max_is_image || (is_image == max_is_image && mem.device_size > max_size)) {
|
|
|
|
|
max_is_image = is_image;
|
|
|
|
|
max_size = mem.device_size;
|
|
|
|
|
max_mem = &mem;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
lock.unlock();
|
|
|
|
|
|
|
|
|
|
/* Move to host memory. This part is mutex protected since
|
|
|
|
|
* multiple backend devices could be moving the memory. The
|
|
|
|
|
* first one will do it, and the rest will adopt the pointer. */
|
|
|
|
|
if (max_mem) {
|
2025-08-18 20:22:44 +02:00
|
|
|
LOG_DEBUG << "Move memory from device to host: " << max_mem->name;
|
2023-02-01 17:22:53 +01:00
|
|
|
|
|
|
|
|
/* Potentially need to call back into multi device, so pointer mapping
|
|
|
|
|
* and peer devices are updated. This is also necessary since the device
|
|
|
|
|
* pointer may just be a key here, so cannot be accessed and freed directly.
|
|
|
|
|
* Unfortunately it does mean that memory is reallocated on all other
|
|
|
|
|
* devices as well, which is potentially dangerous when still in use (since
|
|
|
|
|
* a thread rendering on another devices would only be caught in this mutex
|
|
|
|
|
* if it so happens to do an allocation at the same time as well. */
|
2025-01-17 09:30:05 +01:00
|
|
|
max_mem->move_to_host = true;
|
2025-01-09 12:04:08 +01:00
|
|
|
max_mem->device_move_to_host();
|
2025-01-17 09:30:05 +01:00
|
|
|
max_mem->move_to_host = false;
|
2023-02-01 17:22:53 +01:00
|
|
|
size = (max_size >= size) ? 0 : size - max_size;
|
|
|
|
|
|
2025-01-17 09:09:41 +01:00
|
|
|
/* Tag texture info update for new pointers. */
|
|
|
|
|
need_texture_info = true;
|
2023-02-01 17:22:53 +01:00
|
|
|
}
|
|
|
|
|
else {
|
|
|
|
|
break;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2025-01-01 18:15:54 +01:00
|
|
|
GPUDevice::Mem *GPUDevice::generic_alloc(device_memory &mem, const size_t pitch_padding)
|
2023-02-01 17:22:53 +01:00
|
|
|
{
|
2024-12-26 17:53:59 +01:00
|
|
|
void *device_pointer = nullptr;
|
2024-12-29 17:32:00 +01:00
|
|
|
const size_t size = mem.memory_size() + pitch_padding;
|
2023-02-01 17:22:53 +01:00
|
|
|
|
|
|
|
|
bool mem_alloc_result = false;
|
|
|
|
|
const char *status = "";
|
|
|
|
|
|
|
|
|
|
/* First try allocating in device memory, respecting headroom. We make
|
|
|
|
|
* an exception for texture info. It is small and frequently accessed,
|
|
|
|
|
* so treat it as working memory.
|
|
|
|
|
*
|
|
|
|
|
* If there is not enough room for working memory, we will try to move
|
|
|
|
|
* textures to host memory, assuming the performance impact would have
|
|
|
|
|
* been worse for working memory. */
|
2024-12-29 17:32:00 +01:00
|
|
|
const bool is_texture = (mem.type == MEM_TEXTURE || mem.type == MEM_GLOBAL) &&
|
|
|
|
|
(&mem != &texture_info);
|
|
|
|
|
const bool is_image = is_texture && (mem.data_height > 1);
|
2023-02-01 17:22:53 +01:00
|
|
|
|
2024-12-29 17:32:00 +01:00
|
|
|
const size_t headroom = (is_texture) ? device_texture_headroom : device_working_headroom;
|
2023-02-01 17:22:53 +01:00
|
|
|
|
2025-01-17 09:30:05 +01:00
|
|
|
/* Move textures to host memory if needed. */
|
|
|
|
|
if (!mem.move_to_host && !is_image && can_map_host) {
|
|
|
|
|
move_textures_to_host(size, headroom, is_texture);
|
|
|
|
|
}
|
|
|
|
|
|
2024-12-29 17:32:00 +01:00
|
|
|
size_t total = 0;
|
|
|
|
|
size_t free = 0;
|
2023-02-01 17:22:53 +01:00
|
|
|
get_device_memory_info(total, free);
|
|
|
|
|
|
|
|
|
|
/* Allocate in device memory. */
|
2025-01-28 06:38:36 +01:00
|
|
|
if ((!mem.move_to_host && (size + headroom) < free) || (mem.type == MEM_DEVICE_ONLY)) {
|
2023-02-01 17:22:53 +01:00
|
|
|
mem_alloc_result = alloc_device(device_pointer, size);
|
|
|
|
|
if (mem_alloc_result) {
|
|
|
|
|
device_mem_in_use += size;
|
|
|
|
|
status = " in device memory";
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/* Fall back to mapped host memory if needed and possible. */
|
|
|
|
|
|
2024-12-26 17:53:59 +01:00
|
|
|
void *shared_pointer = nullptr;
|
2023-02-01 17:22:53 +01:00
|
|
|
|
|
|
|
|
if (!mem_alloc_result && can_map_host && mem.type != MEM_DEVICE_ONLY) {
|
|
|
|
|
if (mem.shared_pointer) {
|
|
|
|
|
/* Another device already allocated host memory. */
|
|
|
|
|
mem_alloc_result = true;
|
|
|
|
|
shared_pointer = mem.shared_pointer;
|
|
|
|
|
}
|
|
|
|
|
else if (map_host_used + size < map_host_limit) {
|
|
|
|
|
/* Allocate host memory ourselves. */
|
2025-02-13 13:11:39 +01:00
|
|
|
mem_alloc_result = shared_alloc(shared_pointer, size);
|
2023-02-01 17:22:53 +01:00
|
|
|
|
2024-12-26 17:53:59 +01:00
|
|
|
assert((mem_alloc_result && shared_pointer != nullptr) ||
|
|
|
|
|
(!mem_alloc_result && shared_pointer == nullptr));
|
2023-02-01 17:22:53 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if (mem_alloc_result) {
|
2025-02-13 13:11:39 +01:00
|
|
|
device_pointer = shared_to_device_pointer(shared_pointer);
|
2023-02-01 17:22:53 +01:00
|
|
|
map_host_used += size;
|
|
|
|
|
status = " in host memory";
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if (!mem_alloc_result) {
|
|
|
|
|
if (mem.type == MEM_DEVICE_ONLY) {
|
|
|
|
|
status = " failed, out of device memory";
|
|
|
|
|
set_error("System is out of GPU memory");
|
|
|
|
|
}
|
|
|
|
|
else {
|
|
|
|
|
status = " failed, out of device and host memory";
|
|
|
|
|
set_error("System is out of GPU and shared host memory");
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if (mem.name) {
|
2025-08-18 20:22:44 +02:00
|
|
|
LOG_DEBUG << "Buffer allocate: " << mem.name << ", "
|
|
|
|
|
<< string_human_readable_number(mem.memory_size()) << " bytes. ("
|
|
|
|
|
<< string_human_readable_size(mem.memory_size()) << ")" << status;
|
2023-02-01 17:22:53 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
mem.device_pointer = (device_ptr)device_pointer;
|
|
|
|
|
mem.device_size = size;
|
|
|
|
|
stats.mem_alloc(size);
|
|
|
|
|
|
|
|
|
|
if (!mem.device_pointer) {
|
2024-12-26 17:53:55 +01:00
|
|
|
return nullptr;
|
2023-02-01 17:22:53 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/* Insert into map of allocations. */
|
2024-12-29 17:32:00 +01:00
|
|
|
const thread_scoped_lock lock(device_mem_map_mutex);
|
2023-02-01 17:22:53 +01:00
|
|
|
Mem *cmem = &device_mem_map[&mem];
|
2024-12-26 17:53:59 +01:00
|
|
|
if (shared_pointer != nullptr) {
|
2023-02-01 17:22:53 +01:00
|
|
|
/* Replace host pointer with our host allocation. Only works if
|
|
|
|
|
* memory layout is the same and has no pitch padding. Also
|
|
|
|
|
* does not work if we move textures to host during a render,
|
|
|
|
|
* since other devices might be using the memory. */
|
|
|
|
|
|
2025-01-17 09:30:05 +01:00
|
|
|
if (!mem.move_to_host && pitch_padding == 0 && mem.host_pointer &&
|
2023-02-01 17:22:53 +01:00
|
|
|
mem.host_pointer != shared_pointer)
|
|
|
|
|
{
|
|
|
|
|
memcpy(shared_pointer, mem.host_pointer, size);
|
2025-02-13 13:11:39 +01:00
|
|
|
host_free(mem.type, mem.host_pointer, mem.memory_size());
|
2023-02-01 17:22:53 +01:00
|
|
|
mem.host_pointer = shared_pointer;
|
|
|
|
|
}
|
|
|
|
|
mem.shared_pointer = shared_pointer;
|
|
|
|
|
mem.shared_counter++;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
return cmem;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
void GPUDevice::generic_free(device_memory &mem)
|
|
|
|
|
{
|
2025-01-09 12:04:08 +01:00
|
|
|
if (!(mem.device_pointer && mem.is_resident(this))) {
|
|
|
|
|
return;
|
|
|
|
|
}
|
|
|
|
|
|
2025-01-17 09:41:34 +01:00
|
|
|
/* Host pointer should already have been freed at this point. If not we might
|
|
|
|
|
* end up freeing shared memory and can't recover original host memory. */
|
|
|
|
|
assert(mem.host_pointer == nullptr || mem.move_to_host);
|
|
|
|
|
|
2025-01-09 12:04:08 +01:00
|
|
|
const thread_scoped_lock lock(device_mem_map_mutex);
|
|
|
|
|
DCHECK(device_mem_map.find(&mem) != device_mem_map.end());
|
|
|
|
|
|
2025-01-21 16:07:29 +01:00
|
|
|
/* For host mapped memory, reference counting is used to safely free it. */
|
2025-02-13 13:11:39 +01:00
|
|
|
if (mem.is_shared(this)) {
|
2025-01-21 16:07:29 +01:00
|
|
|
assert(mem.shared_counter > 0);
|
|
|
|
|
if (--mem.shared_counter == 0) {
|
|
|
|
|
if (mem.host_pointer == mem.shared_pointer) {
|
|
|
|
|
/* Safely move the device-side data back to the host before it is freed.
|
|
|
|
|
* We should actually never reach this code as it is inefficient, but
|
|
|
|
|
* better than to crash if there is a bug. */
|
|
|
|
|
assert(!"GPU device should not copy memory back to host");
|
|
|
|
|
const size_t size = mem.memory_size();
|
|
|
|
|
mem.host_pointer = mem.host_alloc(size);
|
|
|
|
|
memcpy(mem.host_pointer, mem.shared_pointer, size);
|
2023-02-01 17:22:53 +01:00
|
|
|
}
|
2025-02-13 13:11:39 +01:00
|
|
|
shared_free(mem.shared_pointer);
|
2025-01-21 16:07:29 +01:00
|
|
|
mem.shared_pointer = nullptr;
|
2023-02-01 17:22:53 +01:00
|
|
|
}
|
2025-01-09 12:04:08 +01:00
|
|
|
map_host_used -= mem.device_size;
|
|
|
|
|
}
|
|
|
|
|
else {
|
|
|
|
|
/* Free device memory. */
|
|
|
|
|
free_device((void *)mem.device_pointer);
|
|
|
|
|
device_mem_in_use -= mem.device_size;
|
|
|
|
|
}
|
2023-02-01 17:22:53 +01:00
|
|
|
|
2025-01-09 12:04:08 +01:00
|
|
|
stats.mem_free(mem.device_size);
|
|
|
|
|
mem.device_pointer = 0;
|
|
|
|
|
mem.device_size = 0;
|
2023-02-01 17:22:53 +01:00
|
|
|
|
2025-01-09 12:04:08 +01:00
|
|
|
device_mem_map.erase(device_mem_map.find(&mem));
|
2023-02-01 17:22:53 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
void GPUDevice::generic_copy_to(device_memory &mem)
|
|
|
|
|
{
|
|
|
|
|
if (!mem.host_pointer || !mem.device_pointer) {
|
|
|
|
|
return;
|
|
|
|
|
}
|
|
|
|
|
|
2025-01-21 16:07:29 +01:00
|
|
|
/* If not host mapped, the current device only uses device memory allocated by backend
|
|
|
|
|
* device allocation regardless of mem.host_pointer and mem.shared_pointer, and should
|
2023-02-01 17:22:53 +01:00
|
|
|
* copy data from mem.host_pointer. */
|
2025-02-13 13:11:39 +01:00
|
|
|
if (!(mem.is_shared(this) && mem.host_pointer == mem.shared_pointer)) {
|
2023-02-01 17:22:53 +01:00
|
|
|
copy_host_to_device((void *)mem.device_pointer, mem.host_pointer, mem.memory_size());
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2025-02-13 13:11:39 +01:00
|
|
|
bool GPUDevice::is_shared(const void *shared_pointer,
|
|
|
|
|
const device_ptr device_pointer,
|
|
|
|
|
Device * /*sub_device*/)
|
2025-01-21 16:07:29 +01:00
|
|
|
{
|
|
|
|
|
return (shared_pointer && device_pointer &&
|
2025-02-13 13:11:39 +01:00
|
|
|
(device_ptr)shared_to_device_pointer(shared_pointer) == device_pointer);
|
2025-01-21 16:07:29 +01:00
|
|
|
}
|
|
|
|
|
|
Cycles: merge of cycles-x branch, a major update to the renderer
This includes much improved GPU rendering performance, viewport interactivity,
new shadow catcher, revamped sampling settings, subsurface scattering anisotropy,
new GPU volume sampling, improved PMJ sampling pattern, and more.
Some features have also been removed or changed, breaking backwards compatibility.
Including the removal of the OpenCL backend, for which alternatives are under
development.
Release notes and code docs:
https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycles
https://wiki.blender.org/wiki/Source/Render/Cycles
Credits:
* Sergey Sharybin
* Brecht Van Lommel
* Patrick Mours (OptiX backend)
* Christophe Hery (subsurface scattering anisotropy)
* William Leeson (PMJ sampling pattern)
* Alaska (various fixes and tweaks)
* Thomas Dinges (various fixes)
For the full commit history, see the cycles-x branch. This squashes together
all the changes since intermediate changes would often fail building or tests.
Ref T87839, T87837, T87836
Fixes T90734, T89353, T80267, T80267, T77185, T69800
2021-09-20 17:59:20 +02:00
|
|
|
/* DeviceInfo */
|
|
|
|
|
|
2015-01-06 14:13:21 +05:00
|
|
|
CCL_NAMESPACE_END
|