Files
test2/source/blender/gpu/CMakeLists.txt

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

931 lines
30 KiB
CMake
Raw Normal View History

# SPDX-FileCopyrightText: 2006 Blender Authors
#
# SPDX-License-Identifier: GPL-2.0-or-later
2010-09-18 03:55:56 +00:00
set(INC
PUBLIC .
dummy
2023-08-24 11:38:41 +10:00
intern
metal
2022-03-25 12:04:14 +11:00
opengl
2022-12-17 16:00:40 +11:00
vulkan
../makesrna
2022-06-22 14:59:42 +10:00
# For theme color access.
../editors/include
2024-09-20 13:14:57 +10:00
# For `*_info.hh` includes.
../compositor/shaders/infos
../draw/engines/eevee
../draw/engines/eevee/shaders/infos
../draw/engines/gpencil
../draw/engines/gpencil/shaders/infos
../draw/engines/image/shaders/infos
../draw/engines/overlay/shaders/infos
../draw/engines/select
../draw/engines/select/shaders/infos
../draw/engines/workbench
../draw/engines/workbench/shaders/infos
DRWManager: New implementation. This is a new implementation of the draw manager using modern rendering practices and GPU driven culling. This only ports features that are not considered deprecated or to be removed. The old DRW API is kept working along side this new one, and does not interfeer with it. However this needed some more hacking inside the draw_view_lib.glsl. At least the create info are well separated. The reviewer might start by looking at `draw_pass_test.cc` to see the API in usage. Important files are `draw_pass.hh`, `draw_command.hh`, `draw_command_shared.hh`. In a nutshell (for a developper used to old DRW API): - `DRWShadingGroups` are replaced by `Pass<T>::Sub`. - Contrary to DRWShadingGroups, all commands recorded inside a pass or sub-pass (even binds / push_constant / uniforms) will be executed in order. - All memory is managed per object (except for Sub-Pass which are managed by their parent pass) and not from draw manager pools. So passes "can" potentially be recorded once and submitted multiple time (but this is not really encouraged for now). The only implicit link is between resource lifetime and `ResourceHandles` - Sub passes can be any level deep. - IMPORTANT: All state propagate from sub pass to subpass. There is no state stack concept anymore. Ensure the correct render state is set before drawing anything using `Pass::state_set()`. - The drawcalls now needs a `ResourceHandle` instead of an `Object *`. This is to remove any implicit dependency between `Pass` and `Manager`. This was a huge problem in old implementation since the manager did not know what to pull from the object. Now it is explicitly requested by the engine. - The pases need to be submitted to a `draw::Manager` instance which can be retrieved using `DRW_manager_get()` (for now). Internally: - All object data are stored in contiguous storage buffers. Removing a lot of complexity in the pass submission. - Draw calls are sorted and visibility tested on GPU. Making more modern culling and better instancing usage possible in the future. - Unit Tests have been added for regression testing and avoid most API breakage. - `draw::View` now contains culling data for all objects in the scene allowing caching for multiple views. - Bounding box and sphere final setup is moved to GPU. - Some global resources locations have been hardcoded to reduce complexity. What is missing: - ~~Workaround for lack of gl_BaseInstanceARB.~~ Done - ~~Object Uniform Attributes.~~ Done (Not in this patch) - Workaround for hardware supporting a maximum of 8 SSBO. Reviewed By: jbakker Differential Revision: https://developer.blender.org/D15817
2022-09-02 18:30:48 +02:00
../draw/intern
../draw/intern/shaders
metal/kernels
vulkan/shaders
shaders/infos
# For shader includes
shaders/common
shaders
../../../intern/ghost
../../../intern/mantaflow/extern
../../../intern/opensubdiv
)
if(WITH_BUILDINFO)
add_definitions(-DWITH_BUILDINFO)
endif()
if(WITH_RENDERDOC)
list(APPEND INC
../../../extern/renderdoc/include
../../../intern/renderdoc_dynload/include
)
add_definitions(-DWITH_RENDERDOC)
endif()
if(WITH_GPU_SHADER_ASSERT)
add_definitions(-DWITH_GPU_SHADER_ASSERT)
endif()
set(INC_SYS
2010-09-18 03:55:56 +00:00
)
set(SRC
2020-07-25 18:29:41 +02:00
intern/gpu_batch.cc
intern/gpu_batch_presets.cc
intern/gpu_batch_utils.cc
intern/gpu_capabilities.cc
intern/gpu_codegen.cc
intern/gpu_compute.cc
intern/gpu_context.cc
intern/gpu_debug.cc
2020-07-25 18:40:19 +02:00
intern/gpu_framebuffer.cc
2020-07-25 18:40:54 +02:00
intern/gpu_immediate.cc
intern/gpu_immediate_util.cc
intern/gpu_index_buffer.cc
intern/gpu_init_exit.cc
intern/gpu_material.cc
2020-07-30 23:50:43 +02:00
intern/gpu_matrix.cc
intern/gpu_node_graph.cc
intern/gpu_pass.cc
2020-07-25 18:41:55 +02:00
intern/gpu_platform.cc
intern/gpu_query.cc
intern/gpu_select.cc
intern/gpu_select_next.cc
intern/gpu_select_pick.cc
intern/gpu_select_sample_query.cc
intern/gpu_shader.cc
intern/gpu_shader_builtin.cc
intern/gpu_shader_create_info.cc
intern/gpu_shader_dependency.cc
intern/gpu_shader_interface.cc
intern/gpu_shader_log.cc
2020-07-28 19:26:54 +02:00
intern/gpu_state.cc
intern/gpu_storage_buffer.cc
intern/gpu_texture.cc
intern/gpu_texture_pool.cc
intern/gpu_uniform_buffer.cc
2020-07-28 02:15:22 +02:00
intern/gpu_vertex_buffer.cc
2020-07-27 23:56:43 +02:00
intern/gpu_vertex_format.cc
intern/gpu_vertex_format_normals.cc
intern/gpu_viewport.cc
intern/gpu_worker.cc
2025-04-24 12:44:27 +10:00
GPU_attribute_convert.hh
GPU_batch.hh
GPU_batch_presets.hh
GPU_batch_utils.hh
GPU_capabilities.hh
GPU_common.hh
GPU_common_types.hh
GPU: Add GPU_shader_batch_create_from_infos This is the first commit of the several required to support subprocess-based parallel compilation on OpenGL. This provides the base API and implementation, and exposes the max subprocesses setting on the UI, but it's not used by any code yet. More information and the rest of the code can be found in #121925. This one includes: - A new `GPU_shader_batch` API that allows requesting the compilation of multiple shaders at once, allowing GPU backed to compile them in parallel and asynchronously without blocking the Blender UI. - A virtual `ShaderCompiler` class that backends can use to add their own implementation. - A `ShaderCompilerGeneric` class that implements synchronous/blocking compilation of batches for backends that don't have their own implementation yet. - A `GLShaderCompiler` that supports parallel compilation using subprocesses. - A new `BLI_subprocess` API, including IPC (required for the `GLShaderCompiler` implementation). - The implementation of the subprocess program in `GPU_compilation_subprocess`. - A new `Max Shader Compilation Subprocesses` option in `Preferences > System > Memory & Limits` to enable parallel shader compilation and the max number of subprocesses to allocate (each subprocess has a relatively high memory footprint). Implementation Overview: There's a single `GLShaderCompiler` shared by all OpenGL contexts. This class stores a pool of up to `GCaps.max_parallel_compilations` subprocesses that can be used for compilation. Each subprocess has a shared memory pool used for sending the shader source code from the main Blender process and for receiving the already compiled shader binary from the subprocess. This is synchronized using a series of shared semaphores. The subprocesses maintain a shader cache on disk inside a `BLENDER_SHADER_CACHE` folder at the OS temporary folder. Shaders that fail to compile are tried to be compiled again locally for proper error reports. Hanged subprocesses are currently detected using a timeout of 30s. Pull Request: https://projects.blender.org/blender/blender/pulls/122232
2024-06-05 18:45:57 +02:00
GPU_compilation_subprocess.hh
GPU_compute.hh
GPU_context.hh
GPU_debug.hh
GPU_format.hh
GPU_framebuffer.hh
GPU_immediate.hh
GPU_immediate_util.hh
GPU_index_buffer.hh
GPU_init_exit.hh
2024-02-01 10:40:24 -05:00
GPU_material.hh
GPU_matrix.hh
GPU_pass.hh
GPU_platform.hh
2024-07-25 11:24:11 +10:00
GPU_platform_backend_enum.h
GPU_primitive.hh
GPU_select.hh
GPU_shader.hh
GPU_shader_builtin.hh
GPU_shader_shared.hh
GPU_state.hh
GPU_storage_buffer.hh
GPU_texture.hh
GPU_texture_pool.hh
GPU_uniform_buffer.hh
GPU_vertex_buffer.hh
GPU_vertex_format.hh
GPU_viewport.hh
GPU_worker.hh
intern/gpu_backend.hh
intern/gpu_capabilities_private.hh
intern/gpu_codegen.hh
intern/gpu_context_private.hh
intern/gpu_debug_private.hh
intern/gpu_framebuffer_private.hh
intern/gpu_immediate_private.hh
intern/gpu_material_library.hh
intern/gpu_matrix_private.hh
intern/gpu_node_graph.hh
intern/gpu_platform_private.hh
intern/gpu_private.hh
intern/gpu_profile_report.hh
intern/gpu_query.hh
intern/gpu_select_private.hh
intern/gpu_shader_create_info.hh
intern/gpu_shader_create_info_private.hh
intern/gpu_shader_dependency_private.hh
intern/gpu_shader_interface.hh
2020-09-30 11:51:13 +10:00
intern/gpu_shader_private.hh
intern/gpu_state_private.hh
intern/gpu_storage_buffer_private.hh
intern/gpu_texture_private.hh
intern/gpu_uniform_buffer_private.hh
intern/gpu_vertex_format_private.hh
dummy/dummy_backend.hh
dummy/dummy_batch.hh
dummy/dummy_context.hh
dummy/dummy_framebuffer.hh
dummy/dummy_vertex_buffer.hh
)
set(OPENGL_SRC
opengl/gl_backend.cc
opengl/gl_batch.cc
GPU: Add GPU_shader_batch_create_from_infos This is the first commit of the several required to support subprocess-based parallel compilation on OpenGL. This provides the base API and implementation, and exposes the max subprocesses setting on the UI, but it's not used by any code yet. More information and the rest of the code can be found in #121925. This one includes: - A new `GPU_shader_batch` API that allows requesting the compilation of multiple shaders at once, allowing GPU backed to compile them in parallel and asynchronously without blocking the Blender UI. - A virtual `ShaderCompiler` class that backends can use to add their own implementation. - A `ShaderCompilerGeneric` class that implements synchronous/blocking compilation of batches for backends that don't have their own implementation yet. - A `GLShaderCompiler` that supports parallel compilation using subprocesses. - A new `BLI_subprocess` API, including IPC (required for the `GLShaderCompiler` implementation). - The implementation of the subprocess program in `GPU_compilation_subprocess`. - A new `Max Shader Compilation Subprocesses` option in `Preferences > System > Memory & Limits` to enable parallel shader compilation and the max number of subprocesses to allocate (each subprocess has a relatively high memory footprint). Implementation Overview: There's a single `GLShaderCompiler` shared by all OpenGL contexts. This class stores a pool of up to `GCaps.max_parallel_compilations` subprocesses that can be used for compilation. Each subprocess has a shared memory pool used for sending the shader source code from the main Blender process and for receiving the already compiled shader binary from the subprocess. This is synchronized using a series of shared semaphores. The subprocesses maintain a shader cache on disk inside a `BLENDER_SHADER_CACHE` folder at the OS temporary folder. Shaders that fail to compile are tried to be compiled again locally for proper error reports. Hanged subprocesses are currently detected using a timeout of 30s. Pull Request: https://projects.blender.org/blender/blender/pulls/122232
2024-06-05 18:45:57 +02:00
opengl/gl_compilation_subprocess.cc
opengl/gl_compute.cc
opengl/gl_context.cc
opengl/gl_debug.cc
opengl/gl_framebuffer.cc
opengl/gl_immediate.cc
opengl/gl_index_buffer.cc
opengl/gl_query.cc
opengl/gl_shader.cc
opengl/gl_shader_interface.cc
opengl/gl_shader_log.cc
opengl/gl_state.cc
opengl/gl_storage_buffer.cc
opengl/gl_texture.cc
opengl/gl_uniform_buffer.cc
opengl/gl_vertex_array.cc
opengl/gl_vertex_buffer.cc
opengl/gl_backend.hh
2020-08-10 11:41:22 +02:00
opengl/gl_batch.hh
GPU: Add GPU_shader_batch_create_from_infos This is the first commit of the several required to support subprocess-based parallel compilation on OpenGL. This provides the base API and implementation, and exposes the max subprocesses setting on the UI, but it's not used by any code yet. More information and the rest of the code can be found in #121925. This one includes: - A new `GPU_shader_batch` API that allows requesting the compilation of multiple shaders at once, allowing GPU backed to compile them in parallel and asynchronously without blocking the Blender UI. - A virtual `ShaderCompiler` class that backends can use to add their own implementation. - A `ShaderCompilerGeneric` class that implements synchronous/blocking compilation of batches for backends that don't have their own implementation yet. - A `GLShaderCompiler` that supports parallel compilation using subprocesses. - A new `BLI_subprocess` API, including IPC (required for the `GLShaderCompiler` implementation). - The implementation of the subprocess program in `GPU_compilation_subprocess`. - A new `Max Shader Compilation Subprocesses` option in `Preferences > System > Memory & Limits` to enable parallel shader compilation and the max number of subprocesses to allocate (each subprocess has a relatively high memory footprint). Implementation Overview: There's a single `GLShaderCompiler` shared by all OpenGL contexts. This class stores a pool of up to `GCaps.max_parallel_compilations` subprocesses that can be used for compilation. Each subprocess has a shared memory pool used for sending the shader source code from the main Blender process and for receiving the already compiled shader binary from the subprocess. This is synchronized using a series of shared semaphores. The subprocesses maintain a shader cache on disk inside a `BLENDER_SHADER_CACHE` folder at the OS temporary folder. Shaders that fail to compile are tried to be compiled again locally for proper error reports. Hanged subprocesses are currently detected using a timeout of 30s. Pull Request: https://projects.blender.org/blender/blender/pulls/122232
2024-06-05 18:45:57 +02:00
opengl/gl_compilation_subprocess.hh
opengl/gl_compute.hh
opengl/gl_context.hh
opengl/gl_debug.hh
opengl/gl_framebuffer.hh
opengl/gl_immediate.hh
opengl/gl_index_buffer.hh
opengl/gl_primitive.hh
opengl/gl_query.hh
2020-08-14 15:20:35 +02:00
opengl/gl_shader.hh
opengl/gl_shader_interface.hh
opengl/gl_state.hh
opengl/gl_storage_buffer.hh
opengl/gl_texture.hh
opengl/gl_uniform_buffer.hh
opengl/gl_vertex_array.hh
opengl/gl_vertex_buffer.hh
)
set(VULKAN_SRC
vulkan/vk_backend.cc
vulkan/vk_batch.cc
Vulkan: Initial Compute Shaders support This patch adds initial support for compute shaders to the vulkan backend. As the development is oriented to the test- cases we have the implementation is limited to what is used there. It has been validated that with this patch that the following test cases are running as expected - `GPUVulkanTest.gpu_shader_compute_vbo` - `GPUVulkanTest.gpu_shader_compute_ibo` - `GPUVulkanTest.gpu_shader_compute_ssbo` - `GPUVulkanTest.gpu_storage_buffer_create_update_read` - `GPUVulkanTest.gpu_shader_compute_2d` This patch includes: - Allocating VkBuffer on device. - Uploading data from CPU to VkBuffer. - Binding VkBuffer as SSBO to a compute shader. - Execute compute shader and altering VkBuffer. - Download the VkBuffer to CPU ram. - Validate that it worked. - Use device only vertex buffer as SSBO - Use device only index buffer as SSBO - Use device only image buffers GHOST API has been changed as the original design was created before we even had support for compute shaders in blender. The function `GHOST_getVulkanBackbuffer` has been separated to retrieve the command buffer without a backbuffer (`GHOST_getVulkanCommandBuffer`). In order to do correct command buffer processing we needed access to the queue owned by GHOST. This is returned as part of the `GHOST_getVulkanHandles` function. Open topics (not considered part of this patch) - Memory barriers & command buffer encoding - Indirect compute dispatching - Rest of the test cases - Data conversions when requested data format is different than on device. - GPUVulkanTest.gpu_shader_compute_1d is supported on AMD devices. NVIDIA doesn't seem to support 1d textures. Pull-request: #104518
2023-02-21 15:03:12 +01:00
vulkan/vk_buffer.cc
vulkan/vk_common.cc
vulkan/vk_context.cc
Vulkan: Texture Data Conversions This PR adds basic support for texture update, read back and clearing for Vulkan. In Vulkan we need to convert each data type ourselves as vulkan buffers are untyped. Therefore this change mostly is about data conversions. Considerations: - Use a compute shader to do the conversions: - Leads to performance regression as compute pipeline can stall graphics pipeline - Lead to additional memory usage as two staging buffers are needed one to hold the CPU data, and one to hold the converted data. - Do inline conversion when sending the data to Vulkan using `eGPUDataFormat` - Additional CPU cycles required and not easy to optimize as it the implementation requires many branches. - Do inline conversion when sending the data to Vulkan (optimized for CPU) For this solution it was chosen to implement the 3rd option as it is fast and doesn't require additional memory what the other options do. **Use Imath/half.h** This patch uses `Imath/half.h` (dependency of OpenEXR) similar to alembic. But this makes vulkan dependent of the availability of OpenEXR. For now this isn't checked, but when we are closer to a working Vulkan backend we have to make a decision how to cope with this dependency. **Missing Features** *Framebuffer textures* This doesn't include all possible data transformations. Some of those transformation can only be tested after the VKFramebuffer has been implemented. Some texture types are only available when created for a framebuffer. These include the depth and stencil variations. *Component format* Is more relevant when implementing VKVertexBuffer. *SRGB textures* SRGB encoded textures aren't natively supported on all platforms, in all usages and might require workarounds. This should be done in a separate PR in a later stage when we are required to use SRGB textures. **Test cases** The added test cases gives an overview of the missing bits and pieces of the patch. When the implementation/direction is accepted more test cases can be enabled/implemented. Some of these test cases will skip depending on the actual support of platform the tests are running on. For example OpenGL/NVidia will skip the next test as it doesn't support the texture format on OpenGL, although it does support it on Vulkan. ``` [ RUN ] GPUOpenGLTest.texture_roundtrip__GPU_DATA_2_10_10_10_REV__GPU_RGB10_A2UI [ SKIPPED ] GPUOpenGLTest.texture_roundtrip__GPU_DATA_2_10_10_10_REV__GPU_RGB10_A2UI [ RUN ] GPUVulkanTest.texture_roundtrip__GPU_DATA_2_10_10_10_REV__GPU_RGB10_A2UI [ OK ] GPUVulkanTest.texture_roundtrip__GPU_DATA_2_10_10_10_REV__GPU_RGB10_A2UI ``` Pull Request: https://projects.blender.org/blender/blender/pulls/105762
2023-03-24 08:09:19 +01:00
vulkan/vk_data_conversion.cc
vulkan/vk_debug.cc
Vulkan: Initial Compute Shaders support This patch adds initial support for compute shaders to the vulkan backend. As the development is oriented to the test- cases we have the implementation is limited to what is used there. It has been validated that with this patch that the following test cases are running as expected - `GPUVulkanTest.gpu_shader_compute_vbo` - `GPUVulkanTest.gpu_shader_compute_ibo` - `GPUVulkanTest.gpu_shader_compute_ssbo` - `GPUVulkanTest.gpu_storage_buffer_create_update_read` - `GPUVulkanTest.gpu_shader_compute_2d` This patch includes: - Allocating VkBuffer on device. - Uploading data from CPU to VkBuffer. - Binding VkBuffer as SSBO to a compute shader. - Execute compute shader and altering VkBuffer. - Download the VkBuffer to CPU ram. - Validate that it worked. - Use device only vertex buffer as SSBO - Use device only index buffer as SSBO - Use device only image buffers GHOST API has been changed as the original design was created before we even had support for compute shaders in blender. The function `GHOST_getVulkanBackbuffer` has been separated to retrieve the command buffer without a backbuffer (`GHOST_getVulkanCommandBuffer`). In order to do correct command buffer processing we needed access to the queue owned by GHOST. This is returned as part of the `GHOST_getVulkanHandles` function. Open topics (not considered part of this patch) - Memory barriers & command buffer encoding - Indirect compute dispatching - Rest of the test cases - Data conversions when requested data format is different than on device. - GPUVulkanTest.gpu_shader_compute_1d is supported on AMD devices. NVIDIA doesn't seem to support 1d textures. Pull-request: #104518
2023-02-21 15:03:12 +01:00
vulkan/vk_descriptor_pools.cc
vulkan/vk_descriptor_set.cc
vulkan/vk_descriptor_set_layouts.cc
vulkan/vk_device.cc
vulkan/vk_device_submission.cc
vulkan/vk_fence.cc
vulkan/vk_framebuffer.cc
vulkan/vk_image_view.cc
Vulkan: Initial Graphics Pipeline Initial graphic pipeline targeting. The goal of this PR is to have an initial graphics pipeline with missing features. It should help identifying areas that requires engineering. Current state is that developers of the GPU module can help with the many smaller pieces that needs to be engineered in order to get it working. It is not intended for users or developers from other modules, but your welcome to learn and give feedback on the code and engineering part. We do expect that large parts of the code still needs to be re-engineered into a more future-proof implementation. **Some highlights**: - In Vulkan the state is kept in the pipeline. Therefore the state is tracked per pipeline. In the near future this could be used as a cache. More research is needed against the default pipeline cache that vulkan already provides. - This PR is based on the work that Kazashi Yoshioka already did. And include work from him in the next areas - Vertex attributes - Vertex data conversions - Pipeline state - Immediate support working. - This PR modifies the VKCommandBuffer to keep track of the framebuffer and its binding state(render pass). Some Vulkan commands require no render pass to be active, other require a render pass. As the order of our commands on API level can not be separated this PR introduces a state engine to keep track of the current state and desired state. This is a temporary solution, the final solution will be proposed when we have a pixel on the screen. At that time I expect that we can design a command encoder that supports all the cases we need. **Notices**: - This branch works on NVIDIA GPUs and has been validated on a Linux system. AMD is known not to work (stalls) and Intel GPUs have not been tested at all. Windows might work but hasn't been validated yet. - The graphics pipeline is implemented with pixels in mind, not with performance. Currently when a draw call is scheduled it is flushed and waited until it is finished drawing, before other draw calls can be scheduled. We expected the performance to be worse that it actually is, but we expect huge performance gains in the future. - Any advanced drawing (that is used by the image editor, compositor or 3d viewport) isn't implemented and might crash when used. - Using multiple windows or resizing of window isn't supported and will stall the system. Pull Request: https://projects.blender.org/blender/blender/pulls/106224
2023-05-11 13:01:56 +02:00
vulkan/vk_immediate.cc
vulkan/vk_index_buffer.cc
2023-04-13 13:14:11 +10:00
vulkan/vk_memory_layout.cc
2025-09-12 10:20:40 +10:00
vulkan/vk_memory_pool.cc
Vulkan: Pipeline pool In Vulkan, a Blender shader is organized in multiple objects. A VkPipeline is the highest level concept and represents somewhat we call a shader. A pipeline is an device/platform optimized version of the shader that is uploaded and executed in the GPU device. A key difference with shaders is that its usage is also compiled in. When using the same shader with a different blending, a new pipeline needs to be created. In the current implementation of the Vulkan backend the pipeline is re-created when any pipeline parameter changes. This triggers many pipeline compilations. Especially when common shaders are used in different parts of the drawing code. A requirement of our render graph implementation is that changes of the pipeline can be detected based on the VkPipeline handle. We only want to rebind the pipeline handle when the handle actually changes. This improves performance (especially on NVIDIA) devices where pipeline binds are known to be costly. The solution of this PR is to add a pipeline pool. This holds all pipelines and can find an already created pipeline based on pipeline infos. Only compute pipelines support has been added. # Future enhancements - Recent drivers replace `VkShaderModule` with pipeline libraries. It improves sharing pipeline stages and reduce pipeline creation times. - GPUMaterials should be removed from the pipeline pool when they are destroyed. Details on this will be more clear when EEVEE support is added. Pull Request: https://projects.blender.org/blender/blender/pulls/120899
2024-04-23 12:39:41 +02:00
vulkan/vk_pipeline_pool.cc
vulkan/vk_pixel_buffer.cc
vulkan/vk_push_constants.cc
vulkan/vk_query.cc
vulkan/render_graph/nodes/vk_pipeline_data.cc
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
vulkan/render_graph/vk_command_buffer_wrapper.cc
vulkan/render_graph/vk_command_builder.cc
2024-05-06 09:20:57 +10:00
vulkan/render_graph/vk_render_graph.cc
vulkan/render_graph/vk_render_graph_links.cc
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
vulkan/render_graph/vk_resource_access_info.cc
vulkan/render_graph/vk_resource_state_tracker.cc
vulkan/render_graph/vk_scheduler.cc
vulkan/vk_resource_pool.cc
Vulkan: Initial Graphics Pipeline Initial graphic pipeline targeting. The goal of this PR is to have an initial graphics pipeline with missing features. It should help identifying areas that requires engineering. Current state is that developers of the GPU module can help with the many smaller pieces that needs to be engineered in order to get it working. It is not intended for users or developers from other modules, but your welcome to learn and give feedback on the code and engineering part. We do expect that large parts of the code still needs to be re-engineered into a more future-proof implementation. **Some highlights**: - In Vulkan the state is kept in the pipeline. Therefore the state is tracked per pipeline. In the near future this could be used as a cache. More research is needed against the default pipeline cache that vulkan already provides. - This PR is based on the work that Kazashi Yoshioka already did. And include work from him in the next areas - Vertex attributes - Vertex data conversions - Pipeline state - Immediate support working. - This PR modifies the VKCommandBuffer to keep track of the framebuffer and its binding state(render pass). Some Vulkan commands require no render pass to be active, other require a render pass. As the order of our commands on API level can not be separated this PR introduces a state engine to keep track of the current state and desired state. This is a temporary solution, the final solution will be proposed when we have a pixel on the screen. At that time I expect that we can design a command encoder that supports all the cases we need. **Notices**: - This branch works on NVIDIA GPUs and has been validated on a Linux system. AMD is known not to work (stalls) and Intel GPUs have not been tested at all. Windows might work but hasn't been validated yet. - The graphics pipeline is implemented with pixels in mind, not with performance. Currently when a draw call is scheduled it is flushed and waited until it is finished drawing, before other draw calls can be scheduled. We expected the performance to be worse that it actually is, but we expect huge performance gains in the future. - Any advanced drawing (that is used by the image editor, compositor or 3d viewport) isn't implemented and might crash when used. - Using multiple windows or resizing of window isn't supported and will stall the system. Pull Request: https://projects.blender.org/blender/blender/pulls/106224
2023-05-11 13:01:56 +02:00
vulkan/vk_sampler.cc
vulkan/vk_samplers.cc
vulkan/vk_shader.cc
Vulkan: Parallel shader compilation This PR introduces parallel shader compilation for Vulkan shader modules. This will improve shader compilation when switching to material preview or EEVEE render preview. It also improves material compilation. However in order to measure the differences shaderc needs to be updated. PR has been created so we can already start with the code review. This PR doesn't include SPIR-V caching, what will land in a separate PR as it needs more validation. Parallel shader compilation has been tested on AMD/NVIDIA on Linux. Testing on other platforms is planned in the upcoming days. **Performance** ``` AMD Ryzen™ 9 7950X × 32, 64GB Ram Operating system: Linux-6.8.0-44-generic-x86_64-with-glibc2.39 64 Bits, X11 UI Graphics card: Quadro RTX 6000/PCIe/SSE2 NVIDIA Corporation 4.6.0 NVIDIA 550.107.02 ``` *Test*: Start blender, open barbershop_interior.blend and wait until the viewport has fully settled. | Backend | Test | Duration | | ------- | ------------------------- | -------- | | OpenGL | Coldstart/No subprocesses | 1:52 | | OpenGL | Coldstart/8 Subprocesses | 0:54 | | OpenGL | Warmstart/8 Subprocesses | 0:06 | | Vulkan | Coldstart Without PR | 0:59 | | Vulkan | Warmstart Without PR | 0:58 | | Vulkan | Coldstart With PR | 0:33 | | Vulkan | Warmstart With PR | 0:08 | The difference in time (why OpenGL is faster in a warm start is that all shaders are cached). Vulkan in this case doesn't cache anything and all shaders are recompiled each time. Caching the shaders will be part of a future PR. Main reason not to add it to this PR directly is that SPIR-V cannot easily be validated and would require a sidecar to keep SPIR-V compatible with external tools.. **NOTE**: - This PR was extracted from #127418 - This PR requires #127564 to land and libraries to update. Linux lib is available as attachment in this PR. It works without, but is as slow as single threaded compilation. Pull Request: https://projects.blender.org/blender/blender/pulls/127698
2024-09-20 08:30:09 +02:00
vulkan/vk_shader_compiler.cc
Vulkan: Initial Compute Shaders support This patch adds initial support for compute shaders to the vulkan backend. As the development is oriented to the test- cases we have the implementation is limited to what is used there. It has been validated that with this patch that the following test cases are running as expected - `GPUVulkanTest.gpu_shader_compute_vbo` - `GPUVulkanTest.gpu_shader_compute_ibo` - `GPUVulkanTest.gpu_shader_compute_ssbo` - `GPUVulkanTest.gpu_storage_buffer_create_update_read` - `GPUVulkanTest.gpu_shader_compute_2d` This patch includes: - Allocating VkBuffer on device. - Uploading data from CPU to VkBuffer. - Binding VkBuffer as SSBO to a compute shader. - Execute compute shader and altering VkBuffer. - Download the VkBuffer to CPU ram. - Validate that it worked. - Use device only vertex buffer as SSBO - Use device only index buffer as SSBO - Use device only image buffers GHOST API has been changed as the original design was created before we even had support for compute shaders in blender. The function `GHOST_getVulkanBackbuffer` has been separated to retrieve the command buffer without a backbuffer (`GHOST_getVulkanCommandBuffer`). In order to do correct command buffer processing we needed access to the queue owned by GHOST. This is returned as part of the `GHOST_getVulkanHandles` function. Open topics (not considered part of this patch) - Memory barriers & command buffer encoding - Indirect compute dispatching - Rest of the test cases - Data conversions when requested data format is different than on device. - GPUVulkanTest.gpu_shader_compute_1d is supported on AMD devices. NVIDIA doesn't seem to support 1d textures. Pull-request: #104518
2023-02-21 15:03:12 +01:00
vulkan/vk_shader_interface.cc
vulkan/vk_shader_log.cc
Vulkan: Parallel shader compilation This PR introduces parallel shader compilation for Vulkan shader modules. This will improve shader compilation when switching to material preview or EEVEE render preview. It also improves material compilation. However in order to measure the differences shaderc needs to be updated. PR has been created so we can already start with the code review. This PR doesn't include SPIR-V caching, what will land in a separate PR as it needs more validation. Parallel shader compilation has been tested on AMD/NVIDIA on Linux. Testing on other platforms is planned in the upcoming days. **Performance** ``` AMD Ryzen™ 9 7950X × 32, 64GB Ram Operating system: Linux-6.8.0-44-generic-x86_64-with-glibc2.39 64 Bits, X11 UI Graphics card: Quadro RTX 6000/PCIe/SSE2 NVIDIA Corporation 4.6.0 NVIDIA 550.107.02 ``` *Test*: Start blender, open barbershop_interior.blend and wait until the viewport has fully settled. | Backend | Test | Duration | | ------- | ------------------------- | -------- | | OpenGL | Coldstart/No subprocesses | 1:52 | | OpenGL | Coldstart/8 Subprocesses | 0:54 | | OpenGL | Warmstart/8 Subprocesses | 0:06 | | Vulkan | Coldstart Without PR | 0:59 | | Vulkan | Warmstart Without PR | 0:58 | | Vulkan | Coldstart With PR | 0:33 | | Vulkan | Warmstart With PR | 0:08 | The difference in time (why OpenGL is faster in a warm start is that all shaders are cached). Vulkan in this case doesn't cache anything and all shaders are recompiled each time. Caching the shaders will be part of a future PR. Main reason not to add it to this PR directly is that SPIR-V cannot easily be validated and would require a sidecar to keep SPIR-V compatible with external tools.. **NOTE**: - This PR was extracted from #127418 - This PR requires #127564 to land and libraries to update. Linux lib is available as attachment in this PR. It works without, but is as slow as single threaded compilation. Pull Request: https://projects.blender.org/blender/blender/pulls/127698
2024-09-20 08:30:09 +02:00
vulkan/vk_shader_module.cc
vulkan/vk_staging_buffer.cc
Vulkan: Initial Compute Shaders support This patch adds initial support for compute shaders to the vulkan backend. As the development is oriented to the test- cases we have the implementation is limited to what is used there. It has been validated that with this patch that the following test cases are running as expected - `GPUVulkanTest.gpu_shader_compute_vbo` - `GPUVulkanTest.gpu_shader_compute_ibo` - `GPUVulkanTest.gpu_shader_compute_ssbo` - `GPUVulkanTest.gpu_storage_buffer_create_update_read` - `GPUVulkanTest.gpu_shader_compute_2d` This patch includes: - Allocating VkBuffer on device. - Uploading data from CPU to VkBuffer. - Binding VkBuffer as SSBO to a compute shader. - Execute compute shader and altering VkBuffer. - Download the VkBuffer to CPU ram. - Validate that it worked. - Use device only vertex buffer as SSBO - Use device only index buffer as SSBO - Use device only image buffers GHOST API has been changed as the original design was created before we even had support for compute shaders in blender. The function `GHOST_getVulkanBackbuffer` has been separated to retrieve the command buffer without a backbuffer (`GHOST_getVulkanCommandBuffer`). In order to do correct command buffer processing we needed access to the queue owned by GHOST. This is returned as part of the `GHOST_getVulkanHandles` function. Open topics (not considered part of this patch) - Memory barriers & command buffer encoding - Indirect compute dispatching - Rest of the test cases - Data conversions when requested data format is different than on device. - GPUVulkanTest.gpu_shader_compute_1d is supported on AMD devices. NVIDIA doesn't seem to support 1d textures. Pull-request: #104518
2023-02-21 15:03:12 +01:00
vulkan/vk_state_manager.cc
vulkan/vk_storage_buffer.cc
vulkan/vk_texture.cc
vulkan/vk_to_string.cc
vulkan/vk_uniform_buffer.cc
Vulkan: Initial Graphics Pipeline Initial graphic pipeline targeting. The goal of this PR is to have an initial graphics pipeline with missing features. It should help identifying areas that requires engineering. Current state is that developers of the GPU module can help with the many smaller pieces that needs to be engineered in order to get it working. It is not intended for users or developers from other modules, but your welcome to learn and give feedback on the code and engineering part. We do expect that large parts of the code still needs to be re-engineered into a more future-proof implementation. **Some highlights**: - In Vulkan the state is kept in the pipeline. Therefore the state is tracked per pipeline. In the near future this could be used as a cache. More research is needed against the default pipeline cache that vulkan already provides. - This PR is based on the work that Kazashi Yoshioka already did. And include work from him in the next areas - Vertex attributes - Vertex data conversions - Pipeline state - Immediate support working. - This PR modifies the VKCommandBuffer to keep track of the framebuffer and its binding state(render pass). Some Vulkan commands require no render pass to be active, other require a render pass. As the order of our commands on API level can not be separated this PR introduces a state engine to keep track of the current state and desired state. This is a temporary solution, the final solution will be proposed when we have a pixel on the screen. At that time I expect that we can design a command encoder that supports all the cases we need. **Notices**: - This branch works on NVIDIA GPUs and has been validated on a Linux system. AMD is known not to work (stalls) and Intel GPUs have not been tested at all. Windows might work but hasn't been validated yet. - The graphics pipeline is implemented with pixels in mind, not with performance. Currently when a draw call is scheduled it is flushed and waited until it is finished drawing, before other draw calls can be scheduled. We expected the performance to be worse that it actually is, but we expect huge performance gains in the future. - Any advanced drawing (that is used by the image editor, compositor or 3d viewport) isn't implemented and might crash when used. - Using multiple windows or resizing of window isn't supported and will stall the system. Pull Request: https://projects.blender.org/blender/blender/pulls/106224
2023-05-11 13:01:56 +02:00
vulkan/vk_vertex_attribute_object.cc
vulkan/vk_vertex_buffer.cc
vulkan/vk_backend.hh
vulkan/vk_batch.hh
Vulkan: Initial Compute Shaders support This patch adds initial support for compute shaders to the vulkan backend. As the development is oriented to the test- cases we have the implementation is limited to what is used there. It has been validated that with this patch that the following test cases are running as expected - `GPUVulkanTest.gpu_shader_compute_vbo` - `GPUVulkanTest.gpu_shader_compute_ibo` - `GPUVulkanTest.gpu_shader_compute_ssbo` - `GPUVulkanTest.gpu_storage_buffer_create_update_read` - `GPUVulkanTest.gpu_shader_compute_2d` This patch includes: - Allocating VkBuffer on device. - Uploading data from CPU to VkBuffer. - Binding VkBuffer as SSBO to a compute shader. - Execute compute shader and altering VkBuffer. - Download the VkBuffer to CPU ram. - Validate that it worked. - Use device only vertex buffer as SSBO - Use device only index buffer as SSBO - Use device only image buffers GHOST API has been changed as the original design was created before we even had support for compute shaders in blender. The function `GHOST_getVulkanBackbuffer` has been separated to retrieve the command buffer without a backbuffer (`GHOST_getVulkanCommandBuffer`). In order to do correct command buffer processing we needed access to the queue owned by GHOST. This is returned as part of the `GHOST_getVulkanHandles` function. Open topics (not considered part of this patch) - Memory barriers & command buffer encoding - Indirect compute dispatching - Rest of the test cases - Data conversions when requested data format is different than on device. - GPUVulkanTest.gpu_shader_compute_1d is supported on AMD devices. NVIDIA doesn't seem to support 1d textures. Pull-request: #104518
2023-02-21 15:03:12 +01:00
vulkan/vk_buffer.hh
vulkan/vk_common.hh
vulkan/vk_context.hh
Vulkan: Texture Data Conversions This PR adds basic support for texture update, read back and clearing for Vulkan. In Vulkan we need to convert each data type ourselves as vulkan buffers are untyped. Therefore this change mostly is about data conversions. Considerations: - Use a compute shader to do the conversions: - Leads to performance regression as compute pipeline can stall graphics pipeline - Lead to additional memory usage as two staging buffers are needed one to hold the CPU data, and one to hold the converted data. - Do inline conversion when sending the data to Vulkan using `eGPUDataFormat` - Additional CPU cycles required and not easy to optimize as it the implementation requires many branches. - Do inline conversion when sending the data to Vulkan (optimized for CPU) For this solution it was chosen to implement the 3rd option as it is fast and doesn't require additional memory what the other options do. **Use Imath/half.h** This patch uses `Imath/half.h` (dependency of OpenEXR) similar to alembic. But this makes vulkan dependent of the availability of OpenEXR. For now this isn't checked, but when we are closer to a working Vulkan backend we have to make a decision how to cope with this dependency. **Missing Features** *Framebuffer textures* This doesn't include all possible data transformations. Some of those transformation can only be tested after the VKFramebuffer has been implemented. Some texture types are only available when created for a framebuffer. These include the depth and stencil variations. *Component format* Is more relevant when implementing VKVertexBuffer. *SRGB textures* SRGB encoded textures aren't natively supported on all platforms, in all usages and might require workarounds. This should be done in a separate PR in a later stage when we are required to use SRGB textures. **Test cases** The added test cases gives an overview of the missing bits and pieces of the patch. When the implementation/direction is accepted more test cases can be enabled/implemented. Some of these test cases will skip depending on the actual support of platform the tests are running on. For example OpenGL/NVidia will skip the next test as it doesn't support the texture format on OpenGL, although it does support it on Vulkan. ``` [ RUN ] GPUOpenGLTest.texture_roundtrip__GPU_DATA_2_10_10_10_REV__GPU_RGB10_A2UI [ SKIPPED ] GPUOpenGLTest.texture_roundtrip__GPU_DATA_2_10_10_10_REV__GPU_RGB10_A2UI [ RUN ] GPUVulkanTest.texture_roundtrip__GPU_DATA_2_10_10_10_REV__GPU_RGB10_A2UI [ OK ] GPUVulkanTest.texture_roundtrip__GPU_DATA_2_10_10_10_REV__GPU_RGB10_A2UI ``` Pull Request: https://projects.blender.org/blender/blender/pulls/105762
2023-03-24 08:09:19 +01:00
vulkan/vk_data_conversion.hh
vulkan/vk_debug.hh
Vulkan: Initial Compute Shaders support This patch adds initial support for compute shaders to the vulkan backend. As the development is oriented to the test- cases we have the implementation is limited to what is used there. It has been validated that with this patch that the following test cases are running as expected - `GPUVulkanTest.gpu_shader_compute_vbo` - `GPUVulkanTest.gpu_shader_compute_ibo` - `GPUVulkanTest.gpu_shader_compute_ssbo` - `GPUVulkanTest.gpu_storage_buffer_create_update_read` - `GPUVulkanTest.gpu_shader_compute_2d` This patch includes: - Allocating VkBuffer on device. - Uploading data from CPU to VkBuffer. - Binding VkBuffer as SSBO to a compute shader. - Execute compute shader and altering VkBuffer. - Download the VkBuffer to CPU ram. - Validate that it worked. - Use device only vertex buffer as SSBO - Use device only index buffer as SSBO - Use device only image buffers GHOST API has been changed as the original design was created before we even had support for compute shaders in blender. The function `GHOST_getVulkanBackbuffer` has been separated to retrieve the command buffer without a backbuffer (`GHOST_getVulkanCommandBuffer`). In order to do correct command buffer processing we needed access to the queue owned by GHOST. This is returned as part of the `GHOST_getVulkanHandles` function. Open topics (not considered part of this patch) - Memory barriers & command buffer encoding - Indirect compute dispatching - Rest of the test cases - Data conversions when requested data format is different than on device. - GPUVulkanTest.gpu_shader_compute_1d is supported on AMD devices. NVIDIA doesn't seem to support 1d textures. Pull-request: #104518
2023-02-21 15:03:12 +01:00
vulkan/vk_descriptor_pools.hh
vulkan/vk_descriptor_set.hh
vulkan/vk_descriptor_set_layouts.hh
vulkan/vk_device.hh
vulkan/vk_fence.hh
vulkan/vk_framebuffer.hh
vulkan/vk_ghost_api.hh
vulkan/vk_image_view.hh
Vulkan: Initial Graphics Pipeline Initial graphic pipeline targeting. The goal of this PR is to have an initial graphics pipeline with missing features. It should help identifying areas that requires engineering. Current state is that developers of the GPU module can help with the many smaller pieces that needs to be engineered in order to get it working. It is not intended for users or developers from other modules, but your welcome to learn and give feedback on the code and engineering part. We do expect that large parts of the code still needs to be re-engineered into a more future-proof implementation. **Some highlights**: - In Vulkan the state is kept in the pipeline. Therefore the state is tracked per pipeline. In the near future this could be used as a cache. More research is needed against the default pipeline cache that vulkan already provides. - This PR is based on the work that Kazashi Yoshioka already did. And include work from him in the next areas - Vertex attributes - Vertex data conversions - Pipeline state - Immediate support working. - This PR modifies the VKCommandBuffer to keep track of the framebuffer and its binding state(render pass). Some Vulkan commands require no render pass to be active, other require a render pass. As the order of our commands on API level can not be separated this PR introduces a state engine to keep track of the current state and desired state. This is a temporary solution, the final solution will be proposed when we have a pixel on the screen. At that time I expect that we can design a command encoder that supports all the cases we need. **Notices**: - This branch works on NVIDIA GPUs and has been validated on a Linux system. AMD is known not to work (stalls) and Intel GPUs have not been tested at all. Windows might work but hasn't been validated yet. - The graphics pipeline is implemented with pixels in mind, not with performance. Currently when a draw call is scheduled it is flushed and waited until it is finished drawing, before other draw calls can be scheduled. We expected the performance to be worse that it actually is, but we expect huge performance gains in the future. - Any advanced drawing (that is used by the image editor, compositor or 3d viewport) isn't implemented and might crash when used. - Using multiple windows or resizing of window isn't supported and will stall the system. Pull Request: https://projects.blender.org/blender/blender/pulls/106224
2023-05-11 13:01:56 +02:00
vulkan/vk_immediate.hh
vulkan/vk_index_buffer.hh
vulkan/vk_memory.hh
2025-04-08 14:09:15 +10:00
vulkan/vk_memory_layout.hh
2025-09-12 10:20:40 +10:00
vulkan/vk_memory_pool.hh
Vulkan: Pipeline pool In Vulkan, a Blender shader is organized in multiple objects. A VkPipeline is the highest level concept and represents somewhat we call a shader. A pipeline is an device/platform optimized version of the shader that is uploaded and executed in the GPU device. A key difference with shaders is that its usage is also compiled in. When using the same shader with a different blending, a new pipeline needs to be created. In the current implementation of the Vulkan backend the pipeline is re-created when any pipeline parameter changes. This triggers many pipeline compilations. Especially when common shaders are used in different parts of the drawing code. A requirement of our render graph implementation is that changes of the pipeline can be detected based on the VkPipeline handle. We only want to rebind the pipeline handle when the handle actually changes. This improves performance (especially on NVIDIA) devices where pipeline binds are known to be costly. The solution of this PR is to add a pipeline pool. This holds all pipelines and can find an already created pipeline based on pipeline infos. Only compute pipelines support has been added. # Future enhancements - Recent drivers replace `VkShaderModule` with pipeline libraries. It improves sharing pipeline stages and reduce pipeline creation times. - GPUMaterials should be removed from the pipeline pool when they are destroyed. Details on this will be more clear when EEVEE support is added. Pull Request: https://projects.blender.org/blender/blender/pulls/120899
2024-04-23 12:39:41 +02:00
vulkan/vk_pipeline_pool.hh
vulkan/vk_pixel_buffer.hh
vulkan/vk_push_constants.hh
vulkan/vk_query.hh
vulkan/render_graph/nodes/vk_begin_query_node.hh
vulkan/render_graph/nodes/vk_begin_rendering_node.hh
2024-07-25 11:24:10 +10:00
vulkan/render_graph/nodes/vk_blit_image_node.hh
vulkan/render_graph/nodes/vk_clear_attachments_node.hh
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
vulkan/render_graph/nodes/vk_clear_color_image_node.hh
vulkan/render_graph/nodes/vk_clear_depth_stencil_image_node.hh
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
vulkan/render_graph/nodes/vk_copy_buffer_node.hh
vulkan/render_graph/nodes/vk_copy_buffer_to_image_node.hh
vulkan/render_graph/nodes/vk_copy_image_node.hh
vulkan/render_graph/nodes/vk_copy_image_to_buffer_node.hh
2024-07-25 11:24:11 +10:00
vulkan/render_graph/nodes/vk_dispatch_indirect_node.hh
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
vulkan/render_graph/nodes/vk_dispatch_node.hh
vulkan/render_graph/nodes/vk_draw_indexed_indirect_node.hh
2024-07-25 11:24:10 +10:00
vulkan/render_graph/nodes/vk_draw_indexed_node.hh
vulkan/render_graph/nodes/vk_draw_indirect_node.hh
2024-07-25 11:24:10 +10:00
vulkan/render_graph/nodes/vk_draw_node.hh
vulkan/render_graph/nodes/vk_end_query_node.hh
vulkan/render_graph/nodes/vk_end_rendering_node.hh
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
vulkan/render_graph/nodes/vk_fill_buffer_node.hh
vulkan/render_graph/nodes/vk_node_info.hh
vulkan/render_graph/nodes/vk_pipeline_data.hh
vulkan/render_graph/nodes/vk_reset_query_pool_node.hh
2024-05-06 09:20:57 +10:00
vulkan/render_graph/nodes/vk_synchronization_node.hh
vulkan/render_graph/nodes/vk_update_buffer_node.hh
vulkan/render_graph/nodes/vk_update_mipmaps_node.hh
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
vulkan/render_graph/vk_command_buffer_wrapper.hh
vulkan/render_graph/vk_command_builder.hh
2024-05-06 09:20:57 +10:00
vulkan/render_graph/vk_render_graph.hh
vulkan/render_graph/vk_render_graph_links.hh
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
vulkan/render_graph/vk_render_graph_node.hh
vulkan/render_graph/vk_resource_access_info.hh
vulkan/render_graph/vk_resource_state_tracker.hh
vulkan/render_graph/vk_scheduler.hh
vulkan/vk_resource_pool.hh
Vulkan: Initial Graphics Pipeline Initial graphic pipeline targeting. The goal of this PR is to have an initial graphics pipeline with missing features. It should help identifying areas that requires engineering. Current state is that developers of the GPU module can help with the many smaller pieces that needs to be engineered in order to get it working. It is not intended for users or developers from other modules, but your welcome to learn and give feedback on the code and engineering part. We do expect that large parts of the code still needs to be re-engineered into a more future-proof implementation. **Some highlights**: - In Vulkan the state is kept in the pipeline. Therefore the state is tracked per pipeline. In the near future this could be used as a cache. More research is needed against the default pipeline cache that vulkan already provides. - This PR is based on the work that Kazashi Yoshioka already did. And include work from him in the next areas - Vertex attributes - Vertex data conversions - Pipeline state - Immediate support working. - This PR modifies the VKCommandBuffer to keep track of the framebuffer and its binding state(render pass). Some Vulkan commands require no render pass to be active, other require a render pass. As the order of our commands on API level can not be separated this PR introduces a state engine to keep track of the current state and desired state. This is a temporary solution, the final solution will be proposed when we have a pixel on the screen. At that time I expect that we can design a command encoder that supports all the cases we need. **Notices**: - This branch works on NVIDIA GPUs and has been validated on a Linux system. AMD is known not to work (stalls) and Intel GPUs have not been tested at all. Windows might work but hasn't been validated yet. - The graphics pipeline is implemented with pixels in mind, not with performance. Currently when a draw call is scheduled it is flushed and waited until it is finished drawing, before other draw calls can be scheduled. We expected the performance to be worse that it actually is, but we expect huge performance gains in the future. - Any advanced drawing (that is used by the image editor, compositor or 3d viewport) isn't implemented and might crash when used. - Using multiple windows or resizing of window isn't supported and will stall the system. Pull Request: https://projects.blender.org/blender/blender/pulls/106224
2023-05-11 13:01:56 +02:00
vulkan/vk_sampler.hh
vulkan/vk_samplers.hh
vulkan/vk_shader.hh
Vulkan: Parallel shader compilation This PR introduces parallel shader compilation for Vulkan shader modules. This will improve shader compilation when switching to material preview or EEVEE render preview. It also improves material compilation. However in order to measure the differences shaderc needs to be updated. PR has been created so we can already start with the code review. This PR doesn't include SPIR-V caching, what will land in a separate PR as it needs more validation. Parallel shader compilation has been tested on AMD/NVIDIA on Linux. Testing on other platforms is planned in the upcoming days. **Performance** ``` AMD Ryzen™ 9 7950X × 32, 64GB Ram Operating system: Linux-6.8.0-44-generic-x86_64-with-glibc2.39 64 Bits, X11 UI Graphics card: Quadro RTX 6000/PCIe/SSE2 NVIDIA Corporation 4.6.0 NVIDIA 550.107.02 ``` *Test*: Start blender, open barbershop_interior.blend and wait until the viewport has fully settled. | Backend | Test | Duration | | ------- | ------------------------- | -------- | | OpenGL | Coldstart/No subprocesses | 1:52 | | OpenGL | Coldstart/8 Subprocesses | 0:54 | | OpenGL | Warmstart/8 Subprocesses | 0:06 | | Vulkan | Coldstart Without PR | 0:59 | | Vulkan | Warmstart Without PR | 0:58 | | Vulkan | Coldstart With PR | 0:33 | | Vulkan | Warmstart With PR | 0:08 | The difference in time (why OpenGL is faster in a warm start is that all shaders are cached). Vulkan in this case doesn't cache anything and all shaders are recompiled each time. Caching the shaders will be part of a future PR. Main reason not to add it to this PR directly is that SPIR-V cannot easily be validated and would require a sidecar to keep SPIR-V compatible with external tools.. **NOTE**: - This PR was extracted from #127418 - This PR requires #127564 to land and libraries to update. Linux lib is available as attachment in this PR. It works without, but is as slow as single threaded compilation. Pull Request: https://projects.blender.org/blender/blender/pulls/127698
2024-09-20 08:30:09 +02:00
vulkan/vk_shader_compiler.hh
Vulkan: Initial Compute Shaders support This patch adds initial support for compute shaders to the vulkan backend. As the development is oriented to the test- cases we have the implementation is limited to what is used there. It has been validated that with this patch that the following test cases are running as expected - `GPUVulkanTest.gpu_shader_compute_vbo` - `GPUVulkanTest.gpu_shader_compute_ibo` - `GPUVulkanTest.gpu_shader_compute_ssbo` - `GPUVulkanTest.gpu_storage_buffer_create_update_read` - `GPUVulkanTest.gpu_shader_compute_2d` This patch includes: - Allocating VkBuffer on device. - Uploading data from CPU to VkBuffer. - Binding VkBuffer as SSBO to a compute shader. - Execute compute shader and altering VkBuffer. - Download the VkBuffer to CPU ram. - Validate that it worked. - Use device only vertex buffer as SSBO - Use device only index buffer as SSBO - Use device only image buffers GHOST API has been changed as the original design was created before we even had support for compute shaders in blender. The function `GHOST_getVulkanBackbuffer` has been separated to retrieve the command buffer without a backbuffer (`GHOST_getVulkanCommandBuffer`). In order to do correct command buffer processing we needed access to the queue owned by GHOST. This is returned as part of the `GHOST_getVulkanHandles` function. Open topics (not considered part of this patch) - Memory barriers & command buffer encoding - Indirect compute dispatching - Rest of the test cases - Data conversions when requested data format is different than on device. - GPUVulkanTest.gpu_shader_compute_1d is supported on AMD devices. NVIDIA doesn't seem to support 1d textures. Pull-request: #104518
2023-02-21 15:03:12 +01:00
vulkan/vk_shader_interface.hh
vulkan/vk_shader_log.hh
Vulkan: Parallel shader compilation This PR introduces parallel shader compilation for Vulkan shader modules. This will improve shader compilation when switching to material preview or EEVEE render preview. It also improves material compilation. However in order to measure the differences shaderc needs to be updated. PR has been created so we can already start with the code review. This PR doesn't include SPIR-V caching, what will land in a separate PR as it needs more validation. Parallel shader compilation has been tested on AMD/NVIDIA on Linux. Testing on other platforms is planned in the upcoming days. **Performance** ``` AMD Ryzen™ 9 7950X × 32, 64GB Ram Operating system: Linux-6.8.0-44-generic-x86_64-with-glibc2.39 64 Bits, X11 UI Graphics card: Quadro RTX 6000/PCIe/SSE2 NVIDIA Corporation 4.6.0 NVIDIA 550.107.02 ``` *Test*: Start blender, open barbershop_interior.blend and wait until the viewport has fully settled. | Backend | Test | Duration | | ------- | ------------------------- | -------- | | OpenGL | Coldstart/No subprocesses | 1:52 | | OpenGL | Coldstart/8 Subprocesses | 0:54 | | OpenGL | Warmstart/8 Subprocesses | 0:06 | | Vulkan | Coldstart Without PR | 0:59 | | Vulkan | Warmstart Without PR | 0:58 | | Vulkan | Coldstart With PR | 0:33 | | Vulkan | Warmstart With PR | 0:08 | The difference in time (why OpenGL is faster in a warm start is that all shaders are cached). Vulkan in this case doesn't cache anything and all shaders are recompiled each time. Caching the shaders will be part of a future PR. Main reason not to add it to this PR directly is that SPIR-V cannot easily be validated and would require a sidecar to keep SPIR-V compatible with external tools.. **NOTE**: - This PR was extracted from #127418 - This PR requires #127564 to land and libraries to update. Linux lib is available as attachment in this PR. It works without, but is as slow as single threaded compilation. Pull Request: https://projects.blender.org/blender/blender/pulls/127698
2024-09-20 08:30:09 +02:00
vulkan/vk_shader_module.hh
vulkan/vk_staging_buffer.hh
Vulkan: Initial Compute Shaders support This patch adds initial support for compute shaders to the vulkan backend. As the development is oriented to the test- cases we have the implementation is limited to what is used there. It has been validated that with this patch that the following test cases are running as expected - `GPUVulkanTest.gpu_shader_compute_vbo` - `GPUVulkanTest.gpu_shader_compute_ibo` - `GPUVulkanTest.gpu_shader_compute_ssbo` - `GPUVulkanTest.gpu_storage_buffer_create_update_read` - `GPUVulkanTest.gpu_shader_compute_2d` This patch includes: - Allocating VkBuffer on device. - Uploading data from CPU to VkBuffer. - Binding VkBuffer as SSBO to a compute shader. - Execute compute shader and altering VkBuffer. - Download the VkBuffer to CPU ram. - Validate that it worked. - Use device only vertex buffer as SSBO - Use device only index buffer as SSBO - Use device only image buffers GHOST API has been changed as the original design was created before we even had support for compute shaders in blender. The function `GHOST_getVulkanBackbuffer` has been separated to retrieve the command buffer without a backbuffer (`GHOST_getVulkanCommandBuffer`). In order to do correct command buffer processing we needed access to the queue owned by GHOST. This is returned as part of the `GHOST_getVulkanHandles` function. Open topics (not considered part of this patch) - Memory barriers & command buffer encoding - Indirect compute dispatching - Rest of the test cases - Data conversions when requested data format is different than on device. - GPUVulkanTest.gpu_shader_compute_1d is supported on AMD devices. NVIDIA doesn't seem to support 1d textures. Pull-request: #104518
2023-02-21 15:03:12 +01:00
vulkan/vk_state_manager.hh
vulkan/vk_storage_buffer.hh
vulkan/vk_texture.hh
vulkan/vk_to_string.hh
vulkan/vk_uniform_buffer.hh
Vulkan: Initial Graphics Pipeline Initial graphic pipeline targeting. The goal of this PR is to have an initial graphics pipeline with missing features. It should help identifying areas that requires engineering. Current state is that developers of the GPU module can help with the many smaller pieces that needs to be engineered in order to get it working. It is not intended for users or developers from other modules, but your welcome to learn and give feedback on the code and engineering part. We do expect that large parts of the code still needs to be re-engineered into a more future-proof implementation. **Some highlights**: - In Vulkan the state is kept in the pipeline. Therefore the state is tracked per pipeline. In the near future this could be used as a cache. More research is needed against the default pipeline cache that vulkan already provides. - This PR is based on the work that Kazashi Yoshioka already did. And include work from him in the next areas - Vertex attributes - Vertex data conversions - Pipeline state - Immediate support working. - This PR modifies the VKCommandBuffer to keep track of the framebuffer and its binding state(render pass). Some Vulkan commands require no render pass to be active, other require a render pass. As the order of our commands on API level can not be separated this PR introduces a state engine to keep track of the current state and desired state. This is a temporary solution, the final solution will be proposed when we have a pixel on the screen. At that time I expect that we can design a command encoder that supports all the cases we need. **Notices**: - This branch works on NVIDIA GPUs and has been validated on a Linux system. AMD is known not to work (stalls) and Intel GPUs have not been tested at all. Windows might work but hasn't been validated yet. - The graphics pipeline is implemented with pixels in mind, not with performance. Currently when a draw call is scheduled it is flushed and waited until it is finished drawing, before other draw calls can be scheduled. We expected the performance to be worse that it actually is, but we expect huge performance gains in the future. - Any advanced drawing (that is used by the image editor, compositor or 3d viewport) isn't implemented and might crash when used. - Using multiple windows or resizing of window isn't supported and will stall the system. Pull Request: https://projects.blender.org/blender/blender/pulls/106224
2023-05-11 13:01:56 +02:00
vulkan/vk_vertex_attribute_object.hh
vulkan/vk_vertex_buffer.hh
)
set(METAL_SRC
metal/mtl_backend.mm
metal/mtl_batch.mm
metal/mtl_command_buffer.mm
2022-08-09 13:26:37 +10:00
metal/mtl_context.mm
metal/mtl_debug.mm
metal/mtl_framebuffer.mm
metal/mtl_immediate.mm
metal/mtl_index_buffer.mm
metal/mtl_memory.mm
metal/mtl_query.mm
metal/mtl_shader.mm
metal/mtl_shader_generator.mm
metal/mtl_shader_interface.mm
metal/mtl_shader_log.mm
metal/mtl_state.mm
metal/mtl_storage_buffer.mm
metal/mtl_texture.mm
metal/mtl_texture_util.mm
metal/mtl_uniform_buffer.mm
metal/mtl_vertex_buffer.mm
metal/mtl_backend.hh
metal/mtl_batch.hh
metal/mtl_capabilities.hh
metal/mtl_common.hh
metal/mtl_context.hh
metal/mtl_debug.hh
metal/mtl_framebuffer.hh
metal/mtl_immediate.hh
metal/mtl_index_buffer.hh
metal/mtl_memory.hh
metal/mtl_primitive.hh
2022-09-07 15:14:33 +10:00
metal/mtl_pso_descriptor_state.hh
metal/mtl_query.hh
metal/mtl_shader.hh
metal/mtl_shader_generator.hh
metal/mtl_shader_interface.hh
2022-09-07 15:14:33 +10:00
metal/mtl_shader_interface_type.hh
metal/mtl_shader_log.hh
2025-04-16 21:34:18 +02:00
metal/mtl_shader_shared.hh
metal/mtl_state.hh
metal/mtl_storage_buffer.hh
metal/mtl_texture.hh
metal/mtl_uniform_buffer.hh
metal/mtl_vertex_buffer.hh
)
set(LIB
PRIVATE bf::blenkernel
PRIVATE bf::blenlib
PRIVATE bf::bmesh
Cleanup: CMake: Modernize bf_dna dependencies There's quite a few libraries that depend on dna_type_offsets.h but had gotten to it by just adding the folder that contains it to their includes INC section without declaring a dependency to bf_dna in the LIB section. which occasionally lead to the lib building before bf_dna and the header being missing, while this generally gets fixed in CMake by adding bf_dna to the LIB section of the lib, however until last week all libraries in the LIB section were linked as INTERFACE so adding it in there did not resolve the build issue. To make things still build, we sprinkled add_dependencies wherever we needed it to force a build order. This diff : Declares public include folders for the bf_dna target so there's no more fudging the INC section required to get to them. Removes all dna related paths from the INC section for all libraries. Adds an alias target bf:dna to signify it has been updated to modern cmake Declares a dependency on bf::dna for all libraries that require it Removes (almost) all calls to add_dependencies for bf_dna Future work: Because of the manual dependency management that was done, there is now some "clutter" with libs depending on bf_dna that realistically don't. Example bf_intern_opencolorio itself has no dependency on bf_dna at all, doesn't need it, doesn't use it. However the dna include folder had been added to it in the past since bf_blenlib uses dna headers in some of its public headers and bf_intern_opencolorio does use those blenlib headers. Given bf_blenlib now correctly declares the dependency on bf_dna as public bf_intern_opencolorio will get the dna header directory automatically from CMake, hence some cleanup could be done for bf_intern_opencolorio Because 99% of the changes in this diff have been automated, this diff does not seek to address these issues as there is no easy way to determine why a certain dependency is in place. A developer will have to make a pass a this at some later point in time. As I'd rather not mix automated and manual labour. There are a few libraries that could not be automatically processed (ie bf_blendthumb) that also will need this manual look-over. Pull Request: https://projects.blender.org/blender/blender/pulls/109835
2023-07-10 15:07:37 +02:00
PRIVATE bf::dna
PRIVATE bf::draw
PRIVATE bf::imbuf
PRIVATE bf::intern::atomic
PRIVATE bf::intern::clog
PRIVATE bf::intern::guardedalloc
PRIVATE bf::extern::fmtlib
PRIVATE bf::nodes
Refactor: OpenColorIO integration Briefly about this change: - OpenColorIO C-API is removed. - The information about color spaces in ImBuf module is removed. It was stored in global ListBase in colormanagement.cc. - Both OpenColorIO and fallback implementation supports GPU drawing. - Fallback implementation supports white point, RGB curves, etc. - Removed check for support of GPU drawing in IMB. Historically it was implemented in a separate library with C-API, this is because way back C++ code needed to stay in intern. This causes all sort of overheads, and even calls that are strictly considered bad level. This change moves OpenColorIO integration into a module within imbuf, next to movie, and next to IMB_colormanagement which is the main user of it. This allows to avoid copy of color spaces, displays, views etc in the ImBuf: they were used to help quickly querying information to be shown on the interface. With this change it can be stored in the same data structures as what is used by the OpenColorIO integration. While it might not be fully avoiding duplication it is now less, and there is no need in the user code to maintain the copies. In a lot of cases this change also avoids allocations done per access to the OpenColorIO. For example, it is not needed anymore to allocate image descriptor in a heap. The bigger user-visible change is that the fallback implementation now supports GLSL drawing, with the whole list of supported features, such as curve mapping and white point. This should help simplifying code which relies on color space conversion on GPU: there is no need to figure out fallback solution in such cases. The only case when drawing will not work is when there is some actual bug, or driver issue, and shader has failed to compile. The change avoids having an opaque type for color space, and instead uses forward declaration. It is a bit verbose on declaration, but helps avoiding unsafe type-casts. There are ways to solve this in the future, like having a header for forward declaration, or to flatten the name space a bit. There should be no user-level changes under normal operation. When building without OpenColorIO or the configuration has a typo or is missing a fuller set of color management tools is applies (such as the white point correction). Pull Request: https://projects.blender.org/blender/blender/pulls/138433
2025-05-09 14:01:43 +02:00
PRIVATE bf::dependencies::optional::opencolorio
)
# Select Backend source based on availability
if(WITH_OPENGL_BACKEND)
list(APPEND INC_SYS
${Epoxy_INCLUDE_DIRS}
)
list(APPEND SRC
${OPENGL_SRC}
)
list(APPEND LIB
${Epoxy_LIBRARIES}
)
add_definitions(-DWITH_OPENGL_BACKEND)
endif()
if(WITH_METAL_BACKEND)
list(APPEND SRC ${METAL_SRC})
endif()
if(WITH_VULKAN_BACKEND)
list(APPEND INC
../../../extern/vulkan_memory_allocator
)
list(APPEND INC_SYS
${VULKAN_INCLUDE_DIRS}
)
list(APPEND INC_SYS
${SHADERC_INCLUDE_DIRS}
)
list(APPEND SRC
${VULKAN_SRC}
)
list(APPEND LIB
${VULKAN_LIBRARIES}
${SHADERC_LIBRARIES}
extern_vulkan_memory_allocator
PRIVATE bf::extern::xxhash
)
add_definitions(-DWITH_VULKAN_BACKEND)
endif()
set(GLSL_SRC
GPU_shader_shared.hh
shaders/infos/gpu_clip_planes_infos.hh
shaders/infos/gpu_index_load_infos.hh
shaders/infos/gpu_interface_infos.hh
shaders/infos/gpu_shader_2D_area_borders_infos.hh
shaders/infos/gpu_shader_2D_checker_infos.hh
shaders/infos/gpu_shader_2D_diag_stripes_infos.hh
shaders/infos/gpu_shader_2D_image_desaturate_color_infos.hh
shaders/infos/gpu_shader_2D_image_infos.hh
shaders/infos/gpu_shader_2D_image_overlays_merge_infos.hh
shaders/infos/gpu_shader_2D_image_overlays_stereo_merge_infos.hh
shaders/infos/gpu_shader_2D_image_rect_color_infos.hh
shaders/infos/gpu_shader_2D_image_shuffle_color_infos.hh
shaders/infos/gpu_shader_2D_node_socket_infos.hh
shaders/infos/gpu_shader_2D_nodelink_infos.hh
shaders/infos/gpu_shader_2D_point_uniform_size_uniform_color_aa_infos.hh
shaders/infos/gpu_shader_2D_point_uniform_size_uniform_color_outline_aa_infos.hh
shaders/infos/gpu_shader_2D_point_varying_size_varying_color_infos.hh
shaders/infos/gpu_shader_2D_widget_infos.hh
shaders/infos/gpu_shader_3D_depth_only_infos.hh
shaders/infos/gpu_shader_3D_flat_color_infos.hh
shaders/infos/gpu_shader_3D_image_infos.hh
shaders/infos/gpu_shader_3D_point_infos.hh
shaders/infos/gpu_shader_3D_polyline_infos.hh
shaders/infos/gpu_shader_3D_smooth_color_infos.hh
shaders/infos/gpu_shader_3D_uniform_color_infos.hh
shaders/infos/gpu_shader_fullscreen_infos.hh
shaders/infos/gpu_shader_gpencil_stroke_infos.hh
shaders/infos/gpu_shader_icon_infos.hh
shaders/infos/gpu_shader_index_infos.hh
shaders/infos/gpu_shader_keyframe_shape_infos.hh
shaders/infos/gpu_shader_line_dashed_uniform_color_infos.hh
shaders/infos/gpu_shader_print_infos.hh
shaders/infos/gpu_shader_sequencer_infos.hh
shaders/infos/gpu_shader_simple_lighting_infos.hh
shaders/infos/gpu_shader_text_infos.hh
shaders/infos/gpu_srgb_to_framebuffer_space_infos.hh
shaders/gpu_shader_depth_only_frag.glsl
shaders/gpu_shader_uniform_color_frag.glsl
shaders/gpu_shader_checker_frag.glsl
shaders/gpu_shader_diag_stripes_frag.glsl
shaders/gpu_shader_simple_lighting_frag.glsl
shaders/gpu_shader_flat_color_frag.glsl
shaders/gpu_shader_2D_vert.glsl
shaders/gpu_shader_2D_area_borders_vert.glsl
shaders/gpu_shader_2D_area_borders_frag.glsl
shaders/gpu_shader_2D_widget_base_vert.glsl
shaders/gpu_shader_2D_widget_base_frag.glsl
shaders/gpu_shader_2D_widget_shadow_vert.glsl
shaders/gpu_shader_2D_widget_shadow_frag.glsl
shaders/gpu_shader_2D_node_socket_frag.glsl
shaders/gpu_shader_2D_node_socket_vert.glsl
shaders/gpu_shader_2D_nodelink_frag.glsl
shaders/gpu_shader_2D_nodelink_vert.glsl
shaders/gpu_shader_2D_line_dashed_frag.glsl
shaders/gpu_shader_2D_image_vert.glsl
shaders/gpu_shader_2D_image_rect_vert.glsl
shaders/gpu_shader_icon_multi_vert.glsl
UI: Icon number indicator for data-blocks Adds the possibility of having a little number on top of icons. At the moment this is used for: * Outliner * Node Editor bread-crumb * Node Group node header For the outliner there is almost no functional change. It is mostly a refactor to handle the indicators as part of the icon shader instead of the outliner draw code. (note that this was already recently changed in a5d3b648e3e2). The difference is that now we use rounded border rectangle instead of circles, and we can go up to 999 elements. So for the outliner this shows the number of collapsed elements of a certain type (e.g., mesh objects inside a collapsed collection). For the node editors is being used to show the use count for the data-block. This is important for the node editor, so users know whether the node-group they are editing (or are about to edit) is used elsewhere. This is particularly important when the Node Options are hidden, which is the default for node groups appended from the asset libraries. --- Note: This can be easily enabled for ID templates which can then be part of T84669. It just need to call UI_but_icon_indicator_number_set in the function template_add_button_search_menu. --- Special thanks Clément Foucault for the help figuring out the shader, Julian Eisel for the help navigating the UI code, and Pablo Vazquez for the collaboration in this design solution. For images showing the result check the Differential Revision. Differential Revision: https://developer.blender.org/D16284
2022-10-20 16:37:07 +02:00
shaders/gpu_shader_icon_frag.glsl
shaders/gpu_shader_icon_vert.glsl
shaders/gpu_shader_image_frag.glsl
shaders/gpu_shader_image_desaturate_frag.glsl
shaders/gpu_shader_image_overlays_merge_frag.glsl
shaders/gpu_shader_image_overlays_stereo_merge_frag.glsl
shaders/gpu_shader_image_shuffle_color_frag.glsl
shaders/gpu_shader_image_color_frag.glsl
shaders/gpu_shader_3D_image_vert.glsl
shaders/gpu_shader_3D_vert.glsl
shaders/gpu_shader_3D_normal_vert.glsl
shaders/gpu_shader_3D_flat_color_vert.glsl
shaders/gpu_shader_3D_line_dashed_uniform_color_vert.glsl
shaders/gpu_shader_3D_polyline_frag.glsl
shaders/gpu_shader_3D_polyline_vert.glsl
shaders/gpu_shader_3D_smooth_color_vert.glsl
shaders/gpu_shader_3D_smooth_color_frag.glsl
shaders/gpu_shader_3D_clipped_uniform_color_vert.glsl
shaders/gpu_shader_point_uniform_color_aa_frag.glsl
shaders/gpu_shader_point_uniform_color_outline_aa_frag.glsl
shaders/gpu_shader_point_varying_color_frag.glsl
shaders/gpu_shader_3D_point_varying_size_varying_color_vert.glsl
shaders/gpu_shader_3D_point_uniform_size_aa_vert.glsl
shaders/gpu_shader_3D_point_flat_color_vert.glsl
shaders/gpu_shader_2D_point_varying_size_varying_color_vert.glsl
shaders/gpu_shader_2D_point_uniform_size_aa_vert.glsl
shaders/gpu_shader_2D_point_uniform_size_outline_aa_vert.glsl
shaders/gpu_shader_text_vert.glsl
shaders/gpu_shader_text_frag.glsl
shaders/gpu_shader_keyframe_shape_vert.glsl
shaders/gpu_shader_keyframe_shape_frag.glsl
VSE: Do Scopes on the GPU, improve their look, HDR for waveform/parade Faster and better looking VSE scopes & "show overexposed". Waveform & RGB Parade now can also show HDR color intensities. (Note: this is only about VSE scopes; Image Space scopes are to be improved separately) - Waveform, RGB Parade, Vectorscope scopes are done on the GPU now, by drawing points for each input pixel, and placing them according to scope logic. The point drawing is implemented in a compute shader, with a fragment shader resolve pass; this is because drawing lots of points in the same location is very slow on some GPUs (e.g. Apple). The compute shader rasterizer is several times faster on regular desktop GPU as well. - If a non-default color management is needed (e.g. VSE colorspace is not the same as display colorspace, or a custom look transform is used etc. etc.), then transform the VSE preview texture into display space RGBA 16F texture using OCIO GPU machinery, and calculate scopes from that. - The "show overexposed" (zebra) preview option is also done on the GPU now. - Waveform/Parade scopes unlock zoom X/Y aspect for viewing HDR scope, similar to how it was done for HDR histograms recently. - Added SEQ_preview_cache.hh that holds GPU textures of VSE preview, this is so that when you have a preview and several scopes, each of them does not have to create/upload their own GPU texture (that would both waste memory, and be slow). Screenshots and performance details in the PR. Pull Request: https://projects.blender.org/blender/blender/pulls/144867
2025-08-26 12:25:43 +02:00
shaders/gpu_shader_sequencer_scope_comp.glsl
shaders/gpu_shader_sequencer_scope_frag.glsl
shaders/gpu_shader_sequencer_strips_vert.glsl
shaders/gpu_shader_sequencer_strips_frag.glsl
2024-09-03 08:25:15 +02:00
shaders/gpu_shader_sequencer_thumbs_vert.glsl
shaders/gpu_shader_sequencer_thumbs_frag.glsl
VSE: Do Scopes on the GPU, improve their look, HDR for waveform/parade Faster and better looking VSE scopes & "show overexposed". Waveform & RGB Parade now can also show HDR color intensities. (Note: this is only about VSE scopes; Image Space scopes are to be improved separately) - Waveform, RGB Parade, Vectorscope scopes are done on the GPU now, by drawing points for each input pixel, and placing them according to scope logic. The point drawing is implemented in a compute shader, with a fragment shader resolve pass; this is because drawing lots of points in the same location is very slow on some GPUs (e.g. Apple). The compute shader rasterizer is several times faster on regular desktop GPU as well. - If a non-default color management is needed (e.g. VSE colorspace is not the same as display colorspace, or a custom look transform is used etc. etc.), then transform the VSE preview texture into display space RGBA 16F texture using OCIO GPU machinery, and calculate scopes from that. - The "show overexposed" (zebra) preview option is also done on the GPU now. - Waveform/Parade scopes unlock zoom X/Y aspect for viewing HDR scope, similar to how it was done for HDR histograms recently. - Added SEQ_preview_cache.hh that holds GPU textures of VSE preview, this is so that when you have a preview and several scopes, each of them does not have to create/upload their own GPU texture (that would both waste memory, and be slow). Screenshots and performance details in the PR. Pull Request: https://projects.blender.org/blender/blender/pulls/144867
2025-08-26 12:25:43 +02:00
shaders/gpu_shader_sequencer_zebra_frag.glsl
shaders/gpu_shader_codegen_lib.glsl
shaders/common/gpu_shader_attribute_load_lib.glsl
shaders/common/gpu_shader_bicubic_sampler_lib.glsl
shaders/common/gpu_shader_common_color_ramp.glsl
shaders/common/gpu_shader_common_color_utils.glsl
shaders/common/gpu_shader_common_curves.glsl
shaders/common/gpu_shader_common_hash.glsl
shaders/common/gpu_shader_common_math_utils.glsl
shaders/common/gpu_shader_common_math.glsl
shaders/common/gpu_shader_common_mix_rgb.glsl
shaders/common/gpu_shader_debug_gradients_lib.glsl
shaders/common/gpu_shader_fullscreen_vert.glsl
shaders/common/gpu_shader_index_load_lib.glsl
DRW: New Curve Drawing Implementation of the design task #142969. This adds the following: - Exact GPU interpolation of curves of all types. - Radius attribute support. - Cyclic curve support. - Resolution attribute support. - New Cylinder hair shape type. ![image.png](/attachments/a8e7aea0-b0e5-4694-b660-89fb3df1ddcd) What changed: - EEVEE doesn't compute random normals for strand hairs anymore. These are considered legacy now. - EEVEE now have an internal shadow bias to avoid self shadowing on hair. - Workbench Curves Strip display option is no longer flat and has better shading. - Legacy Hair particle system evaluates radius at control points before applying additional subdivision. This now matches Cycles. - Color Attribute Node without a name do not fetch the active color attribute anymore. This now matches Cycles. Notes: - This is not 100% matching the CPU implementation for interpolation (see the epsilons in the tests). - Legacy Hair Particle points is now stored in local space after interpolation. The new cylinder shape allows for more correct hair shading in workbench and better intersection in EEVEE. | | Strand | Strip | Cylinder | | ---- | --- | --- | --- | | Main | ![main_strand.png](/attachments/67d3b792-962c-4272-a92c-1c0c7c6cf8de) | ![main_strip.png](/attachments/f2aa3575-368e-4fbb-b888-74df845918f1) | N/A | | PR | ![pr_strand.png](/attachments/cc012483-25f0-491f-a06e-ad3029981d47) | ![pr_strip.png](/attachments/73fa2f5c-5252-4b30-a334-e935ed0fb938) | ![pr_cylinder.png](/attachments/3133b2d4-a6f2-41ee-8e2d-f6fd00db0c8d) | | | Strand | Strip | Cylinder | | ---- | --- | --- | --- | | Main | ![main_strand_closeup.png](/attachments/730bd79c-6762-446d-819b-3ea47961ff9f) |![main_strip_closeup.png](/attachments/d9ace578-cfeb-4895-9896-3625b6ad7a02) | N/A | | PR | ![pr_strand_closeup.png](/attachments/ac8f3b0c-6ef6-4d54-b714-6322f9865036)|![pr_strip_closeup.png](/attachments/8504711a-955b-4ab2-aa3d-c2d114baf9d4)| ![pr_cylinder_closeup.png](/attachments/1e2899a8-0a5c-431f-ac6c-5184d87e9598) | Cyclic Curve, Mixed curve type, and proper radius support: ![image.png](/attachments/7f0bf05e-62ee-4ae9-aef9-a5599249b8d7) Test file for attribute lookup: [test_attribute_lookup.blend](/attachments/1d54dd06-379b-4480-a1c5-96adc1953f77) Follow Up Tasks: - Correct full tube segments orientation based on tangent and normal attributes - Correct V resolution property per object - More attribute type support (currently only color) TODO: - [x] Attribute Loading Changes - [x] Generic Attributes - [x] Length Attribute - [x] Intercept Attribute - [x] Original Coordinate Attribute - [x] Cyclic Curves - [x] Legacy Hair Particle conversion - [x] Attribute Loading - [x] Additional Subdivision - [x] Move some function to generic headers (VertBuf, OffsetIndices) - [x] Fix default UV/Color attribute assignment Pull Request: https://projects.blender.org/blender/blender/pulls/143180
2025-08-27 09:49:43 +02:00
shaders/common/gpu_shader_index_range_lib.glsl
shaders/common/gpu_shader_math_angle_lib.glsl
shaders/common/gpu_shader_math_axis_angle_lib.glsl
shaders/common/gpu_shader_math_base_lib.glsl
shaders/common/gpu_shader_math_constants_lib.glsl
shaders/common/gpu_shader_math_euler_lib.glsl
shaders/common/gpu_shader_math_fast_lib.glsl
shaders/common/gpu_shader_math_matrix_adjoint_lib.glsl
shaders/common/gpu_shader_math_matrix_compare_lib.glsl
shaders/common/gpu_shader_math_matrix_construct_lib.glsl
shaders/common/gpu_shader_math_matrix_conversion_lib.glsl
shaders/common/gpu_shader_math_matrix_interpolate_lib.glsl
shaders/common/gpu_shader_math_matrix_lib.glsl
shaders/common/gpu_shader_math_matrix_normalize_lib.glsl
shaders/common/gpu_shader_math_matrix_projection_lib.glsl
shaders/common/gpu_shader_math_matrix_transform_lib.glsl
shaders/common/gpu_shader_math_quaternion_lib.glsl
shaders/common/gpu_shader_math_rotation_conversion_lib.glsl
shaders/common/gpu_shader_math_rotation_lib.glsl
shaders/common/gpu_shader_math_safe_lib.glsl
shaders/common/gpu_shader_math_vector_compare_lib.glsl
shaders/common/gpu_shader_math_vector_lib.glsl
shaders/common/gpu_shader_math_vector_reduce_lib.glsl
shaders/common/gpu_shader_math_vector_safe_lib.glsl
DRW: New Curve Drawing Implementation of the design task #142969. This adds the following: - Exact GPU interpolation of curves of all types. - Radius attribute support. - Cyclic curve support. - Resolution attribute support. - New Cylinder hair shape type. ![image.png](/attachments/a8e7aea0-b0e5-4694-b660-89fb3df1ddcd) What changed: - EEVEE doesn't compute random normals for strand hairs anymore. These are considered legacy now. - EEVEE now have an internal shadow bias to avoid self shadowing on hair. - Workbench Curves Strip display option is no longer flat and has better shading. - Legacy Hair particle system evaluates radius at control points before applying additional subdivision. This now matches Cycles. - Color Attribute Node without a name do not fetch the active color attribute anymore. This now matches Cycles. Notes: - This is not 100% matching the CPU implementation for interpolation (see the epsilons in the tests). - Legacy Hair Particle points is now stored in local space after interpolation. The new cylinder shape allows for more correct hair shading in workbench and better intersection in EEVEE. | | Strand | Strip | Cylinder | | ---- | --- | --- | --- | | Main | ![main_strand.png](/attachments/67d3b792-962c-4272-a92c-1c0c7c6cf8de) | ![main_strip.png](/attachments/f2aa3575-368e-4fbb-b888-74df845918f1) | N/A | | PR | ![pr_strand.png](/attachments/cc012483-25f0-491f-a06e-ad3029981d47) | ![pr_strip.png](/attachments/73fa2f5c-5252-4b30-a334-e935ed0fb938) | ![pr_cylinder.png](/attachments/3133b2d4-a6f2-41ee-8e2d-f6fd00db0c8d) | | | Strand | Strip | Cylinder | | ---- | --- | --- | --- | | Main | ![main_strand_closeup.png](/attachments/730bd79c-6762-446d-819b-3ea47961ff9f) |![main_strip_closeup.png](/attachments/d9ace578-cfeb-4895-9896-3625b6ad7a02) | N/A | | PR | ![pr_strand_closeup.png](/attachments/ac8f3b0c-6ef6-4d54-b714-6322f9865036)|![pr_strip_closeup.png](/attachments/8504711a-955b-4ab2-aa3d-c2d114baf9d4)| ![pr_cylinder_closeup.png](/attachments/1e2899a8-0a5c-431f-ac6c-5184d87e9598) | Cyclic Curve, Mixed curve type, and proper radius support: ![image.png](/attachments/7f0bf05e-62ee-4ae9-aef9-a5599249b8d7) Test file for attribute lookup: [test_attribute_lookup.blend](/attachments/1d54dd06-379b-4480-a1c5-96adc1953f77) Follow Up Tasks: - Correct full tube segments orientation based on tangent and normal attributes - Correct V resolution property per object - More attribute type support (currently only color) TODO: - [x] Attribute Loading Changes - [x] Generic Attributes - [x] Length Attribute - [x] Intercept Attribute - [x] Original Coordinate Attribute - [x] Cyclic Curves - [x] Legacy Hair Particle conversion - [x] Attribute Loading - [x] Additional Subdivision - [x] Move some function to generic headers (VertBuf, OffsetIndices) - [x] Fix default UV/Color attribute assignment Pull Request: https://projects.blender.org/blender/blender/pulls/143180
2025-08-27 09:49:43 +02:00
shaders/common/gpu_shader_offset_indices_lib.glsl
shaders/common/gpu_shader_print_lib.glsl
shaders/common/gpu_shader_ray_utils_lib.glsl
EEVEE: Reduce necessary includes This saves a few milisecond of compile time per shader. This removes the need for occlusion lib when not using occlusion node. To improve detection of uneeded includes, we add a new logging system which output can be fed to mermaid to inspect dependencies. The new dependencies can be inspected using `--log "gpu.shader_dependencies"` Example pasted here: ```mermaid flowchart LR draw_curves_lib.glsl_7298 --> gpu_shader_math_constants_lib.glsl_600 style gpu_shader_math_constants_lib.glsl_600 fill:#0f0 gpu_shader_math_matrix_conversion_lib.glsl_1032 --> gpu_shader_math_base_lib.glsl_1406 style gpu_shader_math_base_lib.glsl_1406 fill:#1e0 gpu_shader_math_matrix_conversion_lib.glsl_1032 --> gpu_shader_math_euler_lib.glsl_461 style gpu_shader_math_euler_lib.glsl_461 fill:#0f0 gpu_shader_math_matrix_compare_lib.glsl_2964 --> gpu_shader_math_vector_compare_lib.glsl_2489 style gpu_shader_math_vector_compare_lib.glsl_2489 fill:#2d0 gpu_shader_math_matrix_conversion_lib.glsl_1032 --> gpu_shader_math_matrix_compare_lib.glsl_2964 style gpu_shader_math_matrix_compare_lib.glsl_2964 fill:#2d0 gpu_shader_math_matrix_conversion_lib.glsl_1032 --> gpu_shader_math_quaternion_lib.glsl_395 style gpu_shader_math_quaternion_lib.glsl_395 fill:#0f0 gpu_shader_math_matrix_conversion_lib.glsl_1032 --> gpu_shader_utildefines_lib.glsl_3112 style gpu_shader_utildefines_lib.glsl_3112 fill:#3c0 draw_curves_lib.glsl_7298 --> gpu_shader_math_matrix_conversion_lib.glsl_1032 style gpu_shader_math_matrix_conversion_lib.glsl_1032 fill:#1e0 draw_curves_lib.glsl_7298 --> gpu_shader_math_matrix_transform_lib.glsl_706 style gpu_shader_math_matrix_transform_lib.glsl_706 fill:#0f0 eevee_surf_deferred_frag.glsl_4531 --> draw_curves_lib.glsl_7298 style draw_curves_lib.glsl_7298 fill:#780 eevee_surf_deferred_frag.glsl_4531 --> draw_view_lib.glsl_3551 style draw_view_lib.glsl_3551 fill:#3c0 eevee_gbuffer_lib.glsl_14598 --> gpu_shader_math_vector_reduce_lib.glsl_1383 style gpu_shader_math_vector_reduce_lib.glsl_1383 fill:#1e0 eevee_gbuffer_lib.glsl_14598 --> gpu_shader_codegen_lib.glsl_6143 style gpu_shader_codegen_lib.glsl_6143 fill:#690 eevee_gbuffer_lib.glsl_14598 --> gpu_shader_math_vector_lib.glsl_5038 style gpu_shader_math_vector_lib.glsl_5038 fill:#5a0 eevee_gbuffer_lib.glsl_14598 --> gpu_shader_utildefines_lib.glsl_3112 style gpu_shader_utildefines_lib.glsl_3112 fill:#3c0 eevee_gbuffer_write_lib.glsl_7324 --> eevee_gbuffer_lib.glsl_14598 style eevee_gbuffer_lib.glsl_14598 fill:#e10 eevee_surf_deferred_frag.glsl_4531 --> eevee_gbuffer_write_lib.glsl_7324 style eevee_gbuffer_write_lib.glsl_7324 fill:#780 eevee_ambient_occlusion_lib.glsl_10738 --> draw_view_lib.glsl_3551 style draw_view_lib.glsl_3551 fill:#3c0 draw_math_geom_lib.glsl_5172 --> gpu_shader_math_vector_lib.glsl_5038 style gpu_shader_math_vector_lib.glsl_5038 fill:#5a0 draw_math_geom_lib.glsl_5172 --> gpu_shader_math_vector_reduce_lib.glsl_1383 style gpu_shader_math_vector_reduce_lib.glsl_1383 fill:#1e0 eevee_ray_types_lib.glsl_2390 --> draw_math_geom_lib.glsl_5172 style draw_math_geom_lib.glsl_5172 fill:#5a0 eevee_ray_types_lib.glsl_2390 --> draw_view_lib.glsl_3551 style draw_view_lib.glsl_3551 fill:#3c0 eevee_ray_types_lib.glsl_2390 --> gpu_shader_math_matrix_transform_lib.glsl_706 style gpu_shader_math_matrix_transform_lib.glsl_706 fill:#0f0 gpu_shader_math_safe_lib.glsl_1235 --> gpu_shader_math_constants_lib.glsl_600 style gpu_shader_math_constants_lib.glsl_600 fill:#0f0 eevee_ray_types_lib.glsl_2390 --> gpu_shader_math_safe_lib.glsl_1235 style gpu_shader_math_safe_lib.glsl_1235 fill:#1e0 eevee_ray_types_lib.glsl_2390 --> gpu_shader_ray_lib.glsl_137 style gpu_shader_ray_lib.glsl_137 fill:#0f0 eevee_ambient_occlusion_lib.glsl_10738 --> eevee_ray_types_lib.glsl_2390 style eevee_ray_types_lib.glsl_2390 fill:#2d0 eevee_sampling_lib.glsl_4291 --> gpu_shader_math_base_lib.glsl_1406 style gpu_shader_math_base_lib.glsl_1406 fill:#1e0 eevee_sampling_lib.glsl_4291 --> gpu_shader_math_constants_lib.glsl_600 style gpu_shader_math_constants_lib.glsl_600 fill:#0f0 eevee_sampling_lib.glsl_4291 --> gpu_shader_math_safe_lib.glsl_1235 style gpu_shader_math_safe_lib.glsl_1235 fill:#1e0 eevee_ambient_occlusion_lib.glsl_10738 --> eevee_sampling_lib.glsl_4291 style eevee_sampling_lib.glsl_4291 fill:#4b0 eevee_ambient_occlusion_lib.glsl_10738 --> eevee_utility_tx_lib.glsl_1225 style eevee_utility_tx_lib.glsl_1225 fill:#1e0 eevee_ambient_occlusion_lib.glsl_10738 --> gpu_shader_math_base_lib.glsl_1406 style gpu_shader_math_base_lib.glsl_1406 fill:#1e0 gpu_shader_math_fast_lib.glsl_921 --> gpu_shader_math_constants_lib.glsl_600 style gpu_shader_math_constants_lib.glsl_600 fill:#0f0 eevee_ambient_occlusion_lib.glsl_10738 --> gpu_shader_math_fast_lib.glsl_921 style gpu_shader_math_fast_lib.glsl_921 fill:#0f0 gpu_shader_math_vector_safe_lib.glsl_5847 --> gpu_shader_math_safe_lib.glsl_1235 style gpu_shader_math_safe_lib.glsl_1235 fill:#1e0 eevee_ambient_occlusion_lib.glsl_10738 --> gpu_shader_math_vector_safe_lib.glsl_5847 style gpu_shader_math_vector_safe_lib.glsl_5847 fill:#5a0 eevee_ambient_occlusion_lib.glsl_10738 --> gpu_shader_utildefines_lib.glsl_3112 style gpu_shader_utildefines_lib.glsl_3112 fill:#3c0 eevee_nodetree_frag_lib.glsl_395 --> eevee_ambient_occlusion_lib.glsl_10738 style eevee_ambient_occlusion_lib.glsl_10738 fill:#a50 eevee_nodetree_frag_lib.glsl_395 --> eevee_geom_types_lib.glsl_682 style eevee_geom_types_lib.glsl_682 fill:#0f0 draw_model_lib.glsl_2563 --> draw_view_lib.glsl_3551 style draw_view_lib.glsl_3551 fill:#3c0 eevee_nodetree_lib.glsl_16051 --> draw_model_lib.glsl_2563 style draw_model_lib.glsl_2563 fill:#2d0 draw_object_infos_lib.glsl_1114 --> draw_model_lib.glsl_2563 style draw_model_lib.glsl_2563 fill:#2d0 eevee_nodetree_lib.glsl_16051 --> draw_object_infos_lib.glsl_1114 style draw_object_infos_lib.glsl_1114 fill:#1e0 eevee_nodetree_lib.glsl_16051 --> draw_view_lib.glsl_3551 style draw_view_lib.glsl_3551 fill:#3c0 eevee_nodetree_lib.glsl_16051 --> eevee_renderpass_lib.glsl_1793 style eevee_renderpass_lib.glsl_1793 fill:#1e0 eevee_nodetree_lib.glsl_16051 --> eevee_utility_tx_lib.glsl_1225 style eevee_utility_tx_lib.glsl_1225 fill:#1e0 eevee_nodetree_lib.glsl_16051 --> gpu_shader_codegen_lib.glsl_6143 style gpu_shader_codegen_lib.glsl_6143 fill:#690 eevee_nodetree_lib.glsl_16051 --> gpu_shader_math_base_lib.glsl_1406 style gpu_shader_math_base_lib.glsl_1406 fill:#1e0 eevee_nodetree_lib.glsl_16051 --> gpu_shader_math_safe_lib.glsl_1235 style gpu_shader_math_safe_lib.glsl_1235 fill:#1e0 eevee_nodetree_lib.glsl_16051 --> gpu_shader_math_vector_reduce_lib.glsl_1383 style gpu_shader_math_vector_reduce_lib.glsl_1383 fill:#1e0 eevee_nodetree_lib.glsl_16051 --> gpu_shader_utildefines_lib.glsl_3112 style gpu_shader_utildefines_lib.glsl_3112 fill:#3c0 eevee_nodetree_frag_lib.glsl_395 --> eevee_nodetree_lib.glsl_16051 style eevee_nodetree_lib.glsl_16051 fill:#f00 gpu_shader_material_ambient_occlusion.glsl_558 --> gpu_shader_math_vector_safe_lib.glsl_5847 style gpu_shader_math_vector_safe_lib.glsl_5847 fill:#5a0 eevee_nodetree_frag_lib.glsl_395 --> gpu_shader_material_ambient_occlusion.glsl_558 style gpu_shader_material_ambient_occlusion.glsl_558 fill:#0f0 eevee_nodetree_frag_lib.glsl_395 --> gpu_shader_material_emission.glsl_380 style gpu_shader_material_emission.glsl_380 fill:#0f0 gpu_shader_material_output_material.glsl_850 --> gpu_shader_material_transform_utils.glsl_2136 style gpu_shader_material_transform_utils.glsl_2136 fill:#2d0 eevee_nodetree_frag_lib.glsl_395 --> gpu_shader_material_output_material.glsl_850 style gpu_shader_material_output_material.glsl_850 fill:#0f0 eevee_nodetree_frag_lib.glsl_395 --> gpu_shader_material_world_normals.glsl_128 style gpu_shader_material_world_normals.glsl_128 fill:#0f0 eevee_surf_deferred_frag.glsl_4531 --> eevee_nodetree_frag_lib.glsl_395 style eevee_nodetree_frag_lib.glsl_395 fill:#0f0 eevee_surf_deferred_frag.glsl_4531 --> eevee_sampling_lib.glsl_4291 style eevee_sampling_lib.glsl_4291 fill:#4b0 eevee_surf_lib.glsl_3650 --> draw_view_lib.glsl_3551 style draw_view_lib.glsl_3551 fill:#3c0 eevee_surf_lib.glsl_3650 --> gpu_shader_codegen_lib.glsl_6143 style gpu_shader_codegen_lib.glsl_6143 fill:#690 eevee_surf_lib.glsl_3650 --> gpu_shader_math_base_lib.glsl_1406 style gpu_shader_math_base_lib.glsl_1406 fill:#1e0 eevee_surf_lib.glsl_3650 --> gpu_shader_math_vector_safe_lib.glsl_5847 style gpu_shader_math_vector_safe_lib.glsl_5847 fill:#5a0 eevee_surf_deferred_frag.glsl_4531 --> eevee_surf_lib.glsl_3650 style eevee_surf_lib.glsl_3650 fill:#3c0 ``` Pull Request: https://projects.blender.org/blender/blender/pulls/146580
2025-09-23 17:21:56 +02:00
shaders/common/gpu_shader_ray_lib.glsl
2024-09-03 08:25:15 +02:00
shaders/common/gpu_shader_sequencer_lib.glsl
shaders/common/gpu_shader_shared_exponent_lib.glsl
shaders/common/gpu_shader_smaa_lib.glsl
shaders/common/gpu_shader_test_lib.glsl
shaders/common/gpu_shader_utildefines_lib.glsl
shaders/material/gpu_shader_material_add_shader.glsl
shaders/material/gpu_shader_material_ambient_occlusion.glsl
shaders/material/gpu_shader_material_attribute.glsl
shaders/material/gpu_shader_material_background.glsl
shaders/material/gpu_shader_material_bevel.glsl
shaders/material/gpu_shader_material_wavelength.glsl
shaders/material/gpu_shader_material_blackbody.glsl
shaders/material/gpu_shader_material_bright_contrast.glsl
shaders/material/gpu_shader_material_bump.glsl
shaders/material/gpu_shader_material_camera.glsl
shaders/material/gpu_shader_material_clamp.glsl
shaders/material/gpu_shader_material_combine_color.glsl
shaders/material/gpu_shader_material_combine_xyz.glsl
shaders/material/gpu_shader_material_diffuse.glsl
shaders/material/gpu_shader_material_displacement.glsl
shaders/material/gpu_shader_material_eevee_specular.glsl
shaders/material/gpu_shader_material_emission.glsl
shaders/material/gpu_shader_material_fractal_noise.glsl
Nodes: add Fractal Voronoi Noise Fractal noise is the idea of evaluating the same noise function multiple times with different input parameters on each layer and then mixing the results. The individual layers are usually called octaves. The number of layers is controlled with a "Detail" slider. The "Lacunarity" input controls a factor by which each successive layer gets scaled. The existing Noise node already supports fractal noise. Now the Voronoi Noise node supports it as well. The node also has a new "Normalize" property that ensures that the output values stay in a [0.0, 1.0] range. That is except for the F2 feature where in rare cases the output may be outside that range even with "Normalize" turned on. How the individual octaves are mixed depends on the feature and output socket: - F1/Smooth F1/F2: - Distance/Color output: The individual Distance/Color octaves are first multiplied by a factor of `Roughness ^ (#layers - 1.0)` then added together to create the final output. - Position output: Each Position octave gets linearly interpolated with the combined output of the previous octaves. The Roughness input serves as an interpolation factor with 0.0 resutling in only using the combined output of the previous octaves and 1.0 resulting in only using the current highest octave. - Distance to Edge: - Distance output: The Distance octaves are mixed exactly like the Position octaves for F1/Smooth F1/F2. It should be noted that Voronoi Noise is a relatively slow noise function, especially at higher dimensions. Increasing the "Detail" makes it even slower. Therefore, when optimizing a scene one should consider trying to use simpler noise functions instead of Voronoi if the final result is close enough. Pull Request: https://projects.blender.org/blender/blender/pulls/106827
2023-06-13 09:18:12 +02:00
shaders/material/gpu_shader_material_fractal_voronoi.glsl
shaders/material/gpu_shader_material_fresnel.glsl
shaders/material/gpu_shader_material_gamma.glsl
shaders/material/gpu_shader_material_geometry.glsl
shaders/material/gpu_shader_material_glass.glsl
shaders/material/gpu_shader_material_glossy.glsl
shaders/material/gpu_shader_material_hair_info.glsl
shaders/material/gpu_shader_material_hair.glsl
shaders/material/gpu_shader_material_holdout.glsl
shaders/material/gpu_shader_material_hue_sat_val.glsl
shaders/material/gpu_shader_material_invert.glsl
shaders/material/gpu_shader_material_layer_weight.glsl
shaders/material/gpu_shader_material_light_falloff.glsl
shaders/material/gpu_shader_material_light_path.glsl
shaders/material/gpu_shader_material_mapping.glsl
shaders/material/gpu_shader_material_map_range.glsl
shaders/material/gpu_shader_material_metallic.glsl
shaders/material/gpu_shader_material_mix_color.glsl
shaders/material/gpu_shader_material_mix_shader.glsl
shaders/material/gpu_shader_material_noise.glsl
shaders/material/gpu_shader_material_normal.glsl
shaders/material/gpu_shader_material_normal_map.glsl
shaders/material/gpu_shader_material_object_info.glsl
shaders/material/gpu_shader_material_output_aov.glsl
shaders/material/gpu_shader_material_output_material.glsl
shaders/material/gpu_shader_material_output_world.glsl
shaders/material/gpu_shader_material_particle_info.glsl
shaders/material/gpu_shader_material_point_info.glsl
shaders/material/gpu_shader_material_principled.glsl
shaders/material/gpu_shader_material_ray_portal.glsl
shaders/material/gpu_shader_material_refraction.glsl
shaders/material/gpu_shader_material_rgb_to_bw.glsl
shaders/material/gpu_shader_material_radial_tiling.glsl
shaders/material/gpu_shader_material_radial_tiling_shared.glsl
shaders/material/gpu_shader_material_separate_color.glsl
shaders/material/gpu_shader_material_separate_xyz.glsl
shaders/material/gpu_shader_material_set.glsl
shaders/material/gpu_shader_material_shader_to_rgba.glsl
shaders/material/gpu_shader_material_sheen.glsl
shaders/material/gpu_shader_material_squeeze.glsl
shaders/material/gpu_shader_material_subsurface_scattering.glsl
shaders/material/gpu_shader_material_tangent.glsl
shaders/material/gpu_shader_material_tex_brick.glsl
shaders/material/gpu_shader_material_tex_checker.glsl
shaders/material/gpu_shader_material_tex_environment.glsl
Nodes: Implement Gabor noise This patch implements a new Gabor noise node based on [1] but with the improvements from [2] and the phasor formulation from [3]. We compare with the most popular existing implementation, that of OSL, from the user's point of view: - This implementation produces C1 continuous noise as opposed to the non continuous OSL implementation, so it can be used for bump mapping and is generally smother. This is achieved by windowing the Gabor kernel using a Hann window. - The Bandwidth input of OSL was hard-coded to 1 and was replaced with a frequency input, which OSL hard codes to 2, since frequency is more natural to control. This is even more true now that that Gabor kernel is windowed as opposed to truncated, which means increasing the bandwidth will just turn the Gaussian component of the Gabor into a Hann window. While decreasing the bandwidth will eliminate the harmonic from the Gabor kernel, which is the point of Gabor noise. - OSL had three discrete modes of operation for orienting the kernel. Anisotropic, Isotropic, and a hybrid mode. While this implementation provides a continuous Anisotropy parameter which users are already familiar with from the Glossy BSDF node. - This implementation provides not just the Gabor noise value, but also its phase and intensity components. The Gabor noise value is basically sin(phase) * intensity, but the phase is arguably more useful since it does not suffer from the low contrast issues that Gabor suffers from. While the intensity is useful to hide the singularities in the phase. - This implementation converges faster that OSL's relative to the impulse count, so we fix the impulses count to 8 for simplicitly. - This implementation does not implement anisotropic filtering. Future improvements to the node includes implementing surface noise and filtering. As well as extending the spectral control of the noise, either by providing specialized kernels as was done in #110802, or by providing some more procedural control over the frequencies of the Gabor. References: [1]: Lagae, Ares, et al. "Procedural noise using sparse Gabor convolution." ACM Transactions on Graphics (TOG) 28.3 (2009): 1-10. [2]: Tavernier, Vincent, et al. "Making gabor noise fast and normalized." Eurographics 2019-40th Annual Conference of the European Association for Computer Graphics. 2019. [3]: Tricard, Thibault, et al. "Procedural phasor noise." ACM Transactions on Graphics (TOG) 38.4 (2019): 1-13. Pull Request: https://projects.blender.org/blender/blender/pulls/121820
2024-06-19 09:33:32 +02:00
shaders/material/gpu_shader_material_tex_gabor.glsl
shaders/material/gpu_shader_material_tex_gradient.glsl
shaders/material/gpu_shader_material_tex_image.glsl
shaders/material/gpu_shader_material_tex_magic.glsl
shaders/material/gpu_shader_material_tex_noise.glsl
shaders/material/gpu_shader_material_tex_sky.glsl
shaders/material/gpu_shader_material_texture_coordinates.glsl
shaders/material/gpu_shader_material_tex_voronoi.glsl
shaders/material/gpu_shader_material_tex_wave.glsl
shaders/material/gpu_shader_material_tex_white_noise.glsl
shaders/material/gpu_shader_material_toon.glsl
shaders/material/gpu_shader_material_transform_utils.glsl
shaders/material/gpu_shader_material_translucent.glsl
shaders/material/gpu_shader_material_transparent.glsl
shaders/material/gpu_shader_material_uv_map.glsl
shaders/material/gpu_shader_material_vector_displacement.glsl
shaders/material/gpu_shader_material_vector_math.glsl
shaders/material/gpu_shader_material_vector_rotate.glsl
shaders/material/gpu_shader_material_vertex_color.glsl
shaders/material/gpu_shader_material_volume_absorption.glsl
shaders/material/gpu_shader_material_volume_principled.glsl
shaders/material/gpu_shader_material_volume_scatter.glsl
shaders/material/gpu_shader_material_volume_coefficients.glsl
Nodes: add Fractal Voronoi Noise Fractal noise is the idea of evaluating the same noise function multiple times with different input parameters on each layer and then mixing the results. The individual layers are usually called octaves. The number of layers is controlled with a "Detail" slider. The "Lacunarity" input controls a factor by which each successive layer gets scaled. The existing Noise node already supports fractal noise. Now the Voronoi Noise node supports it as well. The node also has a new "Normalize" property that ensures that the output values stay in a [0.0, 1.0] range. That is except for the F2 feature where in rare cases the output may be outside that range even with "Normalize" turned on. How the individual octaves are mixed depends on the feature and output socket: - F1/Smooth F1/F2: - Distance/Color output: The individual Distance/Color octaves are first multiplied by a factor of `Roughness ^ (#layers - 1.0)` then added together to create the final output. - Position output: Each Position octave gets linearly interpolated with the combined output of the previous octaves. The Roughness input serves as an interpolation factor with 0.0 resutling in only using the combined output of the previous octaves and 1.0 resulting in only using the current highest octave. - Distance to Edge: - Distance output: The Distance octaves are mixed exactly like the Position octaves for F1/Smooth F1/F2. It should be noted that Voronoi Noise is a relatively slow noise function, especially at higher dimensions. Increasing the "Detail" makes it even slower. Therefore, when optimizing a scene one should consider trying to use simpler noise functions instead of Voronoi if the final result is close enough. Pull Request: https://projects.blender.org/blender/blender/pulls/106827
2023-06-13 09:18:12 +02:00
shaders/material/gpu_shader_material_voronoi.glsl
shaders/material/gpu_shader_material_wireframe.glsl
shaders/material/gpu_shader_material_world_normals.glsl
shaders/gpu_shader_gpencil_stroke_vert.glsl
shaders/gpu_shader_gpencil_stroke_frag.glsl
shaders/gpu_shader_display_fallback_vert.glsl
shaders/gpu_shader_display_fallback_frag.glsl
shaders/gpu_shader_cfg_world_clip_lib.glsl
shaders/gpu_shader_colorspace_lib.glsl
shaders/gpu_shader_index_2d_array_points.glsl
shaders/gpu_shader_index_2d_array_lines.glsl
shaders/gpu_shader_index_2d_array_tris.glsl
shaders/gpu_shader_compat_glsl.glsl
shaders/gpu_shader_glsl_extension.glsl
GPU_shader_shared_utils.hh
)
set(GLSL_SRC_TEST
shaders/infos/gpu_shader_test_infos.hh
tests/shaders/gpu_math_test.glsl
tests/shaders/gpu_buffer_texture_test.glsl
tests/shaders/gpu_compute_1d_test.glsl
tests/shaders/gpu_compute_2d_test.glsl
tests/shaders/gpu_compute_ibo_test.glsl
Vulkan: Initial Compute Shaders support This patch adds initial support for compute shaders to the vulkan backend. As the development is oriented to the test- cases we have the implementation is limited to what is used there. It has been validated that with this patch that the following test cases are running as expected - `GPUVulkanTest.gpu_shader_compute_vbo` - `GPUVulkanTest.gpu_shader_compute_ibo` - `GPUVulkanTest.gpu_shader_compute_ssbo` - `GPUVulkanTest.gpu_storage_buffer_create_update_read` - `GPUVulkanTest.gpu_shader_compute_2d` This patch includes: - Allocating VkBuffer on device. - Uploading data from CPU to VkBuffer. - Binding VkBuffer as SSBO to a compute shader. - Execute compute shader and altering VkBuffer. - Download the VkBuffer to CPU ram. - Validate that it worked. - Use device only vertex buffer as SSBO - Use device only index buffer as SSBO - Use device only image buffers GHOST API has been changed as the original design was created before we even had support for compute shaders in blender. The function `GHOST_getVulkanBackbuffer` has been separated to retrieve the command buffer without a backbuffer (`GHOST_getVulkanCommandBuffer`). In order to do correct command buffer processing we needed access to the queue owned by GHOST. This is returned as part of the `GHOST_getVulkanHandles` function. Open topics (not considered part of this patch) - Memory barriers & command buffer encoding - Indirect compute dispatching - Rest of the test cases - Data conversions when requested data format is different than on device. - GPUVulkanTest.gpu_shader_compute_1d is supported on AMD devices. NVIDIA doesn't seem to support 1d textures. Pull-request: #104518
2023-02-21 15:03:12 +01:00
tests/shaders/gpu_compute_ssbo_test.glsl
tests/shaders/gpu_compute_vbo_test.glsl
tests/shaders/gpu_compute_dummy_test.glsl
tests/shaders/gpu_specialization_test.glsl
tests/shaders/gpu_framebuffer_layer_viewport_test.glsl
tests/shaders/gpu_framebuffer_subpass_input_test.glsl
tests/shaders/gpu_push_constants_test.glsl
)
set(MTL_BACKEND_GLSL_SRC
metal/kernels/depth_2d_update_infos.hh
metal/kernels/gpu_shader_fullscreen_blit_infos.hh
metal/kernels/depth_2d_update_float_frag.glsl
metal/kernels/depth_2d_update_int24_frag.glsl
metal/kernels/depth_2d_update_int32_frag.glsl
metal/kernels/depth_2d_update_vert.glsl
metal/kernels/gpu_shader_fullscreen_blit_vert.glsl
metal/kernels/gpu_shader_fullscreen_blit_frag.glsl
shaders/gpu_shader_msl_atomic.msl
shaders/gpu_shader_msl_attribute.msl
shaders/gpu_shader_msl_builtin.msl
shaders/gpu_shader_msl_image.msl
shaders/gpu_shader_msl_matrix_legacy.msl
shaders/gpu_shader_msl_matrix.msl
shaders/gpu_shader_msl_sampler.msl
shaders/gpu_shader_msl_types_legacy.msl
shaders/gpu_shader_msl_defines.msl
shaders/gpu_shader_compat_msl.msl
)
set(MSL_SRC
metal/mtl_shader_shared.hh
metal/kernels/compute_texture_update.msl
metal/kernels/compute_texture_read.msl
)
set(VULKAN_BACKEND_GLSL_SRC
vulkan/shaders/vk_backbuffer_blit_infos.hh
vulkan/shaders/vk_backbuffer_blit_comp.glsl
)
if(WITH_GTESTS)
if(WITH_GPU_BACKEND_TESTS)
list(APPEND GLSL_SRC ${GLSL_SRC_TEST})
endif()
endif()
if(WITH_METAL_BACKEND)
list(APPEND GLSL_SRC ${MTL_BACKEND_GLSL_SRC})
set(MSL_C)
foreach(MSL_FILE ${MSL_SRC})
data_to_c_simple(${MSL_FILE} MSL_C)
endforeach()
endif()
if(WITH_VULKAN_BACKEND)
list(APPEND GLSL_SRC ${VULKAN_BACKEND_GLSL_SRC})
endif()
set(GLSL_C)
foreach(GLSL_FILE ${GLSL_SRC})
glsl_to_c(${GLSL_FILE} GLSL_C)
DRW: New Curve Drawing Implementation of the design task #142969. This adds the following: - Exact GPU interpolation of curves of all types. - Radius attribute support. - Cyclic curve support. - Resolution attribute support. - New Cylinder hair shape type. ![image.png](/attachments/a8e7aea0-b0e5-4694-b660-89fb3df1ddcd) What changed: - EEVEE doesn't compute random normals for strand hairs anymore. These are considered legacy now. - EEVEE now have an internal shadow bias to avoid self shadowing on hair. - Workbench Curves Strip display option is no longer flat and has better shading. - Legacy Hair particle system evaluates radius at control points before applying additional subdivision. This now matches Cycles. - Color Attribute Node without a name do not fetch the active color attribute anymore. This now matches Cycles. Notes: - This is not 100% matching the CPU implementation for interpolation (see the epsilons in the tests). - Legacy Hair Particle points is now stored in local space after interpolation. The new cylinder shape allows for more correct hair shading in workbench and better intersection in EEVEE. | | Strand | Strip | Cylinder | | ---- | --- | --- | --- | | Main | ![main_strand.png](/attachments/67d3b792-962c-4272-a92c-1c0c7c6cf8de) | ![main_strip.png](/attachments/f2aa3575-368e-4fbb-b888-74df845918f1) | N/A | | PR | ![pr_strand.png](/attachments/cc012483-25f0-491f-a06e-ad3029981d47) | ![pr_strip.png](/attachments/73fa2f5c-5252-4b30-a334-e935ed0fb938) | ![pr_cylinder.png](/attachments/3133b2d4-a6f2-41ee-8e2d-f6fd00db0c8d) | | | Strand | Strip | Cylinder | | ---- | --- | --- | --- | | Main | ![main_strand_closeup.png](/attachments/730bd79c-6762-446d-819b-3ea47961ff9f) |![main_strip_closeup.png](/attachments/d9ace578-cfeb-4895-9896-3625b6ad7a02) | N/A | | PR | ![pr_strand_closeup.png](/attachments/ac8f3b0c-6ef6-4d54-b714-6322f9865036)|![pr_strip_closeup.png](/attachments/8504711a-955b-4ab2-aa3d-c2d114baf9d4)| ![pr_cylinder_closeup.png](/attachments/1e2899a8-0a5c-431f-ac6c-5184d87e9598) | Cyclic Curve, Mixed curve type, and proper radius support: ![image.png](/attachments/7f0bf05e-62ee-4ae9-aef9-a5599249b8d7) Test file for attribute lookup: [test_attribute_lookup.blend](/attachments/1d54dd06-379b-4480-a1c5-96adc1953f77) Follow Up Tasks: - Correct full tube segments orientation based on tangent and normal attributes - Correct V resolution property per object - More attribute type support (currently only color) TODO: - [x] Attribute Loading Changes - [x] Generic Attributes - [x] Length Attribute - [x] Intercept Attribute - [x] Original Coordinate Attribute - [x] Cyclic Curves - [x] Legacy Hair Particle conversion - [x] Attribute Loading - [x] Additional Subdivision - [x] Move some function to generic headers (VertBuf, OffsetIndices) - [x] Fix default UV/Color attribute assignment Pull Request: https://projects.blender.org/blender/blender/pulls/143180
2025-08-27 09:49:43 +02:00
endforeach()
set(SHADER_C)
list(APPEND SHADER_C ${GLSL_C})
if(WITH_METAL_BACKEND)
list(APPEND SHADER_C ${MSL_C})
endif()
blender_add_lib(bf_gpu_shaders "${SHADER_C}" "" "" "")
blender_set_target_unity_build(bf_gpu_shaders 10)
list(APPEND LIB
bf_gpu_shaders
)
set(GLSL_SOURCE_CONTENT "")
set(GLSL_METADATA_CONTENT "")
set(GLSL_INFOS_CONTENT "")
foreach(GLSL_FILE ${GLSL_SRC})
get_filename_component(GLSL_FILE_NAME ${GLSL_FILE} NAME)
string(REPLACE "." "_" GLSL_FILE_NAME_UNDERSCORES ${GLSL_FILE_NAME})
string(APPEND GLSL_SOURCE_CONTENT "SHADER_SOURCE\(${GLSL_FILE_NAME_UNDERSCORES}, \"${GLSL_FILE_NAME}\", \"${GLSL_FILE}\"\)\n")
string(APPEND GLSL_METADATA_CONTENT "#include \"${GLSL_FILE}.hh\"\n")
string(APPEND GLSL_INFOS_CONTENT "#include \"${GLSL_FILE}.info\"\n")
endforeach()
set(glsl_source_list_file "${CMAKE_CURRENT_BINARY_DIR}/glsl_gpu_source_list.h")
file(GENERATE OUTPUT ${glsl_source_list_file} CONTENT "${GLSL_SOURCE_CONTENT}")
list(APPEND SRC ${glsl_source_list_file})
set(glsl_metadata_list_file "${CMAKE_CURRENT_BINARY_DIR}/glsl_gpu_metadata_list.hh")
file(GENERATE OUTPUT ${glsl_metadata_list_file} CONTENT "${GLSL_METADATA_CONTENT}")
list(APPEND SRC ${glsl_metadata_list_file})
set(glsl_infos_list_file "${CMAKE_CURRENT_BINARY_DIR}/glsl_gpu_infos_list.hh")
file(GENERATE OUTPUT ${glsl_infos_list_file} CONTENT "${GLSL_INFOS_CONTENT}")
list(APPEND SRC ${glsl_infos_list_file})
list(APPEND INC ${CMAKE_CURRENT_BINARY_DIR})
if(WITH_MOD_FLUID)
add_definitions(-DWITH_FLUID)
endif()
if(WITH_OPENSUBDIV)
add_definitions(-DWITH_OPENSUBDIV)
endif()
if(WITH_GPU_BACKEND_TESTS)
add_definitions(-DWITH_GPU_BACKEND_TESTS)
endif()
if(WITH_GTESTS)
add_definitions(-DWITH_GTESTS)
endif()
blender_add_lib(bf_gpu "${SRC}" "${INC}" "${INC_SYS}" "${LIB}")
add_library(bf::gpu ALIAS bf_gpu)
target_link_libraries(bf_gpu PUBLIC
bf_compositor_shaders
bf_draw_shaders
bf_gpu_shaders
Refactor: OpenColorIO integration Briefly about this change: - OpenColorIO C-API is removed. - The information about color spaces in ImBuf module is removed. It was stored in global ListBase in colormanagement.cc. - Both OpenColorIO and fallback implementation supports GPU drawing. - Fallback implementation supports white point, RGB curves, etc. - Removed check for support of GPU drawing in IMB. Historically it was implemented in a separate library with C-API, this is because way back C++ code needed to stay in intern. This causes all sort of overheads, and even calls that are strictly considered bad level. This change moves OpenColorIO integration into a module within imbuf, next to movie, and next to IMB_colormanagement which is the main user of it. This allows to avoid copy of color spaces, displays, views etc in the ImBuf: they were used to help quickly querying information to be shown on the interface. With this change it can be stored in the same data structures as what is used by the OpenColorIO integration. While it might not be fully avoiding duplication it is now less, and there is no need in the user code to maintain the copies. In a lot of cases this change also avoids allocations done per access to the OpenColorIO. For example, it is not needed anymore to allocate image descriptor in a heap. The bigger user-visible change is that the fallback implementation now supports GLSL drawing, with the whole list of supported features, such as curve mapping and white point. This should help simplifying code which relies on color space conversion on GPU: there is no need to figure out fallback solution in such cases. The only case when drawing will not work is when there is some actual bug, or driver issue, and shader has failed to compile. The change avoids having an opaque type for color space, and instead uses forward declaration. It is a bit verbose on declaration, but helps avoiding unsafe type-casts. There are ways to solve this in the future, like having a header for forward declaration, or to flatten the name space a bit. There should be no user-level changes under normal operation. When building without OpenColorIO or the configuration has a typo or is missing a fuller set of color management tools is applies (such as the white point correction). Pull Request: https://projects.blender.org/blender/blender/pulls/138433
2025-05-09 14:01:43 +02:00
bf_imbuf_opencolorio_shaders
)
if(WITH_OPENGL_BACKEND AND CMAKE_SYSTEM_NAME STREQUAL "Linux")
target_link_libraries(bf_gpu PUBLIC rt)
GPU: Add GPU_shader_batch_create_from_infos This is the first commit of the several required to support subprocess-based parallel compilation on OpenGL. This provides the base API and implementation, and exposes the max subprocesses setting on the UI, but it's not used by any code yet. More information and the rest of the code can be found in #121925. This one includes: - A new `GPU_shader_batch` API that allows requesting the compilation of multiple shaders at once, allowing GPU backed to compile them in parallel and asynchronously without blocking the Blender UI. - A virtual `ShaderCompiler` class that backends can use to add their own implementation. - A `ShaderCompilerGeneric` class that implements synchronous/blocking compilation of batches for backends that don't have their own implementation yet. - A `GLShaderCompiler` that supports parallel compilation using subprocesses. - A new `BLI_subprocess` API, including IPC (required for the `GLShaderCompiler` implementation). - The implementation of the subprocess program in `GPU_compilation_subprocess`. - A new `Max Shader Compilation Subprocesses` option in `Preferences > System > Memory & Limits` to enable parallel shader compilation and the max number of subprocesses to allocate (each subprocess has a relatively high memory footprint). Implementation Overview: There's a single `GLShaderCompiler` shared by all OpenGL contexts. This class stores a pool of up to `GCaps.max_parallel_compilations` subprocesses that can be used for compilation. Each subprocess has a shared memory pool used for sending the shader source code from the main Blender process and for receiving the already compiled shader binary from the subprocess. This is synchronized using a series of shared semaphores. The subprocesses maintain a shader cache on disk inside a `BLENDER_SHADER_CACHE` folder at the OS temporary folder. Shaders that fail to compile are tried to be compiled again locally for proper error reports. Hanged subprocesses are currently detected using a timeout of 30s. Pull Request: https://projects.blender.org/blender/blender/pulls/122232
2024-06-05 18:45:57 +02:00
endif()
# If `execinfo.h` exists on a *BSD system then also link in `libexecinfo`.
# Needed for `backtrace` / `backtrace_symbols` (GNU extensions)
# brought in by `blenlib/intern/system.cc`.
if(HAVE_EXECINFO_H AND CMAKE_SYSTEM_NAME MATCHES "FreeBSD|NetBSD|OpenBSD|DragonFly")
target_link_libraries(bf_gpu PUBLIC execinfo)
endif()
SubDiv: Migrate GPU subdivision to use GPU module Blender already had its own copy of OpenSubDiv containing some local fixes and code-style. This code still used gl-calls. This PR updates the calls to use GPU module. This allows us to use OpenSubDiv to be usable on other backends as well. This PR was tested on OpenGL, Vulkan and Metal. Metal can be enabled, but Vulkan requires some API changes to work with loose geometry. ![metal.png](/attachments/bb042c3a-1a87-4140-9958-a80da10d417b) # Considerations **ShaderCreateInfo** intern/opensubdiv now requires access to GPU module. This to create buffers in the correct context and trigger correct dispatches. ShaderCreateInfo is used to construct the shader for cross compilation to Metal/Vulkan. However opensubdiv shader caching structures are still used. **Vertex buffers vs storage buffers** Implementation tries to keep as close to the original OSD implementation. If they used storage buffers for data, we will use GPUStorageBuf. If it uses vertex buffers, we will use gpu::VertBuf. **Evaluator const** The evaluator cannot be const anymore as the GPU module API only allows updating SSBOs when constructing. API could be improved to support updating SSBOs. Current implementation has a change to use reads out of bounds when constructing SSBOs. An API change is in the planning to remove this issue. This will be fixed in an upcoming PR. We wanted to land this PR as the visibility of the issue is not common and multiple other changes rely on this PR to land. Pull Request: https://projects.blender.org/blender/blender/pulls/135296
2025-03-10 07:31:59 +01:00
if(WITH_OPENSUBDIV)
target_link_libraries(bf_gpu PUBLIC bf_osd_shaders)
endif()
if(WITH_RENDERDOC)
target_link_libraries(bf_gpu PUBLIC bf_intern_renderdoc_dynload)
endif()
if(CXX_WARN_NO_SUGGEST_OVERRIDE)
2021-04-09 13:07:21 +02:00
target_compile_options(bf_gpu PRIVATE $<$<COMPILE_LANGUAGE:CXX>:-Wsuggest-override>)
endif()
if(WITH_GTESTS)
set(TEST_SRC)
set(TEST_INC)
set(TEST_LIB
bf_intern_ghost
bf_imbuf
bf_windowmanager
)
if(WITH_GPU_BACKEND_TESTS)
list(APPEND TEST_SRC
tests/buffer_texture_test.cc
tests/compute_test.cc
tests/framebuffer_test.cc
tests/immediate_test.cc
tests/index_buffer_test.cc
tests/push_constants_test.cc
2024-01-09 12:17:12 +11:00
tests/shader_create_info_test.cc
tests/shader_preprocess_test.cc
2025-06-14 15:57:33 +10:00
tests/shader_test.cc
tests/specialization_constants_test.cc
tests/state_blend_test.cc
tests/storage_buffer_test.cc
tests/texture_test.cc
tests/vertex_buffer_test.cc
)
endif()
if(WITH_VULKAN_BACKEND)
list(APPEND TEST_SRC
vulkan/tests/vk_data_conversion_test.cc
vulkan/tests/vk_memory_layout_test.cc
Vulkan: Render graph core **Design Task**: blender/blender#118330 This PR adds the core of the render graph. The render graph isn't used. Current implementation of the Vulkan Backend is slow by design. We focused on stability, before performance. With the new introduced render graph the focus will shift to performance and keep the stability at where it is. Some highlights: - Every context will get its own render graph. (`VKRenderGraph`). - Resources (and resource state tracking) is device specific (`VKResourceStateTracker`). - No node reordering / sub graph execution has been implemented. Currently All nodes in the graph is executed in the order they were added. (`VKScheduler`). - The links inside the graph describe the resources the nodes read from (input links) or writes to (output links) - When resources are written to a resource stamp is incremented allowing keeping track of which nodes needs which stamp of a resource. - At each link the access information (how does the node accesses the resource) and image layout (for image resources) are stored. This allows the render graph to find out how a resource was used in the past and will be used in the future. That is important to construct pipeline barriers that don't stall the whole GPU. # Defined nodes This implementation has nodes for: - Blit image - Clear color image - Copy buffers to buffers - Copy buffers to images - Copy images to images - Copy images to buffers - Dispatch compute shader - Fill buffers - Synchronization Each node has a node info, create info and data struct. The create info contains all data to construct the node, including the links of the graph. The data struct only contains the data stored inside the node. The node info contains the node specific implementation. > NOTE: Other nodes will be added after this PR lands to main. # Resources Before a render graph can be used, the resources should be registered to `VKResourceStateTracker`. In the final implementation this will be owned by the `VKDevice`. Registration of resources can be done by calling `VKResources.add_buffer` or `VKResources.add_image`. # Render graph Nodes can be added to the render graph. When adding a node its read/ write dependencies are extracted and converted into links (`VKNodeInfo. build_links`). When the caller wants to have a resource up to date the functions `VKRenderGraph.submit_for_read` or `VKRenderGraph.submit_for_present` can be called. These functions will select and order the nodes that are needed and convert them to `vkCmd*` commands. These commands include pipeline barrier and image layout transitions. The `vkCmd` are recorded into a command buffer which is sent to the device queue. ## Walking the graph Walking the render graph isn't implemented yet. The idea is to have a `Map<ResourceWithStamp, Vector<NodeHandle>> consumers` and `Map<ResourceWithStamp, NodeHandle> producers`. These attributes can be stored in the render graph and created when building the links, or can be created inside the VKScheduler as a variable. The exact detail which one would be better is unclear as there aren't any users yet. At the moment the scheduler would need them we need to figure out the best way to store and retrieve the consumers/producers. # Unit tests The render graph can be tested by enabling `WITH_GTEST` and use `vk_render_graph` as a filter. ``` bin/tests/blender_test --gtest_filter="vk_render_graph*" ``` Pull Request: https://projects.blender.org/blender/blender/pulls/120427
2024-04-19 10:46:50 +02:00
vulkan/render_graph/tests/vk_render_graph_test_compute.cc
vulkan/render_graph/tests/vk_render_graph_test_present.cc
vulkan/render_graph/tests/vk_render_graph_test_render.cc
vulkan/render_graph/tests/vk_render_graph_test_scheduler.cc
2024-05-06 09:20:57 +10:00
vulkan/render_graph/tests/vk_render_graph_test_transfer.cc
2024-07-25 11:24:11 +10:00
vulkan/render_graph/tests/vk_render_graph_test_types.hh
)
endif()
2024-09-20 13:14:57 +10:00
# Enable shader validation on build-bot for Metal
if(WITH_METAL_BACKEND AND NOT WITH_GPU_DRAW_TESTS AND
NOT (WITH_GTESTS AND WITH_GPU_BACKEND_TESTS)) # Avoid duplicate source file
list(APPEND TEST_SRC
tests/shader_create_info_test.cc
)
endif()
set(TEST_COMMON_SRC
tests/gpu_testing.cc
tests/gpu_testing.hh
)
blender_add_test_suite_lib(gpu
"${TEST_SRC}" "${INC};${TEST_INC}" "${INC_SYS}" "${LIB};${TEST_LIB}" "${TEST_COMMON_SRC}"
)
endif()