Files
test/source/blender/gpu/GPU_index_buffer.hh

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

291 lines
9.6 KiB
C++
Raw Normal View History

/* SPDX-FileCopyrightText: 2016 by Mike Erwin. All rights reserved.
*
* SPDX-License-Identifier: GPL-2.0-or-later */
/** \file
* \ingroup gpu
*
* GPU index buffer
*/
#pragma once
#include "BLI_span.hh"
#include "GPU_primitive.hh"
#define GPU_TRACK_INDEX_RANGE 1
namespace blender::gpu {
/** Value for invisible elements in a #GPU_PRIM_POINTS index buffer. */
constexpr uint32_t RESTART_INDEX = 0xFFFFFFFF;
enum GPUIndexBufType {
GPU_INDEX_U16,
GPU_INDEX_U32,
};
inline size_t to_bytesize(GPUIndexBufType type)
{
return (type == GPU_INDEX_U32) ? sizeof(uint32_t) : sizeof(uint16_t);
}
/**
* Base class which is then specialized for each implementation (GL, VK, ...).
*
* \note #IndexBuf does not hold any #GPUPrimType.
* This is because it can be interpreted differently by multiple batches.
*/
class IndexBuf {
protected:
/** Type of indices used inside this buffer. */
GPUIndexBufType index_type_ = GPU_INDEX_U32;
/** Offset in this buffer to the first index to render. Is 0 if not a subrange. */
uint32_t index_start_ = 0;
/** Number of indices to render. */
uint32_t index_len_ = 0;
/** Base index: Added to all indices after fetching. Allows index compression. */
uint32_t index_base_ = 0;
/** Bookkeeping. */
bool is_init_ = false;
/** Is this object only a reference to a subrange of another IndexBuf. */
bool is_subrange_ = false;
/** True if buffer only contains restart indices. */
bool is_empty_ = false;
union {
/** Mapped buffer data. non-NULL indicates not yet sent to VRAM. */
void *data_ = nullptr;
/** If is_subrange is true, this is the source index buffer. */
IndexBuf *src_;
};
public:
IndexBuf(){};
virtual ~IndexBuf();
void init(uint indices_len,
uint32_t *indices,
uint min_index,
uint max_index,
GPUPrimType prim_type,
bool uses_restart_indices);
void init_subrange(IndexBuf *elem_src, uint start, uint length);
void init_build_on_device(uint index_len);
/* Returns render index count (not precise). */
uint32_t index_len_get() const
{
/* Return 0 to bypass drawing for index buffers full of restart indices.
* They can lead to graphical glitches on some systems. (See #96892) */
return is_empty_ ? 0 : index_len_;
}
uint32_t index_start_get() const
{
return index_start_;
}
uint32_t index_base_get() const
{
return index_base_;
}
bool is_32bit() const
{
return index_type_ == GPU_INDEX_U32;
}
/* Return size in byte of the drawable data buffer range. Actual buffer size might be bigger. */
size_t size_get() const
{
return index_len_ * to_bytesize(index_type_);
};
bool is_init() const
{
return is_init_;
};
virtual void upload_data() = 0;
virtual void bind_as_ssbo(uint binding) = 0;
virtual void read(uint32_t *data) const = 0;
virtual void update_sub(uint start, uint len, const void *data) = 0;
private:
inline void squeeze_indices_short(uint min_idx,
uint max_idx,
GPUPrimType prim_type,
bool clamp_indices_in_range);
virtual void strip_restart_indices() = 0;
};
inline int indices_per_primitive(GPUPrimType prim_type)
{
switch (prim_type) {
case GPU_PRIM_POINTS:
return 1;
case GPU_PRIM_LINES:
return 2;
case GPU_PRIM_TRIS:
return 3;
case GPU_PRIM_LINES_ADJ:
return 4;
case GPU_PRIM_TRIS_ADJ:
return 6;
/** IMPORTANT: These last two expects no restart primitive.
* Asserting for this would be too slow. Just don't be stupid.
* This is needed for polylines but should be deprecated.
* See GPU_batch_draw_expanded_parameter_get */
case GPU_PRIM_LINE_STRIP:
return 1; /* Minus one for the whole length. */
case GPU_PRIM_LINE_LOOP:
return 1;
default:
return -1;
}
}
} // namespace blender::gpu
blender::gpu::IndexBuf *GPU_indexbuf_calloc();
struct GPUIndexBufBuilder {
uint max_allowed_index;
uint max_index_len;
uint index_len;
uint index_min;
uint index_max;
uint restart_index_value;
bool uses_restart_indices;
GPUPrimType prim_type;
uint32_t *data;
};
/** Supports all primitive types. */
void GPU_indexbuf_init_ex(GPUIndexBufBuilder *, GPUPrimType, uint index_len, uint vertex_len);
/** Supports only #GPU_PRIM_POINTS, #GPU_PRIM_LINES and #GPU_PRIM_TRIS. */
2018-07-18 23:09:31 +10:00
void GPU_indexbuf_init(GPUIndexBufBuilder *, GPUPrimType, uint prim_len, uint vertex_len);
blender::gpu::IndexBuf *GPU_indexbuf_build_on_device(uint index_len);
void GPU_indexbuf_init_build_on_device(blender::gpu::IndexBuf *elem, uint index_len);
OpenSubDiv: add support for an OpenGL evaluator This evaluator is used in order to evaluate subdivision at render time, allowing for faster renders of meshes with a subdivision surface modifier placed at the last position in the modifier list. When evaluating the subsurf modifier, we detect whether we can delegate evaluation to the draw code. If so, the subdivision is first evaluated on the GPU using our own custom evaluator (only the coarse data needs to be initially sent to the GPU), then, buffers for the final `MeshBufferCache` are filled on the GPU using a set of compute shaders. However, some buffers are still filled on the CPU side, if doing so on the GPU is impractical (e.g. the line adjacency buffer used for x-ray, whose logic is hardly GPU compatible). This is done at the mesh buffer extraction level so that the result can be readily used in the various OpenGL engines, without having to write custom geometry or tesselation shaders. We use our own subdivision evaluation shaders, instead of OpenSubDiv's vanilla one, in order to control the data layout, and interpolation. For example, we store vertex colors as compressed 16-bit integers, while OpenSubDiv's default evaluator only work for float types. In order to still access the modified geometry on the CPU side, for use in modifiers or transform operators, a dedicated wrapper type is added `MESH_WRAPPER_TYPE_SUBD`. Subdivision will be lazily evaluated via `BKE_object_get_evaluated_mesh` which will create such a wrapper if possible. If the final subdivision surface is not needed on the CPU side, `BKE_object_get_evaluated_mesh_no_subsurf` should be used. Enabling or disabling GPU subdivision can be done through the user preferences (under Viewport -> Subdivision). See patch description for benchmarks. Reviewed By: campbellbarton, jbakker, fclem, brecht, #eevee_viewport Differential Revision: https://developer.blender.org/D12406
2021-12-27 16:34:47 +01:00
blender::MutableSpan<uint32_t> GPU_indexbuf_get_data(GPUIndexBufBuilder *);
GPU: Thread safe index buffer builders. Current index builder is designed to be used in a single thread. This makes all index buffer extractions single threaded. This patch adds a thread safe solution enabling multithreaded building of index buffers. To reduce locking the solution would provide a task/thread local index buffer builder (called sub builder). When a thread is finished this thread local index buffer builder can be joined with the initial index buffer builder. `GPU_indexbuf_subbuilder_init`: Initialized a sub builder. The index list is shared between the parent and sub buffer, but the counters are localized. Ensuring that updating counters would not need any locking. `GPU_indexbuf_subbuilder_finish`: merge the information of the sub builder back to the parent builder. Needs to be invoked outside the worker thread, or when sure that all worker threads have been finished. Internal the function is not thread safe. For testing purposes the extract_points extractor has been migrated to the new API. Herefore changes to the mesh extractor were needed. * When creating tasks, the task number of current task is stored in ExtractTaskData including the total number of tasks. * Adding two functions in `MeshExtract`. ** `task_init` will initialize the task specific userdata. ** `task_finish` should merge back the task specific userdata back. * adding task_id parameter to the iteration functions so they can access the correct task data without any need for locking. There is no noticeable change in end user performance. Reviewed By: mano-wii Differential Revision: https://developer.blender.org/D11499
2021-06-08 16:35:33 +02:00
/*
Refactor: Draw Cache: use 'BLI_task_parallel_range' This is an adaptation of {D11488}. A disadvantage of manually setting the iter ranges per thread is that we don't know how many threads are running in the background and so we don't know how to best distribute the ranges. To solve this limitation we can use `parallel_reduce` and thus let the driver choose the best distribution of ranges among the threads. This proved to be especially beneficial for computers with few cores. **Benchmarking:** Here's the result on an 4-core laptop: ||master:|PATCH: |---|---|---| |large_mesh_editing:|Average: 5.203638 FPS|Average: 5.398925 FPS ||rdata 15ms iter 43ms (frame 193ms)|rdata 14ms iter 36ms (frame 187ms) Here's the result on an 8-core PC: ||master:|PATCH: |---|---|---| |large_mesh_editing:|Average: 15.267482 FPS|Average: 15.906881 FPS ||rdata 9ms iter 28ms (frame 65ms)|rdata 9ms iter 25ms (frame 63ms) |large_mesh_editing_ledge: |Average: 15.145966 FPS|Average: 15.520474 FPS ||rdata 9ms iter 29ms (frame 65ms)|rdata 9ms iter 25ms (frame 64ms) |looptris_test:|Average: 4.001917 FPS|Average: 4.061105 FPS ||rdata 12ms iter 90ms (frame 236ms)|rdata 12ms iter 87ms (frame 230ms) |subdiv_mesh_cage_and_final:|Average: 1.917769 FPS|Average: 1.971790 FPS ||rdata 7ms iter 37ms (frame 261ms)|rdata 7ms iter 31ms (frame 258ms) ||rdata 7ms iter 38ms (frame 252ms)|rdata 7ms iter 33ms (frame 249ms) |subdiv_mesh_final_only:|Average: 6.387240 FPS|Average: 6.591251 FPS ||rdata 3ms iter 25ms (frame 151ms)|rdata 3ms iter 16ms (frame 145ms) |subdiv_mesh_final_only_ledge:|Average: 6.247393 FPS|Average: 6.596024 FPS ||rdata 3ms iter 26ms (frame 158ms)|rdata 3ms iter 16ms (frame 148ms) **Notes:** - The improvement can only be noticed if all extracts are multithreaded. - This patch touches different areas of the code, so it can be split into another patch if the idea is accepted. These screenshots show how threads behave in a quadcore: Master: {F10164664} Patch: {F10164666} Differential Revision: https://developer.blender.org/D11558
2021-06-10 11:01:36 -03:00
* Thread safe.
GPU: Thread safe index buffer builders. Current index builder is designed to be used in a single thread. This makes all index buffer extractions single threaded. This patch adds a thread safe solution enabling multithreaded building of index buffers. To reduce locking the solution would provide a task/thread local index buffer builder (called sub builder). When a thread is finished this thread local index buffer builder can be joined with the initial index buffer builder. `GPU_indexbuf_subbuilder_init`: Initialized a sub builder. The index list is shared between the parent and sub buffer, but the counters are localized. Ensuring that updating counters would not need any locking. `GPU_indexbuf_subbuilder_finish`: merge the information of the sub builder back to the parent builder. Needs to be invoked outside the worker thread, or when sure that all worker threads have been finished. Internal the function is not thread safe. For testing purposes the extract_points extractor has been migrated to the new API. Herefore changes to the mesh extractor were needed. * When creating tasks, the task number of current task is stored in ExtractTaskData including the total number of tasks. * Adding two functions in `MeshExtract`. ** `task_init` will initialize the task specific userdata. ** `task_finish` should merge back the task specific userdata back. * adding task_id parameter to the iteration functions so they can access the correct task data without any need for locking. There is no noticeable change in end user performance. Reviewed By: mano-wii Differential Revision: https://developer.blender.org/D11499
2021-06-08 16:35:33 +02:00
*
2021-10-20 09:19:21 +11:00
* Function inspired by the reduction directives of multi-thread work API's.
GPU: Thread safe index buffer builders. Current index builder is designed to be used in a single thread. This makes all index buffer extractions single threaded. This patch adds a thread safe solution enabling multithreaded building of index buffers. To reduce locking the solution would provide a task/thread local index buffer builder (called sub builder). When a thread is finished this thread local index buffer builder can be joined with the initial index buffer builder. `GPU_indexbuf_subbuilder_init`: Initialized a sub builder. The index list is shared between the parent and sub buffer, but the counters are localized. Ensuring that updating counters would not need any locking. `GPU_indexbuf_subbuilder_finish`: merge the information of the sub builder back to the parent builder. Needs to be invoked outside the worker thread, or when sure that all worker threads have been finished. Internal the function is not thread safe. For testing purposes the extract_points extractor has been migrated to the new API. Herefore changes to the mesh extractor were needed. * When creating tasks, the task number of current task is stored in ExtractTaskData including the total number of tasks. * Adding two functions in `MeshExtract`. ** `task_init` will initialize the task specific userdata. ** `task_finish` should merge back the task specific userdata back. * adding task_id parameter to the iteration functions so they can access the correct task data without any need for locking. There is no noticeable change in end user performance. Reviewed By: mano-wii Differential Revision: https://developer.blender.org/D11499
2021-06-08 16:35:33 +02:00
*/
2021-06-18 14:27:43 +10:00
void GPU_indexbuf_join(GPUIndexBufBuilder *builder, const GPUIndexBufBuilder *builder_from);
GPU: Thread safe index buffer builders. Current index builder is designed to be used in a single thread. This makes all index buffer extractions single threaded. This patch adds a thread safe solution enabling multithreaded building of index buffers. To reduce locking the solution would provide a task/thread local index buffer builder (called sub builder). When a thread is finished this thread local index buffer builder can be joined with the initial index buffer builder. `GPU_indexbuf_subbuilder_init`: Initialized a sub builder. The index list is shared between the parent and sub buffer, but the counters are localized. Ensuring that updating counters would not need any locking. `GPU_indexbuf_subbuilder_finish`: merge the information of the sub builder back to the parent builder. Needs to be invoked outside the worker thread, or when sure that all worker threads have been finished. Internal the function is not thread safe. For testing purposes the extract_points extractor has been migrated to the new API. Herefore changes to the mesh extractor were needed. * When creating tasks, the task number of current task is stored in ExtractTaskData including the total number of tasks. * Adding two functions in `MeshExtract`. ** `task_init` will initialize the task specific userdata. ** `task_finish` should merge back the task specific userdata back. * adding task_id parameter to the iteration functions so they can access the correct task data without any need for locking. There is no noticeable change in end user performance. Reviewed By: mano-wii Differential Revision: https://developer.blender.org/D11499
2021-06-08 16:35:33 +02:00
2018-07-18 23:09:31 +10:00
void GPU_indexbuf_add_generic_vert(GPUIndexBufBuilder *, uint v);
void GPU_indexbuf_add_primitive_restart(GPUIndexBufBuilder *);
2018-07-18 23:09:31 +10:00
void GPU_indexbuf_add_point_vert(GPUIndexBufBuilder *, uint v);
void GPU_indexbuf_add_line_verts(GPUIndexBufBuilder *, uint v1, uint v2);
void GPU_indexbuf_add_tri_verts(GPUIndexBufBuilder *, uint v1, uint v2, uint v3);
void GPU_indexbuf_add_line_adj_verts(GPUIndexBufBuilder *, uint v1, uint v2, uint v3, uint v4);
void GPU_indexbuf_set_point_vert(GPUIndexBufBuilder *builder, uint elem, uint v1);
void GPU_indexbuf_set_line_verts(GPUIndexBufBuilder *builder, uint elem, uint v1, uint v2);
void GPU_indexbuf_set_tri_verts(GPUIndexBufBuilder *builder, uint elem, uint v1, uint v2, uint v3);
/* Skip primitive rendering at the given index. */
void GPU_indexbuf_set_point_restart(GPUIndexBufBuilder *builder, uint elem);
void GPU_indexbuf_set_line_restart(GPUIndexBufBuilder *builder, uint elem);
void GPU_indexbuf_set_tri_restart(GPUIndexBufBuilder *builder, uint elem);
blender::gpu::IndexBuf *GPU_indexbuf_build(GPUIndexBufBuilder *);
blender::gpu::IndexBuf *GPU_indexbuf_build_ex(GPUIndexBufBuilder *builder,
uint index_min,
uint index_max,
bool uses_restart_indices);
void GPU_indexbuf_build_in_place(GPUIndexBufBuilder *, blender::gpu::IndexBuf *);
void GPU_indexbuf_build_in_place_ex(GPUIndexBufBuilder *builder,
uint index_min,
uint index_max,
bool uses_restart_indices,
blender::gpu::IndexBuf *elem);
/**
* Fill an IBO by uploading the referenced data directly to the GPU, bypassing the separate storage
* in the IBO. This should be used whenever the equivalent indices already exist in a contiguous
* array on the CPU.
*
* \todo The optimization to avoid the local copy currently isn't implemented.
*/
blender::gpu::IndexBuf *GPU_indexbuf_build_from_memory(GPUPrimType prim_type,
const uint32_t *data,
int32_t data_len,
int32_t index_min,
int32_t index_max,
bool uses_restart_indices);
2025-01-29 12:31:19 +11:00
/**
* \note Sub-ranges are not taken into account, the whole buffer will be bound without any offset.
*/
void GPU_indexbuf_bind_as_ssbo(blender::gpu::IndexBuf *elem, int binding);
blender::gpu::IndexBuf *GPU_indexbuf_build_curves_on_device(GPUPrimType prim_type,
uint curves_num,
uint verts_per_curve);
OpenSubDiv: add support for an OpenGL evaluator This evaluator is used in order to evaluate subdivision at render time, allowing for faster renders of meshes with a subdivision surface modifier placed at the last position in the modifier list. When evaluating the subsurf modifier, we detect whether we can delegate evaluation to the draw code. If so, the subdivision is first evaluated on the GPU using our own custom evaluator (only the coarse data needs to be initially sent to the GPU), then, buffers for the final `MeshBufferCache` are filled on the GPU using a set of compute shaders. However, some buffers are still filled on the CPU side, if doing so on the GPU is impractical (e.g. the line adjacency buffer used for x-ray, whose logic is hardly GPU compatible). This is done at the mesh buffer extraction level so that the result can be readily used in the various OpenGL engines, without having to write custom geometry or tesselation shaders. We use our own subdivision evaluation shaders, instead of OpenSubDiv's vanilla one, in order to control the data layout, and interpolation. For example, we store vertex colors as compressed 16-bit integers, while OpenSubDiv's default evaluator only work for float types. In order to still access the modified geometry on the CPU side, for use in modifiers or transform operators, a dedicated wrapper type is added `MESH_WRAPPER_TYPE_SUBD`. Subdivision will be lazily evaluated via `BKE_object_get_evaluated_mesh` which will create such a wrapper if possible. If the final subdivision surface is not needed on the CPU side, `BKE_object_get_evaluated_mesh_no_subsurf` should be used. Enabling or disabling GPU subdivision can be done through the user preferences (under Viewport -> Subdivision). See patch description for benchmarks. Reviewed By: campbellbarton, jbakker, fclem, brecht, #eevee_viewport Differential Revision: https://developer.blender.org/D12406
2021-12-27 16:34:47 +01:00
/* Upload data to the GPU (if not built on the device) and bind the buffer to its default target.
*/
void GPU_indexbuf_use(blender::gpu::IndexBuf *elem);
OpenSubDiv: add support for an OpenGL evaluator This evaluator is used in order to evaluate subdivision at render time, allowing for faster renders of meshes with a subdivision surface modifier placed at the last position in the modifier list. When evaluating the subsurf modifier, we detect whether we can delegate evaluation to the draw code. If so, the subdivision is first evaluated on the GPU using our own custom evaluator (only the coarse data needs to be initially sent to the GPU), then, buffers for the final `MeshBufferCache` are filled on the GPU using a set of compute shaders. However, some buffers are still filled on the CPU side, if doing so on the GPU is impractical (e.g. the line adjacency buffer used for x-ray, whose logic is hardly GPU compatible). This is done at the mesh buffer extraction level so that the result can be readily used in the various OpenGL engines, without having to write custom geometry or tesselation shaders. We use our own subdivision evaluation shaders, instead of OpenSubDiv's vanilla one, in order to control the data layout, and interpolation. For example, we store vertex colors as compressed 16-bit integers, while OpenSubDiv's default evaluator only work for float types. In order to still access the modified geometry on the CPU side, for use in modifiers or transform operators, a dedicated wrapper type is added `MESH_WRAPPER_TYPE_SUBD`. Subdivision will be lazily evaluated via `BKE_object_get_evaluated_mesh` which will create such a wrapper if possible. If the final subdivision surface is not needed on the CPU side, `BKE_object_get_evaluated_mesh_no_subsurf` should be used. Enabling or disabling GPU subdivision can be done through the user preferences (under Viewport -> Subdivision). See patch description for benchmarks. Reviewed By: campbellbarton, jbakker, fclem, brecht, #eevee_viewport Differential Revision: https://developer.blender.org/D12406
2021-12-27 16:34:47 +01:00
/* Partially update the blender::gpu::IndexBuf which was already sent to the device, or built
* directly on the device. The data needs to be compatible with potential compression applied to
* the original indices when the index buffer was built, i.e., if the data was compressed to use
* shorts instead of ints, shorts should passed here. */
void GPU_indexbuf_update_sub(blender::gpu::IndexBuf *elem, uint start, uint len, const void *data);
OpenSubDiv: add support for an OpenGL evaluator This evaluator is used in order to evaluate subdivision at render time, allowing for faster renders of meshes with a subdivision surface modifier placed at the last position in the modifier list. When evaluating the subsurf modifier, we detect whether we can delegate evaluation to the draw code. If so, the subdivision is first evaluated on the GPU using our own custom evaluator (only the coarse data needs to be initially sent to the GPU), then, buffers for the final `MeshBufferCache` are filled on the GPU using a set of compute shaders. However, some buffers are still filled on the CPU side, if doing so on the GPU is impractical (e.g. the line adjacency buffer used for x-ray, whose logic is hardly GPU compatible). This is done at the mesh buffer extraction level so that the result can be readily used in the various OpenGL engines, without having to write custom geometry or tesselation shaders. We use our own subdivision evaluation shaders, instead of OpenSubDiv's vanilla one, in order to control the data layout, and interpolation. For example, we store vertex colors as compressed 16-bit integers, while OpenSubDiv's default evaluator only work for float types. In order to still access the modified geometry on the CPU side, for use in modifiers or transform operators, a dedicated wrapper type is added `MESH_WRAPPER_TYPE_SUBD`. Subdivision will be lazily evaluated via `BKE_object_get_evaluated_mesh` which will create such a wrapper if possible. If the final subdivision surface is not needed on the CPU side, `BKE_object_get_evaluated_mesh_no_subsurf` should be used. Enabling or disabling GPU subdivision can be done through the user preferences (under Viewport -> Subdivision). See patch description for benchmarks. Reviewed By: campbellbarton, jbakker, fclem, brecht, #eevee_viewport Differential Revision: https://developer.blender.org/D12406
2021-12-27 16:34:47 +01:00
2021-02-05 16:23:34 +11:00
/* Create a sub-range of an existing index-buffer. */
blender::gpu::IndexBuf *GPU_indexbuf_create_subrange(blender::gpu::IndexBuf *elem_src,
uint start,
uint length);
void GPU_indexbuf_create_subrange_in_place(blender::gpu::IndexBuf *elem,
blender::gpu::IndexBuf *elem_src,
uint start,
uint length);
/**
* (Download and) fill data with the contents of the index buffer.
*
* NOTE: caller is responsible to reserve enough memory.
*/
void GPU_indexbuf_read(blender::gpu::IndexBuf *elem, uint32_t *data);
void GPU_indexbuf_discard(blender::gpu::IndexBuf *elem);
bool GPU_indexbuf_is_init(blender::gpu::IndexBuf *elem);
int GPU_indexbuf_primitive_len(GPUPrimType prim_type);
/* Macros */
#define GPU_INDEXBUF_DISCARD_SAFE(elem) \
do { \
if (elem != nullptr) { \
GPU_indexbuf_discard(elem); \
elem = nullptr; \
} \
} while (0)
Draw: Refactor mesh extraction to avoid creating uninitialized buffers The initial goal of this PR is to avoid creating vertex and index buffers as part of the "request" phase of the drawing loop. Conflating requesting and creating index buffers might not sound so bad, but it ends up significantly complicating the whole process. It is also incompatible with a future buffer cache that would allow avoiding re-uploading mesh buffers. Specifically, this means removing the use of `DRW_vbo_request` and `DRW_ibo_request` from the mesh batch extraction process. Instead, a list of buffer types is gathered based on the requested batches. Then that list is filtered to find the batches that haven't been requested yet. Overall I find the new process much easier to understand. A few examples of simplifications this allows are avoiding allocating `MeshRenderData` on the heap, and the removal of its `use_final_mesh` member. That's just replaced by passing the necessary information through the call stack. Another notable difference is that for meshes, EEVEE's velocity module now requests a batch that contains the buffer rather than just requesting the buffer itself. This is just simpler to get working since it doesn't require a separate code path. The task graph argument for extraction is unused after this change. It wasn't used effectively anyway; a simpler method of multithreading extractions is used in this PR. I didn't remove it completely because it will probably be repurposed in the next step of this project. The next step in this project is to replace `MeshBufferList` with a global cache that's keyed based on the mesh data that compromises each batch, when possible (i.e. for non edit-mode meshes). This changes above should be applied to other object types too. Pull Request: https://projects.blender.org/blender/blender/pulls/135699
2025-03-25 18:09:38 +01:00
namespace blender::gpu {
class IndexBufDeleter {
public:
void operator()(IndexBuf *ibo)
{
GPU_indexbuf_discard(ibo);
}
};
using IndexBufPtr = std::unique_ptr<IndexBuf, IndexBufDeleter>;
Draw: Refactor mesh extraction to avoid creating uninitialized buffers The initial goal of this PR is to avoid creating vertex and index buffers as part of the "request" phase of the drawing loop. Conflating requesting and creating index buffers might not sound so bad, but it ends up significantly complicating the whole process. It is also incompatible with a future buffer cache that would allow avoiding re-uploading mesh buffers. Specifically, this means removing the use of `DRW_vbo_request` and `DRW_ibo_request` from the mesh batch extraction process. Instead, a list of buffer types is gathered based on the requested batches. Then that list is filtered to find the batches that haven't been requested yet. Overall I find the new process much easier to understand. A few examples of simplifications this allows are avoiding allocating `MeshRenderData` on the heap, and the removal of its `use_final_mesh` member. That's just replaced by passing the necessary information through the call stack. Another notable difference is that for meshes, EEVEE's velocity module now requests a batch that contains the buffer rather than just requesting the buffer itself. This is just simpler to get working since it doesn't require a separate code path. The task graph argument for extraction is unused after this change. It wasn't used effectively anyway; a simpler method of multithreading extractions is used in this PR. I didn't remove it completely because it will probably be repurposed in the next step of this project. The next step in this project is to replace `MeshBufferList` with a global cache that's keyed based on the mesh data that compromises each batch, when possible (i.e. for non edit-mode meshes). This changes above should be applied to other object types too. Pull Request: https://projects.blender.org/blender/blender/pulls/135699
2025-03-25 18:09:38 +01:00
} // namespace blender::gpu