griefith/test

Author	SHA1	Message	Date
Brecht Van Lommel	fb2ba20b67	Refactor: Use more typed MEM_calloc<> and MEM_malloc<> Pull Request: https://projects.blender.org/blender/blender/pulls/137822	2025-04-22 11:22:18 +02:00
Brecht Van Lommel	637c6497e9	Refactor: Use more typed MEM_calloc<>, avoid unnecessary size_t cast Handle some cases that were missed in previous refactor. And eliminate unnecessary size_t casts as these could hide issues. Pull Request: https://projects.blender.org/blender/blender/pulls/137404	2025-04-21 17:59:41 +02:00
Campbell Barton	12e3a046a5	Cleanup: remove unused defines	2025-03-29 11:49:08 +11:00
Hans Goudey	9b1a5a1c43	Refactor: Draw: Further changes to mesh buffer extraction Followup to `9b70851d91`. Return buffers by value rather than creating an empty/uninitialized buffer first, then initializing it in an extraction function. This generally makes the code easier to follow. And avoiding these half-created buffers is an essential step to adding some sort of more global cache. Pull Request: https://projects.blender.org/blender/blender/pulls/136570	2025-03-27 16:52:55 +01:00
Hans Goudey	e5d9b7a43f	Cleanup: GPU: Remove unused index buffer function This information is set when creating the indices, there should be no need to compute it later. Pull Request: https://projects.blender.org/blender/blender/pulls/135986	2025-03-14 16:55:55 +01:00
Brecht Van Lommel	c7502b092d	Cleanup: Various clang-tidy warnings in gpu Pull Request: https://projects.blender.org/blender/blender/pulls/133734	2025-01-31 17:03:18 +01:00
Clément Foucault	324517fd78	Cleanup: GPU: Fix clang tidy warnings Removes some other things like: - `TRUST_NO_ONE` which was the same as `#ifndef NDEBUG`. - Replace `reinterpret_cast` by `unwrap` Pull Request: https://projects.blender.org/blender/blender/pulls/129631	2024-10-31 15:18:29 +01:00
Hans Goudey	da1ea4cdd1	Revert "Draw: Avoid temporary copy for mesh triangulation index buffer" This reverts commit `108ab1df2d`. This causes issues when duplicating objects that I don't have time to investigate right now.	2024-05-23 23:43:34 -04:00
Hans Goudey	108ab1df2d	Draw: Avoid temporary copy for mesh triangulation index buffer The mesh triangulation data is stored in CPU memory with the same format as the triangles GPU index buffer. Because of that we can skip creating a temporary copied owned by the GPU API. One way to do that is to just upload the data directly and avoid keeping a reference to it. However, we can only upload GPU data from the main thread with OpenGL, so instead reference the data and keep track of whether to free it. When drawing a mesh with a single material and 1.8 million faces, this change gives a 12-15% improvement in framerate, from about 32 to 37 FPS. Part of #116901. Pull Request: https://projects.blender.org/blender/blender/pulls/122175	2024-05-23 19:59:36 +02:00
Hans Goudey	07eac30070	Mesh: Points index buffer improvement A continuation of #116901. This one doesn't have a performance impact in my testing. It also adds a bit more code compared to main so it isn't really obviously "better" like the previous refactors. However it does get us closer to removing the "extractors" callback iteration loop (`edit_data` is the only other enabled by default), and I'd argue that the final code is easier to iterate on in the future since it's more self-contained. I made an effort to avoid storing restart indices in the index buffers. Though this requires a bit more calculation on the CPU (particularly because the hidden gaps in the IBO need to be compressed out), it reduces overall CPU->GPU traffic and removes the need to strip the restart indices on Metal. Pull Request: https://projects.blender.org/blender/blender/pulls/122084	2024-05-22 16:08:55 +02:00
Hans Goudey	0f46e02310	Mesh: Draw triangle index buffer creation improvements This PR is another step in the refactor described by #116901. This change applies to the triangle index buffer. The main improvement is the ability to recognize when the mesh corner triangles index array can be uploaded directly (when there is a single material and no hidden faces). In that case the index data should be copied directly to the GPU rather than to a temporary array owned by the IBO first. Though that isn't implemented yet since it will be handled by the GPU module later, the code is now structured to make that change simple from the data extraction perspective. Other than that, the main change is to not use the extractor iterator framework anymore, and to set index data directly instead of using GPU API functions. Though we're mainly bottlenecked by memory-bandwidth anyway, it's nice to avoid function call overhead. We also now avoid creating the array of sorted triangle indices when there is a single material and no hidden faces. And we don't use restart indices for the single-material case anymore. For Metal that's nice because we can avoid `strip_restart_indices`. I didn't notice significant performance improvements in my test files beyond a few percent here and there. With a hacked implementation of the copy-directly-to-the-gpu optimization, I did see more consistent improvements though. Pull Request: https://projects.blender.org/blender/blender/pulls/119130	2024-04-11 04:49:27 +02:00
Hans Goudey	fe76d8c946	Refactor: Remove unnecessary C wrappers for vertex and index buffers Now that all relevant code is C++, the indirection from the C struct `GPUVertBuf` to the C++ `blender::gpu::VertBuf` class just adds complexity and necessitates a wrapper API, making more cleanups like use of RAII or other C++ types more difficult. This commit replaces the C wrapper structs with direct use of the vertex and index buffer base classes. In C++ we can choose which parts of a class are private, so we don't risk exposing too many implementation details here. Pull Request: https://projects.blender.org/blender/blender/pulls/119825	2024-03-24 16:38:30 +01:00
Hans Goudey	8b514bccd1	Cleanup: Move remaining GPU headers to C++ Pull Request: https://projects.blender.org/blender/blender/pulls/119807	2024-03-23 01:24:18 +01:00
laurynas	aa3ffca8dc	Fix #119247 : Curves: Extra point in evaluated spline of Curves geometry In `bf17fc8d79` after extending buffer to multiple of 4 there appeared trailing space in buffer not covered by shader's `for` loop. Pull Request: https://projects.blender.org/blender/blender/pulls/119346	2024-03-12 15:01:10 +01:00
Campbell Barton	ed5fb3eaba	Cleanup: various non-functional C++ changes	2024-03-05 11:32:42 +11:00
laurynas	bf17fc8d79	Fix: GPU: Ensures length of curves GPUIndexBuf to be multiple of 4 Exception is thrown in gpu_storage_buffer.cc To reproduce create legacy Bezier curve and convert it to new Curves. Code is from #116617 Pull Request: https://projects.blender.org/blender/blender/pulls/118951	2024-03-03 16:39:11 +01:00
Eugene Kuznetsov	7f43699ebf	DRW: Curves: Indexbuf optimization for large numbers of curves This optimizes a few loops that become significant bottlenecks during viewport rendering of scenes with large numbers of curves. To render a curves object, Blender needs to generate a potentially very large (but trivial) index buffer. As previously implemented, this index buffer is generated in an extremely inefficient manner, with a single-threaded loop and an explicit function call per entry. The buffer then needs to be pushed onto the GPU, which is also a fairly slow task. The PR generates the index buffer directly on the GPU with compute shader. Pull Request: https://projects.blender.org/blender/blender/pulls/116617	2024-02-25 17:22:58 +01:00
Hans Goudey	0618de49ad	Cleanup: Replace MIN/MAX macros with C++ functions Use `std::min` and `std::max` instead. Though keep MIN2 and MAX2 just for C code that hasn't been moved to C++ yet. Pull Request: https://projects.blender.org/blender/blender/pulls/117384	2024-01-22 15:58:18 +01:00
Campbell Barton	611930e5a8	Cleanup: use std::min/max instead of MIN2/MAX2 macros	2023-11-07 16:33:19 +11:00
Sergey Sharybin	c1bc70b711	Cleanup: Add a copyright notice to files and use SPDX format A lot of files were missing copyright field in the header and the Blender Foundation contributed to them in a sense of bug fixing and general maintenance. This change makes it explicit that those files are at least partially copyrighted by the Blender Foundation. Note that this does not make it so the Blender Foundation is the only holder of the copyright in those files, and developers who do not have a signed contract with the foundation still hold the copyright as well. Another aspect of this change is using SPDX format for the header. We already used it for the license specification, and now we state it for the copyright as well, following the FAQ: https://reuse.software/faq/	2023-05-31 16:19:06 +02:00
Campbell Barton	13c815085b	Cleanup: spelling in comments	2023-05-24 11:21:18 +10:00
Jeroen Bakker	f828ecf4ba	GPU: Use same read back API as SSBOs The GPU module has 2 different styles when reading back data from GPU buffers. The SSBOs used a memcpy to copy the data to a pre-allocated buffer. IndexBuf/VertBuf gave back a driver/platform controlled pointer to the memory. Readback is done for test cases returning mapped pointers is not safe. For this reason we settled on using the same approach as the SSBO. Copy the data to a caller pre-allocated buffer. Reason why this API is currently changed is that the Vulkan API is more strict on mapping/unmapping buffers that can lead to potential issues down the road. Pull Request #104571	2023-02-13 08:34:19 +01:00
Campbell Barton	8bdd4b4685	Cleanup: use function style casts for C++	2022-09-30 14:51:49 +10:00
Campbell Barton	f68cfd6bb0	Cleanup: replace C-style casts with functional casts for numeric types	2022-09-25 20:17:08 +10:00
Campbell Barton	6c6a53fad3	Cleanup: spelling in comments, formatting, move comments into headers	2022-09-06 16:25:20 +10:00
Jason Fielder	5f4409b02e	Metal: MTLIndexBuf class implementation. Implementation also contains a number of optimisations and feature enablements specific to the Metal API and Apple Silicon GPUs. Ref T96261 Reviewed By: fclem Maniphest Tasks: T96261 Differential Revision: https://developer.blender.org/D15369	2022-09-01 21:45:12 +02:00
Clément Foucault	b47c5505aa	Fix T96892 Overlay: Hiding all of a mesh in edit mode causes visual glitch This is caused by the geometry shader used by the edit mode line drawing. If the drawcall uses indexed drawing and if the index buffer only contains restart indices, it seems the result is 1 glitchy invocation of the geometry shader. Workaround by tagging these special case index buffers and bypassing their drawcall.	2022-05-10 23:36:16 +02:00
Campbell Barton	c434782e3a	File headers: SPDX License migration Use a shorter/simpler license convention, stops the header taking so much space. Follow the SPDX license specification: https://spdx.org/licenses - C/C++/objc/objc++ - Python - Shell Scripts - CMake, GNUmakefile While most of the source tree has been included - `./extern/` was left out. - `./intern/cycles` & `./intern/atomic` are also excluded because they use different header conventions. doc/license/SPDX-license-identifiers.txt has been added to list SPDX all used identifiers. See P2788 for the script that automated these edits. Reviewed By: brecht, mont29, sergey Ref D14069	2022-02-11 09:14:36 +11:00
Kévin Dietrich	eed45d2a23	OpenSubDiv: add support for an OpenGL evaluator This evaluator is used in order to evaluate subdivision at render time, allowing for faster renders of meshes with a subdivision surface modifier placed at the last position in the modifier list. When evaluating the subsurf modifier, we detect whether we can delegate evaluation to the draw code. If so, the subdivision is first evaluated on the GPU using our own custom evaluator (only the coarse data needs to be initially sent to the GPU), then, buffers for the final `MeshBufferCache` are filled on the GPU using a set of compute shaders. However, some buffers are still filled on the CPU side, if doing so on the GPU is impractical (e.g. the line adjacency buffer used for x-ray, whose logic is hardly GPU compatible). This is done at the mesh buffer extraction level so that the result can be readily used in the various OpenGL engines, without having to write custom geometry or tesselation shaders. We use our own subdivision evaluation shaders, instead of OpenSubDiv's vanilla one, in order to control the data layout, and interpolation. For example, we store vertex colors as compressed 16-bit integers, while OpenSubDiv's default evaluator only work for float types. In order to still access the modified geometry on the CPU side, for use in modifiers or transform operators, a dedicated wrapper type is added `MESH_WRAPPER_TYPE_SUBD`. Subdivision will be lazily evaluated via `BKE_object_get_evaluated_mesh` which will create such a wrapper if possible. If the final subdivision surface is not needed on the CPU side, `BKE_object_get_evaluated_mesh_no_subsurf` should be used. Enabling or disabling GPU subdivision can be done through the user preferences (under Viewport -> Subdivision). See patch description for benchmarks. Reviewed By: campbellbarton, jbakker, fclem, brecht, #eevee_viewport Differential Revision: https://developer.blender.org/D12406	2021-12-27 16:35:54 +01:00
Aaron Carlisle	c1279768a7	Cleanup: Clang-Tidy modernize-redundant-void-arg	2021-12-08 00:31:20 -05:00
Germano Cavalcante	0eb9351296	Refactor: use 'BLI_task_parallel_range' in Draw Cache One drawback to trying to predict the number of threads that will be used in the `task_graph` is that we are only sure of the number when the threads are running. Using `BLI_task_parallel_range` allows the driver to choose the best thread distribution through `parallel_reduce`. The benefit is most evident on hardware with fewer cores. This is the result on an 4-core laptop: \|\|before:\|after: \|---\|---\|---\| \|large_mesh_editing:\|Average: 5.203638 FPS\|Average: 5.398925 FPS \|\|rdata 15ms iter 43ms (frame 193ms)\|rdata 14ms iter 36ms (frame 187ms) Differential Revision: https://developer.blender.org/D11558	2021-06-11 10:49:50 -03:00
Germano Cavalcante	2330cec2c6	Refactor: Draw Cache: use 'BLI_task_parallel_range' This is an adaptation of {D11488}. A disadvantage of manually setting the iter ranges per thread is that we don't know how many threads are running in the background and so we don't know how to best distribute the ranges. To solve this limitation we can use `parallel_reduce` and thus let the driver choose the best distribution of ranges among the threads. This proved to be especially beneficial for computers with few cores. Benchmarking: Here's the result on an 4-core laptop: \|\|master:\|PATCH: \|---\|---\|---\| \|large_mesh_editing:\|Average: 5.203638 FPS\|Average: 5.398925 FPS \|\|rdata 15ms iter 43ms (frame 193ms)\|rdata 14ms iter 36ms (frame 187ms) Here's the result on an 8-core PC: \|\|master:\|PATCH: \|---\|---\|---\| \|large_mesh_editing:\|Average: 15.267482 FPS\|Average: 15.906881 FPS \|\|rdata 9ms iter 28ms (frame 65ms)\|rdata 9ms iter 25ms (frame 63ms) \|large_mesh_editing_ledge: \|Average: 15.145966 FPS\|Average: 15.520474 FPS \|\|rdata 9ms iter 29ms (frame 65ms)\|rdata 9ms iter 25ms (frame 64ms) \|looptris_test:\|Average: 4.001917 FPS\|Average: 4.061105 FPS \|\|rdata 12ms iter 90ms (frame 236ms)\|rdata 12ms iter 87ms (frame 230ms) \|subdiv_mesh_cage_and_final:\|Average: 1.917769 FPS\|Average: 1.971790 FPS \|\|rdata 7ms iter 37ms (frame 261ms)\|rdata 7ms iter 31ms (frame 258ms) \|\|rdata 7ms iter 38ms (frame 252ms)\|rdata 7ms iter 33ms (frame 249ms) \|subdiv_mesh_final_only:\|Average: 6.387240 FPS\|Average: 6.591251 FPS \|\|rdata 3ms iter 25ms (frame 151ms)\|rdata 3ms iter 16ms (frame 145ms) \|subdiv_mesh_final_only_ledge:\|Average: 6.247393 FPS\|Average: 6.596024 FPS \|\|rdata 3ms iter 26ms (frame 158ms)\|rdata 3ms iter 16ms (frame 148ms) Notes: - The improvement can only be noticed if all extracts are multithreaded. - This patch touches different areas of the code, so it can be split into another patch if the idea is accepted. These screenshots show how threads behave in a quadcore: Master: {F10164664} Patch: {F10164666} Differential Revision: https://developer.blender.org/D11558	2021-06-11 10:45:12 -03:00
Jeroen Bakker	259b9c73d0	GPU: Thread safe index buffer builders. Current index builder is designed to be used in a single thread. This makes all index buffer extractions single threaded. This patch adds a thread safe solution enabling multithreaded building of index buffers. To reduce locking the solution would provide a task/thread local index buffer builder (called sub builder). When a thread is finished this thread local index buffer builder can be joined with the initial index buffer builder. `GPU_indexbuf_subbuilder_init`: Initialized a sub builder. The index list is shared between the parent and sub buffer, but the counters are localized. Ensuring that updating counters would not need any locking. `GPU_indexbuf_subbuilder_finish`: merge the information of the sub builder back to the parent builder. Needs to be invoked outside the worker thread, or when sure that all worker threads have been finished. Internal the function is not thread safe. For testing purposes the extract_points extractor has been migrated to the new API. Herefore changes to the mesh extractor were needed. * When creating tasks, the task number of current task is stored in ExtractTaskData including the total number of tasks. * Adding two functions in `MeshExtract`. `task_init` will initialize the task specific userdata. `task_finish` should merge back the task specific userdata back. * adding task_id parameter to the iteration functions so they can access the correct task data without any need for locking. There is no noticeable change in end user performance. Reviewed By: mano-wii Differential Revision: https://developer.blender.org/D11499	2021-06-08 16:36:06 +02:00
Germano Cavalcante	223016a408	GPUIndexBuf: Find the minimum and maximum index through the builder Moving the bounds code to the builder can be useful for future optimizations like building multithreaded. Reviewed By: fclem, jbakker Differential Revision: https://developer.blender.org/D11455	2021-06-07 08:41:38 -03:00
Jeroen Bakker	87055dc71b	GPU: Compute Pipeline. With the compute pipeline calculation can be offloaded to the GPU. This patch only adds the framework for compute. So no changes for users at this moment. NOTE: As this is an OpenGL4.3 feature it must always have a fallback. Use `GPU_compute_shader_support` to check if compute pipeline can be used. Check `gpu_shader_compute*` test cases for usage. This patch also adds support for shader storage buffer objects and device only vertex/index buffers. An alternative that had been discussed was adding this to the `GPUBatch`, this was eventually not chosen as it would lead to more code when used as part of a shading group. The idea is that we add an `eDRWCommandType` in the near future. Reviewed By: fclem Differential Revision: https://developer.blender.org/D10913	2021-05-26 16:49:30 +02:00
Sybren A. Stüvel	16732def37	Cleanup: Clang-Tidy modernize-use-nullptr Replace `NULL` with `nullptr` in C++ code. No functional changes.	2020-11-06 18:08:25 +01:00
Clément Foucault	cd849076d2	Fix T80681 Wireframe is not visible on square planes with 16384 quads Fix wrong logic.	2020-09-15 14:56:02 +02:00
Clément Foucault	4ea93029c6	GPUIndexBuf: GL backend Isolation This is part of the Vulkan backend task T68990. There is no real change, only making some code re-organisation. This also make the IndexBuf completely abstract from outside the GPU module.	2020-09-06 22:13:06 +02:00
Clément Foucault	84d67bd0a9	Cleanup: GPU: Rename GPU_element to GPU_index_buffer Makes it follow the functions names.	2020-09-06 22:13:06 +02:00

39 Commits