test2

Author	SHA1	Message	Date
Brecht Van Lommel	841ae6e8ab	Fix part of #131933 : Crash with playback of deforming subdivision surface The `ForeachContext` in `deform_coarse_vertices` does not use TLS but still has a `func_free` callback set. Change the task API to allow this. Pull Request: https://projects.blender.org/blender/blender/pulls/132498	2025-01-02 12:21:56 +01:00
Campbell Barton	b93ddf30e9	Unbreak build WITH_TBB=OFF	2024-04-30 12:12:02 +10:00
Jacques Lucke	8d13a9608b	BLI: generalize task size hints for parallel_for This integrates the functionality for `parallel_for_weighted` from `9a3ceb79de` into `parallel_for`. This reduces the number of entry points to the threading API and also makes it easier to build higher level threading primitives. For example, `IndexMask.foreach_*` may use `parallel_for` if a `GrainSize` is provided, but can't use `parallel_for_weighted` easily without duplicating a fair amount of code. The default behavior of `parallel_for` does not change. However, now one can optionally pass in `TaskSizeHints` as the last parameter. This can be used to specify the size of individual tasks relative to each other and relative to the grain size. This helps scheduling more equally sized tasks which generally improves performance because threads are used more effectively. One generally does not construct `TaskSizeHints` manually, but calls either `threading::individual_task_sizes` or `threading::accumulated_task_sizes`. Both allow specifying individual task sizes, but the latter should be used when the combined size of consecutive tasks can be computed in O(1) time. This allows splitting up the work more efficiently. It can often be used in conjunction with `OffsetIndices`. Pull Request: https://projects.blender.org/blender/blender/pulls/121127	2024-04-29 23:55:22 +02:00
Campbell Barton	57dd9c21d3	Cleanup: spelling in comments	2024-03-21 10:02:53 +11:00
Jacques Lucke	b99c1abc3a	BLI: speedup memory bandwidth bound tasks by reducing threading This improves performance by reducing the amounts of threads used for tasks which require a high memory bandwidth. This works because the underlying hardware has a certain maximum memory bandwidth. If that is used up by a few threads already, any additional threads wanting to use a lot of memory will just cause more contention which actually slows things down. By reducing the number of threads that can perform certain tasks, the remaining threads are also not locked up doing work that they can't do efficiently. It's best if there is enough scheduled work so that these tasks can do more compute intensive tasks instead. To use this new functionality, one has to put the parallel code in question into a `threading::memory_bandwidth_bound_task(...)` block. Additionally, one also has to provide a (very) rough approximation for how many bytes are accessed. If the number is low, the number of threads shouldn't be reduced because it's likely that all touched memory can be in L3 cache which generally has a much higher bandwidth than main memory. The exact number of threads that are allowed to do bandwidth bound tasks at the same time is generally highly context and hardware dependent. It's also not really possible to measure reliably because it depends on so many static and dynamic factors. The thread count is now hardcoded to 8. It seems that this many threads are easily capable of maxing out the bandwidth capacity. With this technique I can measure surprisingly good performance improvements: * Generating a 3000x3000 grid: 133ms -> 103ms. * Generating a mesh line with 100'000'000 vertices: 212ms -> 189ms. * Realize mesh instances resulting in ~27'000'000 vertices: 460ms -> 305ms. In all of these cases, only 8 instead of 24 threads are used. The remaining threads are idle in these cases, but they could do other work if available. Pull Request: https://projects.blender.org/blender/blender/pulls/118939	2024-03-19 18:23:56 +01:00
Jacques Lucke	9a3ceb79de	BLI: add weighted parallel for function The standard `threading::parallel_for` function tries to split the range into uniformly sized subranges. This is great if each element takes approximately the same amount of time to compute. However, there are also situations where the time required to do the work for a single index differs significantly between different indices. In such a case, it's better to split the tasks into segments while taking the size of each task into account. This patch implements `threading::parallel_for_weighted` which allows passing in an additional callback that returns the size of each task. Pull Request: https://projects.blender.org/blender/blender/pulls/118348	2024-02-25 15:01:05 +01:00
Campbell Barton	de18b629f0	Cleanup: unused includes in source/blender/blenlib Remove 30 includes.	2024-02-13 11:07:14 +11:00
Campbell Barton	611930e5a8	Cleanup: use std::min/max instead of MIN2/MAX2 macros	2023-11-07 16:33:19 +11:00
Campbell Barton	5fbcb4c27e	Cleanup: remove spaces from commented arguments Also use local enums for `MA_BM_*` in versioning code.	2023-09-22 12:21:18 +10:00
Campbell Barton	e955c94ed3	License Headers: Set copyright to "Blender Authors", add AUTHORS Listing the "Blender Foundation" as copyright holder implied the Blender Foundation holds copyright to files which may include work from many developers. While keeping copyright on headers makes sense for isolated libraries, Blender's own code may be refactored or moved between files in a way that makes the per file copyright holders less meaningful. Copyright references to the "Blender Foundation" have been replaced with "Blender Authors", with the exception of `./extern/` since these this contains libraries which are more isolated, any changed to license headers there can be handled on a case-by-case basis. Some directories in `./intern/` have also been excluded: - `./intern/cycles/` it's own `AUTHORS` file is planned. - `./intern/opensubdiv/`. An "AUTHORS" file has been added, using the chromium projects authors file as a template. Design task: #110784 Ref !110783.	2023-08-16 00:20:26 +10:00
Sergey Sharybin	c1bc70b711	Cleanup: Add a copyright notice to files and use SPDX format A lot of files were missing copyright field in the header and the Blender Foundation contributed to them in a sense of bug fixing and general maintenance. This change makes it explicit that those files are at least partially copyrighted by the Blender Foundation. Note that this does not make it so the Blender Foundation is the only holder of the copyright in those files, and developers who do not have a signed contract with the foundation still hold the copyright as well. Another aspect of this change is using SPDX format for the header. We already used it for the license specification, and now we state it for the copyright as well, following the FAQ: https://reuse.software/faq/	2023-05-31 16:19:06 +02:00
Jacques Lucke	f6d824bca6	BLI: move tbb part of parallel_for to implementation file Previously, `tbb::parallel_for` was instantiated every time `threading::parallel_for` is used. However, when actual parallelism is used, the overhead of a function call is negilible. Therefor it is possible to move that part out of the header without causing noticable performance regressions. This reduces the size of the Blender binary from 308.2 to 303.5 MB, which is a reduction of about 1.5%.	2023-05-21 13:31:32 +02:00
Hans Goudey	97746129d5	Cleanup: replace UNUSED macro with commented args in C++ code This is the conventional way of dealing with unused arguments in C++, since it works on all compilers. Regex find and replace: `UNUSED$(\w+)$` -> `/$1/`	2022-10-03 17:38:16 -05:00
Jacques Lucke	5c81d3bd46	Geometry Nodes: improve evaluator with lazy threading In large node setup the threading overhead was sometimes very significant. That's especially true when most nodes do very little work. This commit improves the scheduling by not using multi-threading in many cases unless it's likely that it will be worth it. For more details see the comments in `BLI_lazy_threading.hh`. Differential Revision: https://developer.blender.org/D15976	2022-09-20 11:08:05 +02:00
Campbell Barton	c434782e3a	File headers: SPDX License migration Use a shorter/simpler license convention, stops the header taking so much space. Follow the SPDX license specification: https://spdx.org/licenses - C/C++/objc/objc++ - Python - Shell Scripts - CMake, GNUmakefile While most of the source tree has been included - `./extern/` was left out. - `./intern/cycles` & `./intern/atomic` are also excluded because they use different header conventions. doc/license/SPDX-license-identifiers.txt has been added to list SPDX all used identifiers. See P2788 for the script that automated these edits. Reviewed By: brecht, mont29, sergey Ref D14069	2022-02-11 09:14:36 +11:00
Campbell Barton	8e8a6b80cf	Cleanup: replace BLI_assert(!"text") with BLI_assert_msg(0, "text") This shows the text as part of the assertion message.	2021-07-15 18:29:01 +10:00
Brecht Van Lommel	fcc844f8fb	BLI: use explicit task isolation, no longer part of parallel operations After looking into task isolation issues with Sergey, we couldn't find the reason behind the deadlocks that we are getting in T87938 and a Sprite Fright file involving motion blur renders. There is no apparent place where we adding or waiting on tasks in a task group from different isolation regions, which is what is known to cause problems. Yet it still hangs. Either we do not understand some limitation of TBB isolation, or there is a bug in TBB, but we could not figure it out. Instead the idea is to use isolation only where we know we need it: when holding a mutex lock and then doing some multithreaded operation within that locked region. Three places where we do this now: * Generated images * Cached BVH tree building * OpenVDB lazy grid loading Compared to the more automatic approach previously used, there is the downside that it is easy to miss places where we need isolation. Yet doing it more automatically is also causing unexpected issue and bugs that we found no solution for, so this seems better. Patch implemented by Sergey and me. Differential Revision: https://developer.blender.org/D11603	2021-06-15 17:28:44 +02:00
Brecht Van Lommel	677e63d518	TBB: fix deprecation warnings with newer TBB versions * USD and OpenVDB headers use deprecated TBB headers, suppress all deprecation warnings there since we have no control over them. * For our own TBB includes, use the individual headers rather than the tbb.h that includes everything to avoid warnings, rather than suppressing all. This is in anticipation of the TBB 2020 upgrade in D10359. Ref D10361.	2021-02-10 19:32:24 +01:00
Sybren A. Stüvel	958df2ed1b	Cleanup: Clang-Tidy, modernize-deprecated-headers No functional changes.	2020-12-04 11:28:09 +01:00
Sybren A. Stüvel	16732def37	Cleanup: Clang-Tidy modernize-use-nullptr Replace `NULL` with `nullptr` in C++ code. No functional changes.	2020-11-06 18:08:25 +01:00
Jacques Lucke	4a5389816b	Clang-Tidy: enable readability-named-parameter	2020-07-03 17:07:13 +02:00
Brecht Van Lommel	183ba284f2	Cleanup: make guarded memory allocation always thread safe Previously this would be enabled when threads were used, but threads are now basically always in use so there is no point. Further, this is only needed for guarded allocation with --debug-memory which is not performance critical.	2020-05-20 01:03:05 +02:00
Brecht Van Lommel	33fc42bd65	Merge branch 'blender-v2.83-release'	2020-05-20 00:46:15 +02:00
Jeroen Bakker	08ac4d3d71	Fix T76553: Blender Freezes When Playing Back Animation In some cases blender could freeze. When threads are blocked (waiting for other tasks completion) the scheduler can let the thread perform a different task. If this task wants a write-lock for something that was read-locked in the stack a dead lock will happen. For task pools every task is isolated. For range tasks the inner loop will be isolated. The implementation is limited as isolation in TBB uses functors which are tricky to add to a C API. We decided to start with a simple and adapt were we need to. During testing we came to this setup as it was reliable (we weren't able to let it freeze or crash) and didn't had noticeable performance impact. Reviewed By: Brecht van Lommel Differential Revision: https://developer.blender.org/D7688	2020-05-14 13:54:16 +02:00
Brecht Van Lommel	d8a3f3595a	Task: Use TBB as Task Scheduler This patch enables TBB as the default task scheduler. TBB stands for Threading Building Blocks and is developed by Intel. The library contains several threading patters. This patch maps blenders BLI_task_* function to their counterpart. After this patch we can add more patterns. A promising one is TBB:graph that can be used for depsgraph, draw manager and compositor. Performance changes depends on the actual hardware. It was tested on different hardwares from laptops to workstations and we didn't detected any downgrade of the performance. * Linux Xeon E5-2699 v4 got FPS boost from 12 to 17 using Spring's 04_010_A.anim.blend. * AMD Ryzen Threadripper 2990WX 32-Core Animation playback goes from 9.5-10.5 FPS to 13.0-14.0 FPS on Agent 327 , 10_03_B.anim.blend. Reviewed By: brecht, sergey Differential Revision: https://developer.blender.org/D7475	2020-04-30 08:09:21 +02:00

25 Commits