griefith/test

Author	SHA1	Message	Date
Jacques Lucke	b99c1abc3a	BLI: speedup memory bandwidth bound tasks by reducing threading This improves performance by reducing the amounts of threads used for tasks which require a high memory bandwidth. This works because the underlying hardware has a certain maximum memory bandwidth. If that is used up by a few threads already, any additional threads wanting to use a lot of memory will just cause more contention which actually slows things down. By reducing the number of threads that can perform certain tasks, the remaining threads are also not locked up doing work that they can't do efficiently. It's best if there is enough scheduled work so that these tasks can do more compute intensive tasks instead. To use this new functionality, one has to put the parallel code in question into a `threading::memory_bandwidth_bound_task(...)` block. Additionally, one also has to provide a (very) rough approximation for how many bytes are accessed. If the number is low, the number of threads shouldn't be reduced because it's likely that all touched memory can be in L3 cache which generally has a much higher bandwidth than main memory. The exact number of threads that are allowed to do bandwidth bound tasks at the same time is generally highly context and hardware dependent. It's also not really possible to measure reliably because it depends on so many static and dynamic factors. The thread count is now hardcoded to 8. It seems that this many threads are easily capable of maxing out the bandwidth capacity. With this technique I can measure surprisingly good performance improvements: * Generating a 3000x3000 grid: 133ms -> 103ms. * Generating a mesh line with 100'000'000 vertices: 212ms -> 189ms. * Realize mesh instances resulting in ~27'000'000 vertices: 460ms -> 305ms. In all of these cases, only 8 instead of 24 threads are used. The remaining threads are idle in these cases, but they could do other work if available. Pull Request: https://projects.blender.org/blender/blender/pulls/118939	2024-03-19 18:23:56 +01:00
Campbell Barton	38dc888d7f	Cleanup: use ELEM macro, remove redundant "struct"	2024-03-19 14:17:47 +11:00
Jacques Lucke	ee1fa8e1ca	BLI: support set operations on index masks The `IndexMask` data structure was designed to allow us to implement set operations like `union`, `intersection` and `difference` efficiently (`2cfcb8b0b8`). This patch adds an evaluator for arbitrary expressions involving the mentioned operations. The evaluator makes use of the design of the `IndexMask` data structure to be quite efficient. In some common cases, the evaluator runs in constant time. So it's very fast even if the mask contains many millions of indices. If possible the evaluator works on entire segments at once instead of looking at the individual indices. This results in a very low constant factor even if the evaluation time is linear. If the evaluator has to look at the individual indices to be able to perform the operation, it can make use of multi-threading. The evaluation consists of the following steps: 1. A coarse evaluation that looks at entire segments at once. 2. All segments that couldn't be fully evaluated by the coarse evaluation are evaluated exactly by looking at the actual indices. There are two evaluators for this case. One that is based on `std::set_union` etc. The other one first converts the index masks to bit spans, then does bit operations to evaluate the expression, and then converts the bits back into indices. Depending on the expression, one or the other can be more efficient. 3. Construct an index mask from the evaluated segments. Showing the performance of the evaluator is kind of difficult because it highly depends on the input data. Comparing the performance to something that does not short-circuit when there are full ranges is meaningless, because one can construct an example where the new evaluator is arbitrarily faster. I'm still working on a case where performance can be compared to e.g. using `std::set_union`. This comparison is only fair when the input data when constructing a case where the new evaluator can't short-circuit. One of the main remaining bottlenecks are the calls to `slice_content` on large index masks. I think the impact of those can still be reduced. We are not using this evaluator much yet, except through `IndexMask::complement` calls. I intend to use it when I get to refactoring the field evaluator for geometry nodes to optimize the evaluation of selections. Pull Request: https://projects.blender.org/blender/blender/pulls/117805	2024-03-17 09:52:32 +01:00
Hans Goudey	b5082f6640	Refactor: Simplify BLI_serialize.hh for asset indexer - Remove the unnecessary `ContainerValue` from the class hierarchy - Construct `StringValue` with a `std::string` by value to avoid copies - Remove some indirection by using type names directly instead of aliases - Use utility methods to lookup/append specific data types for arrays/dicts - Simplify conversion from unique_ptr to shared_ptr - Avoid use of `new` and `delete` - Avoid creating maps of all elements in vector for a single lookup	2024-03-13 14:52:57 -04:00
Campbell Barton	e33f5e36ac	Cleanup: spacing around C-style comment blocks	2024-03-09 23:40:57 +11:00
Omar Emara	a444a5eeba	Fix: Byte interpolation with clamped boundary returns zero The byte BLI image interpolation function with clamped boundary returns zero for out of bound pixels. This is the same as #119164, but for byte interpolation. Pull Request: https://projects.blender.org/blender/blender/pulls/119173	2024-03-08 07:50:01 +01:00
Campbell Barton	f3e0e39df5	Cleanup: use const pointers where camera data isn't modified	2024-03-08 17:15:08 +11:00
Hans Goudey	744f3b2823	Cleanup: Grammar in comments: Fix uses of "own" "Own" (the adjective) cannot be used on its own. It should be combined with something like "its own", "our own", "her own", or "the object's own". It also isn't used separately to mean something like "separate". Also, "its own" is correct instead of "it's own" which is a misues of the verb.	2024-03-07 16:23:35 -05:00
Omar Emara	5ab0cc8e74	Fix: Interpolation with clamped boundary returns zero The BLI image interpolation function with clamped boundary returns zero for out of bound pixels. That's because the neighbour pixel wrapping condition disregarded the border template argument. To fix this, only handle that condition if in border mode. Pull Request: https://projects.blender.org/blender/blender/pulls/119164	2024-03-07 15:34:42 +01:00
Anthony Roberts	445fd42c61	Windows: Add ARM64 support * Only works on machines with a Qualcomm Snapdragon 8cx Gen3 or above. Older generation devices are not and will not be supported due to some driver issues * Requires VS2022 for building. * Uses new MSVC preprocessor for sse2neon compatibility. * SIMD is not enabled, waiting on conversion of blenlib to C++. Ref #119126 Pull Request: https://projects.blender.org/blender/blender/pulls/117036	2024-03-06 16:14:34 +01:00
Campbell Barton	d686699316	Cleanup: various non-functional C++ changes	2024-03-06 14:47:29 +11:00
Hans Goudey	5993c517bd	Cleanup: Use C++ Array, Span, int2 for lasso coords	2024-03-05 11:29:04 -05:00
Hans Goudey	139607dd26	Cleanup: Move BLI_bitmap_draw_2d.h to C++	2024-03-05 10:28:17 -05:00
Hans Goudey	164eb3c25b	Cleanup: Move lasso utility files to C++	2024-03-05 10:23:11 -05:00
Campbell Barton	c789a938d9	Cleanup: remove temporary directory creation	2024-03-05 09:54:49 +11:00
Campbell Barton	5af4987456	Merge branch 'blender-v4.1-release'	2024-03-04 12:21:50 +11:00
Campbell Barton	51126fab33	BLI_tempfile: ensure the temporary directory is absolute While unreported, there is nothing preventing CWD relative temporary directories being used. Resolve asserts & errors if the CWD changes at run-time.	2024-03-04 12:20:44 +11:00
Campbell Barton	1b514659ca	Cleanup: minor changes to temp directory API - Pass null instead of an empty string to BKE_tempdir_init because the string isn't meant to be used. - Never pass null to BLI_temp_directory_path_copy_if_valid (the caller must check). - Additional comments for which checks are performed & why from discussion about #95411.	2024-03-04 11:42:02 +11:00
casey bianco-davis	3d136d0d00	BLI: Add support for non-square matrix multiplication. Adds support for multiplying non-square non-equal matrices. Co-authored-by: Clément Foucault <foucault.clem@gmail.com> Pull Request: https://projects.blender.org/blender/blender/pulls/115783	2024-03-03 16:26:04 +01:00
Campbell Barton	da2ac8ee92	Merge branch 'blender-v4.1-release'	2024-02-29 22:04:23 +11:00
Campbell Barton	c19cdc343f	Fix assert with temporary directories beginning with "//" - Skip leading forward slashes when setting the temp directory. - Add a utility function to set the temporary directory which is used for the user preferences & environment variables. This issue was raised by #95411 where "//" resolves to "/", then asserts when passed to Blender's file-system functions. However the crash referenced in this report looks to be caused by Collada failing to write to the temporary directory which can be handled separately. Ref !118872	2024-02-29 22:01:44 +11:00
Hans Goudey	d338261c55	Cleanup: Pass Span by value Also pass Span instead of `const Array &` and use parantheses for BLI includes.	2024-02-27 23:09:54 -05:00
Iliya Katueshenock	849279b8f1	Cleanup: Collapsible brackets in macros Fix of collapsible brackets in Notepad++.	2024-02-27 21:51:41 +01:00
Campbell Barton	5db2a842c0	Unbreak build with GLIBC pre 2.28 Also de-duplicate rename logic for Linux & other UNIX systems.	2024-02-26 10:15:54 +11:00
Jacques Lucke	9a3ceb79de	BLI: add weighted parallel for function The standard `threading::parallel_for` function tries to split the range into uniformly sized subranges. This is great if each element takes approximately the same amount of time to compute. However, there are also situations where the time required to do the work for a single index differs significantly between different indices. In such a case, it's better to split the tasks into segments while taking the size of each task into account. This patch implements `threading::parallel_for_weighted` which allows passing in an additional callback that returns the size of each task. Pull Request: https://projects.blender.org/blender/blender/pulls/118348	2024-02-25 15:01:05 +01:00
Campbell Barton	91895bf806	Unbreak build with GLIBC pre 2.28 Also de-duplicate rename logic for Linux & other UNIX systems.	2024-02-25 22:56:22 +11:00
Sebastian Parborg	8aed44471e	Merge branch 'blender-v4.1-release'	2024-02-22 14:28:04 +01:00
Sebastian Parborg	b4610f8fc0	Fix #116049 , #117754 : Renaming fails on linux with certain filesystems Not all filesystems on linux supports the RENAME_NOREPLACE flag. If we get a EINVAL return value, retry with a non atomic operation. RENAME_NOREPLACE was introduced in `050d48edfc`, so this is a regression fix as well. Pull Request: https://projects.blender.org/blender/blender/pulls/118571	2024-02-22 14:23:54 +01:00
Jacques Lucke	50709ca253	BLI: add named constructors for IndexRange Unless you're very familiar with `IndexRange`, it's often hard to know what e.g. `IndexRange(10, 15)` means. Without more context, one could think that it means `10-14`, `10-15` or `10-24`. This patch adds named constructors to `IndexRange` to make the behavior more obvious when writing and when reading the code. With those one can use `IndexRange::from_begin_end(10, 15)`, `IndexRange::from_begin_end_inclusive(10, 15)` or `IndexRange::from_begin_size(10, 15)` respectively. While being a bit more verbose, the explicitness makes code easier to understand and also allows abstracting away some common index computations. The old unnamed constructor that takes a begin and size is not removed by this patch, as that would make the patch significantly bigger. I think it's reasonable to generally use the named constructors going forward and to change the existing usages of the old constructor over time. Pull Request: https://projects.blender.org/blender/blender/pulls/118606	2024-02-22 12:57:10 +01:00
Campbell Barton	d4aedd89d0	Cleanup: spelling in comments	2024-02-22 22:40:46 +11:00
Julian Eisel	99673edd85	Cleanup: Add method to get UUID as std::string Avoids having to use the C-style `BLI_uuid_format()` function with manual buffer management, and makes it easy to get a `std::string` from a UUID.	2024-02-20 15:20:11 +01:00
Jacques Lucke	148cad93e3	BLI: simplify creating index masks from group ids Pull Request: https://projects.blender.org/blender/blender/pulls/118498	2024-02-20 13:18:16 +01:00
Sybren A. Stüvel	1ee414feb0	Cleanup: avoid compiler warning when USE_BRUTE_FORCE_ASSERT is undefined Avoid 'unused variable' compiler warning when `USE_BRUTE_FORCE_ASSERT` is not defined, in release mode builds. No functional changes.	2024-02-19 17:19:12 +01:00
Brecht Van Lommel	0f2064bc3b	Revert changes from main commits that were merged into blender-v4.1-release The last good commit was `4bf6a2e564`.	2024-02-19 15:59:59 +01:00
Hans Goudey	81a63153d0	Despgraph: Rename "copy-on-write" to "copy-on-evaluation" The depsgraph CoW mechanism is a bit of a misnomer. It creates an evaluated copy for data-blocks regardless of whether the copy will actually be written to. The point is to have physical separation between original and evaluated data. This is in contrast to the commonly used performance improvement of keeping a user count and copying data implicitly when it needs to be changed. In Blender code we call this "implicit sharing" instead. Importantly, the dependency graph has no idea about the _actual_ CoW behavior in Blender. Renaming this functionality in the despgraph removes some of the confusion that comes up when talking about this, and will hopefully make the depsgraph less confusing to understand initially too. Wording like "the evaluated copy" (as opposed to the original data-block) has also become common anyway. Pull Request: https://projects.blender.org/blender/blender/pulls/118338	2024-02-19 15:54:08 +01:00
Campbell Barton	14b5912eee	Cleanup: quiet C4551 warning for MSVC	2024-02-19 09:34:41 +11:00
Campbell Barton	5ae0b0c7f4	Cleanup: use the term "sincos" in convexhull_2d for clarity The 2D vector calculated from edge vectors represents sin & cos which wasn't obvious.	2024-02-16 14:26:51 +11:00
Campbell Barton	503d56e2c8	Cleanup: use const variables in convexhull_2d_sorted for clarity	2024-02-16 14:26:49 +11:00
Campbell Barton	5c87dfd269	Cleanup: use BLI_time_ prefix for time functions Also use the term "now" instead of "check" for clarity.	2024-02-15 13:15:56 +11:00
Hans Goudey	61e61ce0e1	Cleanup: Use Span instead of Vector const reference Span is preferrable since it's agnostic of the source container, makes it clearer that there is no ownership, is 8 bytes smaller, and can be passed by value.	2024-02-14 17:23:01 -05:00
Hans Goudey	1c0f374ec3	Object: Move transform matrices to runtime struct The `object_to_world` and `world_to_object` matrices are set during depsgraph evaluation, calculated from the object's animated location, rotation, scale, parenting, and constraints. It's confusing and unnecessary to store them with the original data in DNA. This commit moves them to `ObjectRuntime` and moves the matrices to use the C++ `float4x4` type, giving the potential for simplified code using the C++ abstractions. The matrices are accessible with functions on `Object` directly since they are used so commonly. Though for write access, directly using the runtime struct is necessary. The inverse `world_to_object` matrix is often calculated before it's used, even though it's calculated as part of depsgraph evaluation. Long term we might not want to store this in `ObjectRuntime` at all, and just calculate it on demand. Or at least we should remove the redundant calculations. That should be done separately though. Pull Request: https://projects.blender.org/blender/blender/pulls/118210	2024-02-14 16:14:49 +01:00
Campbell Barton	aa6ab9caf9	Cleanup: various non-functional changes for C++	2024-02-14 13:56:58 +11:00
Campbell Barton	1111dff0a6	Tests: improvements to BLI_convexhull_2d_test The convex hull tests included a reference AABB-fitting function for comparison which was used to validate the optimized implementation. This wasn't great as it depended on matching exact return values and didn't test the logic of AABB-fitting worked usefully. Replace this with a more general test that creates random polygons with known bounds, apply a random rotation & translation, then use AABB-fitting to un-rotate the points, passing when the bounds are no larger than the size of the generated input. Details: - Make BLI_convexhull_aabb_fit_hull_2d a static function again as it was only exposed for tests. Use BLI_convexhull_aabb_fit_points_2d instead. - Remove brute force reference implementation from tests, moving this to an assertion within convexhull_2d (disabled by default since it's quite slow).	2024-02-14 13:42:14 +11:00
Campbell Barton	3f8cd44485	Cleanup: move BLI_strict_flags.h last, not that it should be kept last Also add a note in the header why it should be kept last.	2024-02-14 13:40:31 +11:00
Germano Cavalcante	c9bd326255	Merge branch 'blender-v4.1-release'	2024-02-13 20:35:52 -03:00
Germano Cavalcante	c6e229d3e4	Fix #118221 : Snap to Edge with Constraint Plane shifts out of plane The intersection needs to be calculated with the plane passing through the snap pivot.	2024-02-13 20:35:08 -03:00
Hans Goudey	9cf304160b	BLI: Add missing overrides to some generic virtual array implementations The lack of these functions in the "single trivial value" and "sliced GVArray" implementations caused some code to call fack to the base class functions. Those are much slower since they involve a virtual function call per element. For example, this changed the runtime of creating a new boolean attribute set to "true" on one million faces from 3.4 ms to 0.35 ms. Pull Request: https://projects.blender.org/blender/blender/pulls/118161	2024-02-13 19:59:58 +01:00
Germano Cavalcante	1dd163c2f7	Fix: build error with 'WITH_CXX_GUARDEDALLOC'	2024-02-13 10:59:56 -03:00
Jacques Lucke	cd0e41c73e	BLI: improve printing of IndexMask The new printed format is like this: `(Size: 503 \| 0-499, 555, 699, 900)`.	2024-02-13 12:33:48 +01:00
Jacques Lucke	bce1edc2bd	BLI: add IndexMask.shift method	2024-02-13 12:33:48 +01:00

1 2 3 4 5 ...

5169 Commits