griefith/test

Author	SHA1	Message	Date
Brecht Van Lommel	98b3b36411	Refactor: Build: Add bf::dependencies::eigen target To make adding a dependeny on TBB easier. Additional changes: * Using LIB for libmv tests, as it now brings in includes * Removing Eigen header listing in iTaSC Pull Request: https://projects.blender.org/blender/blender/pulls/136865	2025-04-02 16:50:46 +02:00
Jorn Visser	b1bb1d9815	Build: Make the FFTW threads library required to use FFTW This is done because the library is necessary to make certain FFTW functions thread safe, see #136557 as well. Also pass each library variable separately to `find_package_handle_standard_args` instead of as a list, as otherwise it won't correctly detect if `libfftw3f` or `libfftw3f_threads` is missing. This is because CMake considers a value false if it contains `-NOTFOUND` at the end, but not if it's in the middle. For example, CMake considers `.../libfftw3f.a;.../libfftw3f_threads.a;FFTW3_LIBRARY_D-NOTFOUND` to be false, but `.../libfftw3f.a;FFTW3_LIBRARY_THREADS_F-NOTFOUND;.../libfftw3.a` to be true. --- I noticed that some other find modules also have the same list issue. I guess it was done this way to make CMake print all the found libraries instead of only the first. Pull Request: https://projects.blender.org/blender/blender/pulls/136692	2025-03-31 14:42:35 +02:00
Jacques Lucke	202db40afb	Refactor: move uvproject code from blenlib to blenkernel I'm moving this for two (related) reasons: * It depends a lot on the specifics of `Camera` and `Object` data-blocks. * It links `Object::object_to_world()` which is not an inline function and thus easily leads to linker errors. It mostly seems like luck that this is not breaking our build due to early dead code elimination when linking binaries which use the blenlib static library such as `msgfmt`. I found this while working on a compilation tool which would not be as lucky and has a linker error because of the dependence on `Object::object_to_world`. Pull Request: https://projects.blender.org/blender/blender/pulls/136547	2025-03-26 20:51:57 +01:00
Jacques Lucke	ac2cd6c1ef	Geometry Nodes: make CSV parser more reliable and faster This reimplements the CSV parser used by the (still experimental) Import CSV node. Reliability is improved by: * Properly handling quoted fields. * Unit tests. * Generalizing the parser to be able to handle customized delimiter, quote and escape characters (those are not exposed in the node yet though). * More accurate detection of column types by actually taking all values of a column into account instead of only the first row. Performance is improved by designing the parser in a way that supports multi-threaded parsing. I'm measuring about 5x performance improvement which mainly comes from multi-threading. Some files I wanted to use for benchmarking didn't load in the version that's in `main` but do load fine with this new version. The implementation is now split up into two parts: 1. A general CSV parser in `blenlib` that manages splitting a buffer into records and their fields. 2. Application specific parsing of fields into e.g. floats and integers which remains in `io/csv/importer`. This separation simplifies unit testing and makes the core code more reusable. Pull Request: https://projects.blender.org/blender/blender/pulls/134715	2025-02-19 11:10:59 +01:00
Brecht Van Lommel	e2e1984e60	Refactor: Convert remainder of blenlib to C++ A few headers like BLI_math_constants.h and BLI_utildefines.h keep working for C code, for remaining makesdna and userdef defaults code in C. Pull Request: https://projects.blender.org/blender/blender/pulls/134406	2025-02-12 23:01:08 +01:00
Brecht Van Lommel	ec0fc49fb8	Cleanup: Remove unused BLI_blenlib.h	2025-01-31 17:03:18 +01:00
Jacques Lucke	619d9e4e01	Fix #98559 : support applying Geometry Nodes through multires modifier Before, it was only possible to apply modifiers through a multires modifier if they were deform-only. The Geometry Nodes modifier is of course not deform-only. However, often one can build a node setup, that only deforms and does nothing else. To make it possible to apply the Geometry Nodes modifier in such cases the following things had to be done: * Update `BKE_modifier_deform_verts` to work with modifiers that implement `modify_geometry_set` instead of `deform_verts`. * Add error handling for the case when `modify_geometry_set` does more than just deformation. * Allow the Geometry Nodes modifier to be applied through a multi-res modifier. Two new utility types (`ArrayState` and `MeshTopologyState`) have been introduced to allow for efficient and accurate checking whether the topology has been modified. In common cases, they can detect that the topology has not been changed in constant time, but they fall back to linear time checking if it's not immediately obvious that the topology has not been changed. This works with the example files from #98559 and #97603. Pull Request: https://projects.blender.org/blender/blender/pulls/131904	2025-01-23 17:34:30 +01:00
Brecht Van Lommel	920e709069	Refactor: Make header files more clangd and clang-tidy friendly When using clangd or running clang-tidy on headers there are currently many errors. These are noisy in IDEs, make auto fixes impossible, and break features like code completion, refactoring and navigation. This makes source/blender headers work by themselves, which is generally the goal anyway. But #includes and forward declarations were often incomplete. * Add #includes and forward declarations * Add IWYU pragma: export in a few places * Remove some unused #includes (but there are many more) * Tweak ShaderCreateInfo macros to work better with clangd Some types of headers still have errors, these could be fixed or worked around with more investigation. Mostly preprocessor template headers like NOD_static_types.h. Note that that disabling WITH_UNITY_BUILD is required for clangd to work properly, otherwise compile_commands.json does not contain the information for the relevant source files. For more details see the developer docs: https://developer.blender.org/docs/handbook/tooling/clangd/ Pull Request: https://projects.blender.org/blender/blender/pulls/132608	2025-01-07 12:39:13 +01:00
Hans Goudey	70a86b14c0	Cleanup: Repace BLI_range.h with Bounds<float> Pull Request: https://projects.blender.org/blender/blender/pulls/132181	2024-12-20 21:05:40 +01:00
Hans Goudey	0e15649bf8	Cleanup: Remove unused dynlib files This has been unused since the game engine was removed. Pull Request: https://projects.blender.org/blender/blender/pulls/132186	2024-12-20 20:38:40 +01:00
Hans Goudey	31964ef5ca	Cleanup: Move BLI_kdopbvh to C++ Pull Request: https://projects.blender.org/blender/blender/pulls/132031	2024-12-17 21:04:55 +01:00
Campbell Barton	5dd67c6e1c	Cleanup: sort CMake path lists	2024-12-09 09:18:50 +11:00
Hans Goudey	77af97d5b9	Cleanup: Move some blenlib files to C++ Pull Request: https://projects.blender.org/blender/blender/pulls/131319	2024-12-05 14:36:01 +01:00
Germano Cavalcante	1633766c2e	Refactor: move blenlib system_win32 to C++	2024-11-07 17:33:27 -03:00
Hans Goudey	5ed65e6608	Cleanup: Remove unused grease pencil V2 update cache, DLRB tree See #123468. Pull Request: https://projects.blender.org/blender/blender/pulls/129729	2024-11-02 22:18:10 +01:00
Campbell Barton	381898b6dc	Refactor: move BLI_path_util header to C++, rename to BLI_path_utils Move to a C++ header to allow C++ features to be used there, use the "utils" suffix as it's preferred for new files. Ref !128147	2024-09-26 21:13:39 +10:00
Campbell Barton	36edfe04e0	Refactor: move blenlib tempfile to C++	2024-09-26 09:39:35 +10:00
Jesse Yurkovich	447cde140d	Cleanup: Remove unused BLI_array macro implementation Remove unused files. Pull Request: https://projects.blender.org/blender/blender/pulls/127962	2024-09-22 00:53:14 +02:00
Campbell Barton	427be373f7	Cleanup: sort cmake file lists	2024-09-21 16:26:43 +10:00
Aras Pranckevicius	92544d6d76	BLI: add float<->half conversion functions with correct math, use in Vulkan Blender codebase had two ways to convert half (FP16) to float (FP32): - BLI_math_bits.h half_to_float. Out of 64k possible half values, it converts 4096 of them incorrectly. Mostly denormals and NaNs, which is perhaps not too relevant. But more importantly, it converts half zero to float 0.000030517578 which does not sound ideal. - Functions in Vulkan vk_data_conversion.hh. This one converts 2046 possible half values incorrectly. Function to convert float (FP32) to half (FP16) was in Vulkan vk_data_conversion.hh, and it got a bunch of possible inputs wrong. I guess it did not do proper "round to nearest even" that CPU/GPU hardware does. This PR: - Adds BLI_math_half.hh with float_to_half and half_to_float functions. - Documentation and test coverage. - When compiling on ARM NEON, use hardware VCVT instructions. - Removes the incorrect half_to_float from BLI_math_bits.h and replaces single usage of it in View3D color picking to use the new function. - Changes Vulkan FP32<->FP16 conversion code to use the new functions, to fix correctness issues (makes eevee_next_bsdf_vulkan test pass). This makes it faster too. Pull Request: https://projects.blender.org/blender/blender/pulls/127708	2024-09-18 13:15:00 +02:00
Aras Pranckevicius	806b0e8379	BLI: improve 2/3/4d vector codegen for debug or asserts-enabled builds Majority of math operations on VecBase<> were implemented by calling into an indexing operator, sometimes coupled with unroll<Size> template. When compiler optimizations are off (e.g. Debug build), or when asserts are on (e.g. usual "developer" setup), this resulted in codegen that is very sub-optimal. Especially if these vector types are used a lot, e.g. when scaling down a screenshot for saving as a thumbnail into the blend file. Address that by explicit code paths for 4,3,2 dimensional vectors, that avoids both the unroll<> template and indexing operator. To avoid repeated long typo-prone code, do that with C preprocessor :( -- however all of the preprocessor innards are in a separate file BLI_math_vector_unroll.hh so they do not get into the way much. Scaling down a screenshot to the blend file thumbnail, while saving the blend file, on my machine: (4K screen resolution, Ryzen 5950X, VS2022 build), which involves two calls to IMB_scale which uses float4 for pixel operations: - Release with asserts off (what ships to users): no change at 9.4ms - Release with asserts on ("developer" setup): 38.1ms -> 9.4ms - Debug: 226ms -> 64ms - Debug w/ ASAN: 314ms -> 78ms Pull Request: https://projects.blender.org/blender/blender/pulls/127577	2024-09-16 13:06:16 +02:00
Hans Goudey	13f179a9c0	Cleanup: Add utility function to sum offset indices group sizes I've done this a few times and would have benefited from a utility function for it, apparently it's done in a few more places too. The utilities aren't multithreaded for now, it doesn't seem important and often multithreading happens at a different level of the call stack anyway. Pull Request: https://projects.blender.org/blender/blender/pulls/127517	2024-09-12 20:28:35 +02:00
Jonas Holzman	fe93de1a91	Obj-C Refactor: Port `BLI_delete_soft` from `objc_` runtime calls to proper Obj-C Port the macOS version of the `BLI_delete_soft` function from raw runtime `objc_` calls function to proper Objective-C for increased readability and long-term maintainability. This new function is placed in a new `intern/fileops_apple.mm` file, analogous to the existing `intern/storage_apple.mm` file. Pull Request: https://projects.blender.org/blender/blender/pulls/126766	2024-09-03 12:08:20 +02:00
Jacques Lucke	66adedbd78	BLI: optimize constructing IndexMask from bits and bools This patch optimizes `IndexMask::from_bits` by making use of the fact that many bits can be processed at once and one does not have to look at every bit individual in many cases. Bits are stored as array of `BitInt` (aka `uint64_t`). So we can process at least 64 bits at a time. On some platforms we can also make use of SIMD and process up to 128 bits at once. This can significantly improve performance if all bits are set/unset. As a byproduct, this patch also optimizes `IndexMask::from_bools` which is now implemented in terms of `IndexMask::from_bits`. The conversion from bools to bits has been optimized significantly too by using SIMD intrinsics. Pull Request: https://projects.blender.org/blender/blender/pulls/126888	2024-08-29 12:15:33 +02:00
Campbell Barton	b76fcc3a1f	Cleanup: add missing headers to CMake's file listing	2024-08-23 10:19:53 +10:00
Campbell Barton	07b11206eb	Cleanup: sort cmake file lists	2024-08-23 10:19:53 +10:00
Campbell Barton	08d5eb8f9c	Cleanup: cmake formatting	2024-08-21 23:20:34 +10:00
Jacques Lucke	354a097ce0	Volumes: improve file cache and unloading This changes how the lazy-loading and unloading of volume grids works. With that it should also fix #124164. The cache is now moved to a deeper and more global level. This allows reloadable volume grids to be unloaded automatically when a memory limit is reached. The previous system for automatically unloading grids only worked in fairly specific cases and also did not work all that well with caching (parts of) volume sequences. At its core, this patch adds a general cache system in `BLI_memory_cache.hh`. It has a simple interface of the form `get(key, compute_if_not_cached_fn) -> value`. To avoid growing the cache indefinitly, it uses the new `BLI_memory_counter.hh` API to detect when the cache size limit is reached. In this case it can automatically free some cached values. Currently, this uses an LRU system, where the items that have not been used in a while are removed first. Other heuristics can be implemented too, but especially for caches for loading files from disk this works well already. The new memory cache is internally used by `volume_grid_file_cache.cc` for loading individual volume grids and their simplified variants. It could potentially also be used to cache which grids are stored in a file. Additionally, it can potentially also be used as caching layer in more places like loading bakes or in import geometry nodes. It's not clear yet whether this will need an extension to the API which currently is fairly minimal. To allow different systems to use the same memory cache, it has to support arbitrary identifiers for the cached data. Therefore, this patch also introduces `GenericKey`, which is an abstract base class for any kind of key that is comparable, hashable and copyable. The implementation of the cache currently relies on a new `ConcurrentMap` data-structure which is a thin wrapper around `tbb::concurrent_hash_map` with a fallback implementation for when `tbb` is not available. This data structure allows concurrent reads and writes to the cache. Note that adding data to the cache is still serialized because of the memory counting. The size of the cache depends on the `memory_cache_limit` property that's already shown in the user preferences. While it has a generic name, it's currently only used by the VSE which is currently using the `MEM_CacheLimiter` API which has a similar purpose but seems to be less automatic, thread-safe and also has no idea of implicit-sharing. It also seems to be designed in a way where one is expected to create multiple "cache limiters" each of which has its own limit. Longer term, we should probably strive towards unifying these systems, which seems feasible but a bit out of scope right now. While it's not ideal that these cache systems don't use a shared memory limit, it's essentially what we already have for all cache systems in Blender, so it's nothing new. Some tests for lazy-loading had to be removed because this behavior is more implicit now and is not as easily observable from the outside. Pull Request: https://projects.blender.org/blender/blender/pulls/126411	2024-08-19 20:39:32 +02:00
Jacques Lucke	a8667aa03f	Core: introduce MemoryCounter API We often have the situation where it would be good if we could easily estimate the memory usage of some value (e.g. a mesh, or volume). Examples of where we ran into this in the past: * Undo step size. * Caching of volume grids. * Caching of loaded geometries for import geometry nodes. Generally, most caching systems would benefit from the ability to know how much memory they currently use to make better decisions about which data to free and when. The goal of this patch is to introduce a simple general API to count the memory usage that is independent of any specific caching system. I'm doing this to "fix" the chicken and egg problem that caches need to know the memory usage, but we don't really need to count the memory usage without using it for caches. Implementing caching and memory counting at the same time make both harder than implementing them one after another. The main difficulty with counting memory usage is that some memory may be shared using implicit sharing. We want to avoid double counting such memory. How exactly shared memory is treated depends a bit on the use case, so no specific assumptions are made about that in the API. The gathered memory usage is not expected to be exact. It's expected to be a decent approximation. It's neither a lower nor an upper bound unless specified by some specific type. Cache systems generally build on top of heuristics to decide when to free what anyway. There are two sides to this API: 1. Get the amount of memory used by one or more values. This side is used by caching systems and/or systems that want to present the used memory to the user. 2. Tell the caller how much memory is used. This side is used by all kinds of types that can report their memory usage such as meshes. ```cpp /* Get how much memory is used by two meshes together. / MemoryCounter memory; mesh_a->count_memory(memory); mesh_b->count_memory(memory); int64_t bytes_used = memory.counted_bytes(); / Tell the caller how much memory is used. / void Mesh::count_memory(blender::MemoryCounter &memory) const { memory.add_shared(this->runtime->face_offsets_sharing_info, this->face_offsets().size_in_bytes()); / Forward memory counting to lower level types. This should be fairly common. / CustomData_count_memory(this->vert_data, this->verts_num, memory); } void CustomData_count_memory(const CustomData &data, const int totelem, blender::MemoryCounter &memory) { for (const CustomDataLayer &layer : Span{data.layers, data.totlayer}) { memory.add_shared(layer.sharing_info, [&](blender::MemoryCounter &shared_memory) { / Not quite correct for all types, but this is only a rough approximation anyway. / const int64_t elem_size = CustomData_get_elem_size(&layer); shared_memory.add(totelem elem_size); }); } } ``` Pull Request: https://projects.blender.org/blender/blender/pulls/126295	2024-08-15 10:54:21 +02:00
Iliya Katueshenock	4839a86984	Cleanup: BLI: Merge files Deduplicate IndexRange implementation files. Pull Request: https://projects.blender.org/blender/blender/pulls/126169	2024-08-13 20:26:55 +02:00
Sergey Sharybin	baf9691959	BLI: Add easy and portable way of platform-specific checks Covers OS detection, CPU architecture, bitness, and compiler family. The goal of this change is to provide easier to use and remember checks for these things. For example, with this change code like ``` #ifdef _WIN32 ... #elif defined(__APPLE__) \|\| defined(__FreeBSD__) \|\| defined(__NetBSD__) \|\| \ defined(__OpenBSD__) .. #endif ``` becomes ``` #if OS_WIN ... #elif OS_MAC \|\| OS_BSD ... #endif ``` The code is originally based on build_config.h from Chromium, which was first modified for Libmv, then to some other projects, and now is adopted for Blender itself. The checks are relying on the -Wundef to provide hint of cases when an include is missing prior to the platform-specific checks. This change only introduces possibility of cleaner checks and does not start actual refactor. Pull Request: https://projects.blender.org/blender/blender/pulls/118908	2024-08-06 14:33:53 +02:00
Jesse Yurkovich	ec4fc2d34a	CMake: Modernize the optional TBB dependency This continues the cmake modernization effort and introduces support for allowing our optional dependencies to integrate properly. TBB is added here as it's proven troublesome to maintain correctly. Currently the only Blender project which uses the TBB headers directly is `blenlib`. However, all downstream projects which require blenlib as their dependency, and wish to properly make use of its threading facilities, needed to define various TBB items in their CMake files. Not only is this unnecessary and arcane, but several projects didn't do this and ended up not using threading as well as producing ODR violations along the way[1]. This PR makes TBB a modern dependency and exposes it PUBLIC'ly from `blenlib`. All downstream projects which depend on blenlib will now receive everything they require from TBB automatically. This includes the `WITH_TBB` define, the headers, and the library itself. [1] blender/blender@05241f47f5 Pull Request: https://projects.blender.org/blender/blender/pulls/124916	2024-07-19 23:30:56 +02:00
Miguel Pozo	74224b25a5	GPU: Add GPU_shader_batch_create_from_infos This is the first commit of the several required to support subprocess-based parallel compilation on OpenGL. This provides the base API and implementation, and exposes the max subprocesses setting on the UI, but it's not used by any code yet. More information and the rest of the code can be found in #121925. This one includes: - A new `GPU_shader_batch` API that allows requesting the compilation of multiple shaders at once, allowing GPU backed to compile them in parallel and asynchronously without blocking the Blender UI. - A virtual `ShaderCompiler` class that backends can use to add their own implementation. - A `ShaderCompilerGeneric` class that implements synchronous/blocking compilation of batches for backends that don't have their own implementation yet. - A `GLShaderCompiler` that supports parallel compilation using subprocesses. - A new `BLI_subprocess` API, including IPC (required for the `GLShaderCompiler` implementation). - The implementation of the subprocess program in `GPU_compilation_subprocess`. - A new `Max Shader Compilation Subprocesses` option in `Preferences > System > Memory & Limits` to enable parallel shader compilation and the max number of subprocesses to allocate (each subprocess has a relatively high memory footprint). Implementation Overview: There's a single `GLShaderCompiler` shared by all OpenGL contexts. This class stores a pool of up to `GCaps.max_parallel_compilations` subprocesses that can be used for compilation. Each subprocess has a shared memory pool used for sending the shader source code from the main Blender process and for receiving the already compiled shader binary from the subprocess. This is synchronized using a series of shared semaphores. The subprocesses maintain a shader cache on disk inside a `BLENDER_SHADER_CACHE` folder at the OS temporary folder. Shaders that fail to compile are tried to be compiled again locally for proper error reports. Hanged subprocesses are currently detected using a timeout of 30s. Pull Request: https://projects.blender.org/blender/blender/pulls/122232	2024-06-05 18:45:57 +02:00
Omar Emara	d4bf23771d	Compositor: Optimize Fog Glow Glare node This patches optimizes the Fog Glow Glare node to be about 25x faster for 4K images. This is mainly achieved by utilizing the FFTW library and multi-threading support code. Further improvements are still possible by caching kernels, but the CPU compositor does not support caching yet. The old Hartley transform was removed, so the node no longer works when FFTW is disabled as a build time option, much like the OIDN node. A new BLI library was introduced for FFTW, it includes some helper routines relevant for FFTW as well as an initialization routine that sets up multithreading using TBB as well as thread safety. Build system support for threaded FFTW was also added, which defines the relevant variables to detect threading support as well as add the relevant libraries. We do not currently have the threaded FFTW libs in our precompiled libs, so the threading code is disabled until the libs lands in the coming weeks. So currently, the code is only about 9x faster. The only functional change is that the kernel is now odd sized, which should produce more accurate results, but the final result is almost identical and mostly undetectable. The plan is to port this to the GPU as well similar to how we implement OIDN until we have a GPU FFT implementation. GPU compositor can also do caching, so it should be faster, being able to compute a 4K image in under half a second. Pull Request: https://projects.blender.org/blender/blender/pulls/121653	2024-05-17 12:45:21 +02:00
Campbell Barton	ab6e00bd7d	Cleanup: sort cmake file lists	2024-05-06 09:20:57 +10:00
Sergey Sharybin	1b0012d51c	Refactor: Require C++ for users of BLI_simd.h This is because sse2neon.h might be used to emulate SSE intrinsics on ARM64 architecture, and it uses some preprocessor which is not available for C language when using MSVC. The old-style math file math_matrix.c uses this header, so needed to become C++. Simple rename did not work since there is a new math utility math_matrix.cc exists. Following some existing convention the math_matrix.c is renamed to math_matrix_c.cc. Eventually all the code should switch to use C++ style math, and the C style removed, so it seems reasonable to not mix old and new style of API in the same file. There should be no functional changes. Pull Request: https://projects.blender.org/blender/blender/pulls/121335	2024-05-02 16:22:19 +02:00
Jacques Lucke	8d13a9608b	BLI: generalize task size hints for parallel_for This integrates the functionality for `parallel_for_weighted` from `9a3ceb79de` into `parallel_for`. This reduces the number of entry points to the threading API and also makes it easier to build higher level threading primitives. For example, `IndexMask.foreach_*` may use `parallel_for` if a `GrainSize` is provided, but can't use `parallel_for_weighted` easily without duplicating a fair amount of code. The default behavior of `parallel_for` does not change. However, now one can optionally pass in `TaskSizeHints` as the last parameter. This can be used to specify the size of individual tasks relative to each other and relative to the grain size. This helps scheduling more equally sized tasks which generally improves performance because threads are used more effectively. One generally does not construct `TaskSizeHints` manually, but calls either `threading::individual_task_sizes` or `threading::accumulated_task_sizes`. Both allow specifying individual task sizes, but the latter should be used when the combined size of consecutive tasks can be computed in O(1) time. This allows splitting up the work more efficiently. It can often be used in conjunction with `OffsetIndices`. Pull Request: https://projects.blender.org/blender/blender/pulls/121127	2024-04-29 23:55:22 +02:00
Jacques Lucke	51f8bf53b2	Geometry Nodes: use xxhash for compute context hash Previously, md5 was used which is significantly slower. In almost all cases this does not have a significant performance impact in practice. However, it's possible to build geometry nodes setups that become a few percent faster ( by combining lots of cheap node groups). Using xxhash instead of md5 should never be slower. Pull Request: https://projects.blender.org/blender/blender/pulls/120225	2024-04-03 20:11:09 +02:00
Campbell Barton	937776b555	Cleanup: sort CMake file lists	2024-04-01 16:48:44 +11:00
Jacques Lucke	7314c86869	BLI: add fixed width integer type This is intended to be used in the new exact mesh boolean algorithm by @howardt. The new `BLI_fixed_width_int.hh` header provides types like `Int256` and `UInt256` which are like e.g. `uint64_t` but with higher precision. The code supports many different integer sizes. The following operations are supported: * Addition * Subtraction * Multiplication * Comparisons * Negation * Conversion to and from other number types * Conversion to and from string (based on `GMP`) Division is not implemented. It could be implemented, but it's more complex and is not required for the new mesh boolean algorithm. Some alternatives to having a custom implementation have been discussed in https://devtalk.blender.org/t/fixed-length-multiprecision-arithmetic/29189/. Generally, the implementation is fairly straight forward. The main complexity is the addition/multiplication algorithm which isn't too complicated. It's nice to have control over this part as it allows us to optimize the code more if necessary. Also, from what I understand, we might be able to benefit from some special cases like multiplying a large integer with a smaller one. I tried some different ways to optimize this already, but so far the normal compiler optimization turned out to work best. Not sure if the same is true on windows though, as it doesn't have native support for an `int128` which helps the compiler understand what I'm doing. Alternatives I tried so far are using intrinsics directly (mainly `_addcarry_u64` and similar), writing inline assembly manually and copying the assembly output from the compiler. I assume the assembly implementation didn't help for me because it prohibited other compiler optimizations. Pull Request: https://projects.blender.org/blender/blender/pulls/119528	2024-03-25 23:39:42 +01:00
Jacques Lucke	ee1fa8e1ca	BLI: support set operations on index masks The `IndexMask` data structure was designed to allow us to implement set operations like `union`, `intersection` and `difference` efficiently (`2cfcb8b0b8`). This patch adds an evaluator for arbitrary expressions involving the mentioned operations. The evaluator makes use of the design of the `IndexMask` data structure to be quite efficient. In some common cases, the evaluator runs in constant time. So it's very fast even if the mask contains many millions of indices. If possible the evaluator works on entire segments at once instead of looking at the individual indices. This results in a very low constant factor even if the evaluation time is linear. If the evaluator has to look at the individual indices to be able to perform the operation, it can make use of multi-threading. The evaluation consists of the following steps: 1. A coarse evaluation that looks at entire segments at once. 2. All segments that couldn't be fully evaluated by the coarse evaluation are evaluated exactly by looking at the actual indices. There are two evaluators for this case. One that is based on `std::set_union` etc. The other one first converts the index masks to bit spans, then does bit operations to evaluate the expression, and then converts the bits back into indices. Depending on the expression, one or the other can be more efficient. 3. Construct an index mask from the evaluated segments. Showing the performance of the evaluator is kind of difficult because it highly depends on the input data. Comparing the performance to something that does not short-circuit when there are full ranges is meaningless, because one can construct an example where the new evaluator is arbitrarily faster. I'm still working on a case where performance can be compared to e.g. using `std::set_union`. This comparison is only fair when the input data when constructing a case where the new evaluator can't short-circuit. One of the main remaining bottlenecks are the calls to `slice_content` on large index masks. I think the impact of those can still be reduced. We are not using this evaluator much yet, except through `IndexMask::complement` calls. I intend to use it when I get to refactoring the field evaluator for geometry nodes to optimize the evaluation of selections. Pull Request: https://projects.blender.org/blender/blender/pulls/117805	2024-03-17 09:52:32 +01:00
Campbell Barton	9796805bb8	Cleanup: sort CMake source files	2024-03-07 13:29:09 +11:00
Hans Goudey	139607dd26	Cleanup: Move BLI_bitmap_draw_2d.h to C++	2024-03-05 10:28:17 -05:00
Hans Goudey	164eb3c25b	Cleanup: Move lasso utility files to C++	2024-03-05 10:23:11 -05:00
Jacques Lucke	fe2a47b5a7	BLI: add chunked list data structure that uses linear allocator This adds a new special purpose container data structure that can be used to gather many elements into many (potentially small) lists efficiently. I originally worked on this data structure because I might want to use it in #118772. However, also it's useful in the geometry nodes logger already. I'm measuring a 10-20% speed improvement in my many-math-nodes file when I enable logging for all sockets (not just the ones that are currently visible). Pull Request: https://projects.blender.org/blender/blender/pulls/118774	2024-02-28 22:22:21 +01:00
Jacques Lucke	1e20f06c21	BLI: add utility to simplify creating proper random access iterator The difficulty of implementing this iterator is that it requires lots of operator overloads which are usually very simple to implement, but result in a lot of code. The goal of this patch is to abstract the common parts so that it becomes easier to implement random accessor iterators. Many algorithms can work more efficiently with random access iterators than with other iterator types. Also see https://en.cppreference.com/w/cpp/iterator/random_access_iterator Pull Request: https://projects.blender.org/blender/blender/pulls/118113	2024-02-17 20:59:45 +01:00
Campbell Barton	7747b8c944	Fix convexhull_2d_test for macOS & re-enable the test Use EXPECT_NEAR instead of EXPECT_EQ to account for a differences in atan2 implementation on macOS, more generally relying on exact float comparison for tests is error prone.	2024-02-13 14:07:26 +11:00
Hans Goudey	1394907474	Cleanup: Move uvproject.c to C++	2024-02-12 20:43:24 -05:00
Campbell Barton	fb81bbaa60	Tests: disable BLI_convexhull_2d_test which fails on macOS	2024-02-13 00:34:44 +11:00
Campbell Barton	b91918564d	Tests: add tests for convexhull_2d Move BLI_convexhull_aabb_fit_points_2d to a public function to be able to compare compare fitting one convex hull with a simple reference method. One test is disabled as it exposes an error in convex hull calculation which needs further investigation.	2024-02-12 20:17:19 +11:00

1 2 3 4 5 ...

426 Commits