Commit Graph

6782 Commits

Author SHA1 Message Date
Campbell Barton
49bf7ebbdd Cleanup: use const args & variables, remove redundant checks
- Declare const variables & arguments.
- Remove redundant null checks.
- Remove break after return.
- Replace suspicious "&" with "&&".
2024-04-15 09:50:47 +10:00
Jacques Lucke
c819d9fdc9 Fix #120579: incorrect compute context hashes
The problem was that `XXH3_128bits` was called on `len` bytes
and not `HashSizeInBytes + len` as before 51f8bf53b2.
This lead to more compute context duplicates that one would expect.

I changed the code a little bit to make this mistake less likely in case
the hash function is ever changed to something else.
2024-04-14 13:20:32 +02:00
Jacques Lucke
769a9069de BLI: use float instead of int for weights in string search
Floats are a bit more convenient to deal with. Also, I find myself
expecting this to be a float on the call site.
2024-04-12 14:34:17 +02:00
Campbell Barton
3a8cceee7d Cleanup: remove redundant checks & assignments 2024-04-11 20:47:07 +10:00
Campbell Barton
962d2ca6a6 Cleanup: use a const ListBase argument to BLI_uniquename
The list-base isn't manipulated, only the link argument.
2024-04-11 17:44:27 +10:00
Campbell Barton
a70f667f8b Cleanup: pass args by reference instead of value in mesh_boolean.cc 2024-04-11 17:44:27 +10:00
Campbell Barton
09ee8d97e6 Cleanup: use C-style comments for descriptive text 2024-04-11 17:44:27 +10:00
Iliya Katueshenock
6bafe65d28 Mesh: Calculate edges with VectorSet instead of Map
Due to legacy reasons (`MEdge`), edge calculation was being done with
idea that edges cannot be temporarily copied. But today, edges are just
`int2`, so using `edge *` instead of `edge` actually made things worse.
And since `OrderedEdge` itself is the same thing as `int2`, it does not
make sense to use `Map` for edges. So, now edges are in a hash set.
To be able to take index of edges, `VectorSet` is used.

The only functional change now is that original edges will be reordered
as well. This should be okay just like an unintentional but stable
indices change.

For 2'000 x 2'000 x 2'000 cube edges calculation, change is around
`3703.47` -> `2911.18` ms.

In order to reduce memory usage, a template parameter is added to
`VectorSet` slots, so they can use a 32 instead of 64 bit index type.
Without that, the performance change is not consistent and might not be
better on a computer with more memory bandwidth.

Co-authored-by: Hans Goudey <hans@blender.org>

Pull Request: https://projects.blender.org/blender/blender/pulls/120224
2024-04-11 04:33:25 +02:00
Clément Foucault
99ebc1f7d3 BLI: Fix inverted 0 determinant for infinite orthographic projection
This was creating issues with triangle winding order since
the resulting matrix was degenerate (0 determinant). Which
caused the Draw manager to wrongly invert the winding.

This fixes a bug in EEVEE-Next mesh voxelization for volume
rendering (with accurate method).
2024-04-10 15:53:51 +02:00
Falk David
7ce0b625cb BLI: IndexMask: Add binary set operations
The `IndexMask` class already had a static function `from_union`.
This adds two new functions `from_difference` and `from_intersection`
as well as tests for each of them.

It also uses `from_intersection` in two grease pencil utility functions.

Pull Request: https://projects.blender.org/blender/blender/pulls/120419
2024-04-09 12:08:14 +02:00
Campbell Barton
e01525cf2c Cleanup: remove redundant variables & assignments
Co-authored-by: Sean Kim <SeanCTKim@protonmail.com>
2024-04-09 13:52:41 +10:00
Clément Foucault
4a7e98be40 EEVEE-Next: Volume: Fragment shader voxelization
This replaces the compute shader pass for volume material properties
voxelization by a fragment shader that is run only once per pixel.
The fragment shader then execute the nodetree in a loop for each
individual froxel.

The motivations are:
- faster evaluation of homogenous materials: can evaluate nodetree
  once and fast write the properties for all froxel in a loop.
  This matches cycles homogenous material optimization (except that
  it only considers the first hit).
- no invocations for empty froxels: not restricted to box dispach.
- support for more than one material: invocations are per pixel.
- cleaner implementation (no compute shader specific paths).

Implementation wise, this is done by adding a stencil texture when
rendering volumetric objects. It is populated during the occupancy
phase but it is not directly used (the stencil test is enabled but
since we use `imageAtomic` to set the occupancy bits, the fragment
shader is forced to be run). The early depth-test is then turned
on for the material properties pass, allowing only one fragment to
be invoked.
This fragment runs the nodetree at the desired frequency: once per
direction (homogenous), or once per froxel (heterogenous).

Note that I tried to use the frontmost fragment using a depth equal
test but it was failing for some reason on Apple silicon producing
flickering artifacts. We might reconsider this frontmost fragment
approach later since the result is now face order dependant when
an object has multiple materials.

Pull Request: https://projects.blender.org/blender/blender/pulls/119439
2024-04-05 16:33:58 +02:00
Aras Pranckevicius
4cb1d7242a Tests: extend ghash performance tests to cover blender::Map
Clean up the GHash performance testing suite (more C++ constructs), and
extend to cover blender::Map equivalents of the same functionality.

Pull Request: https://projects.blender.org/blender/blender/pulls/120252
2024-04-05 08:04:52 +02:00
Hans Goudey
274d7c6d12 Cleanup: Remove unused BVH tree function 2024-04-04 14:49:01 -04:00
Sebastian Parborg
a9fff1f22f Fix: Missing include in BLI_implicit_sharing.hh 2024-04-04 17:13:31 +02:00
Campbell Barton
eb04e1a753 Cleanup: quiet set-but-unused warnings 2024-04-04 10:55:18 +11:00
Campbell Barton
cc4b5facb8 Cleanup: order index range checks before using them to index arrays
While these particular cases didn't cause out-of-bounds array access,
it's reads like it could be an oversight.
2024-04-04 10:55:16 +11:00
Campbell Barton
fdaaebce54 Cleanup: remove unnecessary checks & unused assignments 2024-04-04 10:55:13 +11:00
Campbell Barton
52ce8d408f Cleanup: use const arguments & variables 2024-04-04 10:55:10 +11:00
Jacques Lucke
51f8bf53b2 Geometry Nodes: use xxhash for compute context hash
Previously, md5 was used which is significantly slower. In almost all cases
this does not have a significant performance impact in practice. However,
it's possible to build geometry nodes setups that become a few percent
faster ( by combining lots of cheap node groups). Using xxhash instead of
md5 should never be slower.

Pull Request: https://projects.blender.org/blender/blender/pulls/120225
2024-04-03 20:11:09 +02:00
Hans Goudey
e7339bdd5f Geometry: Use implicit sharing for deformed positions
Avoid copying the positions array into the evaluated edit hints array
that's used to support editing with deformed positions when there is
a topology-changing procedural operation. In a simple test in sculpt
mode with 706k curve points, memory usage went from 78 to 70 MB.

This adds more duplication would be ideal, mainly because retrieving
the data with write access and making implicit sharing info for arbitrary
arrays aren't abstracted by implicit sharing utilities. It may be possible
to improve both of those aspects, either now or in the future.

Pull Request: https://projects.blender.org/blender/blender/pulls/120146
2024-04-03 14:14:34 +02:00
Campbell Barton
8252208955 Cleanup: pass CoplanarClusterInfo::add_cluster arg by const reference 2024-04-03 15:04:31 +11:00
Campbell Barton
861536b24c Unbreak lite-build WITH_GMP enabled 2024-04-03 15:04:31 +11:00
Campbell Barton
c4c1aedd00 Cleanup: correct comments in scanfill.c, don't use bool for a flag
Logically PolyFill::f is a flag so use uchar instead of bool.
2024-04-03 14:07:37 +11:00
Campbell Barton
d5d1025e94 Cleanup: use const pointer arguments 2024-04-03 10:22:05 +11:00
Harley Acheson
d1b6621903 UI: Complete Event Icon Coverage
This PR completes coverage of the event icons used to represent keymap
entries on the status bar. This adds 0-9, non-alpha keys, tablet, F13-
F24, NDOF buttons, etc.

Pull Request: https://projects.blender.org/blender/blender/pulls/120117
2024-04-02 21:52:57 +02:00
Jacques Lucke
6f6ab91dbf Fix: support parallel_for_each with IndexRange without TBB 2024-04-02 15:12:37 +02:00
Sebastian Parborg
658eba4b2e Fix #119966: File rename fails on Mac with certain filesystems
As in the Linux case, it seems like the atomic rename doesn't work on all file systems on Mac either.
We did test on Windows and it seems like there is a built in fallback, so we don't need to do this there.

Pull Request: https://projects.blender.org/blender/blender/pulls/120037
2024-04-02 12:27:36 +02:00
Campbell Barton
99a60dd6c1 BLI_convexhull_2d: correct ifdef check
the check for USE_ANGLE_ITER_ORDER_ASSERT was flipped.
2024-04-01 23:58:52 +11:00
Campbell Barton
937776b555 Cleanup: sort CMake file lists 2024-04-01 16:48:44 +11:00
Campbell Barton
4855f8cd9c BLI_convexhull_2d: optimize rotating calipers
Previously the hulls edges were simply iterated over causing the
rotating calipers to step over points 4x as many times as is needed.

Avoid this by adding angle stepping logic that maps all angles to a
single quadrant, reducing the checks needed to advance the calipers
to each new angle. This gives ~1.4x speedup to AABB fitting logic.

Also add a test for octagon shapes to ensure axis aligned edges work
as expected.
2024-03-31 22:47:23 +11:00
Campbell Barton
7c4b2ec722 BLI_convexhull_2d: adjust order of edge iteration
Begin testing the edge edge between indices [0, 1] indices,
instead of [last, 0]. This only ever makes a difference as a tie breaker,
where [0, 1] is now prioritized.

This minor change simplifies further optimizations.
2024-03-31 22:39:14 +11:00
Falk David
614a23e9f6 Fix: BLI: Bounds is_empty function
This was meant to be the same as `BLI_rct*_is_empty`
but wasn't because the `less_or_equal_than` was
effectively doing a logical "and", when it should have
been doing a logical "or".
2024-03-29 17:12:51 +01:00
Hoshinova
c78c6b0bdf Fix #119797: Noise Texture Precision Issues
The Perlin noise algorithms suffer from precision issues when a coordinate
is greater than about 250000.

To fix this the Perlin noise texture is repeated every 100000 on each axis.
This causes discontinuities every 100000, however at such scales this
usually shouldn't be noticeable.

Pull Request: https://projects.blender.org/blender/blender/pulls/119884
2024-03-29 16:12:23 +01:00
Jesse Yurkovich
b37e825d89 Fix: MSVC ICE in MatBase stream output operator
Recently an internal compiler error has been popping up for folks
stemming from our MatBase matrix `operator<<`.

My guess is that the nested fold-expression (coming from `unroll`) and
the lambda is causing MSVC to become very upset in some instances.

Regardless of the actual cause, using simple for loops results in less
generated code and the use of `unroll` isn't required since these output
operators are mainly for debugging.

Unfortunately I've been unable to reproduce it in simpler contexts to
report it upstream.

Pull Request: https://projects.blender.org/blender/blender/pulls/119982
2024-03-28 19:31:18 +01:00
Campbell Barton
686605a6dd Cleanup: declare arrays as const where possible 2024-03-28 22:57:57 +11:00
Campbell Barton
b2e00d1285 Cleanup: use const pointer arguments 2024-03-28 20:57:50 +11:00
Campbell Barton
939e076fdc Cleanup: remove redundant assignment & null check 2024-03-28 13:01:36 +11:00
Campbell Barton
3416fe6e1e License headers: add SPDX headers 2024-03-27 10:31:24 +11:00
Campbell Barton
40ab214c0a Cleanup: spelling in comments 2024-03-27 10:25:31 +11:00
Jacques Lucke
4cdc62044e BLI: fix fixed-width-int to string conversion 2024-03-26 14:25:33 +01:00
Jacques Lucke
7314c86869 BLI: add fixed width integer type
This is intended to be used in the new exact mesh boolean algorithm by @howardt.
The new `BLI_fixed_width_int.hh` header provides types like `Int256` and
`UInt256` which are like e.g. `uint64_t` but with higher precision. The code
supports many different integer sizes.

The following operations are supported:
* Addition
* Subtraction
* Multiplication
* Comparisons
* Negation
* Conversion to and from other number types
* Conversion to and from string (based on `GMP`)

Division is not implemented. It could be implemented, but it's more complex and
is not required for the new mesh boolean algorithm.

Some alternatives to having a custom implementation have been discussed in
https://devtalk.blender.org/t/fixed-length-multiprecision-arithmetic/29189/.

Generally, the implementation is fairly straight forward. The main complexity is
the addition/multiplication algorithm which isn't too complicated. It's nice to
have control over this part as it allows us to optimize the code more if
necessary. Also, from what I understand, we might be able to benefit from some
special cases like multiplying a large integer with a smaller one.

I tried some different ways to optimize this already, but so far the normal
compiler optimization turned out to work best. Not sure if the same is true on
windows though, as it doesn't have native support for an `int128` which helps
the compiler understand what I'm doing. Alternatives I tried so far are using
intrinsics directly (mainly `_addcarry_u64` and similar), writing inline
assembly manually and copying the assembly output from the compiler. I assume
the assembly implementation didn't help for me because it prohibited other
compiler optimizations.

Pull Request: https://projects.blender.org/blender/blender/pulls/119528
2024-03-25 23:39:42 +01:00
Iliya Katueshenock
9123451427 Cleanup: BLI: Redundant dereference
Redundant dereference of array element.
This code is not used currently, but i noticed issue
while using this in my branch.

Pull Request: https://projects.blender.org/blender/blender/pulls/119842
2024-03-24 11:07:00 +01:00
Omar Emara
2906ea9785 BLI: Add nearest interpolation with clamped boundary
This patch adds clamped boundaries variants of the nearest interpolation
functions in the BLI module. The naming convention used by the bilinear
functions were followed.

Needed by #119414.

Pull Request: https://projects.blender.org/blender/blender/pulls/119732
2024-03-21 13:22:10 +01:00
Omar Emara
12d34fed91 BLI: Add step function to math library
This patch adds a step function that is equivalent to the GLSL step
function to the BLI math library.

Needed by #119414.

Pull Request: https://projects.blender.org/blender/blender/pulls/119731
2024-03-21 10:54:17 +01:00
Campbell Barton
57dd9c21d3 Cleanup: spelling in comments 2024-03-21 10:02:53 +11:00
Campbell Barton
116264c310 Cleanup: use full scentences for code-comments & minor corrections 2024-03-21 09:49:19 +11:00
Campbell Barton
fbe16bc1eb BLI_delete: assert that dir is true when recursive is true
While this isn't an error avoid ambiguity for recursive deletion
as it's not meaningful to delete a file.
2024-03-21 09:43:40 +11:00
Campbell Barton
4e3771124d Docs: BLI_delete parameters & behavior with symbolic-links 2024-03-21 09:35:40 +11:00
Jacques Lucke
b99c1abc3a BLI: speedup memory bandwidth bound tasks by reducing threading
This improves performance by **reducing** the amounts of threads used for tasks
which require a high memory bandwidth.

This works because the underlying hardware has a certain maximum memory
bandwidth. If that is used up by a few threads already, any additional threads
wanting to use a lot of memory will just cause more contention which actually
slows things down. By reducing the number of threads that can perform certain
tasks, the remaining threads are also not locked up doing work that they can't
do efficiently. It's best if there is enough scheduled work so that these tasks
can do more compute intensive tasks instead.

To use this new functionality, one has to put the parallel code in question into
a `threading::memory_bandwidth_bound_task(...)` block. Additionally, one also
has to provide a (very) rough approximation for how many bytes are accessed. If
the number is low, the number of threads shouldn't be reduced because it's
likely that all touched memory can be in L3 cache which generally has a much
higher bandwidth than main memory.

The exact number of threads that are allowed to do bandwidth bound tasks at the
same time is generally highly context and hardware dependent. It's also not
really possible to measure reliably because it depends on so many static and
dynamic factors. The thread count is now hardcoded to 8. It seems that this many
threads are easily capable of maxing out the bandwidth capacity.

With this technique I can measure surprisingly good performance improvements:
* Generating a 3000x3000 grid: 133ms -> 103ms.
* Generating a mesh line with 100'000'000 vertices: 212ms -> 189ms.
* Realize mesh instances resulting in ~27'000'000 vertices: 460ms -> 305ms.

In all of these cases, only 8 instead of 24 threads are used. The remaining
threads are idle in these cases, but they could do other work if available.

Pull Request: https://projects.blender.org/blender/blender/pulls/118939
2024-03-19 18:23:56 +01:00