A float array use different alignment and reservation rules on
GPU versus CPU. Before this change the _pad1 was aligned to 16 bytes as
it was defined as an array, but on CPU this alignment doesn't
happen.
This is fixed by using 3 floats in stead of a float array for padding.
Pull Request: https://projects.blender.org/blender/blender/pulls/112565
* Fix#112284 and other non-reported sculpt-related regressions in the
new Workbench.
* Cleanup ObjectState setup.
* Update `sculpt_batches_get` to support getting per material batches
while passing SculptBatchFeatures.
* Make material indices 0 based in Workbench.
Pull Request: https://projects.blender.org/blender/blender/pulls/112344
The hash tables and vector blenlib headers were pulling many more
headers than they actually need, including the C base math header,
our C string API header, and the StringRef header. All of this
potentially slows down compilation and polutes autocomplete
with unrelated information.
Also remove the `ListBase` constructor for `Vector`. It wasn't used
much, and making it easy to use `ListBase` isn't worth it for the
same reasons mentioned above.
It turns out a lot of files depended on indirect includes of
`BLI_string.h` and `BLI_listbase.h`, so those are fixed here.
Pull Request: https://projects.blender.org/blender/blender/pulls/111801
Add three cached topology maps to `Mesh`, to avoid computations when
mesh data isn't changed. Choosing the right maps to cache is a bit
arbitrary, but generally we have to start somewhere. The limiting
factor is memory usage (all the new caches combined have a
comparable footprint to a UV map).
For now, the caches added are:
- Vertex to face corner
- Vertex to face
- Face corner to face
These caches are used in quite a few places already;
- Face corner normal calculation
- UV value merging
- Setting sharp edges from face angles
- Data transfer modifier
- Voxel remesh attribute remapping
- Sculpt mode painting
- Sculpt mode normal calculation
- Vertex paint mode
- Split edges geometry node
- Mesh topology geometry nodes
Caching topology maps means they don't have to be rebuilt every time
they're used. Meshes copied but without topology changes can share
the cache, further reducing re-computations. For example, FPS with a
large mesh using the "Corners of Vertex" node went from 1.8 to 2.3.
Entering sculpt mode is slightly faster too.
There is some obvious work for future commits:
- Use caches in attribute domain interpolation
- More multithreading of second phase of map building
- Update/build caches eagerly in some geometry nodes
Pull Request: https://projects.blender.org/blender/blender/pulls/107816
The `EdgeHash` and `EdgeSet` data structures are designed specifically
as a hash of an order agnostic pair of integers. This specialization can
be achieved much more easily with the templated C++ data structures,
which gives improved performance, readability, and type safety.
This PR removes the older data structures and replaces their use with
`Map`, `Set`, or `VectorSet` depending on the situation. The changes
are mostly straightforward, but there are a few places where the old
API made the goals of the code confusing.
The last time these removed data structures were significantly changed,
they were already moving closer to the implementation of the newer
C++ data structures (aa63a87d37).
Pull Request: https://projects.blender.org/blender/blender/pulls/111391
Adds a userpref toggle for the edit mode overlays fresnel.
The edit mode fresnel is only a bit useful in edge cases, like
very dense photogrametry, and the problem is that it causes
more eye strain when modeling for many hours. And it's
benefit on shape readability is small compared to it's negative
impact on selection visibility. It makes the selection color to a
darker less saturated color instead of the theme color, which
leads to worse contrast between the selection and the mesh
or with the background, and also makes the unselected (black)
brighter, also reducing contrast. So it's off by default.
This was split up from https://projects.blender.org/blender/blender/pulls/110097
Pull Request: https://projects.blender.org/blender/blender/pulls/111494
Enables three options of wireframe color for all shading modes: theme color, object color
and random color. Previously this was exclusive to the wireframe shading mode.
Pull Request: https://projects.blender.org/blender/blender/pulls/111502
The commit f10965dcb8 introduced a regression in the way how the VBOs
are filled by assuming the requested VBO attribute type has the same
alignment as the CPU and is the same on different platforms.
Unfortunately, this turned out to not be the case.
Switch the mask attribute to be float on the GPU, which has a downside
of increased bandwidth to be transferred, but a benefit of less compute
power needed to update the VBO.
The fix is suggested by Clement.
Pull Request: https://projects.blender.org/blender/blender/pulls/111521
When GLSL sources were first included in Blender they were treated as
data (like blend files) and had no license header.
Since then GLSL has been used for more sophisticated features
(EEVEE & real-time compositing)
where it makes sense to include licensing information.
Add SPDX copyright headers to *.glsl files, matching headers used for
C/C++, also include GLSL files in the license checking script.
As leading C-comments are now stripped,
added binary size of comments is no longer a concern.
Ref !111247
This allows adding debug calls to library
sources and not trigger an error for
all shaders that reference the lib.
The particular shader that need to use
it can set `drw_debug_print_enable` and
`drw_debug_draw_enable` in their main file
and it will trigger the injection of the
debug functions.
The issue was that there was a change in the shader that expected
a specific flag to be set. The current grease pencil code did not
set these flags.
This fix separates the old and new shader code so that
the old way of rendering the strokes remains untouched.
Pull Request: https://projects.blender.org/blender/blender/pulls/111227
Split shadow rendering per LOD per tilemap and improve
fragment shader invocation rate by using multi-viewport.
Also changes the layout of the atlas to be 4 x 4 x Layers.
This allow to grow the atlas while keeping the content
and page indirection correct, but this isn't implemented
in this patch.
# First attempt
Shadow rendering using atomic proved to be less than ideal
and performance were not quite to an acceptable level.
The previous method had issue with atomic contention when
a lot of triangle would overlap and too many fragment shader
invocations with quite complex indirection rules and biases
which made the technique costly.
The new implementation leverage multi viewport and
layered rendeing to effectively replace the need for atomic
and render directly to the shadow atlas. Using the well
supported extension these are free on modern hardware and
do not need a geometry shader.
One view per tile is needed since we use the viewport index
and the layer index as a way to index a specific tile in the
array.
# Geometric Complexity Problem
The counterpart of this is that we need to draw one geometry
instance per tile which is 32x32 time more instances (at most)
than with the previous method.
This means that we will have to find a way to mitigate this
geometry cost by either reducing the number of tiles per
tilemaps (in other words, making the system less memory efficient)
or splitting complex objects' geometry into smaller, more
cull friendly chunks (for example, like the sculpt PBVH nodes).
The later seems to be a longer term solution as it requires
way too much engineering time we have right now.
# Update Lag Problem
This also mean we can only update up to 64 tile per redraw
which is not enough even in the most basic cases. This leads
to missing or over shadowing when a light updates until there
is no updates and the shadow rendering can catch up.
One possible solution is to update a lower LODs first waiting
until there is no update to render. This would allow no artifact
during the transforms (unless there is too many light updates
even for lowest LOD, but that was an issue also for the
previous implementation). This could also help with the
geometric complexity.
# Solution
In the end, we decided to have one view per lod. This limits
the complexity of the fragment shader (improve speed),
reduces the number of views per tilemap (fix update lag),
and reduces the number of instances.
This also mean we cannot render directly to the atlas anymore
and reverted to the atomic solution. Using the smallest
possible viewport, we assure that there isn't that much fragment
shader invocations which was one of the bottleneck. And also
reduces the amount of geometry instances that pass the
clipping test.
Pull Request: https://projects.blender.org/blender/blender/pulls/110979
Include counts of some headers while making full blender build:
- BLI_color.hh 1771 -> 1718
- BLI_math_color.h 1828 -> 1783
- BLI_math_vector.hh 496 -> 405
- BLI_index_mask.hh 1341 -> 1267
- BLI_task.hh 958 -> 903
- BLI_generic_virtual_array.hh 509 -> 435
- IMB_colormanagement.h 437 -> 130
- GPU_texture.h 806 -> 780
- FN_multi_function.hh 331 -> 257
Note: DNA_node_tree_interface_types.h needs color include only
for the currently unused (but soon to be used) socket_color function.
Future step is to figure out how to include
DNA_node_tree_interface_types.h less.
Pull Request: #111113
Including <iostream> or similar headers is quite expensive, since it
also pulls in things like <locale> and so on. In many BLI headers,
iostreams are only used to implement some sort of "debug print",
or an operator<< for ostream.
Change some of the commonly used places to instead include <iosfwd>,
which is the standard way of forward-declaring iostreams related
classes, and move the actual debug-print / operator<< implementations
into .cc files.
This is not done for templated classes though (it would be possible
to provide explicit operator<< instantiations somewhere in the
source file, but that would lead to hard-to-figure-out linker error
whenever someone would add a different template type). There, where
possible, I changed from full <iostream> include to only the needed
<ostream> part.
For Span<T>, I just removed print_as_lines since it's not used by
anything. It could be moved into a .cc file using a similar approach
as above if needed.
Doing full blender build changes include counts this way:
- <iostream> 1986 -> 978
- <sstream> 2880 -> 925
It does not affect the total build time much though, mostly because
towards the end of it there's just several CPU cores finishing
compiling OpenVDB related source files.
Pull Request: https://projects.blender.org/blender/blender/pulls/111046
This makes it so the `GreasePencil` geometry gets updated on a time
change.
The frame at which the object gets evaluated is stored in runtime as
`eval_frame`. This is for example used to calculate the bounding box
of the geometry as well as invalidating the batch cache for different
frames.
Pull Request: https://projects.blender.org/blender/blender/pulls/111137
Listing the "Blender Foundation" as copyright holder implied the Blender
Foundation holds copyright to files which may include work from many
developers.
While keeping copyright on headers makes sense for isolated libraries,
Blender's own code may be refactored or moved between files in a way
that makes the per file copyright holders less meaningful.
Copyright references to the "Blender Foundation" have been replaced with
"Blender Authors", with the exception of `./extern/` since these this
contains libraries which are more isolated, any changed to license
headers there can be handled on a case-by-case basis.
Some directories in `./intern/` have also been excluded:
- `./intern/cycles/` it's own `AUTHORS` file is planned.
- `./intern/opensubdiv/`.
An "AUTHORS" file has been added, using the chromium projects authors
file as a template.
Design task: #110784
Ref !110783.
Using ClangBuildAnalyzer on the whole Blender build, it was pointing
out that BLI_math.h is the heaviest "header hub" (i.e. non tiny file
that is included a lot).
However, there's very little (actually zero) source files in Blender
that need "all the math" (base, colors, vectors, matrices,
quaternions, intersection, interpolation, statistics, solvers and
time). A common use case is source files needing just vectors, or
just vectors & matrices, or just colors etc. Actually, 181 files
were including the whole math thing without needing it at all.
This change removes BLI_math.h completely, and instead in all the
places that need it, includes BLI_math_vector.h or BLI_math_color.h
and so on.
Change from that:
- BLI_math_color.h was included 1399 times -> now 408 (took 114.0sec
to parse -> now 36.3sec)
- BLI_simd.h 1403 -> 418 (109.7sec -> 34.9sec).
Full rebuild of Blender (Apple M1, Xcode, RelWithDebInfo) is not
affected much (342sec -> 334sec). Most of benefit would be when
someone's changing BLI_simd.h or BLI_math_color.h or similar files,
that now there's 3x fewer files result in a recompile.
Pull Request #110944
Add sculpt support to EEVEE Next.
It creates a new resource handle for sculpt object, since
BKE_object_boundbox_get returns a wrong (zeroed) bounding box.
This also adds a new `resource_handle` function to the `draw::Manager`
that works like the default one, but lets the caller optionally override the
object matrix and/or bounds.
Pull Request: https://projects.blender.org/blender/blender/pulls/110703
This is a full rewrite of the raytracing denoise pipeline. It uses the
same principle as before but now uses compute shaders for every stages
and a tile base approach. More aggressive filtering is needed since we
are moving towards having no prefiltered screen radiance buffer. Thus
we introduce a temporal denoise and a bilateral denoise stage to the
denoising. These are optionnal and can be disabled.
Note that this patch does not include any tracing part and only samples
the reflection probes. It is focused on denoising only. Tracing will
come in another PR.
The motivation for this is that having hardware raytracing support
means we can't prefilter the radiance in screen space so we have to
have better denoising. Also this means we can have better surface
appearance with support for other BxDF model than GGX. Also GGX support
is improved.
Technically, the new denoising fixes some implementation mistake the
old pipeline did. It separates all 3 stages (spatial, temporal,
bilateral) and use random sampling for all stages hoping to create
a noisy enough (but still stable) output so that the TAA soaks the
remaining noise. However that's not always the case. Depending on the
nature of the scene, the input can be very high frequency and might
create lots of flickering. That why another solution needs to be found
for the higher roughness material as denoising them becomes expensive
and low quality.
Pull Request: https://projects.blender.org/blender/blender/pulls/110117
Adds support for generating curve primtiives avoiding the
use of primtiive restarts. This maixmises geometry performance
when using Metal.
Also ensure that the existing index buffer optimization path is
skipped for indirect draw calls where counts are not known at
submission time.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/109972