Preparation for the the 3.6 library update landing.
The filenames for these libs will change a little bit and 3.6 will add
new library for the fp32 version of fftw.
LibOverride resync process would not properly 'make local' newly created
overrides (copied from their linked reference). The main consequence was
that some linked data used by these new liboverrides would not be
properly tagged as directly linked.
While this would not be an issue in current codebase, this was breaking
upcomming readfile undo refactor, since some linked data did not get
their local 'placeholder reference' data written in memfile blendfiles.
NOTE: this is more of a quick fix, whole handling of library data in
complex copying cases like that need some more general work, see #107848
and #107847.
Initial graphic pipeline targeting. The goal of this PR is to have an initial
graphics pipeline with missing features. It should help identifying
areas that requires engineering.
Current state is that developers of the GPU module can help with the many
smaller pieces that needs to be engineered in order to get it working. It is not
intended for users or developers from other modules, but your welcome to learn
and give feedback on the code and engineering part.
We do expect that large parts of the code still needs to be re-engineered into
a more future-proof implementation.
**Some highlights**:
- In Vulkan the state is kept in the pipeline. Therefore the state is tracked
per pipeline. In the near future this could be used as a cache. More research
is needed against the default pipeline cache that vulkan already provides.
- This PR is based on the work that Kazashi Yoshioka already did. And include
work from him in the next areas
- Vertex attributes
- Vertex data conversions
- Pipeline state
- Immediate support working.
- This PR modifies the VKCommandBuffer to keep track of the framebuffer and its
binding state(render pass). Some Vulkan commands require no render pass to be
active, other require a render pass. As the order of our commands on API level
can not be separated this PR introduces a state engine to keep track of the
current state and desired state. This is a temporary solution, the final
solution will be proposed when we have a pixel on the screen. At that time
I expect that we can design a command encoder that supports all the cases
we need.
**Notices**:
- This branch works on NVIDIA GPUs and has been validated on a Linux system. AMD
is known not to work (stalls) and Intel GPUs have not been tested at all. Windows might work
but hasn't been validated yet.
- The graphics pipeline is implemented with pixels in mind, not with performance. Currently
when a draw call is scheduled it is flushed and waited until it is finished drawing, before
other draw calls can be scheduled. We expected the performance to be worse that it actually
is, but we expect huge performance gains in the future.
- Any advanced drawing (that is used by the image editor, compositor or 3d viewport) isn't
implemented and might crash when used.
- Using multiple windows or resizing of window isn't supported and will stall the system.
Pull Request: https://projects.blender.org/blender/blender/pulls/106224
Due to cumulative floating point errors the quaternions produced by
`mat3_normalized_to_quat_fast()` could be non-unit length. This is now
detected, and quaternions are normalised when necessary.
This is a slight roll-back of 756538b4a1,
which removed the normalisation altogether, but did introduce this issue.
Vulkan doesn't have a conversion from uint32_t/int32_t to float. It does
have conversions from 16/8 bits. Main reason is that Vulkan expects that
there is no benefit when converting 32 bits from one type to the other
and should be solved by passing the right data type.
In Blender however this isn't the case as there are benefits on other
GPU backends (OpenGL for example).
This PR adds helper function to check if conversion is needed and
perform any conversions in place. It also implements the function to
upload vertex buffers to the GPU.
NOTE: Test cases have been added to validate this, but they are not
able to run on the Vulkan backend just yet, because they require the
graphics pipeline to be available.
Pull Request: https://projects.blender.org/blender/blender/pulls/107733
This PR adds several tests to GTest for testing blend states
and immediate mode. The immediate mode test are basic and will
be extended in the near future.
The tests have been developed in order to get the first pixel on
screen for the Vulkan backend. In the first phase of this goal we
had to validate that everything was working as expected without
being able to validate it by starting Blender.
Pull Request: https://projects.blender.org/blender/blender/pulls/107834
ShaderInput lookup used to be explicit as in we know for sure that the
shader input exists. This was guarded by a assert. During development
of the graphics pipeline we came to the conclusion that this isn't
always the case due to its late bindings of resources.
This PR Makes the shader input lookup return an optional result.
This result can then be checked depending on the area where it is used
to validate the existence.
Pull Request: https://projects.blender.org/blender/blender/pulls/107735
This patch implements the Plane Track Deform node for the realtime
compositor. The implementation is mostly similar to the Corner Pin node,
but it is generalized to multiple homography matrices to take motion
blur into account.
Pull Request: https://projects.blender.org/blender/blender/pulls/107811
Adds a generic `is_larger` and `get_aspect_scaled_extent` function to
simplify packing logic.
Migrate several packing functions so that they only improve a layout,
and early-exit packing if a better layout already exists.
Fix a `rotate_inside_square` logic error during xatlas packing.
BLI_strncpy_utf8 didn't check for null bytes within bytes stepped
over by the variable length UTF8 encoding.
While a valid UTF8 string wont include these, it's possible Latin1
encoding or a truncated string includes such characters.
In this case, the entire string is copied as it's not the purpose of
this function to correct or strip invalid/truncated encoding,
only to prevent it from happening in the first place.
Use the offset indices pattern to avoid keeping separate arrays for the
offset of a vertex's neighbors and the number of neighbors. This gave
a 9% speedup for the conversion, from 42.9 ms to 39.3 ms.
In vertex extrusion mode, when there are no edge attributes we can avoid
creating a topology map meant for mixing old values. This makes
simulation caching for a curl noise demo file from user Higgsas 10%
faster (from 44s to 40s to calculate 230 frames).
Adds a `remap_pairing` function for node group operators that ensures
the simulation input nodes' `output_node_id` matches the new node are
creating a group, ungrouping a node group, or separating from a group.
Also fixes a crash in the "Group Separate" operator when group
input/output nodes are included in the selection.
Pull Request: https://projects.blender.org/blender/blender/pulls/107807
Issue was that, when UI-related code _is_ requested in foreach_id
processing, `ID_SCR` screen ID type can actually use any kind of ID
(through e.g. the Outliner space).
So `BKE_library_id_can_use_filter_id` had to be updated with a new
parameter (whether UI-related data should be taken into account or not).
The root of the issue was that while reading a new blendfile, the
current `G_MAIN` is still the old one, not the one in which new data is
being read.
Since the remapping callback taking care of UI data during ID remapping
is using `G_MAIN`, it is processing the wrong data, so when deleting the
invalid shapekey (from `BLO_main_validate_shapekeys`), the UI data of
the new bmain would not be properly remapped, causing invalid memory
access later when recomputing user counts (calls to
`BKE_main_id_refcount_recompute`).
This is fixed by adding a new `BKE_id_delete_ex` function that takes
extra remapping options parameter, and calling it from
`BLO_main_validate_shapekeys` with extra option to enforce handling of
UI data by remapping code.
NOTE: At some point we have to check if that whole UI-callback thing is
still needed, would be good to get rid of it and systematically process
UI-related ID pointers like any others. Current situation is... fragile
to say the least.
For realtime use cases, storing the geometry's state in memory at every
frame can be prohibitively expensive. This commit adds an option to
disable the caching, stored per object and accessible in the baking
panel. The default is still to enable caching.
Pull Request: https://projects.blender.org/blender/blender/pulls/107767
Caused by 2fb31f34af.
Since above commit, we are now calling `BKE_mesh_orco_verts_transform`
not only from `psys_face_mat` and `psys_mat_hair_to_orco`, but now also
from `psys_particle_on_dm` > `psys_interpolate_face`, so we get double
transforms with mirror.
Remove the "extra" call in `psys_mat_hair_to_orco`.
Should be backported to 3.3 LTS as well.
Pull Request: https://projects.blender.org/blender/blender/pulls/107804
Avoid storing redundant and unnecessary data in temporary arrays during
face corner normal calculation with custom normals. Previously we used
96 bytes per normal space (`MLoopNorSpace` (64), `MLoopNorSpace *` (8),
`LinkData` (24)), now we use 36 (`CornerNormalSpace` (32), `int` (4)).
This is achieved with a few changes:
- Avoid sharing the data storage with the BMesh implementation
- Use indices to refer to normal fan spaces rather than pointers
- Only calculate indices of all corners in each fan when necessary
- Don't duplicate automatic normal in space storage
- Avoid storing redundant flags in space struct
Reducing memory usage gives a significant performance improvement in
my test files, which is consistent with findings from previous commits
(see 9fcfba4aae). In my test, the time used to calculate
normals for a character model with 196 thousand faces reduced from
20.2 ms to 14.0 ms, a 44% improvement.
Edit mode isn't affected by this change.
A note about the `reverse_index_array` function added here: this is the
same as the mesh mapping functions (for example for mapping from
vertices to face corners). I'm planning on working this area and hoping
to generalize/reuse the implementation in the near future.
Pull Request: https://projects.blender.org/blender/blender/pulls/107592
The new theme setting for the dope sheet (timeline) allows changing the
color of the bar that shows which frames are cached/baked. The
invalid/cached/baked status is differentiated by hardcoded transparency
values. In theory, those could be separate theme settings though.
Pull Request: https://projects.blender.org/blender/blender/pulls/107738
Operations such as assigning or removing weights were simply not
undoable (no previous weights were applied back to the EditLatt). Now
add MDeformVert data to UndoLattice and handle with
appropriate functions during Undo.
Pull Request: https://projects.blender.org/blender/blender/pulls/107776
Blender would fail to link to USD before MaterialX files were
copied to the install targets lib/ directory.
Resolve by including ${LIBDIR}/materialx/lib in link_directories.
looptris were referred to as both tris & faces, sometimes polygons
were referred to as faces too. Was especially error prone with
callbacks that took both a tri and a tri_i arguments.
Sometimes tri_i represented a looptri index, other times the corner of
the triangle from 0-2. Causing expressions such as:
`args->mlooptri[tri].tri[tri_i]`
- Rename tri & tri_index -> looptri_i.
- Rename faces -> looptris.
- Rename face_index/poly_index/poly -> poly_i.
- Declare looptri_i at the start of the loop and reuse it,
in some cases it was declared with args->prim_indices[i] being
used as well.
Change PBVH normal calculation to also update vertices connected to
vertices with an update tag. This is necessary because vertex normals
are the mixed face normals, so changing any face's normal will change
the normals of all connected faces.
This change requires that the PBVH always have a vertex to face
topology map available. In the future this will likely be cached on
meshes though, which will reduce the delay it adds when entering sculpt
mode.
Now, first all face normals are updated, then the normals for
connected vertices are mixed from the face normals. This is a
significant simplification to the whole process, which previously
worked with atomics and normals at the triangle level. Similar changes
changes for regular non-sculpt normal calculation are being worked on
in #105920.
Pull Request: https://projects.blender.org/blender/blender/pulls/107458
In some cases strips may end up with speed factor of 0 which causes
offsets and position to be invalid. The exact cause is unknown, but
most likely caused by `do_versions_sequencer_init_retiming_tool_data()`.
This could possibly happen if 3.6 file is saved with 3.5 version and
then opened again with 3.6 version.
To fix strips, retiming data is removed, start offset reset and speed
factor is set to 1. Previous versioning code is fixed, so speed factor
is never set to 0.
Pull Request: https://projects.blender.org/blender/blender/pulls/107798
Avoid instantiating the templates separately in every translation unit.
This saves 20 KB in my Blender binary. Also remove a timer mistakenly
committed.
This saves about 6 ms every update when in edit mode on a 1 million
face grid. For reference, the BMesh to Mesh conversion took 80 ms,
before and after the change.
This was added temporarily during development.
Loading files created in the geometry-nodes-simulation branch (3.5.4
and older) will remove links from simulation zones, which need to be
added back manually.
Pull Request: https://projects.blender.org/blender/blender/pulls/107781
Similare to the Metal backend, Vulkan keeps data of the full texture
around as they will be executed by the same submission. So there are
no benefits to splice a texture into smaller parts, but adds overhead
as more commands are required to be processed.
Pull Request: https://projects.blender.org/blender/blender/pulls/107728