There exist a bunch of "give me a (filtered) image pixel at this location"
functions, some with duplicated functionality, some with almost the same but
not quite, some that look similar but behave slightly differently, etc.
Some of them were in BLI, some were in ImBuf.
This commit tries to improve the situation by:
* Adding low level interpolation functions to `BLI_math_interp.hh`
- With documentation on their behavior,
- And with more unit tests.
* At `ImBuf` level, there are only convenience inline wrappers to the above BLI
functions (split off into a separate header `IMB_interp.hh`). However, since
these wrappers are inline, some things get a tiny bit faster as a side
effect. E.g. VSE image strip, scaling to 4K resolution (Windows/Ryzen5950X):
- Nearest filter: 2.33 -> 1.94ms
- Bilinear filter: 5.83 -> 5.69ms
- Subsampled3x3 filter: 28.6 -> 22.4ms
Details on the functions:
- All of them have `_byte` and `_fl` suffixes.
- They exist in 4-channel byte (uchar4) and float (float4), as well as
explicitly passed amount of channels for other float images.
- New functions in BLI `blender::math` namespace:
- `interpolate_nearest`
- `interpolate_bilinear`
- `interpolate_bilinear_wrap`. Note that unlike previous "wrap" function,
this one no longer requires the caller to do their own wrapping.
- `interpolate_cubic_bspline`. Previous similar function was called just
"bicubic" which could mean many different things.
- Same functions exist in `IMB_interp.hh`, they are just convenience that takes
ImBuf and uses data pointer, width, height from that.
Other bits:
- Renamed `mod_f_positive` to `floored_fmod` (better matches `safe_floored_modf`
and `floored_modulo` that exist elsewhere), made it branchless and added more
unit tests.
- `interpolate_bilinear_wrap_fl` no longer clamps result to 0..1 range. Instead,
moved the clamp to be outside of the call in `paint_image_proj.cc` and
`paint_utils.cc`. Though the need for clamping in there is also questionable.
Pull Request: https://projects.blender.org/blender/blender/pulls/117387
Fix#115983: Crash when saving empty Composite output
Blender crashes when the compositor tries to save en empty composite
output through the render pipeline. This happens when no Composite node
exist and no Render Layers node is used, essentially producing an empty
render result.
We fix this by only saving the composite output if a Composite node or a
Render Layers node exist.
Pull Request: https://projects.blender.org/blender/blender/pulls/117129
The term `PIL` stands for "platform independent library." It exists since the `Initial Revision`
commit from 2002. Nowadays, we generally just use the `BLI` (blenlib) prefix for such code
and the `PIL` prefix feels more confusing then useful. Therefore, this patch renames the
`PIL` to `BLI`.
Pull Request: https://projects.blender.org/blender/blender/pulls/117325
An assert that ensures the GPU compositor executes in a non main thread
wrongly fires in background mode, that's because in background mode,
rendering happens in the main thread. So add a condition for background
mode.
This allows code outside of the render pipeline to make proper
decisions about how the imbuf of render passes are to be handled.
For example, IMB_create_gpu_texture() will now properly select
single channel grayscale texture format for depth pass coming from
multilayer EXR, additionally solving assert in the GPU compositor
code which verifies expected and actual imbuf texture format.
Pull Request: https://projects.blender.org/blender/blender/pulls/117184
Set the number of planes based on the number of pass channels.
If the pass contains 2 passes or more than 4 passes set the number
of planes to the previously used value of 32.
This is needed because quite some areas check for the number of
planes for various optimizations. For example, this is one of the
factors which make IMB_create_gpu_texture() to choose the texture
format. If the number of planes for the depth pass is set to the
previously used this function will never consider using single
channel GPU texture.
Unfortunately, this change is not enough to make the GPU texture
to use single channel format as the color space of the image
buffer is also checked, and that is nullptr which means scene linear.
Part of overall "improve filtering situation" (#116980) task:
* Add Bicubic filtering option to strip Transform "Filter" setting.
Previously this option only existed in Transform Effect "Interpolation"
setting.
- With this addition, it feels like the transform effect could
possibly be marked as legacy/deprecated, since the regular Transform
that is on all strips can do everything that Transform Effect did?
* Speed up bicubic filtering (used now in VSE, but also in CPU Compositor,
image paint, etc.) by slightly simplifying the code and using some SIMD.
Upscaling 96x54 image to 3840x2160 resolution, using Bicubic filtering:
- Windows (VS2022, Ryzen 5950X): 35.5ms -> 15.1ms
- Mac (clang 15, M1 Max): 29.6ms -> 24.4ms
* Add gtest coverage for bicubic functionality.
Pull Request: https://projects.blender.org/blender/blender/pulls/117100
The previous commit introduced a new `RPT_()` macro to translate
strings which are not tooltips or regular interface elements, but
longer reports or statuses.
This commit uses the new macro to translate many strings all over the
UI.
Most of it is a simple replace from `TIP_()` or `IFACE_()` to
`RPT_()`, but there are some additional changes:
- A few translations inside `BKE_report()` are removed altogether
because they are already handled by the translation system.
- Messages inside `UI_but_disable()` are no longer translated
manually, but they are handled by a new regex in the translation
system.
Pull Request: https://projects.blender.org/blender/blender/pulls/116804
Pull Request: https://projects.blender.org/blender/blender/pulls/116804
Along with the 4.1 libraries upgrade, we are bumping the clang-format
version from 8-12 to 17. This affects quite a few files.
If not already the case, you may consider pointing your IDE to the
clang-format binary bundled with the Blender precompiled libraries.
Some common headers were including this. Separating the includes
will ideally lead to better conceptual separation between CustomData
and the attribute API too. Mostly the change is adding the file to
places where it was included indirectly before. But some code is
shuffled around to hopefully better places as well.
Except for vertex groups and a few older color types, these
are generally replaced by newer generic attribute types.
Also remove some includes of DNA_mesh_types.h, since it's
included indirectly by BKE_mesh.hh currently.
Each value is now out of the global namespace, so they can be shorter
and easier to read. Most of this commit just adds the necessary casting
and namespace specification. `enum class` can be forward declared since
it has a specified size. We will make use of that in the next commit.
Use the standard "elements_num" naming, and use the "corner" name rather
than the old "loop" name: `verts_num`, `edges_num`, and `corners_num`.
This matches the existing `faces_num` field which was already renamed.
Pull Request: https://projects.blender.org/blender/blender/pulls/116350
Make the naming consistent with the recent change from "loop" to
"corner". Avoid the need for a special type for these triangles by
conveying the semantics in the naming instead.
- `looptris` -> `corner_tris`
- `lt` -> `tri` (or `corner_tri` when there is less context)
- `looptri_index` -> `tri_index` (or `corner_tri_index`)
- `lt->tri[0]` -> `tri[0]`
- `Span<MLoopTri>` -> `Span<int3>`
- `looptri_faces` -> `tri_faces` (or `corner_tri_faces`)
If we followed the naming pattern of "corner_verts" and "edge_verts"
exactly, we'd probably use "tri_corners" instead. But that sounds much
worse and less intuitive to me.
I've found that by using standard vector types for this sort of data,
the commonalities with other areas become much clearer, and code ends
up being naturally more data oriented. Besides that, the consistency
is nice, and we get to mostly remove use of `DNA_meshdata_types.h`.
Pull Request: https://projects.blender.org/blender/blender/pulls/116238
The term `looptri` was used ambiguously for both single & arrays.
The term `tri` was also used, causing `tri->tri`.
Use terms:
- `looptris` for an array or when dealing with multiple items.
- `looptri` is used when dealing with a single item.
- `lt` for a single MLoopTri variables & arguments.
This was already a convention but not followed closely.
This patches refactors the compositor File Output mechanism and
implements the file output node for the Realtime Compositor. The
refactor was done for the following reasons:
1. The existing file output mechanism relied on a global EXR image
resource where the result of each compositor execution for each
view was accumulated and stored in the global resource, until the
last view is executed, when the EXR is finally saved. Aside from
relying on global resources, this can cause effective memory leaks
since the compositor can be interrupted before the EXR is written and
closed.
2. We need common code to share between all compositors since we now
have multiple compositor implementations.
3. We needed to take the opportunity to fix some of the issues with the
existing implementation, like lossy compression of data passes,
and inability to save single values passes.
The refactor first introduced a new structure called the Compositor
Render Context. This context stores compositor information related to
the render pipeline and is persistent across all compositor executions
of all views. Its extended lifetime relative to a single compositor
execution lends itself well to store data that is accumulated across
views. The context currently has a map of File Output objects. Those
objects wrap a Render Result structure and can be used to construct
multi-view images which can then be saved after all views are executed
using the existing BKE_image_render_write function.
Minor adjustments were made to the BKE and RE modules to allow saving
using the BKE_image_render_write function. Namely, the function now
allows the use of a source image format for saving as well as the
ability to not save the render result as a render by introducing two new
default arguments. Further, for multi-layer EXR saving, the existent of
a single unnamed render layer will omit the layer name from the EXR
channel full name, and only the pass, view, and channel ID will remain.
Finally, the Render Result to Image Buffer conversion now take he number
of channels into account, instead of always assuming color channels.
The patch implements the File Output node in the Realtime Compositor
using the aforementioned mechanisms, replaces the implementation of the
CPU compositor using the same Realtime Compositor implementation, and
setup the necessary logic in the render pipeline code.
Pull Request: https://projects.blender.org/blender/blender/pulls/113982
The Realtime compositor currently relies on the GPU cache in image IDs.
That cache only supports single layer images, so multi-layer images will
be acquired without a cache, introducing significant IO bottlenecks for
the GPU compositor.
This patch ignores the image GPU cache and stores the images in the
static cache manager of the compositor. Draw data was introduced to the
image ID for proper cache invalidation, like other IDs such as masks.
The downside is that the cache will no longer be shared between EEVEE
and the compositor. But realistically, images are not typically shared
between materials and compositors.
This is just a temporary solution until we have proper GPU storage
support for image buffers.
Pull Request: https://projects.blender.org/blender/blender/pulls/115511
This gives better asserts in debug builds through use of Span, more
safety when name convention attributes happen to have different types
or domains, and simpler code in some cases. But the main reasoning is to
avoid relying on the specifics of CustomData more to allow us to replace
it in the future.
"mesh" reads much better than "me" since "me" is a different word.
There's no reason to avoid using two more characters here. Replacing
all of these at once is better than encountering it repeatedly and
doing the same change bit by bit.
Due to changes in the build environment shader_builder wasn't able to
compile on macOs. This patch reverts several recent changes to CMake files.
* dbb2844ed9
* 94817f64b9
* 1b6cd937ff
The idea is that in the near future shader_builder will run on the buildbot as
part of any regular build to ensure that changes to the CMake doesn't break
shader_builder and we only detect it after a few days.
Pull Request: https://projects.blender.org/blender/blender/pulls/115929
The experimental GPU compositor crashes when the render size is huge.
This is just due to GPU texture allocation failing. The patch fixes that
by downscaling the render result when reading, then upscaling it again
when writing. Additionally, the render size was adapted to the
downscaled size since it is used by other input nodes. This is not an
ideal solution, but it a good temporary solution to prevent crashes
until we have proper support for huge textures.
Pull Request: https://projects.blender.org/blender/blender/pulls/115299
Implement the next phases of bounds improvement design #96968.
Mainly the following changes:
Don't use `Object.runtime.bb` for performance caching volume bounds.
This is redundant with the cache in most geometry data-block types.
Instead, this becomes `Object.runtime.bounds_eval`, and is only used
where it's actually needed: syncing the bounds from the evaluated
geometry in the active depsgraph to the original object.
Remove all redundant functions to access geometry bounds with an
Object argument. These make the whole design confusing, since they
access geometry bounds at an object level.
Use `std::optional<Bounds<float3>>` to pass and store bounds instead
of an allocated `BoundBox` struct. This uses less space, avoids
small heap allocations, and generally simplifies code, since we
usually only want the min and max anyway.
After this, to avoid performance regressions, we should also cache
bounds in volumes, and maybe the legacy curve and GP data types
(though it might not be worth the effort for those legacy types).
Pull Request: https://projects.blender.org/blender/blender/pulls/114933
This patch adds support for full precision compositing for the Realtime
Compositor. A new precision option was added to the compositor to change
between half and full precision compositing, where the Auto option uses
half for the viewport compositor and the interactive render compositor,
while full is used for final renders.
The compositor context now need to implement the get_precision() method
to indicate its preferred precision. Intermediate results will be stored
using the context's precision, with a number of exceptions that can use
a different precision regardless of the context's precision. For
instance, summed area tables are always stored in full float results
even if the context specified half float. Conversely, jump flooding
tables are always stored in half integer results even if the context
specified full. The former requires full float while the latter has no
use for it.
Since shaders are created for a specific precision, we need two variants
of each compositor shader to account for the context's possible
precision. However, to avoid doubling the shader info count and reduce
boilerplate code and development time, an automated mechanism was
employed. A single shader info of whatever precision needs to be added,
then, at runtime, the shader info can be adjusted to change the
precision of the outputs. That shader variant is then cached in the
static cache manager for future processing-free shader retrieval.
Therefore, the shader manager was removed in favor of a cached shader
container in the static cache manager.
A number of utilities were added to make the creation of results as well as
the retrieval of shader with the target precision easier. Further, a
number of precision-specific shaders were removed in favor of more
generic ones that utilizes the aforementioned shader retrieval
mechanism.
Pull Request: https://projects.blender.org/blender/blender/pulls/113476