2023-08-16 00:20:26 +10:00
|
|
|
/* SPDX-FileCopyrightText: 2023 Blender Authors
|
2023-05-31 16:19:06 +02:00
|
|
|
*
|
|
|
|
|
* SPDX-License-Identifier: GPL-2.0-or-later */
|
2022-11-04 16:14:22 +02:00
|
|
|
|
|
|
|
|
#pragma once
|
|
|
|
|
|
2023-12-11 19:43:03 +02:00
|
|
|
#include "COM_bokeh_kernel.hh"
|
2023-12-13 09:50:42 +01:00
|
|
|
#include "COM_cached_image.hh"
|
2023-05-01 11:29:06 +02:00
|
|
|
#include "COM_cached_mask.hh"
|
Realtime Compositor: Support full precision compositing
This patch adds support for full precision compositing for the Realtime
Compositor. A new precision option was added to the compositor to change
between half and full precision compositing, where the Auto option uses
half for the viewport compositor and the interactive render compositor,
while full is used for final renders.
The compositor context now need to implement the get_precision() method
to indicate its preferred precision. Intermediate results will be stored
using the context's precision, with a number of exceptions that can use
a different precision regardless of the context's precision. For
instance, summed area tables are always stored in full float results
even if the context specified half float. Conversely, jump flooding
tables are always stored in half integer results even if the context
specified full. The former requires full float while the latter has no
use for it.
Since shaders are created for a specific precision, we need two variants
of each compositor shader to account for the context's possible
precision. However, to avoid doubling the shader info count and reduce
boilerplate code and development time, an automated mechanism was
employed. A single shader info of whatever precision needs to be added,
then, at runtime, the shader info can be adjusted to change the
precision of the outputs. That shader variant is then cached in the
static cache manager for future processing-free shader retrieval.
Therefore, the shader manager was removed in favor of a cached shader
container in the static cache manager.
A number of utilities were added to make the creation of results as well as
the retrieval of shader with the target precision easier. Further, a
number of precision-specific shaders were removed in favor of more
generic ones that utilizes the aforementioned shader retrieval
mechanism.
Pull Request: https://projects.blender.org/blender/blender/pulls/113476
2023-11-08 08:32:00 +01:00
|
|
|
#include "COM_cached_shader.hh"
|
2023-04-25 09:04:35 +02:00
|
|
|
#include "COM_cached_texture.hh"
|
Realtime Compositor: Implement Fast Gaussian blur
This patch implements the Fast Gaussian blur mode for the Realtime
Compositor. This is a faster but less accurate implementation of
Gaussian blur.
This is implemented as a recursive Gaussian blur algorithm based on the
general method outlined in the following paper:
Hale, Dave. "Recursive gaussian filters." CWP-546 (2006).
In particular, based on the table in Section 5 Conclusion, for very low
radius blur, we use a direct separable Gaussian convolution. For medium
blur radius, we use the fourth order IIR Deriche filter based on the
following paper:
Deriche, Rachid. Recursively implementating the Gaussian and its
derivatives. Diss. INRIA, 1993.
For high radius blur, we use the fourth order IIR Van Vliet filter based
on the following paper:
Van Vliet, Lucas J., Ian T. Young, and Piet W. Verbeek. "Recursive
Gaussian derivative filters." Proceedings. Fourteenth International
Conference on Pattern Recognition (Cat. No. 98EX170). Vol. 1. IEEE,
1998.
That's because direct convolution is faster and more accurate for very
low radius, while the Deriche filter is more accurate for medium blur
radius, while Van Vliet is more accurate for high blur radius. The
criteria suggested by the paper is a sigma value threshold of 3 and 32
for the Deriche and Van Vliet filters respectively, which we apply on
the larger of the two dimensions.
Both the Deriche and Van Vliet filters are numerically unstable for high
blur radius. So we decompose the Van Vliet filter into a parallel bank
of smaller second order filters based on the method of partial fractions
discussed in the book:
Oppenheim, Alan V. Discrete-time signal processing. Pearson Education
India, 1999.
We leave the Deriche filter as is since it is only used for low radii
anyways.
Compared to the CPU implementation, this implementation is more
accurate, but less numerically stable, since CPU uses doubles, which is
not feasible for the GPU.
The only change of behavior between CPU and this implementation is that
this implementation uses the same radius, so Fast Gaussian will match
normal Gaussian, while the CPU implementation has a radius that is 1.5x
the size of normal Gaussian. A patch to change the CPU behavior #121211.
Pull Request: https://projects.blender.org/blender/blender/pulls/120431
2024-05-01 09:57:30 +02:00
|
|
|
#include "COM_deriche_gaussian_coefficients.hh"
|
2023-06-07 14:45:46 +02:00
|
|
|
#include "COM_distortion_grid.hh"
|
2024-05-21 18:05:48 +03:00
|
|
|
#include "COM_fog_glow_kernel.hh"
|
2025-05-26 08:25:06 +02:00
|
|
|
#include "COM_image_coordinates.hh"
|
2023-10-04 07:35:07 +02:00
|
|
|
#include "COM_keying_screen.hh"
|
2022-11-04 16:14:22 +02:00
|
|
|
#include "COM_morphological_distance_feather_weights.hh"
|
2023-05-15 07:20:08 +02:00
|
|
|
#include "COM_ocio_color_space_conversion_shader.hh"
|
2023-03-26 16:59:13 +02:00
|
|
|
#include "COM_smaa_precomputed_textures.hh"
|
2022-11-04 16:14:22 +02:00
|
|
|
#include "COM_symmetric_blur_weights.hh"
|
|
|
|
|
#include "COM_symmetric_separable_blur_weights.hh"
|
Realtime Compositor: Implement Fast Gaussian blur
This patch implements the Fast Gaussian blur mode for the Realtime
Compositor. This is a faster but less accurate implementation of
Gaussian blur.
This is implemented as a recursive Gaussian blur algorithm based on the
general method outlined in the following paper:
Hale, Dave. "Recursive gaussian filters." CWP-546 (2006).
In particular, based on the table in Section 5 Conclusion, for very low
radius blur, we use a direct separable Gaussian convolution. For medium
blur radius, we use the fourth order IIR Deriche filter based on the
following paper:
Deriche, Rachid. Recursively implementating the Gaussian and its
derivatives. Diss. INRIA, 1993.
For high radius blur, we use the fourth order IIR Van Vliet filter based
on the following paper:
Van Vliet, Lucas J., Ian T. Young, and Piet W. Verbeek. "Recursive
Gaussian derivative filters." Proceedings. Fourteenth International
Conference on Pattern Recognition (Cat. No. 98EX170). Vol. 1. IEEE,
1998.
That's because direct convolution is faster and more accurate for very
low radius, while the Deriche filter is more accurate for medium blur
radius, while Van Vliet is more accurate for high blur radius. The
criteria suggested by the paper is a sigma value threshold of 3 and 32
for the Deriche and Van Vliet filters respectively, which we apply on
the larger of the two dimensions.
Both the Deriche and Van Vliet filters are numerically unstable for high
blur radius. So we decompose the Van Vliet filter into a parallel bank
of smaller second order filters based on the method of partial fractions
discussed in the book:
Oppenheim, Alan V. Discrete-time signal processing. Pearson Education
India, 1999.
We leave the Deriche filter as is since it is only used for low radii
anyways.
Compared to the CPU implementation, this implementation is more
accurate, but less numerically stable, since CPU uses doubles, which is
not feasible for the GPU.
The only change of behavior between CPU and this implementation is that
this implementation uses the same radius, so Fast Gaussian will match
normal Gaussian, while the CPU implementation has a radius that is 1.5x
the size of normal Gaussian. A patch to change the CPU behavior #121211.
Pull Request: https://projects.blender.org/blender/blender/pulls/120431
2024-05-01 09:57:30 +02:00
|
|
|
#include "COM_van_vliet_gaussian_coefficients.hh"
|
2022-11-04 16:14:22 +02:00
|
|
|
|
2024-12-17 11:39:04 +01:00
|
|
|
namespace blender::compositor {
|
2022-11-04 16:14:22 +02:00
|
|
|
|
|
|
|
|
/* -------------------------------------------------------------------------------------------------
|
|
|
|
|
* Static Cache Manager
|
|
|
|
|
*
|
|
|
|
|
* A static cache manager is a collection of cached resources that can be retrieved when needed and
|
2023-04-25 13:20:00 +02:00
|
|
|
* created if not already available. In particular, each cached resource type has its own instance
|
|
|
|
|
* of a container derived from the CachedResourceContainer type in the class. All instances of that
|
|
|
|
|
* cached resource type are stored and tracked in the container. See the CachedResource and
|
|
|
|
|
* CachedResourceContainer classes for more information.
|
2022-11-04 16:14:22 +02:00
|
|
|
*
|
|
|
|
|
* The manager deletes the cached resources that are no longer needed. A cached resource is said to
|
|
|
|
|
* be not needed when it was not used in the previous evaluation. This is done through the
|
|
|
|
|
* following mechanism:
|
|
|
|
|
*
|
|
|
|
|
* - Before every evaluation, do the following:
|
|
|
|
|
* 1. All resources whose CachedResource::needed flag is false are deleted.
|
|
|
|
|
* 2. The CachedResource::needed flag of all remaining resources is set to false.
|
|
|
|
|
* - During evaluation, when retrieving any cached resource, set its CachedResource::needed flag to
|
|
|
|
|
* true.
|
|
|
|
|
*
|
|
|
|
|
* In effect, any resource that was used in the previous evaluation but was not used in the current
|
|
|
|
|
* evaluation will be deleted before the next evaluation. This mechanism is implemented in the
|
2024-06-28 15:27:09 +02:00
|
|
|
* reset() method of the class, which should be called before every evaluation. The reset for the
|
|
|
|
|
* next evaluation can be skipped by calling the skip_next_reset() method, see its description for
|
|
|
|
|
* more information. */
|
2022-11-04 16:14:22 +02:00
|
|
|
class StaticCacheManager {
|
|
|
|
|
public:
|
2023-04-25 13:20:00 +02:00
|
|
|
SymmetricBlurWeightsContainer symmetric_blur_weights;
|
|
|
|
|
SymmetricSeparableBlurWeightsContainer symmetric_separable_blur_weights;
|
|
|
|
|
MorphologicalDistanceFeatherWeightsContainer morphological_distance_feather_weights;
|
|
|
|
|
CachedTextureContainer cached_textures;
|
2023-05-01 11:29:06 +02:00
|
|
|
CachedMaskContainer cached_masks;
|
|
|
|
|
SMAAPrecomputedTexturesContainer smaa_precomputed_textures;
|
2023-05-15 07:20:08 +02:00
|
|
|
OCIOColorSpaceConversionShaderContainer ocio_color_space_conversion_shaders;
|
2023-06-07 14:45:46 +02:00
|
|
|
DistortionGridContainer distortion_grids;
|
2023-10-04 07:35:07 +02:00
|
|
|
KeyingScreenContainer keying_screens;
|
Realtime Compositor: Support full precision compositing
This patch adds support for full precision compositing for the Realtime
Compositor. A new precision option was added to the compositor to change
between half and full precision compositing, where the Auto option uses
half for the viewport compositor and the interactive render compositor,
while full is used for final renders.
The compositor context now need to implement the get_precision() method
to indicate its preferred precision. Intermediate results will be stored
using the context's precision, with a number of exceptions that can use
a different precision regardless of the context's precision. For
instance, summed area tables are always stored in full float results
even if the context specified half float. Conversely, jump flooding
tables are always stored in half integer results even if the context
specified full. The former requires full float while the latter has no
use for it.
Since shaders are created for a specific precision, we need two variants
of each compositor shader to account for the context's possible
precision. However, to avoid doubling the shader info count and reduce
boilerplate code and development time, an automated mechanism was
employed. A single shader info of whatever precision needs to be added,
then, at runtime, the shader info can be adjusted to change the
precision of the outputs. That shader variant is then cached in the
static cache manager for future processing-free shader retrieval.
Therefore, the shader manager was removed in favor of a cached shader
container in the static cache manager.
A number of utilities were added to make the creation of results as well as
the retrieval of shader with the target precision easier. Further, a
number of precision-specific shaders were removed in favor of more
generic ones that utilizes the aforementioned shader retrieval
mechanism.
Pull Request: https://projects.blender.org/blender/blender/pulls/113476
2023-11-08 08:32:00 +01:00
|
|
|
CachedShaderContainer cached_shaders;
|
2023-12-11 19:43:03 +02:00
|
|
|
BokehKernelContainer bokeh_kernels;
|
2023-12-13 09:50:42 +01:00
|
|
|
CachedImageContainer cached_images;
|
Realtime Compositor: Implement Fast Gaussian blur
This patch implements the Fast Gaussian blur mode for the Realtime
Compositor. This is a faster but less accurate implementation of
Gaussian blur.
This is implemented as a recursive Gaussian blur algorithm based on the
general method outlined in the following paper:
Hale, Dave. "Recursive gaussian filters." CWP-546 (2006).
In particular, based on the table in Section 5 Conclusion, for very low
radius blur, we use a direct separable Gaussian convolution. For medium
blur radius, we use the fourth order IIR Deriche filter based on the
following paper:
Deriche, Rachid. Recursively implementating the Gaussian and its
derivatives. Diss. INRIA, 1993.
For high radius blur, we use the fourth order IIR Van Vliet filter based
on the following paper:
Van Vliet, Lucas J., Ian T. Young, and Piet W. Verbeek. "Recursive
Gaussian derivative filters." Proceedings. Fourteenth International
Conference on Pattern Recognition (Cat. No. 98EX170). Vol. 1. IEEE,
1998.
That's because direct convolution is faster and more accurate for very
low radius, while the Deriche filter is more accurate for medium blur
radius, while Van Vliet is more accurate for high blur radius. The
criteria suggested by the paper is a sigma value threshold of 3 and 32
for the Deriche and Van Vliet filters respectively, which we apply on
the larger of the two dimensions.
Both the Deriche and Van Vliet filters are numerically unstable for high
blur radius. So we decompose the Van Vliet filter into a parallel bank
of smaller second order filters based on the method of partial fractions
discussed in the book:
Oppenheim, Alan V. Discrete-time signal processing. Pearson Education
India, 1999.
We leave the Deriche filter as is since it is only used for low radii
anyways.
Compared to the CPU implementation, this implementation is more
accurate, but less numerically stable, since CPU uses doubles, which is
not feasible for the GPU.
The only change of behavior between CPU and this implementation is that
this implementation uses the same radius, so Fast Gaussian will match
normal Gaussian, while the CPU implementation has a radius that is 1.5x
the size of normal Gaussian. A patch to change the CPU behavior #121211.
Pull Request: https://projects.blender.org/blender/blender/pulls/120431
2024-05-01 09:57:30 +02:00
|
|
|
DericheGaussianCoefficientsContainer deriche_gaussian_coefficients;
|
|
|
|
|
VanVlietGaussianCoefficientsContainer van_vliet_gaussian_coefficients;
|
2024-05-21 18:05:48 +03:00
|
|
|
FogGlowKernelContainer fog_glow_kernels;
|
2025-05-26 08:25:06 +02:00
|
|
|
ImageCoordinatesContainer image_coordinates;
|
2023-04-25 13:20:00 +02:00
|
|
|
|
2024-06-28 15:27:09 +02:00
|
|
|
private:
|
|
|
|
|
/* The cache manager should skip the next reset. See the skip_next_reset() method for more
|
|
|
|
|
* information. */
|
|
|
|
|
bool should_skip_next_reset_ = false;
|
|
|
|
|
|
|
|
|
|
public:
|
2022-11-04 16:14:22 +02:00
|
|
|
/* Reset the cache manager by deleting the cached resources that are no longer needed because
|
|
|
|
|
* they weren't used in the last evaluation and prepare the remaining cached resources to track
|
|
|
|
|
* their needed status in the next evaluation. See the class description for more information.
|
|
|
|
|
* This should be called before every evaluation. */
|
|
|
|
|
void reset();
|
2024-06-28 15:27:09 +02:00
|
|
|
|
|
|
|
|
/* Specifies that the cache manager should skip the next reset. This is useful for instance when
|
|
|
|
|
* the evaluation gets canceled before it was fully done, in that case, we wouldn't want to
|
|
|
|
|
* invalidate the cache because not all operations that use cached resources got the chance to
|
|
|
|
|
* mark their used resources as still in use. So we wait until a full evaluation happen before we
|
|
|
|
|
* decide that some resources are no longer needed. */
|
|
|
|
|
void skip_next_reset();
|
2022-11-04 16:14:22 +02:00
|
|
|
};
|
|
|
|
|
|
2024-12-17 11:39:04 +01:00
|
|
|
} // namespace blender::compositor
|