Commit Graph

1123 Commits

Author SHA1 Message Date
Clément Foucault
23dce15f67 EEVEE-Next: Horizon Scan: Use Spherical harmonics
This uses Spherical Harmonics to store the indirect lighting and
distant lighting visibility.

We can then reuse this information for each closure which divide
the cost of it by 2 or 3 in many cases, doing the scanning once.

The storage cost is higher than previous method, so we split the
resolution scaling to be independant of raytracing.

The spatial filtering has been split to its own pass for performance
reason. Upsampling now only uses 4 bilinearly interpolated samples
(instead of 9) using bilateral weights to avoid bleeding.

This also add a missing dot product (which soften the lighting
around corners) and fixes the blocky artifacts seen at lower
resolution.

Pull Request: https://projects.blender.org/blender/blender/pulls/118924
2024-03-19 19:16:21 +01:00
Aras Pranckevicius
a05adbef28 BLF: optimizations and fixes to font shader
Simplifies/optimizes the "font" shader. It runs faster now too, but primarily
this is so that it loads/initializes faster.

* Instead of doing blur via individual bilinear samples (where each sample is 4
  texel fetches), do raw texel fetches of the kernel footprint and compute final
  result by shifting the kernel weights according to bilinear fraction weight.
  For 5x5 blur, this reduces number of texel fetches from 64 down to 36.
* Instead of checking "is the texel inside the glyph box? if so, then fetch it",
  first fetch it, and then set result to zero if it was outside. Simplifies the
  branching code flow in the compiled GPU shader.
* Avoid costly integer modulo/division for "unwrapping" the font texture. The
  texture width is always power of two size, so division/modulo can be replaced
  by masking and a shift. Setup uniforms to contain the needed data.

### Fixes

* The 3x3 blur was not doing a 3x3 blur, due to a copy-pasta typo (one of the
  sample offsets was repeated twice, and thus another sample offset was
  missing).
* Blur towards left/top edges of the glyphs had artifacts, because float->int
  casting in GLSL rounds towards zero, but the code actually wanted to round
  towards floor.

Image of how the blur has changed in the PR.

### First time initialization

* Windows 10, NVIDIA RTX 3080Ti, OpenGL: 274.4ms -> 51.3ms
* macOS, Apple M1 Max, Metal: 456ms -> 289ms (this is including PSO creation
  time).

### Shader performance/complexity

Performance I only measured on macOS (M1 Max), by making a BLF text that is
scaled up to cover most of screen via Python. Using Xcode Metal profiler,
drawing that text with 5x5 shadow blur: 1.5ms -> 0.3ms.

More performance analysis details in PR.

Pull Request: https://projects.blender.org/blender/blender/pulls/119653
2024-03-19 16:29:21 +01:00
Jason Fielder
6b56ed3cd3 Metal: Resolve artifact in EEVEE Next Film Cryptomatte
Cryptomatte passes would generate a feathered outline
in Metal due to missing texture fence in chained
read->modify->write->read->... patterns.

Added imageFence function to explicitly state that
imageStore's should be visible to future imageLoad's.

Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/119163
2024-03-14 17:48:30 +01:00
Jeroen Bakker
f0f911590e EEVEE-Next: Viewport pixel size with up-sampling
EEVEE-Next performes less on integrated GPUs then discrete GPUs.
Most shaders have been analyzed, but there will always be bottlenecks
related to architectural differences.

In order to make EEVEE-Next run smooth on integrated GPUs this change
will implement viewport pixel size option similar to Cycles. The main difference
is that the samples will still be weighted and up-sampled to the final film
resolution. This makes the pixels not look squared in the viewport but will
resolve to something close to the results without up-scaling.

This improves the performance especially on integrated GPUs. The improvement
for discrete GPUs are less noticeable. See here the stats when playing
`rain_restaurant.blend` back on a RAPHAEL_MENDOCINO iGPU.

| Pixel size | Frames per second |
|------------|-------------------|
| 1x         | 0.25 FPS          |
| 2x         | 4.14 FPS          |
| 4x         | 6.90 FPS          |
| 8x         | 9.95 FPS          |

Related to: #114597
See PR for some example images.

Pull Request: https://projects.blender.org/blender/blender/pulls/118903
2024-03-13 12:00:24 +01:00
Campbell Barton
32151abfc3 Cleanup: spelling in comments 2024-03-09 16:47:38 +11:00
Campbell Barton
b1c59a793c Cleanup: correct spelling for alignment 2024-03-09 16:43:34 +11:00
Clément Foucault
b8e726a158 GPU: Add support for small types
This implement the design of #118961.

- Add aliases in GLSL since theses types are
  not supported.
- Add detection mechanism that prevents usage
  inside shader shared code.

Check is only done in debug build to avoid slowing down
application startup.

Pull Request: https://projects.blender.org/blender/blender/pulls/119226
2024-03-08 23:28:15 +01:00
Clément Foucault
4205718dce GPU: Cleanup type aliases
This define all aliases for supported types,
document which one to use in C++ shared code,
move relevant defines to their backend file.

Rename `bool1` to `bool32_t` and cleanup
its usage as mentioned in #118961.

Rel. #118961

Pull Request: https://projects.blender.org/blender/blender/pulls/119098
2024-03-08 19:09:10 +01:00
Jeroen Bakker
3109564825 GPU: Fix shader compilation Metal
Metal uses an union to store the `gl_WorkGroupSize` the union needs to
be unpacked. We first unpack to uvec3 before in order to work around an
NVIDIA driver bug.

Issue introduced by: e3ac2ac93e

Pull Request: https://projects.blender.org/blender/blender/pulls/118749
2024-02-26 14:51:21 +01:00
Jeroen Bakker
e3ac2ac93e GPU: Shaders fail to compile on NVIDIA
NVIDIA fails with segmentation fault when compiling shaders due to recent changes.
This PR tweaks the shader code to work around the segmentation fault.

Issue introduced by: 7f43699ebf

Pull Request: https://projects.blender.org/blender/blender/pulls/118744
2024-02-26 13:01:10 +01:00
Eugene Kuznetsov
7f43699ebf DRW: Curves: Indexbuf optimization for large numbers of curves
This optimizes a few loops that become significant bottlenecks during
viewport rendering of scenes with large numbers of curves.

To render a curves object, Blender needs to generate a potentially
very large (but trivial) index buffer. As previously implemented,
this index buffer is generated in an extremely inefficient manner,
with a single-threaded loop and an explicit function call per entry.
The buffer then needs to be pushed onto the GPU, which is also a fairly
slow task.

The PR generates the index buffer directly on the GPU with compute
shader.

Pull Request: https://projects.blender.org/blender/blender/pulls/116617
2024-02-25 17:22:58 +01:00
Jeroen Bakker
e6eecdf614 EEVEE-Next: Voronoi colors are pure emissive
The voronoi texture node only sets the first 3 components of the
color. The alpha value is never set. Normally this is covered
when using it in a shader node, but when directly connected to
the AOV output, the color was stored as a pure emissive color.

This resulted in incorrect colors in the viewport and image renders.

This is a partial fix for #118494

Pull Request: https://projects.blender.org/blender/blender/pulls/118497
2024-02-21 11:32:29 +01:00
Clément Foucault
749a3880de GL: Remove cube map array workaround 2024-01-31 18:12:59 +01:00
Clément Foucault
8c02b6eb24 EEVEE-Next: Add normal layer reuse
This allows to save some space in the gbuffer
and improve write performance.

Pull Request: https://projects.blender.org/blender/blender/pulls/117131
2024-01-31 15:36:42 +01:00
Omar Emara
51aac62006 Fix: Output of Color Ramp node is slightly off
The output of the Color Ramp node in the GPU compositor and EEVEE is
slightly off. That's because the factor is evaluated directly at the
sampler without proper half pixel offsets to account for the sampler's
linear interpolation, which this patch adds.

Pull Request: https://projects.blender.org/blender/blender/pulls/117677
2024-01-31 10:48:46 +01:00
Jason Fielder
190567f941 EEVEE Next: Optimize HiZ with fast image load store routines
Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/116953
2024-01-24 09:36:25 +01:00
Miguel Pozo
ef6b6031de Fix #117123: EEVEE: Broken Specular BSDF
Add missing disable sss flag
2024-01-15 16:57:18 +01:00
Hans Goudey
6438d0ad1f Cleanup: Grammar in comments 2024-01-11 11:01:50 -05:00
Miguel Pozo
31d8a6514f Fix: EEVEE(Legacy): Broken dielectric material shading
sss_radius r and g are already used for occlussion workarounds.
Use only sss_radius.b for flagging sss as disabled.
Regression from 2942147079.
2024-01-10 17:59:20 +01:00
Clément Foucault
ea989ebf94 EEVEE/EEVEE-Next: Split Diffuse and Subsurface closure
Even if related, they don't have the same performance
impact.

To avoid any performance hit, we replace the Diffuse
by a Subsurface Closure for legacy EEVEE and
use the subsurface closure only where needed for
EEVEE-Next leveraging the random sampling.

This increases the compatibility with cycles that
doesn't modulate the radius of the subsurface anymore.
This change is only present in EEVEE-Next.

This commit changes the principled BSDF code so that
it is easier to follow the flow of data.

For legacy EEVEE, the SSS switch is moved to a
`radius == -1` check.
2024-01-09 16:39:17 +13:00
Campbell Barton
0ba83fde1f Cleanup: spelling in comments 2024-01-08 11:24:37 +11:00
Brecht Van Lommel
d377ef2543 Clang Format: bump to version 17
Along with the 4.1 libraries upgrade, we are bumping the clang-format
version from 8-12 to 17. This affects quite a few files.

If not already the case, you may consider pointing your IDE to the
clang-format binary bundled with the Blender precompiled libraries.
2024-01-03 13:38:14 +01:00
Jason Fielder
d721dcd767 Metal: Resolve texture atomic compilation issue
Resolves small issue with native texture
atomic support after addition of fallback path.

Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/116657
2023-12-31 01:07:47 +01:00
Omar Emara
dc082f432a Fix #116522: Compositor incorrectly extrapolates values
The GPU compositor incorrectly extrapolates values of RGBA curves node.
That's because the code introduces a half-pixel offset to the color
values since they will be used to sample the curve maps. Those same
values are then used for extrapolation, which shouldn't take the
half-pixel value into account.

This patch fixes that by computing sampler coordinate in a separate
step.

Pull Request: https://projects.blender.org/blender/blender/pulls/116586
2023-12-28 09:25:11 +01:00
Jason Fielder
335d3a1b75 GPU: Add Shader specialization constant API
Adds API to allow usage of specialization constants in shaders.
Specialization constants are dynamic runtime constants which can
be compiled into a shader pipeline state object (PSO) to improve
runtime performance by reducing shader complexity through
shader compiler constant-folding.

This API allows specialization constant values to be specified
along with a default value if no constant value has been declared.
Each GPU backend is then responsible for caching PSO permutations
against the current specialization configuration.

This patch adds support for specialization constants in the
Metal backend and provides a generalised high-level solution
which can be adopted by other graphics APIs supporting
this feature.

Authored by Apple: Michael Parkin-White
Authored by Blender: Clément Foucault (files in gpu/test folder)

Pull Request: https://projects.blender.org/blender/blender/pulls/115193
2023-12-28 05:34:38 +01:00
Clément Foucault
c0fe51678e Fix #116489: Improper use of enum in GLSL file
Use defines instead.
2023-12-24 20:18:00 +13:00
Clément Foucault
f4275cc4df EEVEE-Next: Gbuffer Optimization
This modify the GBuffer layout to store less bits per closures.
This allows packing all closures into 64 bits or 96 bits.
In turn, this reduces the amount of data stored for most
usual materials.

Moreover, this contain some groundwork for the getting rid of the
hard-coded closure type. But evaluation shaders still use
the hard-coded types.

This adds tests for checking packing and unpacking of the gbuffer
doesn't loose any data.

Related to #115966

Pull Request: https://projects.blender.org/blender/blender/pulls/116476
2023-12-23 05:58:52 +01:00
Miguel Pozo
6c40adcc36 Fix: GPU: from_up_axis
glsl sign can return 0.
Fixes surfels display when normal.z == 0.
2023-12-13 17:47:48 +01:00
Clément Foucault
ac11ccd2bd EEVEE-Next: Add Translucent BSDF support
This adds support for Translucent BSDF.

This also fixes a bug to allow correct
shadowing.

The input normal had to be set back to
non-inverted in the node function to allow
for correct interpretation of the Normal
by Screen Space Reflections.

This add the necessary optimization
and code deduplication to hybrid deferred
and forward pipeline.

Pull Request: https://projects.blender.org/blender/blender/pulls/116070
2023-12-13 02:19:19 +01:00
Campbell Barton
77204bed17 Cleanup: spelling in comments 2023-12-12 12:58:56 +11:00
Jason Fielder
9313750f0a Metal: Add fallback path for texture atomics V2
This patch adds an alternative path for devices/OSs
which do not support native texture atomics in Metal.
Support is encapsulated within the backend, ensuring
any allocated texture with the USAGE_ATOMIC flag is
allocated with a backing buffer, upon which atomic
operations happen.

The shader generation is also changed for the atomic
case, which instructs the backend to insert additional
buffer bind-points for the buffer resource. As Metal
also only supports buffer-backed textures for
textureBuffers or 2D textures, TextureArrays and
3D textures are emulated within a 2D texture, with
sample locations being indirected.

All usage of atomic textures MUST now utilise the
correct atomic texture types in the high level shader
and GPUShaderCreateInfo declarations.

Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/115956
2023-12-11 23:00:20 +01:00
Campbell Barton
9898602e9d Cleanup: clarify #ifndef checks in trailing #endif comments 2023-12-07 10:38:54 +11:00
Clément Foucault
fe848ce3ef EEVEE-Next: Optimize GBuffer Layout and writting
This layout is more flexible and polymorphic.

While the worst case is worse (4 + 3 layers),
the common case is more optimized (2 + 2 layers).
The average written closure data is also lower
since we can compact the data for special cases
which are quite frequent.

Some adjustment had to be made in the denoise an
tile classify shaders.

Pull Request: https://projects.blender.org/blender/blender/pulls/115541
2023-12-01 14:41:13 +01:00
Harley Acheson
c8d1bb5902 BLF: Rename Text Shader Depth to Component Len
Rename text shader glyph "depth" to "glyph_comp_len", as requested
by Clément Foucault.

Pull Request: https://projects.blender.org/blender/blender/pulls/115640
2023-12-01 02:07:57 +01:00
Harley Acheson
2052d167c3 BLF: Fix Shader Type Conversion
Implicit conversion of vec2 is not permitted with Metal.

Pull Request: https://projects.blender.org/blender/blender/pulls/115638
2023-12-01 01:09:52 +01:00
Harley Acheson
5d330ddb1f BLF: Support All Render and Bitmap Formats
Add support for all of FreeType's various render formats and output
bitmap formats.

Pull Request: https://projects.blender.org/blender/blender/pulls/115452
2023-11-30 22:17:30 +01:00
Clément Foucault
92bec9af0c Fix EEVEE: Compilation on Metal 2023-11-30 01:10:27 +01:00
Jeroen Bakker
8ae80abe5f Fix: Various Small GLSL Material Tweaks
We should use explicit casting. Although it is not always needed it
is a best practise in order to support the shaders on Metal.

* `float max(float, int)` is not supported on Metal and fails with a compilation error

Pull Request: https://projects.blender.org/blender/blender/pulls/115464
2023-11-27 12:06:28 +01:00
Jason Fielder
9042773d93 GPU: Resolve compilation error in Metal caused by type ambiguity
Recent change in commit 3f778150a9
caused compilation errors in Metal due to type ambiguity. Updating call to
explicitly utilise floats where appropriate.

Authored by Apple: Michael Parkin-White

Co-authored-by: Michael Parkin-White <mparkinwhite@apple.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/115301
2023-11-27 11:50:25 +01:00
Jeroen Bakker
53ebe40ed2 Fix: EEVEE Compiler Issue Emission Shader
Due to recent changes the local variables where not explicitly cast, that failed
when compiling on Metal.

Pull Request: https://projects.blender.org/blender/blender/pulls/115463
2023-11-27 08:46:23 +01:00
Campbell Barton
1eff48a838 Cleanup: spelling in code 2023-11-27 10:55:39 +11:00
Campbell Barton
27c660707d Cleanup: spelling in comments, variables 2023-11-27 09:54:36 +11:00
Leon Schittek
0335b6a3b7 UI: Improve menu dropshadow
Improvements to the drawing of shadows, used with blocks, menus, nodes,
etc. Improvements to shape, especially at the top corner or at extremes
of widget roundness. Allows transparent objects to have shadows. This
is a nice refactor that removes a lot of code.

Pull Request: https://projects.blender.org/blender/blender/pulls/111794
2023-11-24 22:50:20 +01:00
Miguel Pozo
3f778150a9 GPU: Sanitize closure nodes inputs
Ensure all Closures are filled with correct values,
like the principled bsdf node already does.

The main reason is that the new AgX color transform doesn't play well
with negative values (see #113220), but it's probably best to ensure we
use sanitized values in the rendering code as a whole.

Pull Request: https://projects.blender.org/blender/blender/pulls/115059
2023-11-21 20:15:59 +01:00
Clément Foucault
a001cf9f2b EEVEE-Next: Displacement Option
This add the displacement option to EEVEE materials.
This unifies the option for Cycles and EEVEE.

The displacement only option is not matching cycles
and not particularly useful. So we decided to not
support it and revert to displacement + bump.

Pull Request: https://projects.blender.org/blender/blender/pulls/113979
2023-11-21 19:55:38 +01:00
Clément Foucault
3097d5d821 EEVEE-Next: Add horizon scan to raytracing module
This uses the principles outlined in
Screen Space Indirect Lighting with Visibility Bitmask
to compute local and distant diffuse lighting.

This implements it inside the ray-tracing module as a fallback when the
surface is too rough. The threshold for blending between technique is
available to the user.

The implementation first setup a radiance buffer and a view normal
buffer. These buffer are tracing resolution as the lighting quality is
less important for rough surfaces. These buffers are necessary to avoid
re-projection on a per sample basis, and finding and rotating the
surface normal.

The processing phase scans the whole screen in 2 directions and outputs
local incomming lighting from neighbor pixels and the remaining
occlusion for everything that is outside the view.

The final steps filters the result of the previous phase while applying
the occlusion on the probe radiance to have an energy conserving mix.

Related #112979

Pull Request: https://projects.blender.org/blender/blender/pulls/114259
2023-11-21 16:24:14 +01:00
Hoshinova
0b11c591ec Nodes: Merge Musgrave node into Noise node
This path merges the Musgrave and Noise Texture nodes into a single
combined Noise Texture node. The reasoning is that both nodes
intrinsically do the same thing, which is the layering of Perlin noise
derivatives to produce fractal noise. So the patch de-duplicates code
and unifies the use of fractal noise for the end use.

Since the Noise node had a Distortion input and a Color output, while
the Musgrave node did not, those are now available to the Musgrave types
as new functionalities.

The Dimension input of the Musgrave node is analogous to the Roughness
input of the Noise node, so both inputs were unified to follow the same
behavior of the Roughness input, which is arguable more intuitive to
control. Similarly, the Detail input was slightly different across both
nodes, since the Noise node evaluated one extra layer of noise. This was
also unified to follow the behavior of the Noise node.

The patch, coincidentally fixes an unreported bug causing repeated
output for certain noise types and another floating precision bug
#112180.

The versioning code implemented with this patch ensures backward
compatibility for both the Musgrave and Noise Texture nodes. When
opening older Blender files in Blender 4.1 the output of both nodes are
guaranteed to always be exactly identical to that of Blender files
created before the nodes were merged in all cases.

Forward compatibility with Blender 4.0 is implemented by #114236.
Forward compatibility with Blender 3.6 LTS is implemented by #115015.

Pull Request: #111187
2023-11-18 09:40:44 +01:00
Jeroen Bakker
0c9433bf44 Vulkan: Retarget Depth Range
OpenGL uses a depth range between -1 and 1, which is then normalized.
Metal & Vulkan uses a depth range between 0 and 1, which is already normalized.

The final plan would be to default to a depth range between 0 and 1, but
for now the depth ranges are retargetted so they won't be clipped away.

This solves the next issues for users:
- Navigate control will be rendered correctly
- Ortographic view clipping artifacts
- EEVEE light evaluation

Retargetting happens at the end of the vertex stage or when a geometry
stage is present at the end of the geometry stage. Derivatives using
depth would have a different value compared to OpenGL, but would match
Metal backend. OpenGL performs clipping and generates derivatives based
on the original depth value.

`gl_FragCoord` and clipping would have some precision differences as clipping
and normalizing are done in a different order but would match Metal.

Geometry shaders should use `gpu_EmitVertex` to ensure that the retargetting
is done per vertex.

Pull Request: https://projects.blender.org/blender/blender/pulls/114669
2023-11-10 12:32:06 +01:00
Miguel Pozo
7c68e9a94c Fix #114524: EEVEE-Next: Wrong normal map node result
The new draw manager stores the object scale as a regular flag.
Update `node_normal_map` to read it correctly.

Pull Request: https://projects.blender.org/blender/blender/pulls/114587
2023-11-07 18:58:31 +01:00
Brecht Van Lommel
adb41fe6b2 Merge branch 'blender-v4.0-release' into main 2023-11-06 19:13:18 +01:00