Commit Graph

5065 Commits

Author SHA1 Message Date
Omar Emara
0b1dc351e4 Fix: eGPUBarrier enum negate operator is broken
The ENUM_OPERATORS macro for the eGPUBarrier enum uses
GPU_BARRIER_UNIFORM as its maximum value, while it should be
GPU_BARRIER_BUFFER_UPDATE instead.
2024-04-01 16:14:23 +02:00
Hoshinova
c78c6b0bdf Fix #119797: Noise Texture Precision Issues
The Perlin noise algorithms suffer from precision issues when a coordinate
is greater than about 250000.

To fix this the Perlin noise texture is repeated every 100000 on each axis.
This causes discontinuities every 100000, however at such scales this
usually shouldn't be noticeable.

Pull Request: https://projects.blender.org/blender/blender/pulls/119884
2024-03-29 16:12:23 +01:00
Campbell Barton
686605a6dd Cleanup: declare arrays as const where possible 2024-03-28 22:57:57 +11:00
Campbell Barton
b2e00d1285 Cleanup: use const pointer arguments 2024-03-28 20:57:50 +11:00
Campbell Barton
362d381a5a Cleanup: pass GPUStateMutable as a const reference 2024-03-28 18:10:49 +11:00
Campbell Barton
b0328f67a9 Fix invalid sizes when clearing gpu::Batch
For Batch::verts some values weren't cleared,
for Batch::inst values after the array would be cleared,
although as these were already zeroed this probably didn't cause
problems in practice.
2024-03-28 13:45:22 +11:00
Campbell Barton
3416fe6e1e License headers: add SPDX headers 2024-03-27 10:31:24 +11:00
Campbell Barton
40ab214c0a Cleanup: spelling in comments 2024-03-27 10:25:31 +11:00
Hans Goudey
48e4576162 Cleanup: Remove unnecessary keywords from C++ headers 2024-03-26 15:58:39 -04:00
Clément Foucault
2a600b4a83 EEVEE-Next: Shadow: Limit view per shadow map projection
This limits the number of tilemaps per LOD that can be fed to avoid the
easy to hit "Too many shadow updates" (#119757).

This allows for a max 64 tilemaps to be updated at once at their lowest
requested LOD (so ~10.6667 point lights if every faces of the punctual
shadow map is needed, but likely more in practice).

Unfortunately this is still quite low and will surely be hit quite soon
with directional shadow added to it. One idea to workaround this would
be to time slice the update of some lights, but this opens a whole can
of worms that I'm not ready to open for now so I created #119890 for
future reference.

Some notes, most lights seems to request around 3 LODs. It might help
to allow requesting at least 2 LODs if we are rendering since volumes
might want lower LOD available for volumes.

I added a very simplistic heuristic that also lowers the max tilemaps
when transforming, animation playback or navigating the 3D view to
improve the responsiveness of the engine. Note that this doesn't
only lowers the resolution to the minimum requested one. So it should
be good enough in most cases.

Pull Request: https://projects.blender.org/blender/blender/pulls/119889
2024-03-26 20:33:31 +01:00
Aras Pranckevicius
3663c8147c Vulkan: implement support for compressed textures
Textures that are GPU-compressed already (in practice: from DDS files
that are DXT1/DXT3/DXT5 compressed) now can stay GPU compressed
in Vulkan, similar to how that works on OpenGL.

Additionally, fixed lack of mipmaps in Vulkan textures. The textures
were created with mipmaps (good), the sampler too (good), but
the vulkan image view was always saying "yo, this is mip 0 only"
because mip range variables were never set to anything than zero.

Pull Request: https://projects.blender.org/blender/blender/pulls/119866
2024-03-26 14:49:53 +01:00
Jeroen Bakker
e811785f37 Vulkan: to_string for used vulkan types
Every vulkan installation has a vk.xml file containing the vulkan specification
in a machine readable fasion.

This PR uses the vk.xml to generate to_string functions for data types blender uses.
When updating to a new specification or when changing features/extensions we
should re-generate the to_string functions.

The generator is implemented in `vk_to_string.py`.

Pull Request: https://projects.blender.org/blender/blender/pulls/119880
2024-03-26 11:35:16 +01:00
Campbell Barton
155dae94d7 Cleanup: code-comments, use doxygen formatting & spelling corrections
Also move some function doc-strings from the implementation
to their declarations.
2024-03-26 17:55:20 +11:00
Hans Goudey
fc0d8ba012 Cleanup: Remove C++ ifdef checks in C++ headers
Pull Request: https://projects.blender.org/blender/blender/pulls/119900
2024-03-26 04:56:03 +01:00
Hans Goudey
893130e6fe Refactor: Remove unnecessary C wrapper for GPUBatch class
Similar to fe76d8c946

Pull Request: https://projects.blender.org/blender/blender/pulls/119898
2024-03-26 03:06:25 +01:00
Aras Pranckevicius
26337b9fb4 Metal: implement support for compressed textures
Noticed lack of it via #119793. Now DDS images using BC1/BC2/BC3
(aka DXT1/DXT3/DXT5) formats can keep on being GPU compressed
on Metal too, just like e.g. on OpenGL.

Pull Request: https://projects.blender.org/blender/blender/pulls/119835
2024-03-25 11:40:20 +01:00
Hans Goudey
b54d9875ba Fix: Another Metal build error after recent refactor
Sorry for the noise, I misread the output from the PR build.
2024-03-24 13:24:03 -04:00
Hans Goudey
aa87b747c5 Fix: Additional macOS metal build error 2024-03-24 12:37:36 -04:00
Hans Goudey
e201b5e553 Fix: Debug build error after previous commit 2024-03-24 12:17:44 -04:00
Hans Goudey
fe76d8c946 Refactor: Remove unnecessary C wrappers for vertex and index buffers
Now that all relevant code is C++, the indirection from the C struct
`GPUVertBuf` to the C++ `blender::gpu::VertBuf` class just adds
complexity and necessitates a wrapper API, making more cleanups like
use of RAII or other C++ types more difficult.

This commit replaces the C wrapper structs with direct use of the
vertex and index buffer base classes. In C++ we can choose which parts
of a class are private, so we don't risk exposing too many
implementation details here.

Pull Request: https://projects.blender.org/blender/blender/pulls/119825
2024-03-24 16:38:30 +01:00
Hans Goudey
8b514bccd1 Cleanup: Move remaining GPU headers to C++
Pull Request: https://projects.blender.org/blender/blender/pulls/119807
2024-03-23 01:24:18 +01:00
Miguel Pozo
def5f86cae Fix: EEVEE-Next: Material compilation
Move pcg functions to eevee_sampling_lib.
Including gpu_shader_common libs in engine code results in double  includes.
2024-03-22 18:58:12 +01:00
Jeroen Bakker
463856e6c6 GPU: Remove print statement when frame capturing
When frame capturing cannot be start an error is printed to the console.
Most of the time the issue is that you're not running from within a frame
capturing environment. For example not from your IDE/GPU debugger.

The print statement is often just not that useful. Especially when
running the `WITH_GPU_DRAW_TESTS` where it floods the console.

Pull Request: https://projects.blender.org/blender/blender/pulls/119783
2024-03-22 16:27:52 +01:00
Campbell Barton
57dd9c21d3 Cleanup: spelling in comments 2024-03-21 10:02:53 +11:00
Miguel Pozo
3888bdf8b2 EEVEE-Next: Fix transparent shadows convergence
Replace the hashed alpha function in shadows for a fully random one.
Add pcg functions to `gpu_shader_common_hash.glsl`
(Split from #119480)

Pull Request: https://projects.blender.org/blender/blender/pulls/119526
2024-03-20 16:05:07 +01:00
Brecht Van Lommel
dc34e96dc4 Merge branch 'blender-v4.1-release' 2024-03-20 15:49:15 +01:00
Jason Fielder
c584597165 Fix #109363: Resolve GPencil fill in Metal
Resolves an issue with stroke rendering in
Metal using the geometry shader fallback
path. Stroke rendering now matches OpenGL
which should enable the GPencil fill tool to
function correctly at all zoom levels.

Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/119660
2024-03-20 15:38:44 +01:00
Clément Foucault
23dce15f67 EEVEE-Next: Horizon Scan: Use Spherical harmonics
This uses Spherical Harmonics to store the indirect lighting and
distant lighting visibility.

We can then reuse this information for each closure which divide
the cost of it by 2 or 3 in many cases, doing the scanning once.

The storage cost is higher than previous method, so we split the
resolution scaling to be independant of raytracing.

The spatial filtering has been split to its own pass for performance
reason. Upsampling now only uses 4 bilinearly interpolated samples
(instead of 9) using bilateral weights to avoid bleeding.

This also add a missing dot product (which soften the lighting
around corners) and fixes the blocky artifacts seen at lower
resolution.

Pull Request: https://projects.blender.org/blender/blender/pulls/118924
2024-03-19 19:16:21 +01:00
Aras Pranckevicius
a05adbef28 BLF: optimizations and fixes to font shader
Simplifies/optimizes the "font" shader. It runs faster now too, but primarily
this is so that it loads/initializes faster.

* Instead of doing blur via individual bilinear samples (where each sample is 4
  texel fetches), do raw texel fetches of the kernel footprint and compute final
  result by shifting the kernel weights according to bilinear fraction weight.
  For 5x5 blur, this reduces number of texel fetches from 64 down to 36.
* Instead of checking "is the texel inside the glyph box? if so, then fetch it",
  first fetch it, and then set result to zero if it was outside. Simplifies the
  branching code flow in the compiled GPU shader.
* Avoid costly integer modulo/division for "unwrapping" the font texture. The
  texture width is always power of two size, so division/modulo can be replaced
  by masking and a shift. Setup uniforms to contain the needed data.

### Fixes

* The 3x3 blur was not doing a 3x3 blur, due to a copy-pasta typo (one of the
  sample offsets was repeated twice, and thus another sample offset was
  missing).
* Blur towards left/top edges of the glyphs had artifacts, because float->int
  casting in GLSL rounds towards zero, but the code actually wanted to round
  towards floor.

Image of how the blur has changed in the PR.

### First time initialization

* Windows 10, NVIDIA RTX 3080Ti, OpenGL: 274.4ms -> 51.3ms
* macOS, Apple M1 Max, Metal: 456ms -> 289ms (this is including PSO creation
  time).

### Shader performance/complexity

Performance I only measured on macOS (M1 Max), by making a BLF text that is
scaled up to cover most of screen via Python. Using Xcode Metal profiler,
drawing that text with 5x5 shadow blur: 1.5ms -> 0.3ms.

More performance analysis details in PR.

Pull Request: https://projects.blender.org/blender/blender/pulls/119653
2024-03-19 16:29:21 +01:00
Brecht Van Lommel
7a395e2e7f Revert changes from main commits that were merged into blender-v4.1-release
The last good commit was f57e4c5b98.

After this one more fix was committed, this one is preserved as well:
67bd678887.
2024-03-18 15:04:12 +01:00
Jason Fielder
661d12aef7 Fix #119195: Ensure Metal uses correct attribute conversion mode
Resolves custom attribute types for ints and booleans by ensuring
conversion mode is correct. Previously, the attribute declarations
were assumed to be linear. However, patch ensures the correct
attribute index is now fetched, ensuring the conversion mode
is correctly specified for non-linear attribute ID's.

Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/119569
2024-03-18 13:38:09 +01:00
Jason Fielder
6768ded895 Fix #118868: Metal render pass output for EEVEE Next
Resolves render pass export for EEVEE Next on Metal.
Reads from texture views was previously utilising the
root texture rather than the view variant, resulting
in views into texture arrays being incorrectly sampled.

Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/119563
2024-03-16 20:16:37 +01:00
Jason Fielder
6b56ed3cd3 Metal: Resolve artifact in EEVEE Next Film Cryptomatte
Cryptomatte passes would generate a feathered outline
in Metal due to missing texture fence in chained
read->modify->write->read->... patterns.

Added imageFence function to explicitly state that
imageStore's should be visible to future imageLoad's.

Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/119163
2024-03-14 17:48:30 +01:00
Jason Fielder
ecffea86b1 Metal: Fix Storage buffer read sync affecting surfels
Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/119093
2024-03-14 09:40:59 +01:00
Jeroen Bakker
f0f911590e EEVEE-Next: Viewport pixel size with up-sampling
EEVEE-Next performes less on integrated GPUs then discrete GPUs.
Most shaders have been analyzed, but there will always be bottlenecks
related to architectural differences.

In order to make EEVEE-Next run smooth on integrated GPUs this change
will implement viewport pixel size option similar to Cycles. The main difference
is that the samples will still be weighted and up-sampled to the final film
resolution. This makes the pixels not look squared in the viewport but will
resolve to something close to the results without up-scaling.

This improves the performance especially on integrated GPUs. The improvement
for discrete GPUs are less noticeable. See here the stats when playing
`rain_restaurant.blend` back on a RAPHAEL_MENDOCINO iGPU.

| Pixel size | Frames per second |
|------------|-------------------|
| 1x         | 0.25 FPS          |
| 2x         | 4.14 FPS          |
| 4x         | 6.90 FPS          |
| 8x         | 9.95 FPS          |

Related to: #114597
See PR for some example images.

Pull Request: https://projects.blender.org/blender/blender/pulls/118903
2024-03-13 12:00:24 +01:00
laurynas
aa3ffca8dc Fix #119247: Curves: Extra point in evaluated spline of Curves geometry
In bf17fc8d79  after extending buffer to multiple of 4 there appeared trailing
space in buffer not covered by shader's `for` loop.

Pull Request: https://projects.blender.org/blender/blender/pulls/119346
2024-03-12 15:01:10 +01:00
Jason Fielder
06ac33bdd2 Metal: Fix SSBO from VBO size assertion
Resolves assertion firing when creating an SSBO
from a VBO which is not aligned to 16 bytes.
Required to ensure API validation is satisfied.

Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/119298
2024-03-11 08:28:14 +01:00
Prakhar-Singh-Chouhan
5d076e0e7b Vulkan: Implementing VKBackend::samplers_update()
Implemented `VKBackend::samplers_update()`. When triggered,
if the VK Device is initialized, the `device.samplers` are
freed and reinitialized.

Implements: #117019
Pull Request: https://projects.blender.org/blender/blender/pulls/119109
2024-03-11 07:57:52 +01:00
Jason Fielder
703353b5da Metal: Fix uniform upload for small types
This patch adds special cases to Shader::uniform_int routine
to allow writing of small types (1 bytes, 2 bytes) to the push
constant buffer.

This previously interpreted all incoming push constant data as
integer components only, resulting in rendering artifacts such as
bad SRGB mode selection and shader editor not rendering due
to mis-aligned overlay parameter, as the uniform assignment
would overflow consecutive small types.

Authored by Apple: Michael Parkin-White

Pull Request: https://projects.blender.org/blender/blender/pulls/119285
2024-03-10 19:36:30 +01:00
Campbell Barton
e33f5e36ac Cleanup: spacing around C-style comment blocks 2024-03-09 23:40:57 +11:00
Campbell Barton
32151abfc3 Cleanup: spelling in comments 2024-03-09 16:47:38 +11:00
Campbell Barton
b1c59a793c Cleanup: correct spelling for alignment 2024-03-09 16:43:34 +11:00
Clément Foucault
b8e726a158 GPU: Add support for small types
This implement the design of #118961.

- Add aliases in GLSL since theses types are
  not supported.
- Add detection mechanism that prevents usage
  inside shader shared code.

Check is only done in debug build to avoid slowing down
application startup.

Pull Request: https://projects.blender.org/blender/blender/pulls/119226
2024-03-08 23:28:15 +01:00
Clément Foucault
4205718dce GPU: Cleanup type aliases
This define all aliases for supported types,
document which one to use in C++ shared code,
move relevant defines to their backend file.

Rename `bool1` to `bool32_t` and cleanup
its usage as mentioned in #118961.

Rel. #118961

Pull Request: https://projects.blender.org/blender/blender/pulls/119098
2024-03-08 19:09:10 +01:00
Hans Goudey
1e1d7034ec Cleanup: Move GPU_uniform_buffer.h to C++ 2024-03-06 21:54:28 -05:00
Anthony Roberts
445fd42c61 Windows: Add ARM64 support
* Only works on machines with a Qualcomm Snapdragon 8cx Gen3 or above.
  Older generation devices are not and will not be supported due to
  some driver issues
* Requires VS2022 for building.
* Uses new MSVC preprocessor for sse2neon compatibility.
* SIMD is not enabled, waiting on conversion of blenlib to C++.

Ref #119126

Pull Request: https://projects.blender.org/blender/blender/pulls/117036
2024-03-06 16:14:34 +01:00
Omar Emara
eb91828aab GPU: Add maximum image units to GPU capabilities
This patch adds the maximum number of supported image units to the GPU
capabilities module. Currently, the GPU module assume a maximum of 8
units, so the patch is not currently particularly useful, but we can
consider committing it for the future anyways.

Pull Request: https://projects.blender.org/blender/blender/pulls/119057
2024-03-05 07:25:20 +01:00
Campbell Barton
ed5fb3eaba Cleanup: various non-functional C++ changes 2024-03-05 11:32:42 +11:00
Campbell Barton
76867ad4c2 Cleanup: redundant "void" in function declarations for C++ 2024-03-05 11:25:35 +11:00
laurynas
bf17fc8d79 Fix: GPU: Ensures length of curves GPUIndexBuf to be multiple of 4
Exception is thrown in gpu_storage_buffer.cc

To reproduce create legacy Bezier curve and convert it to new Curves.
Code is from #116617

Pull Request: https://projects.blender.org/blender/blender/pulls/118951
2024-03-03 16:39:11 +01:00