This replaces the direct shader uniform layout declaration by a linear
search through a global buffer.
Each instance has an attribute offset inside the global buffer and an
attribute count.
This removes any padding and tighly pack all uniform attributes inside
a single buffer.
This would also remove the limit of 8 attribute but it is kept because of
compatibility with the old system that is still used by the old draw
manager.
This is a new implementation of the draw manager using modern
rendering practices and GPU driven culling.
This only ports features that are not considered deprecated or to be
removed.
The old DRW API is kept working along side this new one, and does not
interfeer with it. However this needed some more hacking inside the
draw_view_lib.glsl. At least the create info are well separated.
The reviewer might start by looking at `draw_pass_test.cc` to see the
API in usage.
Important files are `draw_pass.hh`, `draw_command.hh`,
`draw_command_shared.hh`.
In a nutshell (for a developper used to old DRW API):
- `DRWShadingGroups` are replaced by `Pass<T>::Sub`.
- Contrary to DRWShadingGroups, all commands recorded inside a pass or
sub-pass (even binds / push_constant / uniforms) will be executed in order.
- All memory is managed per object (except for Sub-Pass which are managed
by their parent pass) and not from draw manager pools. So passes "can"
potentially be recorded once and submitted multiple time (but this is
not really encouraged for now). The only implicit link is between resource
lifetime and `ResourceHandles`
- Sub passes can be any level deep.
- IMPORTANT: All state propagate from sub pass to subpass. There is no
state stack concept anymore. Ensure the correct render state is set before
drawing anything using `Pass::state_set()`.
- The drawcalls now needs a `ResourceHandle` instead of an `Object *`.
This is to remove any implicit dependency between `Pass` and `Manager`.
This was a huge problem in old implementation since the manager did not
know what to pull from the object. Now it is explicitly requested by the
engine.
- The pases need to be submitted to a `draw::Manager` instance which can
be retrieved using `DRW_manager_get()` (for now).
Internally:
- All object data are stored in contiguous storage buffers. Removing a lot
of complexity in the pass submission.
- Draw calls are sorted and visibility tested on GPU. Making more modern
culling and better instancing usage possible in the future.
- Unit Tests have been added for regression testing and avoid most API
breakage.
- `draw::View` now contains culling data for all objects in the scene
allowing caching for multiple views.
- Bounding box and sphere final setup is moved to GPU.
- Some global resources locations have been hardcoded to reduce complexity.
What is missing:
- ~~Workaround for lack of gl_BaseInstanceARB.~~ Done
- ~~Object Uniform Attributes.~~ Done (Not in this patch)
- Workaround for hardware supporting a maximum of 8 SSBO.
Reviewed By: jbakker
Differential Revision: https://developer.blender.org/D15817
This was caused by un-wanted normalization. This is a requirement of
the MikkTspace. The issue is that g_data.N is expected to be normalized
by many other functions and overriden by bump displacement.
Adding a new global variable containing the interpolated normal fixes the
issue AND make it match cycles behavior better (mix between bump and
interpolated normal).
This patch implements the dilate/erode node for the realtime compositor.
Differential Revision: https://developer.blender.org/D15790
Reviewed By: Clement Foucault
Upcoming cryptomatte patch would need access to these defines. So moving
them from film_lib to shader shared. We cannot include the film_lib as
it requires images/textures to be bound that we don't need.
At the same time fixes incorrect casing (`lAYER` => `LAYER`).
- Adding in compatibility paths to support minimum per-vertex strides for vertex formats. OpenGL supports a minimum stride of 1 byte, in Metal, this minimum stride is 4 bytes. Meaing a vertex format must be atleast 4-bytes in size.
- Replacing transform feedback compile-time check to conditional look-up, given TF is supported on macOS with Metal.
- 3D texture size safety check added as a general capability, rather than being in the gl backend only. Also required for Metal.
Authored by Apple: Michael Parkin-White
Ref T96261
Reviewed By: fclem
Maniphest Tasks: T96261
Differential Revision: https://developer.blender.org/D14510
Use lowercase rgba channel names which still by-passes lossy nature
of DWA compression and which also keeps external compositing tools
happy.
Thanks Steffen Dünner for testing this patch!
Differential Revision: https://developer.blender.org/D15834
The DWA compression code in OpenEXR has hardcoded rules which decides
which channels are lossy or lossless. There is no control over these
rules via API.
This change makes it so channel names of xyzw is used for cryptomatte
passes in Cycles. This works around the hardcoded rules in the DWA code
making it so lossless compression is used. It is important to use lower
case y channel name as the upper case Y uses lossy compression.
The change in the channel naming also makes it so the write code uses
32bit for the cryptomatte even when saving half-float EXR.
Fixes T96933: Cryptomatte layers saved incorrectly with EXR DWA compression
Fixes T88049: Cryptomatte EXR Output Bit Depth should always be 32bit
Differential Revision: https://developer.blender.org/D15823
The viewport compositor crashes when it is disabled then enabled after
the compositor node tree is edited.
This happens because the compositor engine uses the view_update callback
of the draw engine type to detect changes in the node tree and reset its
state for future evaluation. However, the draw manager only calls the
view_update callback for enabled engines, so the compositor never
receives the needed updates to properly reset its state and then crashes
at draw time.
This patch call the view_update callback for all registered engines
regardless if they are enabled or not, that way, they always receive
the potentially important updated needed to maintain a correct state.
Aside from the compositor engine, this change affects the EEVEE and
Workbench engines because they are the only engines that utilizes this
callback. However, both of them only reset a flag that is checked at
draw time. So the change should have no side effects. For the EEVEE
engine, we just add a null check in case it was not instanced, while
Workbench already have the appropriate null check.
Differential Revision: https://developer.blender.org/D15821
Reviewed By: Clement Foucault
This patch moves material indices from the mesh `MPoly` struct to a
generic integer attribute. The builtin material index was already
exposed in geometry nodes, but this makes it a "proper" attribute
accessible with Python and visible in the "Attributes" panel.
The goals of the refactor are code simplification and memory and
performance improvements, mainly because the attribute doesn't have
to be stored and processed if there are no materials. However, until
4.0, material indices will still be read and written in the old
format, meaning there may be a temporary increase in memory usage.
Further notes:
* Completely removing the `MPoly.mat_nr` after 4.0 may require
changes to DNA or introducing a new `MPoly` type.
* Geometry nodes regression tests didn't look at material indices,
so the change reveals a bug in the realize instances node that I fixed.
* Access to material indices from the RNA `MeshPolygon` type is slower
with this patch. The `material_index` attribute can be used instead.
* Cycles is changed to read from the attribute instead.
* BMesh isn't changed in this patch. Theoretically it could be though,
to save 2 bytes per face when less than two materials are used.
* Eventually we could use a 16 bit integer attribute type instead.
Ref T95967
Differential Revision: https://developer.blender.org/D15675
Issue arises when face areas are really large combined with small UV
areas (report has a mesh ~1.5 km), then precission of shorts is
insufficient.
Now use floats instead.
This also removes this negative signed version of the total area ratio
(since with floats it is no longer used).
Thx @brecht for a lot of hand-holding!
NOTE: this is an alternative to D15805 (and quick tests show this does
not introduce the tiny performance hit as D15805 did).
Maniphest Tasks: T93084
Differential Revision: https://developer.blender.org/D15810
EEVEE-Next passes were rendered to the render result, but didn't appear
in the compositor. Reasoning is that when a render engine has the update
render passes callback registered it would not register any default
render passes. This callback is used to update the Render Layer node.
This patch implements the callback for EEVEE-Next with the render passes
that are already available. In the future the callback should be
extended. Note that AO/SHADOW render passes have been disabled for now
as they need to be converted to color buffers.
This fixes a compilation error in eevee_light_culling_debug shader.
Some compilers complained when accessing the same data twice. Unclear
why. We should investigate that this change doesn't harm the performance
of the shader.
Although the light is a local variable it might clutter available
registers. If so it will harm developers during debugging.
Due to a copy-paste error there was an out of bound read. Some drivers
didn't complain about it, others did. This patch fixes the compilation
error by accessing the array within bounds.
Metaball, curve, text, and surface objects use the geometry component
system to add evaluated mesh object instances to the dependency graph
"for render engine" iterator. Therefore it is unnecessary to process
those object types in these loops-- it would either be redundant work
or a no-op.
With the ultimate goal of simplifying drawing and evaluation,
this patch makes the following changes and removes code:
- Use `Mesh` instead of `DispList` for evaluated basis metaballs.
- Remove all `DispList` drawing code, which is now unused.
- Simplify code that converts evaluated metaballs to meshes.
- Store the evaluated mesh in the evaluated geometry set.
This has the following indirect benefits:
- Evaluated meshes from metaball objects can be used in geometry nodes.
- Renderers can ignore evaluated metaball objects completely
- Cycles rendering no longer has to convert to mesh from `DispList`.
- We get closer to removing `DispList` completely.
- Optimizations to mesh rendering will also apply to metaball objects.
The vertex normals on the evaluated mesh are technically invalid;
the regular calculation wouldn't reproduce them. Metaball objects
don't support modifiers though, so it shouldn't be a problem.
Eventually we can support per-vertex custom normals (T93551).
Differential Revision: https://developer.blender.org/D14593
* Flip the logic to first detect if we are dealing with an unmodified mesh
in editmode. And then if not, detect if we need a mapping or not.
* runtime.is_original is only valid for the bmesh wrapper. Rename it to clarify
that and only check it when the mesh is a bmesh wrapper.
* Remove MR_EXTRACT_MAPPED and instead check only for the existence of the
origindex arrays. Previously it would sometimes access those arrays without
MR_EXTRACT_MAPPED set, which according to a comment means they are invalid.
Differential Revision: https://developer.blender.org/D15676
This fixes missing selection updates in UV editor, both with GPU subdivision
and with the Modified Edges display option for modifiers in general.
It also fixes the UV editor incorrectly showing the cage mesh with deformed
coordinates. These are not yet supported by the UV selection system.
Changes:
* Always read selection state from the editmesh when building batches. The
flags in the evaluated mesh can be outdated as selection bypasses depsgraph
evaluation for performance, and instead may just clear the batches.
* runtime.is_original is only valid for the bmesh wrapper. The check for
building the UV cage should only use that if the mesh is a bmesh wrapper.
* Don't create cage batches for objects whose mesh is in edit mode, but that
are not themselves in edit mode, there is no need.
Differential Revision: https://developer.blender.org/D15658
This change combines the diffuse/specular light passes into a single texture
array, freeing up an image binding for cryptomatte.
When diffuse/specular light pass and/or requested a
texture array will be allocated. Only when specular light is requested 2 images will always be allocated. This increases the
memory overhead when viewing the specular light renderpass in the viewport. For final rendering it is a common scenario that none
or both are requested.
Reviewed By: fclem
Differential Revision: https://developer.blender.org/D15701
* Fixed crash in debug draw code. Apparently this is
only used by PBVH draw?
* Debug draw code can now be forcibly enabled in release
mode (i.e. RelWithDebugInfo) by uncommenting a commented
out #define.
* Fixed colors in debug draw mode.
* PBVH node boxes in debug mode now flash a different color
when they are updated.
This new implementation does all downsampling in a single compute shader
dispatch, removing a lot of complexity from the previous recursive
downsampling.
This is heavilly inspired by the Single-Pass-Downsampler from GPUOpen:
https://github.com/GPUOpen-Effects/FidelityFX-SPD
However I do not implement all the optimization bits as they require
vulkan (GL_KHR_shader_subgroup) and is not as versatile (it is only
for HiZ).
Timers inside renderdoc report ~0.4ms of saving on a 2048*1024 render for
the whole downsampling. Note that the previous implementation only
processed 6 mips where the new one processes 8 mips.
```
EEVEE ~1.0ms
EEVEE-Next ~0.6ms
```
Padding has been bumped to be of 128px for processing 8 mips.
A new debug option has been added (debug value 2) to validate the HiZ.
With libepoxy we can choose between EGL and GLX at runtime, as well as
dynamically open EGL and GLX libraries without linking to them.
This will make it possible to build with Wayland, EGL, GLVND support while
still running on systems that only have X11, GLX and libGL. It also paves
the way for headless rendering through EGL.
libepoxy is a new library dependency, and is included in the precompiled
libraries. GLEW is no longer a dependency, and WITH_SYSTEM_GLEW was removed.
Includes contributions by Brecht Van Lommel, Ray Molenkamp, Campbell Barton
and Sergey Sharybin.
Ref T76428
Differential Revision: https://developer.blender.org/D15291