Implement invalid sample points filling. Improves invalid regions
but introduce light leak.
Grid sample points are considered invalid if they have a ratio of
front-face ray hit under the given threshold. This is a post-processing
pass on the baked lighting that fills dark regions produced by
invalid sample location (e.g.: inside walls) with valid neighbor
samples data.
Two new parameters are added:
- Dilation Threshold: Validity threshold under which grid samples are
considered invalid. Invalid samples will gather valid lighting data
from valid neighbors inside the dilation radius.
- Dilation Radius: Radius of the dilation process. Expressed in grid
sample distance.
The validity of each point is progressively refined just like the
lighting data during the baking process.
The dilation process is implemented as a post-processing pass during
the loading of the grid data into the irradiance atlas. This allows
live tweaking the dilation parameters.
Pull Request: https://projects.blender.org/blender/blender/pulls/110386
This is a full rewrite of the raytracing denoise pipeline. It uses the
same principle as before but now uses compute shaders for every stages
and a tile base approach. More aggressive filtering is needed since we
are moving towards having no prefiltered screen radiance buffer. Thus
we introduce a temporal denoise and a bilateral denoise stage to the
denoising. These are optionnal and can be disabled.
Note that this patch does not include any tracing part and only samples
the reflection probes. It is focused on denoising only. Tracing will
come in another PR.
The motivation for this is that having hardware raytracing support
means we can't prefilter the radiance in screen space so we have to
have better denoising. Also this means we can have better surface
appearance with support for other BxDF model than GGX. Also GGX support
is improved.
Technically, the new denoising fixes some implementation mistake the
old pipeline did. It separates all 3 stages (spatial, temporal,
bilateral) and use random sampling for all stages hoping to create
a noisy enough (but still stable) output so that the TAA soaks the
remaining noise. However that's not always the case. Depending on the
nature of the scene, the input can be very high frequency and might
create lots of flickering. That why another solution needs to be found
for the higher roughness material as denoising them becomes expensive
and low quality.
Pull Request: https://projects.blender.org/blender/blender/pulls/110117
There were two separate issues occurring here:
With some other recent changes to curve handles, an early exit was
added when the handles should not display, however, this early exit
was not discarding geometry in the Metal implentation, but leaving
values undefined. Resulting in random geometry flickering on screen.
This may not previously have happened in certain modes if the vertex
buffers were zero-initialised up-front (which only happens with certain
debug flags).
Curve handle geometry generation would render incorrectly when
outputting triangleStrips IF the transparent border was disabled.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/110719
Adds support for generating curve primtiives avoiding the
use of primtiive restarts. This maixmises geometry performance
when using Metal.
Also ensure that the existing index buffer optimization path is
skipped for indirect draw calls where counts are not known at
submission time.
Authored by Apple: Michael Parkin-White
Pull Request: https://projects.blender.org/blender/blender/pulls/109972
Use `parallel_for` instead of the C threading API, extract some
constant checks from hot loops, and use `EnumerableThreadSpecific`
for thread-local storage.
Pull Request: https://projects.blender.org/blender/blender/pulls/105940
Historically, the OCIO based color management implementation in Blender
had exceptions to treat specific configurations differently. It was a
compatibility with the legacy "No color management" option.
With time and more development in the area there are better ways of
achieving this goal, if needed.
This commit removes the named-based exception, which also solves confusion
about why certain similar configurations (from OCIO stand point) give
different results. As well as allows to create a cleaner plate for an
upcoming additions in the OCIO configuration such as AgX.
Quite simple and technical change which constant-folds the check for
whether the scene color management enabled or not with "true" value.
Ref #110685
Pull Request: https://projects.blender.org/blender/blender/pulls/110580
Reduce overhead of copying attribute data into GPU buffers when the
PBVH is active. The existing lambda with a FunctionRef callback had
a significant overhead. While that was reduced by 25917f0165
already, even making the `foreach_faces` lambda into a template gave
significant overhead compared to simpler loops. Instead, separate
value conversion and iteration over visible triangles in a way that the
compiler is able to optimize more easily.
According to the GPU module, it's also better to use raw data access
than `GPU_vertbuf_raw_step`, since the data format strides aren't
meant to vary by platform, and the runtime stride can have a
noticeable performance impact.
Also avoid recalculating face normals, since they're already used to
calculate vertex normals anyway (since ac02f94caf).
I tested the runtime of the initial data-upload after entering sculpt
mode with a 16 million vertex mesh. Before, that took 1350 ms, after
it took 680 ms, which is almost a 2x improvement. In my tests, the
performance improvement was only observable for the initial data
upload, theoretically it is a more general change though.
It's possible that a similar optimization could be applied to multires
or dynamic topology sculpting, but that can be looked at later too.
Pull Request: https://projects.blender.org/blender/blender/pulls/110621
The process of calculating the caches for loose edges and loose vertices
and extracting their indices are independent and both single threaded.
If the CPU isn't doing anything else, using two threads can half the
total time for both. For example, this saves 40-50ms opening a file
with a 16 million face mesh.
Use StringRef where possible to avoid copying strings, avoid
redundant string returns, and use std::string for attribute
request names now that all the relevant code is C++.
The issue was caused by recent C++ transition: the header file is
shared between CPU and GPU. The Metal defines __cplusplus so it is
not enough to check for it to use C linking as it is not a valid
syntax for shaders.
Pull Request: https://projects.blender.org/blender/blender/pulls/110570
At least on GCC on Linux, it appears std::function has noticeable
overhead compared to blender::FunctionRef. That makes some
sense, as the latter generally handles less, and the performance
difference is mentioned in the function ref header as well.
To test performance, I measured the timing of the first data
upload (`BKE_pbvh_draw_cb`) after entering sculpt mode. For
meshes, I observed a 30% improvement, from 1.7s to 1.3s.
For multires, I observed a change from 290ms to 263ms.
The change should apply to regular draw updates while sculpting,
but that's harder to measure.
This is also cleaner semantically, since the callbacks aren't meant
to own any data, they are just lambdas that capture by reference.
One thing to point out is that `PBVH::nodes` is now stored in a `Vector`
which replaces the manual amortized growth. That requires explicitly
setting the defaults of PBVHNode fields for default initialization.
Similar to f0b53777c8
Add an API for armature layer access. Instead of accessing `arm->layer`
and friends directly, the code now uses this API. This will make things
easier to replace by bone collections in the future.
The functions are named "bonecoll" (short for "bone collection"), as
that's the soon-to-be-introduced replacement for armature layers. This
API is the first step towards that replacement, and should help to
reduce the changes necessary when functional changes are committed.
This also creates a new module `source/blender/animrig` for Animation &
Rigging code. This will, for example, house the bone collection system
in the near future.
There is a bunch of code currently spread across blenkernel and editors
in a rather ad-hoc way; it is intended that at some point that code gets
moved into `animrig` as well (or at least the subset of that code where
such a move makes sense; brain still required).
Ref: #108941
No functional changes.
In the Armature drawing code, split up `get_pchan_color()` into three
separate functions. It was basically one big `switch` with three
`case`s, and there were three calls of the function, each with its own
hard-coded parameter value, one for each `case`.
This now also makes it clear that two of those functions always write to
their return parameter, and thus copying a default color 'just in case'
is no longer necessary, reducing the parameter counts even more.
No functional changes.
Add a `UnifiedBonePtr::constflag()` function to grab the `constflag` from
the bone, so that it doesn't have to be passed as a separate parameter
to every drawing-related function.
No functional changes.
- Introduce `UnifiedBonePtr` to avoid having to pass `(EditBone *eBone,
pPoseChannel *pchan)` everywhere.
- Introduce `eArmatureDrawMode` and store that on the
`ArmatureDrawContext`, to avoid having to pass `bArmature *arm` and
then doing `arm->flag & ARM_POSEMODE` everywhere.
- Use the `eBone_Flag` type instead of `int`.
- Deprecate the `ARM_POSEMODE` armature flag. It is no longer necessary,
and also it was changing DNA data from the draw functions. The flag
was basically purely runtime-only, to pass some information to
lower-level drawing code, yet it was stored in DNA. It has been
replaced by the `eArmatureDrawMode` on the context.
Note that some comparisons `eBone != nullptr` (often using the implicit
conversion of pointer to boolean) have been replaced by a comparison to
`ctx->draw_mode`. This is used in cases where the pointer comparison was
actually indicative of the draw mode, and to help get the `else if
(draw_mode == ARM_DRAW_MODE_POSE)` symmetrical.
Disclaimer: this `UnifiedBonePtr` can probably be used in many other
places in Blender as well. We might move it somewhere else in the
future, but to keep things simple I just want to see how it behaves
locally first.
Pull Request: https://projects.blender.org/blender/blender/pulls/110424
Currently, while calculating face corner normals, Blender retrieves
custom normal data with write access. When the the custom normals in a
single smooth corner fan don't match, they are reset to the average
value.
This behavior is very old, but it comes from when Blender didn't have a
strong idea of const correctness. Indeed, modifying custom normal data
while calculating normals isn't threadsafe, which is important because
normals are calculated for viewport drawing, for example. And in the
future, properly caching face corner normals (see #93551) will require
the ability to calculate normals on a properly const mesh.
The fix is to still use the average of custom normals in a fan, but
not write that back to the custom data array. In my testing the results
are the same. Setting custom normals still fills the same value for all
corners in a fan.
Pull Request: https://projects.blender.org/blender/blender/pulls/110478
Implements the rest of #101689, after 5e9ea9243b.
- `vdata` -> `vert_data`
- `edata` -> `edge_data`
- `pdata` -> `face_data`
- `ldata` -> `loop_data`
A deeper rename of `loop` to `corner` will be proposed as a next
step, and renaming `totvert` and `totedge` can be done separately.
Pull Request: https://projects.blender.org/blender/blender/pulls/110432
Utility functions make accessing the next and previous corner of a face
more obvious, and range based for loops make iterating over corners
or vertices in a face simpler too.