* GHOST is an implementation detail, Blender environment variables should
have the BLENDER_ prefix.
* Don't put links in command line help output. We don't do this for other
arguments either, and it's trivial to search for this online.
* Make description more straightforward and line wrap.
* Use CLOG for logging.
Ref #143049
Pull Request: https://projects.blender.org/blender/blender/pulls/144349
The data constructed by this call remains in the 'C-alloc' realm, i.e.
it can be `MEM_dupallocN`'ed, and `MEM_freeN`'ed.
This is intended as a temporary API only, to facilitate transition to
full C++ handling of data in Blender. It's primary target is to allow
pseudo-POD types to use default values for their members. See e.g.
!134531.
Unlike !143827 and !138829, it does not change the current rule (`new`
must be paired with `delete`, and `alloc` must be paired with `free`).
Instead, it defines an explicit and temporary API to allow a very
limited form of construction to happen on C-allocated data, provided
that the type is default-constructible, and remains trivial after
construction.
### Notes
* The new API is purposely as restrictive as possible, trying to
only allow the current known needs (init with default member values).
This can easily be extended if needed.
* To try to stay as close as malloc/calloc behavior as possible, and
avoid the 'zero-initialization' gotcha, it does not use
value-initialization, but instead default-initialization on zero-
initialized memory.
_Ideally it would even not allow any user-defined default constructor,
but this does not seem simple to detect._
Pull Request: https://projects.blender.org/blender/blender/pulls/144141
This PR proposes to add an environment variable for forcing vsync to
be on or off. My primary use case was to disable vsync for forcing
viewport rendering performance tests not to be capped at the display
refresh rate when #142984 is used for removing animation frame rate
limits.
I initially added the environment variable "GHOST_VSYNC_OFF", but
found "GHOST_VSYNC=0/1" to be more easily understandable.
Usage:
- GHOST_VSYNC=0 => Vsync is forced off
- GHOST_VSYNC=1 => Vsync is forced on
Pull Request: https://projects.blender.org/blender/blender/pulls/143049
This fits better with the way normal and displacement maps are typically
combined. Previously there was a mixing of displaced normal and undisplaced
tangent, which was broken behavior.
Additionally, to undisplaced_N and undisplaced_tangent attributes must now
always be used to get undisplaced coordinates. The regular N and tangent
attributes now always include displacement.
Ref #142022
Pull Request: https://projects.blender.org/blender/blender/pulls/143109
Modify shader update so we simplify the graphs first to determine the
kernel features, then load the kernels, and only then update data on the
device. This avoids errors due to mismatched kernels and shaders.
Pull Request: https://projects.blender.org/blender/blender/pulls/144238
Resolve regression in [0] where setting the cursor wasn't checking
if a software cursor was used. This would then show the hardware &
software cursors.
[0]: 232b106e64
Swapchain handling of minimized windows wasn't correct. On some
platforms it still tried to create images with no surface.
This PR will discard swapchains of minimized windows, but still being
able to flush the render graph and free resources.
Pull Request: https://projects.blender.org/blender/blender/pulls/144189
Vulkan 1.0 render passes have been replaced by dynamic rendering in 1.2.
Blender Vulkan backend was implemented with dynamic rendering in mind.
All our supported platforms support dynamic rendering. Render pass support
was added to try to work around an issue with legacy drivers. However these
drivers also fail with render passes.
Using render passes had several limitations (blending and some workbench
features were not supported). As no GPU uses it and it is quite some code
to support it is better to remove it.
Pull Request: https://projects.blender.org/blender/blender/pulls/144149
This PR removes the deferred allocation of swapchain images. A lot of layers
don't support this and will crash when used.
Thanks to Jorn Visser to point out the potential issue.
Pull Request: https://projects.blender.org/blender/blender/pulls/143924
The Principled BSDF has a ton of inputs, and the previous SVM code just always
allocated stack space for all of them. This results in a ton of additional
NODE_VALUE_x SVM nodes, which slow down execution.
However, this is not really needed for two reasons:
- First, many inputs are only used consitionally. For example, if the
subsurface weight is zero, none of the other subsurface inputs are used.
- Many of the inputs have a "usual" value that they will have in most
materials, so if they happen to have that value we can just indicate that
by not allocating space for them.
This is a bit similar to the standard "pack the fixed value and provide
a stack offset if there's a link" pattern, except that the fixed value
is a constant in the code and we allocate a NODE_VALUE_x if a different
fixed value is used.
Therefore, this PR re-implements the parameter packing in a more efficient way:
- If we can determine that a component is disabled, all conditional inputs are
disconnected (to avoid generating upstream nodes).
- If we can determine that a component is disabled, we skip allocating all
conditional inputs on the stack.
- The inputs for which a reasonable "usual" value exists are changed to
respect that, and to only be allocated if they differ.
- param1 and param2 (which are fixed-value-packed as on all BSDF nodes) are
used to store IOR and roughness, which have a decent chance to be fixed
values.
- The parameter packing is more aggressive about using uchar4, which allows
to get rid of two SVM nodes while still storing the same inputs.
The result is a considerable speedup in scenes that make heavy use of the
Principled BSDF:
| Scene | CPU speedup | OptiX speedup |
| --- | --- | --- |
| attic | 5% | 9% |
| bistro | 5% | 8% |
| junkshop | 5% | 10% |
| monster | 3% | 4% |
| spring | 1% | 6% |
Pull Request: https://projects.blender.org/blender/blender/pulls/143910
Ever since the OptiX 8 update in Blender 4.5, the minimum GPU driver
requirements to use OptiX has increased to 535 or newer.
This commit update the minimum GPU driver requirement listed in the UI
to reflect this.
Pull Request: https://projects.blender.org/blender/blender/pulls/143917
This PR adds a new `fresnel_conductor_polarized` function, which calculates reflectance and phase shift (if requested) for both parallel and perpendicular polarized light. This is needed for applying thin film iridescence to conductors (see !141131).
For consistency, this PR also makes `fresnel_conductor` call `fresnel_conductor_polarized` instead of using a fast approximation of the Fresnel equations that is inaccurate at lower n and k values. This will change the output of some Metallic BSDF renders using Physical Conductor and prevent discrepancies when enabling thin film iridescence.
I didn't do any rigorous performance testing, but from timing the functions outside of Blender, `fresnel_conductor_polarized` is significantly slower than the approximation, between 1.5-3x depending on the compiler. This makes sense because it has three square roots and the approximation has none. In some informal tests with metallic_multiggx_physical.blend modified to have more spheres, the new renders took around 1-2% longer on both CPU and GPU.
There are some avoidable inefficiencies in this approach of just calling `fresnel_conductor_polarized`:
- one of the three square roots could be saved since `fresnel_conductor` never needs the phase shift and there are simplifications possible when only calculating the reflectance
- there are several unnecessary multiplications by 1.0 since `fresnel_conductor` uses relative IOR and `fresnel_conductor_polarized` doesn't, though those could get optimized out if inlined
Pull Request: https://projects.blender.org/blender/blender/pulls/143903
This replaces `stack_assign` with `stack_assign_if_linked`, which should save a few SVM nodes for constant parameters.
Running benchmarks (all scenes in the benchmark repo, 3 runs, median value for each) shows 1.0% improvement on CPU and 1.5% on OptiX. Not huge, but fairly (all between -0.2% and 3.0%).
Pull Request: https://projects.blender.org/blender/blender/pulls/143404
This change puts all the block size macros in the same common header, so
they can be included in host side code without needing to also include
the kernels that are defined in the device headers that contained these
values.
This change also removes a magic number used to enqueue a kernel, which
happened to agree with the GPU_PARALLEL_SORT_BLOCK_SIZE macro.
Pull Request: https://projects.blender.org/blender/blender/pulls/143646
Generate mouse events from touch devices on Wayland.
The code tracks a single contact point and converts it to mouse moves,
LMB presses & releases. Multi-touch, pinch & swipe aren't yet supported.
Addresses #121449.
Ref !143416
Co-authored-by: Campbell Barton <campbell@blender.org>
`Eavg` can still be 1 for very small roughness, causing division by zero
when computing `Ems`.
A roughness of 1e-5 gives an `Evg` of 0.999998, seems reasonable.
Pull Request: https://projects.blender.org/blender/blender/pulls/143637
Add new "Linear 3D Curves" option in the Curves panel in the render
properties. This renders curves as linear segments rather than smooth
curves, for faster render time at the cost of accuracy.
On NVIDIA Blackwell GPUs, this can give a 6x speedup compared to smooth
curves, due to hardware acceleration. On NVIDIA Ada there is still
a 3x speedup, and CPU and other GPU backends will also render this
faster.
A difference with smooth curves is that these have end caps, as this
was simpler to implement and they are usually helpful anyway.
In the future this functionality will also be used to properly support
the CURVE_TYPE_POLY on the new curves object.
Pull Request: https://projects.blender.org/blender/blender/pulls/139735
This fixes issues when using Embree on mutliple GPUs.
A previous workaround used separate contexts, this one now
lets us keep a single context for all GPUs.
Pull Request: https://projects.blender.org/blender/blender/pulls/143089
Update use_shading, use_camera and the shading system pointers in the same
location, so that when the render is interrupted they are in a consistent state.
The added null pointer checks are not strictly needed, but just in case it
goes out of sync for another reason.
Pull Request: https://projects.blender.org/blender/blender/pulls/143467
`sd->dPdu`, `sd->dPdv`, `sd->du` and `sd->dv` are computed from
`sd->dP` by constructing a local frame, so both results are the same, subject to some numerical differences.
This avoids constructing the local frame again, so might be faster.
Use MTLPatchShaderSource to provide the patch basis shader source on
all Apple platforms. The immediate advantage of this change is ability
to use GPU subdivision on iOS. Another advantage is that it moves us
further away from frameworks which got deprecated by Apple and it might
save us some headache in the future.
Also tweak backend-specific defines to match definitions from OpenSubdiv.
The annoying difference is that OSD_PATCH_BASIS_METAL is defined by the
OpenSubdiv as 1 in the very beginning of the base code, which is not done
for the OSD_PATCH_BASIS_GLSL is not defined by the OpenSubdiv at all.
Ref #143445
---
TODO:
- [X] Check it works correctly on macOS
- [x] Check it works correctly on Linux
Pull Request: https://projects.blender.org/blender/blender/pulls/143462
Part of https://projects.blender.org/blender/blender/pulls/141278
Blend files compatibility:
If a World exists and "Use Nodes" is disabled, we add new nodes to the
existing node tree (or create one if it doesn't) that emulates the
behavior of a world without a node tree. This ensures backward and
forward compatibility.
Python API compatibility:
- `world.use_nodes` was removed from Python API => **Breaking change**
- `world.color` is still being used by Workbench, so it stays there,
although it has no effect anymore when using Cycles or EEVEE.
Python API changes:
Creating a World using `bpy.data.worlds.new()` now creates a World with
an empty (embedded) node tree. This was necessary to enable Python
scripts to add nodes without having to create a node tree (which is
currently not possible, because World node trees are embedded).
Pull Request: https://projects.blender.org/blender/blender/pulls/142342
* Replace G.quiet by CLG_quiet_set/get
* CLOG_INFO_NOCHECK prints are now suppressed when quiet, these were
typically inside a if (!G.quiet) conditional already.
* Change some prints for blend files, color management and rendering to
use CLOG, that were previously using if (!G.quiet) printf().
Pull Request: https://projects.blender.org/blender/blender/pulls/143138