This change puts all the block size macros in the same common header, so
they can be included in host side code without needing to also include
the kernels that are defined in the device headers that contained these
values.
This change also removes a magic number used to enqueue a kernel, which
happened to agree with the GPU_PARALLEL_SORT_BLOCK_SIZE macro.
Pull Request: https://projects.blender.org/blender/blender/pulls/143646
Generate mouse events from touch devices on Wayland.
The code tracks a single contact point and converts it to mouse moves,
LMB presses & releases. Multi-touch, pinch & swipe aren't yet supported.
Addresses #121449.
Ref !143416
Co-authored-by: Campbell Barton <campbell@blender.org>
`Eavg` can still be 1 for very small roughness, causing division by zero
when computing `Ems`.
A roughness of 1e-5 gives an `Evg` of 0.999998, seems reasonable.
Pull Request: https://projects.blender.org/blender/blender/pulls/143637
Add new "Linear 3D Curves" option in the Curves panel in the render
properties. This renders curves as linear segments rather than smooth
curves, for faster render time at the cost of accuracy.
On NVIDIA Blackwell GPUs, this can give a 6x speedup compared to smooth
curves, due to hardware acceleration. On NVIDIA Ada there is still
a 3x speedup, and CPU and other GPU backends will also render this
faster.
A difference with smooth curves is that these have end caps, as this
was simpler to implement and they are usually helpful anyway.
In the future this functionality will also be used to properly support
the CURVE_TYPE_POLY on the new curves object.
Pull Request: https://projects.blender.org/blender/blender/pulls/139735
This fixes issues when using Embree on mutliple GPUs.
A previous workaround used separate contexts, this one now
lets us keep a single context for all GPUs.
Pull Request: https://projects.blender.org/blender/blender/pulls/143089
Update use_shading, use_camera and the shading system pointers in the same
location, so that when the render is interrupted they are in a consistent state.
The added null pointer checks are not strictly needed, but just in case it
goes out of sync for another reason.
Pull Request: https://projects.blender.org/blender/blender/pulls/143467
`sd->dPdu`, `sd->dPdv`, `sd->du` and `sd->dv` are computed from
`sd->dP` by constructing a local frame, so both results are the same, subject to some numerical differences.
This avoids constructing the local frame again, so might be faster.
Use MTLPatchShaderSource to provide the patch basis shader source on
all Apple platforms. The immediate advantage of this change is ability
to use GPU subdivision on iOS. Another advantage is that it moves us
further away from frameworks which got deprecated by Apple and it might
save us some headache in the future.
Also tweak backend-specific defines to match definitions from OpenSubdiv.
The annoying difference is that OSD_PATCH_BASIS_METAL is defined by the
OpenSubdiv as 1 in the very beginning of the base code, which is not done
for the OSD_PATCH_BASIS_GLSL is not defined by the OpenSubdiv at all.
Ref #143445
---
TODO:
- [X] Check it works correctly on macOS
- [x] Check it works correctly on Linux
Pull Request: https://projects.blender.org/blender/blender/pulls/143462
Part of https://projects.blender.org/blender/blender/pulls/141278
Blend files compatibility:
If a World exists and "Use Nodes" is disabled, we add new nodes to the
existing node tree (or create one if it doesn't) that emulates the
behavior of a world without a node tree. This ensures backward and
forward compatibility.
Python API compatibility:
- `world.use_nodes` was removed from Python API => **Breaking change**
- `world.color` is still being used by Workbench, so it stays there,
although it has no effect anymore when using Cycles or EEVEE.
Python API changes:
Creating a World using `bpy.data.worlds.new()` now creates a World with
an empty (embedded) node tree. This was necessary to enable Python
scripts to add nodes without having to create a node tree (which is
currently not possible, because World node trees are embedded).
Pull Request: https://projects.blender.org/blender/blender/pulls/142342
* Replace G.quiet by CLG_quiet_set/get
* CLOG_INFO_NOCHECK prints are now suppressed when quiet, these were
typically inside a if (!G.quiet) conditional already.
* Change some prints for blend files, color management and rendering to
use CLOG, that were previously using if (!G.quiet) printf().
Pull Request: https://projects.blender.org/blender/blender/pulls/143138
It allows to implement tricks based on a knowledge whether the path
ever cam through a portal or not, and even something more advanced
based on the number of portals.
The main current objective is for strokes shading: stroke shader
uses Ray Portal BSDF to place ray to the center of the stroke and
point it in the direction of the surface it is generated for. This
gives stroke a single color which matches shading of the original
object. For this usecase to work the ray bounced from the original
surface should ignore the strokes, which is now possible by using
Portal Depth input and mixing with the Transparent BSDF. It also
helps to make shading look better when there are multiple stroke
layers.
A solution of using portal depth is chosen over a single flag due
to various factors:
- Last time we've looked into it it was a bit tricky to implement
as a flag due to us running out of bits.
- It feels to be more flexible solution, even though it is a bit
hard to come up with 100% compelling setup for it.
- It needs to be slightly different from the current "Is Foo"
flags, and be more "Is Portal Descendant" or something.
An extra uint16 is added to the state to count the portal depth,
but it is only allocated for scenes that use Ray Portal BSDF.
Portal BSDF still increments Transparent bounce, as it is required
to have some "limiting" factor so that ray does not get infinitely
move to different place of the scene.
Ref #125213
Pull Request: https://projects.blender.org/blender/blender/pulls/143107
When "generated" is required via Attribute Node, it is not available for
World and Point Cloud.
Make OSL match the SVM behavior to use the object coordinates
(see `svm/attribute.h`),
Pull Request: https://projects.blender.org/blender/blender/pulls/143198
Previously with adaptive subdivision this happened to work with the N
attribute, but that was not meant to be undisplaced. This adds a new
undisplaced_N attribute specifically for this purpose.
For backwards compatibility in Blender 4.5, this also keeps N undisplaced.
But that will be changed in 5.0.
Pull Request: https://projects.blender.org/blender/blender/pulls/142090
This commit renames the GHOST Metal graphic context class and related
files / references from `GHOST_ContextCGL` to `GHOST_ContextMTL`. When
the Metal backend was first introduced, this context file contained
both the old OpenGL context (CGL, for macOS Core OpenGL API), and the
newer Metal context. Since #110185 all CGL related code was removed
from the class, making it Metal only, and thus rendering the old class
name outdated and potentially misleading. In addition to the rename,
unused OpenGL related forward declarations and old TODOs were also
removed.
Pull Request: https://projects.blender.org/blender/blender/pulls/142519
Currently MetalRT renders always use extended limits, which is needed to correctly render scenes where the max primitive count can exceed 2^28 or the instance count can exceed 2^24. This patch adopts Metal best practices of only enabling this flag if it is needed.
This PR is similar to #133364, but there are some notable differences:
1) The old PR made an overly optimistic assumption that all the relevant visibility bits could be squeezed into 8 bits. This new PR adopts the same approach that Optix takes of using 8 bits as a primary HW filter, and checking the full 32 bit mask inside the SW intersection handler.
~~2) I moved the scene scanning check from Scene into MetalDevice. This avoids platform specific details leaking into platform agnostic areas.~~
~~3) In live viewport mode, we always use extended limits in case we tip over the threshold.~~
_EDIT:_
2) The limits are scanned in `Scene::update_kernel_features`, and given to the device by a new `set_bvh_limits` method which returns true if the BVH and kernels need to be reloaded.
Pull Request: https://projects.blender.org/blender/blender/pulls/142401
When `delta` is significantly larger than the difference between `tmin`
and `tmax`, the precision of `theta_a` and `theta_b` reduces, resulting
in banding artefacts.
For equiangular sampling, we use uniform sampling to fix this case.
For light tree, we use the equality
atan(a) + atan(b) = atan2(a + b, 1 - a*b).
Pull Request: https://projects.blender.org/blender/blender/pulls/142845
Use Principled BSDF instead of Diffuse BSDF. This is a breaking change but
likely does not affect many scenes significantly. It adds some specularity
when an object does not have a a material assigned.
Fix#142538
Pull Request: https://projects.blender.org/blender/blender/pulls/142703
This PR allows blender to be compiled with OpenXR, but without
OpenGL support.
For investigating a bug we needed to compile Blender without
OpenGL (`WITH_OPENGL_BACKEND=OFF`) but it would fail to
compile OpenXR (`WITH_XR_OPENXR=ON`). This PR tweakes
the preprocess directives to compile the openxr graphics
binding without OpenGL.
Co-authored-by: Jonas Holzman <jonas@holzman.fr>
Pull Request: https://projects.blender.org/blender/blender/pulls/141668
The code of the "oneapi_load_kernels" function before this modification
was loading kernels and compiling them, if needed, for all devices in
the associated GPU context. This makes sense for one GPU execution
scenario, as well as for execution scenario of multi identical GPU,
but in cases where Blender users have several different GPUs in
render, the previous implementation would compile all kernels
for all devices for each device, unnecessarily doing the same
work multiple times. Because of this, I am changing the
implementation so that now compilation happens only for the used
device per used device, ensuring that no unnecessary work is done.
No render performance changes are expected.
This improves the existing scene specialisation mechanism by replacing "kernel_data.kernel_features" with a function constant. It doesn't cause any additional compilation requests, but allows the backend compiler to eliminate more dead code. An additional compiler hint is provided for dead-stripping "volume_stack_enter_exit" which results in slightly faster rendering of non-volumetric scenes.
Pull Request: https://projects.blender.org/blender/blender/pulls/142235
- Cursors in wayland are expected to use pre-multiplied alpha.
Note that cursor generators create straight alpha,
pre multiplier in the wayland backend.
- Invert cursor colors on Wayland since dark cursors are often default,
this could use dark theme settings in the future.
Support showing vector cursors at the appropriate size based on each
monitors DPI.
This solves incorrectly sized cursors with Hi-DPI outputs as well as
supporting outputs with different DPI's.
Instead of passing pixel data, pass an object that can generate cursors,
GHOST/Wayland can then ensure the correct size is used based on the
display the cursor is shown on.
While may sound overly complicated it actually simplifies displaying
cursors on Wayland, especially for multiple monitors at different DPI's.
Showing cursors at higher or lower resolutions than the monitor's DPI
also behaves differently on GNOME & Plasma, resulting in situations
where the cursor would show larger/smaller than expected.
Details:
- Take advantage of the recently added SVG cursors & BLF time cursor
!140990 & !141367.
- Setting bitmap cursors via GHOST_SetCustomCursorShape is now a no-op
for Wayland since it's no longer used. This was supported in an
earlier version of this PR and could be restored if needed.
- While fractional scaling works it relies on the compositor downscaling
the cursors. Ideally they would be rendered at the target size but
this isn't a priority as the difference isn't noticeable.
Ref !141597