While building the AVX kernel, util_avxf.h/avxb.h were using some AVX2 intrinsics,
these were never called, so it wasn't a run-time issue, but the intrinsics headers
on centos excluded the AVX2 prototypes when building the AVX kernel causing build errors.
This commit cleans up the improper usage of the AVX2 intrinsics and provides AVX
fallback implementations for future use.
Differential Revision: https://developer.blender.org/D3696
This isn't really possible to do the shuffle which was attempted to do.
While it's possible to achieve expected behavior, the function needs to
be rewritten. Since it's not used anyway, it's simpler to remove it for
now.
The assembler template was backing up and restoring ebx, which is
fair enough. However, this did not prevent compiler for putting
result variables to ebx. This was causing data corruption.
In order to prevent this easiest solution is to list ebx in clobbers
for the assembly.
This is an initial implementation of BVH8 optimization structure
and packated triangle intersection. The aim is to get faster ray
to scene intersection checks.
Scene BVH4 BVH8
barbershop_interior 10:24.94 10:10.74
bmw27 02:41.25 02:38.83
classroom 08:16.49 07:56.15
fishy_cat 04:24.56 04:17.29
koro 06:03.06 06:01.45
pavillon_barcelona 09:21.26 09:02.98
victor 23:39.65 22:53.71
As memory goes, peak usage raises by about 4.7% in a complex
scenes.
Note that BVH8 is disabled when using OSL, this is because OSL
kernel does not get per-microarchitecture optimizations and
hence always considers BVH3 is used.
Original BVH8 patch from Anton Gavrikov.
Batched triangles intersection from Victoria Zhislina.
Extra work and tests and fixes from Maxym Dmytrychenko.
This is in preparation of upgrading our library dependencies, some of which
need C++11. We already use C++11 in blender2.8 and for Windows and macOS, so
this just affects Linux.
On many distributions this will not require any changes, on some
install_deps.sh will need to be run again to rebuild libraries.
Differential Revision: https://developer.blender.org/D3568
This is a physically-based, easy-to-use shader for rendering hair and fur,
with controls for melanin, roughness and randomization.
Based on the paper "A Practical and Controllable Hair and Fur Model for
Production Path Tracing".
Implemented by Leonardo E. Segovia and Lukas Stockner, part of Google
Summer of Code 2018.
Features to get the 2nd, 3rd, 4th closest point instead of the closest, and
various distance metrics. No viewport/Eevee support yet.
Patch by Michel Anders, Charlie Jolly and Brecht Van Lommel.
Differential Revision: https://developer.blender.org/D3503
Textures in 16 bit integer format are sometimes used for displacement, bump and normal maps and can be exported by tools like Substance Painter. Without this patch, Cycles would promote those textures to single precision floating point, causing them to take up twice as much memory as needed.
Reviewers: #cycles, brecht, sergey
Reviewed By: #cycles, brecht, sergey
Subscribers: sergey, dingto, #cycles
Tags: #cycles
Differential Revision: https://developer.blender.org/D3523
The latest clang compiler (at least the one in Xcode 9.4.1) warns about the register keyword and macro expansions using defined().
Since these warnings come from third party code, we can't address them directly in Blender. Silencing them via #pramgas will
at least keep the warnings during a build down to the ones that are relevant to Blender code.
I've limited it to just the RGB<->XYZ stuff for now, correct image handling is the next step.
Reviewers: brecht, sergey
Differential Revision: https://developer.blender.org/D3478
There is one legit place in the code where memcpy was used as an
optimization trick. Was needed for older version of GCC, but now
it should be re-evaluated and checked if it still helps to have
that trick.
In other places it's somewhat lazy programming to zero out all
object members. That is absolutely unsafe, at the moment when
less trivial class is used as a member in that object things
will break.
Other cases were using memcpy into an object which comes from
an external library. We don't control that object, and we can
not guarantee it will always be safe for such memory tricks
and debugging bugs caused by such low level access is far fun.
Ideally we need to use more proper C++, but needs to be done with
big care, including benchmarks of each change, For now do
annoying but simple cast to void*.