griefith/test

Author	SHA1	Message	Date
Hallam Roberts	a501a2dbff	Images: add mirror extension type This adds a new mirror image extension type for shaders and geometry nodes (next to the existing repeat, extend and clip options). See D16432 for a more detailed explanation of `wrap_mirror`. This also adds a new sampler flag `GPU_SAMPLER_MIRROR_REPEAT`. It acts as a modifier to `GPU_SAMPLER_REPEAT`, so any `REPEAT` flag must be set for the `MIRROR` flag to have an effect. Differential Revision: https://developer.blender.org/D16432	2022-12-14 19:27:29 +01:00
Leszek Godlewski	07d3a3962a	Fix Cycles build in VS2022, use explicit two's complement in find_first_set() Compiling Cycles in Visual Studio 2022 yields the error: C4146: unary minus operator applied to unsigned type, result still unsigned Replacing it with explicit two's complement achieves the same result as signed negation but avoids the error. Differential Revision: https://developer.blender.org/D16616	2022-12-07 18:34:57 +01:00
Thomas Dinges	3124241256	Fix Cycles SSE4 define for fast math rint function. Differential Revision: https://developer.blender.org/D16708	2022-12-06 19:06:43 +01:00
Brecht Van Lommel	16b6116b9d	Fix Cycles light tree render errors on Windows Due to mistake in popcount implementation. Thanks to Weizhen for help figuring this out.	2022-12-06 16:52:15 +01:00
Sergey Sharybin	c5e71cebaa	Cycles: Remove OpenGL header It is not really used from any of the sources, including the standalone app. Since we are moving to a more backend-independent drawing it makes sense to remove header which was specific to how Blender integrates Cycles into viewport. There is probably some cleanup in CMake files is possible, but there is some inter-dependency with USD. Differential Revision: https://developer.blender.org/D16681	2022-12-02 17:19:00 +01:00
Weizhen Huang	e028662f78	Cycles: store axis and length of an area light instead of their product	2022-12-02 15:23:09 +01:00
Michael Jones	2c596319a4	Cycles: Cache only up to 5 kernels of each type on Metal This patch adapts D14754 for the Metal backend. Kernels of the same type are already organised into subdirectories which simplifies type matching. Reviewed By: brecht Differential Revision: https://developer.blender.org/D16469	2022-11-11 18:10:29 +00:00
Patrick Mours	e6b38deb9d	Cycles: Add basic support for using OSL with OptiX This patch generalizes the OSL support in Cycles to include GPU device types and adds an implementation for that in the OptiX device. There are some caveats still, including simplified texturing due to lack of OIIO on the GPU and a few missing OSL intrinsics. Note that this is incomplete and missing an update to the OSL library before being enabled! The implementation is already committed now to simplify further development. Maniphest Tasks: T101222 Differential Revision: https://developer.blender.org/D15902	2022-11-09 15:30:21 +01:00
Brecht Van Lommel	e1b3d91127	Refactor: replace Cycles sse/avx types by vectorized float4/int4/float8/int8 The distinction existed for legacy reasons, to easily port of Embree intersection code without affecting the main vector types. However we are now using SIMD for these types as well, so no good reason to keep the distinction. Also more consistently pass these vector types by value in inline functions. Previously it was partially changed for functions used by Metal to avoid having to add address space qualifiers, simple to do it everywhere. Also removes function declarations for vector math headers, serves no real purpose. Differential Revision: https://developer.blender.org/D16146	2022-11-08 12:28:40 +01:00
Sergey Sharybin	74c293863d	Cycles: Remove use of sprintf() in MD5 code The new Xcode declares the `sprintf()` function deprecated and suggests to sue `snprintf()` as a safer alternative. This change actually moves away from any formatted printing and uses inlined byte-to-hex-string conversion which is also safe and is (unmesurably) faster. Differential Revision: https://developer.blender.org/D16378	2022-11-03 15:10:37 +01:00
Xavier Hallade	4b14b33ea8	Cycles: use packed float3 back for oneAPI This fixes a 15% performance regression silently introduced by `79ab76e156` that aligned the compact float3 on 16 bytes for oneAPI. Current change is minimalist, there are further cleanup opportunities such as removing packed_float3 definition for oneAPI but for some reason, it cuts the recovered speedup in half, so we're starting with this small fix for now. Reviewed by: brecht Differential Revision: https://developer.blender.org/D16340	2022-10-26 10:53:23 +02:00
Brecht Van Lommel	7da85ea35a	Fix Cycles build error on 32bit x86	2022-10-25 16:56:35 +02:00
Lukas Stockner	6ad04a031c	Cycles: Fix floor intrinsic for ARM Neon	2022-10-17 01:13:43 +02:00
Lukas Stockner	0c50f9c4aa	Fix T98672: Noise texture shows incorrect behaviour for large scales This was a floating point precision issue - or, to be more precise, an issue with how Cycles split floats into the integer and fractional parts for Perlin noise. For coordinates below -2^24, the integer could be wrong, leading to the fractional part being outside of 0-1 range, which breaks all sorts of other things. 2^24 sounds like a lot, but due to how the detail octaves work, it's not that hard to reach when combined with a large scale. Since this code is originally based on OSL, I checked if they changed it in the meantime, and sure enough, there's a fix for it: https://github.com/OpenImageIO/oiio/commit/5c9dc68391e9 So, this basically just ports over that change to Cycles. The original code mentions being faster, but as pointed out in the linked commit, the performance impact is actually irrelevant. I also checked in a simple scene with eight Noise textures at detail 15 (with >90% of render time being spent on the noise), and the render time went from 13.06sec to 13.05sec. So, yeah, no issue.	2022-10-16 02:34:10 +02:00
Brecht Van Lommel	d20be55c1a	Cleanup: in Cycles force inline transform_inverse_impl We expect this to always happen. Ref T100891	2022-10-03 17:58:34 +02:00
Campbell Barton	ea2c41c730	Cleanup: spelling in comments Also replace "dm" for evaluated mesh in some comments.	2022-10-03 11:03:46 +11:00
Sebastian Herhoz	75a6d3abf7	Cycles: add Path Guiding on CPU through Intel OpenPGL This adds path guiding features into Cycles by integrating Intel's Open Path Guiding Library. It can be enabled in the Sampling > Path Guiding panel in the render properties. This feature helps reduce noise in scenes where finding a path to light is difficult for regular path tracing. The current implementation supports guiding directional sampling decisions on surfaces, when the material contains a least one diffuse component, and in volumes with isotropic and anisotropic Henyey-Greenstein phase functions. On surfaces, the guided sampling decision is proportional to the product of the incident radiance and the normal-oriented cosine lobe and in volumes it is proportional to the product of the incident radiance and the phase function. The incident radiance field of a scene is learned and updated during rendering after each per-frame rendering iteration/progression. At the moment, path guiding is only supported by the CPU backend. Support for GPU backends will be added in future versions of OpenPGL. Ref T92571 Differential Revision: https://developer.blender.org/D15286	2022-09-27 15:56:32 +02:00
Brecht Van Lommel	fd1bc90679	Cycles: sync changes from standalone repository * Windows build fixes * Workaround for Hydra + OpenColorIO link issue * Bump version	2022-09-18 17:34:23 +02:00
Lukas Stockner	6951e8890a	Mikktspace: Optimized port to C++ This commit is a big overhaul to the Mikktspace module, which is used to compute tangents. I'm not calling it a rewrite since it's the result of a lot of iterations on the original code, but pretty much everything is reworked somehow. Overall goal was to a) make it faster and b) make it maintainable. Notable changes: - Since the callbacks for requesting geometry data were a big bottleneck before, I've ported it to C++ and made it header-only, templating on the data source. That way, the compiler generates code specific to the caller, which allows it to inline the data source and specialize for some cases (e.g. subd vs. non-subd in Cycles). - The one input parameter, an optional angle threshold, was not used anywhere. Turns out that removing it allows for considerable algorithmic simplification, removing a lot of the complexity in the later stages. Therefore, I've just removed the option in the new code. - The code computes several outputs, but only one (the tangent itself) is ever used in Blender. Therefore, I've removed the others to simplify the code. They could easily be brought back if needed, none of the algorithmic simplifications are conflicting with them. - The original code had fallback paths for many steps in case temporary memory allocation fails, but that never actually gets used anyways since malloc() doesn't really ever return NULL in practise, so I removed them. - In general, I've restructured A LOT of the code to make the algorithms clearer and make use of some C++ features (vectors, std::array, booleans, classes), though there's still some of cleanup that could be done. - Parallelized duplicate detection, neighbor detection, triangle tangent computation, degenerate triangle handling and tangent space accumulation. - Replaced several algorithms with faster equivalents: Duplicate detection uses a (concurrent) hash set now, neighbor detection uses Radixsort and splits vertices by index pairs etc. As for results, the exact speedup depends on the scene of course, but let's consider the file from T97378: - Blender 3.1 (before D14675): 6.07sec - Blender 3.2 (with D14675): 4.62sec - rBf0a36599007d (last nightly build): 4.42sec - With this commit: 0.90sec This speedup will mostly be noticed at the start of Cycles renders and, even more importantly, in Eevee when doing something that changes the geometry (e.g. animating) on a model using normal maps. Differential Revision: https://developer.blender.org/D15589	2022-09-07 00:35:44 +02:00
Nathan Vegdahl	50df9caef0	Cycles: improve Progressive Multi-Jittered sampling Fix two issues in the previous implementation: * Only power-of-two prefixes were progressively stratified, not suffixes. This resulted in unnecessarily increased noise when using non-power-of-two sample counts. * In order to try to get away with just a single sample pattern, the code used a combination of sample index shuffling and Cranley-Patterson rotation. Index shuffling is normally fine, but due to the sample patterns themselves not being quite right (as described above) this actually resulted in additional increased noise. Cranley-Patterson, on the other hand, always increases noise with randomized (t,s) nets like PMJ02, and should be avoided with these kinds of sequences. Addressed with the following changes: * Replace the sample pattern generation code with a much simpler algorithm recently published in the paper "Stochastic Generation of (t, s) Sample Sequences". This new implementation is easier to verify, produces fully progressively stratified PMJ02, and is far faster than the previous code, being O(N) in the number of samples generated. * It keeps the sample index shuffling, which works correctly now due to the improved sample patterns. But it now uses a newer high-quality hash instead of the original Laine-Karras hash. * The scrambling distance feature cannot (to my knowledge) be implemented with any decorrelation strategy other than Cranley-Patterson, so Cranley-Patterson is still used when that feature is enabled. But it is now disabled otherwise, since it increases noise. * In place of Cranley-Patterson, multiple independent patterns are generated and randomly chosen for different pixels and dimensions as described in the original PMJ paper. In this patch, the pattern selection is done via hash-based shuffling to ensure there are no repeats within a single pixel until all patterns have been used. The combination of these fixes brings the quality of Cycles' PMJ sampler in line with the previously submitted Sobol-Burley sampler in D15679. They are essentially indistinguishable in terms of quality/noise, which is expected since they are both randomized (0,2) sequences. Differential Revision: https://developer.blender.org/D15746	2022-09-01 14:57:39 +02:00
Campbell Barton	a3e1a9e2aa	Cleanup: spelling in comments, format	2022-08-26 12:47:21 +10:00
Brecht Van Lommel	6b9209ddfa	Merge branch 'blender-v3.3-release'	2022-08-19 21:02:02 +02:00
Brecht Van Lommel	4b62970dd3	Cleanup: replace CHECK_TYPE macro with static_assert To avoid conflicts with BLI headers and simplify code.	2022-08-19 20:36:02 +02:00
Nathan Vegdahl	a06c9b5ca8	Cycles: add Sobol-Burley sampling pattern Based on the paper "Practical Hash-based Owen Scrambling" by Brent Burley, 2020, Journal of Computer Graphics Techniques. It is distinct from the existing Sobol sampler in two important ways: * It is Owen scrambled, which gives it a much better convergence rate in many situations. * It uses padding for higher dimensions, rather than using higher Sobol dimensions directly. In practice this is advantagous because high-dimensional Sobol sequences have holes in their sampling patterns that don't resolve until an unreasonable number of samples are taken. (See Burley's paper for details.) The pattern reduces noise in some benchmark scenes, however it is also slower, particularly on the CPU. So for now Progressive Multi-Jittered sampling remains the default. Differential Revision: https://developer.blender.org/D15679	2022-08-19 16:27:22 +02:00
Sebastian Parborg	8ffc11dbcb	Cleanup OpenGL linking and related code after libepoxy merge This cleans up the OpenGL build flags and linking. It additionally also removes some dead code. One of these dead code paths is WITH_X11_ALPHA which actually never was active even with the build flag on. The call to use this was never called because the default initializer for GHOST was set to have it off per default. Nothing called this function with a boolean value to enable it. These cleanups are needed to support true headless OpenGL rendering. Without these cleanups libepoxy will fail to load the correct OpenGL Libraries as we have already linked them to the blender binary. Reviewed By: Brecht, Campbell, Jeroen Differential Revision: http://developer.blender.org/D15554	2022-08-15 16:47:20 +02:00
Christian Rauch	a296b8f694	GPU: replace GLEW with libepoxy With libepoxy we can choose between EGL and GLX at runtime, as well as dynamically open EGL and GLX libraries without linking to them. This will make it possible to build with Wayland, EGL, GLVND support while still running on systems that only have X11, GLX and libGL. It also paves the way for headless rendering through EGL. libepoxy is a new library dependency, and is included in the precompiled libraries. GLEW is no longer a dependency, and WITH_SYSTEM_GLEW was removed. Includes contributions by Brecht Van Lommel, Ray Molenkamp, Campbell Barton and Sergey Sharybin. Ref T76428 Differential Revision: https://developer.blender.org/D15291	2022-08-15 16:10:29 +02:00
Brecht Van Lommel	c9d821294f	Cycles: take into account time limit for progress bar This change allows the Cycles progress report system to take into conderation the time limit property. This allows for more accuracte progress reports for high sample count renders with short time limits. Contributed by Alaska. Differential Revision: https://developer.blender.org/D15599	2022-08-11 19:37:18 +02:00
Brecht Van Lommel	5cbfdaccd0	Cleanup: minor changes to DebugFlags Use C++11, remove unused running_inside_blender and move viewport_static_bvh to BlenderSync.	2022-08-11 17:03:10 +02:00
Brecht Van Lommel	752fb5dd08	Merge branch 'blender-v3.3-release'	2022-08-09 19:19:54 +02:00
Brecht Van Lommel	79f1cc601c	Cycles: improve ray tracing precision near triangle edges Detect cases where a ray-intersection would miss the current triangle, which if the intersection is strictly watertight, implies that a neighboring triangle would incorrectly be hit instead. When that is detected, apply a ray-offset. The idea being that we only want to introduce potential error from ray offsets if we really need to. This work for BVH2 and Embree, as we are able to match the ray-interesction bit-for-bit, though doing so for Embree requires ugly hacks. Tiny differences like fused-multiply-add or dot product intrinstics in matrix inversion and ray intersection needed to be matched exactly, so this is fragile. Unfortunately we're not able to do the same for OptiX or MetalRT, since those implementations are unknown (and possibly impossible to match as hardware instructions). Still artifacts are much reduced, though not eliminated. Ref T97259 Differential Revision: https://developer.blender.org/D15559	2022-08-09 18:42:01 +02:00
Brecht Van Lommel	230f9ade64	Cycles: make transform inverse match Embree exactly Helps improve ray-tracing precision. This is a bit complicated as it requires different implementation depending on the CPU architecture.	2022-08-09 16:59:05 +02:00
Brecht Van Lommel	286e535071	Cleanup: simplify CPU instruction checking The performance of this will be slightly more important for upcoming changes. Also removed an unused function and changed includes so these system.h can be included in more places.	2022-08-09 16:59:05 +02:00
Andrii Symkin	d832d993c5	Cycles: add new Spectrum and PackedSpectrum types These replace float3 and packed_float3 in various places in the kernel where a spectral color representation will be used in the future. That representation will require more than 3 channels and conversion to from/RGB. The kernel code was refactored to remove the assumption that Spectrum and RGB colors are the same thing. There are no functional changes, Spectrum is still a float3 and the conversion functions are no-ops. Differential Revision: https://developer.blender.org/D15535	2022-08-09 16:49:34 +02:00
Brecht Van Lommel	1988665c3c	Cleanup: make vector types make/print functions consistent between CPU and GPU Now all the same ones are available on CPU and GPU, which was previously not possible due to lack of operator overloadng in OpenCL. Print functions are no-ops on some GPUs. Ref D15535	2022-08-09 16:07:23 +02:00
Sergey Sharybin	cefd6140f3	Fix T100119: Cycles light object's parametric vector distorted Caused by `38af5b0501`. Adjust barycentric coordinates used for intersection result in the ray-to-rectangle intersection check. Differential Revision: https://developer.blender.org/D15592	2022-08-09 15:56:03 +02:00
Sergey Sharybin	25a0124bc8	Fix T100119: Light object's parametric vector distorted in blender 3.4 Caused by `38af5b0501`. Adjust barycentric coordinates used for intersection result in the ray-to-rectangle intersection check. Differential Revision: https://developer.blender.org/D15592	2022-08-02 14:17:10 +02:00
Bastien Montagne	d3879e9aaa	Merge branch 'blender-v3.3-release'	2022-07-29 15:17:40 +02:00
Tianhao Chai	b862cf0b9f	Fix Cycles build error with CUDA on arm64 Checking arm64 assembly support before CUDA/Metal would cause NVCC to generate inline arm64 assembly. Differential Revision: https://developer.blender.org/D15569	2022-07-29 14:57:09 +02:00
Brecht Van Lommel	79ab76e156	Cleanup: simplifications and consistency for vector types * OneAPI: remove separate float3 definition * OneAPI: disable operator[] to match other GPUs * OneAPI: make int3 compact to match other GPUs * Use #pragma once * Add __KERNEL_NATIVE_VECTOR_TYPES__ to simplify checks * Remove unused vector3	2022-07-28 21:27:13 +02:00
Brecht Van Lommel	38af5b0501	Cycles: switch Cycles triangle barycentric convention to match Embree/OptiX Simplifies intersection code a little and slightly improves precision regarding self intersection. The parametric texture coordinate in shader nodes is still the same as before for compatibility.	2022-07-27 21:03:33 +02:00
Brecht Van Lommel	4cf6524731	Fix Cycles Metal build errors after recent changes float8 is a reserved type in Metal, but is not implemented. So rename to float8_t for now. Also move back intersection handlers to kernel.metal, they can't be in the class that encapsulates the other Metal kernel functions.	2022-07-26 00:17:37 +02:00
Brecht Van Lommel	f26aa186b2	Cleanup: remove __KERNEL_CPU__ This was tested in some places to check if code was being compiled for the CPU, however this is only defined in the kernel. Checking __KERNEL_GPU__ always works.	2022-07-25 17:43:35 +02:00
Andrii Symkin	793d203139	Cycles: add math functions for float8 This patch adds required math functions for float8 to make it possible using float8 instead of float3 for color data. Differential Revision: https://developer.blender.org/D15525	2022-07-25 17:36:58 +02:00
Brecht Van Lommel	023eb2ea7c	Cycles: more closely match some math and intersection operations in Embree This helps with debugging, and gives a slightly closer match between CPU and CUDA/HIP/Metal renders when it comes to ray tracing precision.	2022-07-25 13:27:40 +02:00
Brecht Van Lommel	5152c7c152	Cycles: refactor rays to have start and end distance, fix precision issues For transparency, volume and light intersection rays, adjust these distances rather than the ray start position. This way we increment the start distance by the smallest possible float increment to avoid self intersections, and be sure it works as the distance compared to be will be exactly the same as before, due to the ray start position and direction remaining the same. Fix T98764, T96537, hair ray tracing precision issues. Differential Revision: https://developer.blender.org/D15455	2022-07-15 18:46:24 +02:00
Michael Jones	da4ef05e4d	Cycles: Apple Silicon optimization to specialize intersection kernels The Metal backend now compiles and caches a second set of kernels which are optimized for scene contents, enabled for Apple Silicon. The implementation supports doing this both for intersection and shading kernels. However this is currently only enabled for intersection kernels that are quick to compile, and already give a good speedup. Enabling this for shading kernels would be faster still, however this also causes a long wait times and would need a good user interface to control this. M1 Max samples per minute (macOS 13.0): PSO_GENERIC PSO_SPECIALIZED_INTERSECT PSO_SPECIALIZED_SHADE barbershop_interior 83.4 89.5 93.7 bmw27 1486.1 1671.0 1825.8 classroom 175.2 196.8 206.3 fishy_cat 674.2 704.3 719.3 junkshop 205.4 212.0 257.7 koro 310.1 336.1 342.8 monster 376.7 418.6 424.1 pabellon 273.5 325.4 339.8 sponza 830.6 929.6 1142.4 victor 86.7 96.4 96.3 wdas_cloud 111.8 112.7 183.1 Code contributed by Jason Fielder, Morteza Mostajabodaveh and Michael Jones Differential Revision: https://developer.blender.org/D14645	2022-07-15 13:40:04 +02:00
Andrii Symkin	f00d9e80ae	Cycles: add more math functions for float4 Add more math functions for float4 to make them on par with float3 ones. It makes it possible to change the types of float3 variables to float4 without additional work. Differential Revision: https://developer.blender.org/D15318	2022-06-30 16:25:21 +02:00
Xavier Hallade	a02992f131	Cycles: Add support for rendering on Intel GPUs using oneAPI This patch adds a new Cycles device with similar functionality to the existing GPU devices. Kernel compilation and runtime interaction happen via oneAPI DPC++ compiler and SYCL API. This implementation is primarly focusing on Intel® Arc™ GPUs and other future Intel GPUs. The first supported drivers are 101.1660 on Windows and 22.10.22597 on Linux. The necessary tools for compilation are: - A SYCL compiler such as oneAPI DPC++ compiler or https://github.com/intel/llvm - Intel® oneAPI Level Zero which is used for low level device queries: https://github.com/oneapi-src/level-zero - To optionally generate prebuilt graphics binaries: Intel® Graphics Compiler All are included in Linux precompiled libraries on svn: https://svn.blender.org/svnroot/bf-blender/trunk/lib The same goes for Windows precompiled binaries but for the graphics compiler, available as "Intel® Graphics Offline Compiler for OpenCL™ Code" from https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html, for which path can be set as OCLOC_INSTALL_DIR. Being based on the open SYCL standard, this implementation could also be extended to run on other compatible non-Intel hardware in the future. Reviewed By: sergey, brecht Differential Revision: https://developer.blender.org/D15254 Co-authored-by: Nikita Sirgienko <nikita.sirgienko@intel.com> Co-authored-by: Stefan Werner <stefan.werner@intel.com>	2022-06-29 12:58:04 +02:00
Sayak Biswas	abfa09752f	Cycles: enable Vega GPU/APU support Enables Vega and Vega II GPUs as well as Vega APU, using changes in HIP code to support 64-bit waves and a new HIP SDK version. Tested with Radeon WX9100, Radeon VII GPUs and Ryzen 7 PRO 5850U with Radeon Graphics APU. Ref T96740, T91571 Differential Revision: https://developer.blender.org/D15242	2022-06-28 18:35:43 +02:00
Andrii Symkin	c2a2f3553a	Cycles: unify math functions names This patch unifies the names of math functions for different data types and uses overloading instead. The goal is to make it possible to swap out all the float3 variables containing RGB data with something else, with as few as possible changes to the code. It's a requirement for future spectral rendering patches. Differential Revision: https://developer.blender.org/D15276	2022-06-23 15:02:53 +02:00

1 2 3 4 5 ...

1130 Commits