griefith/test

Author	SHA1	Message	Date
Clément Foucault	23dce15f67	EEVEE-Next: Horizon Scan: Use Spherical harmonics This uses Spherical Harmonics to store the indirect lighting and distant lighting visibility. We can then reuse this information for each closure which divide the cost of it by 2 or 3 in many cases, doing the scanning once. The storage cost is higher than previous method, so we split the resolution scaling to be independant of raytracing. The spatial filtering has been split to its own pass for performance reason. Upsampling now only uses 4 bilinearly interpolated samples (instead of 9) using bilateral weights to avoid bleeding. This also add a missing dot product (which soften the lighting around corners) and fixes the blocky artifacts seen at lower resolution. Pull Request: https://projects.blender.org/blender/blender/pulls/118924	2024-03-19 19:16:21 +01:00
Aras Pranckevicius	a05adbef28	BLF: optimizations and fixes to font shader Simplifies/optimizes the "font" shader. It runs faster now too, but primarily this is so that it loads/initializes faster. * Instead of doing blur via individual bilinear samples (where each sample is 4 texel fetches), do raw texel fetches of the kernel footprint and compute final result by shifting the kernel weights according to bilinear fraction weight. For 5x5 blur, this reduces number of texel fetches from 64 down to 36. * Instead of checking "is the texel inside the glyph box? if so, then fetch it", first fetch it, and then set result to zero if it was outside. Simplifies the branching code flow in the compiled GPU shader. * Avoid costly integer modulo/division for "unwrapping" the font texture. The texture width is always power of two size, so division/modulo can be replaced by masking and a shift. Setup uniforms to contain the needed data. ### Fixes * The 3x3 blur was not doing a 3x3 blur, due to a copy-pasta typo (one of the sample offsets was repeated twice, and thus another sample offset was missing). * Blur towards left/top edges of the glyphs had artifacts, because float->int casting in GLSL rounds towards zero, but the code actually wanted to round towards floor. Image of how the blur has changed in the PR. ### First time initialization * Windows 10, NVIDIA RTX 3080Ti, OpenGL: 274.4ms -> 51.3ms * macOS, Apple M1 Max, Metal: 456ms -> 289ms (this is including PSO creation time). ### Shader performance/complexity Performance I only measured on macOS (M1 Max), by making a BLF text that is scaled up to cover most of screen via Python. Using Xcode Metal profiler, drawing that text with 5x5 shadow blur: 1.5ms -> 0.3ms. More performance analysis details in PR. Pull Request: https://projects.blender.org/blender/blender/pulls/119653	2024-03-19 16:29:21 +01:00
Jason Fielder	661d12aef7	Fix #119195 : Ensure Metal uses correct attribute conversion mode Resolves custom attribute types for ints and booleans by ensuring conversion mode is correct. Previously, the attribute declarations were assumed to be linear. However, patch ensures the correct attribute index is now fetched, ensuring the conversion mode is correctly specified for non-linear attribute ID's. Authored by Apple: Michael Parkin-White Pull Request: https://projects.blender.org/blender/blender/pulls/119569	2024-03-18 13:38:09 +01:00
Jason Fielder	6768ded895	Fix #118868 : Metal render pass output for EEVEE Next Resolves render pass export for EEVEE Next on Metal. Reads from texture views was previously utilising the root texture rather than the view variant, resulting in views into texture arrays being incorrectly sampled. Authored by Apple: Michael Parkin-White Pull Request: https://projects.blender.org/blender/blender/pulls/119563	2024-03-16 20:16:37 +01:00
Jason Fielder	6b56ed3cd3	Metal: Resolve artifact in EEVEE Next Film Cryptomatte Cryptomatte passes would generate a feathered outline in Metal due to missing texture fence in chained read->modify->write->read->... patterns. Added imageFence function to explicitly state that imageStore's should be visible to future imageLoad's. Authored by Apple: Michael Parkin-White Pull Request: https://projects.blender.org/blender/blender/pulls/119163	2024-03-14 17:48:30 +01:00
Jason Fielder	ecffea86b1	Metal: Fix Storage buffer read sync affecting surfels Authored by Apple: Michael Parkin-White Pull Request: https://projects.blender.org/blender/blender/pulls/119093	2024-03-14 09:40:59 +01:00
Jeroen Bakker	f0f911590e	EEVEE-Next: Viewport pixel size with up-sampling EEVEE-Next performes less on integrated GPUs then discrete GPUs. Most shaders have been analyzed, but there will always be bottlenecks related to architectural differences. In order to make EEVEE-Next run smooth on integrated GPUs this change will implement viewport pixel size option similar to Cycles. The main difference is that the samples will still be weighted and up-sampled to the final film resolution. This makes the pixels not look squared in the viewport but will resolve to something close to the results without up-scaling. This improves the performance especially on integrated GPUs. The improvement for discrete GPUs are less noticeable. See here the stats when playing `rain_restaurant.blend` back on a RAPHAEL_MENDOCINO iGPU. \| Pixel size \| Frames per second \| \|------------\|-------------------\| \| 1x \| 0.25 FPS \| \| 2x \| 4.14 FPS \| \| 4x \| 6.90 FPS \| \| 8x \| 9.95 FPS \| Related to: #114597 See PR for some example images. Pull Request: https://projects.blender.org/blender/blender/pulls/118903	2024-03-13 12:00:24 +01:00
laurynas	aa3ffca8dc	Fix #119247 : Curves: Extra point in evaluated spline of Curves geometry In `bf17fc8d79` after extending buffer to multiple of 4 there appeared trailing space in buffer not covered by shader's `for` loop. Pull Request: https://projects.blender.org/blender/blender/pulls/119346	2024-03-12 15:01:10 +01:00
Jason Fielder	06ac33bdd2	Metal: Fix SSBO from VBO size assertion Resolves assertion firing when creating an SSBO from a VBO which is not aligned to 16 bytes. Required to ensure API validation is satisfied. Authored by Apple: Michael Parkin-White Pull Request: https://projects.blender.org/blender/blender/pulls/119298	2024-03-11 08:28:14 +01:00
Prakhar-Singh-Chouhan	5d076e0e7b	Vulkan: Implementing `VKBackend::samplers_update()` Implemented `VKBackend::samplers_update()`. When triggered, if the VK Device is initialized, the `device.samplers` are freed and reinitialized. Implements: #117019 Pull Request: https://projects.blender.org/blender/blender/pulls/119109	2024-03-11 07:57:52 +01:00
Jason Fielder	703353b5da	Metal: Fix uniform upload for small types This patch adds special cases to Shader::uniform_int routine to allow writing of small types (1 bytes, 2 bytes) to the push constant buffer. This previously interpreted all incoming push constant data as integer components only, resulting in rendering artifacts such as bad SRGB mode selection and shader editor not rendering due to mis-aligned overlay parameter, as the uniform assignment would overflow consecutive small types. Authored by Apple: Michael Parkin-White Pull Request: https://projects.blender.org/blender/blender/pulls/119285	2024-03-10 19:36:30 +01:00
Campbell Barton	e33f5e36ac	Cleanup: spacing around C-style comment blocks	2024-03-09 23:40:57 +11:00
Campbell Barton	32151abfc3	Cleanup: spelling in comments	2024-03-09 16:47:38 +11:00
Campbell Barton	b1c59a793c	Cleanup: correct spelling for alignment	2024-03-09 16:43:34 +11:00
Clément Foucault	b8e726a158	GPU: Add support for small types This implement the design of #118961. - Add aliases in GLSL since theses types are not supported. - Add detection mechanism that prevents usage inside shader shared code. Check is only done in debug build to avoid slowing down application startup. Pull Request: https://projects.blender.org/blender/blender/pulls/119226	2024-03-08 23:28:15 +01:00
Clément Foucault	4205718dce	GPU: Cleanup type aliases This define all aliases for supported types, document which one to use in C++ shared code, move relevant defines to their backend file. Rename `bool1` to `bool32_t` and cleanup its usage as mentioned in #118961. Rel. #118961 Pull Request: https://projects.blender.org/blender/blender/pulls/119098	2024-03-08 19:09:10 +01:00
Hans Goudey	1e1d7034ec	Cleanup: Move GPU_uniform_buffer.h to C++	2024-03-06 21:54:28 -05:00
Anthony Roberts	445fd42c61	Windows: Add ARM64 support * Only works on machines with a Qualcomm Snapdragon 8cx Gen3 or above. Older generation devices are not and will not be supported due to some driver issues * Requires VS2022 for building. * Uses new MSVC preprocessor for sse2neon compatibility. * SIMD is not enabled, waiting on conversion of blenlib to C++. Ref #119126 Pull Request: https://projects.blender.org/blender/blender/pulls/117036	2024-03-06 16:14:34 +01:00
Omar Emara	eb91828aab	GPU: Add maximum image units to GPU capabilities This patch adds the maximum number of supported image units to the GPU capabilities module. Currently, the GPU module assume a maximum of 8 units, so the patch is not currently particularly useful, but we can consider committing it for the future anyways. Pull Request: https://projects.blender.org/blender/blender/pulls/119057	2024-03-05 07:25:20 +01:00
Campbell Barton	ed5fb3eaba	Cleanup: various non-functional C++ changes	2024-03-05 11:32:42 +11:00
Campbell Barton	76867ad4c2	Cleanup: redundant "void" in function declarations for C++	2024-03-05 11:25:35 +11:00
laurynas	bf17fc8d79	Fix: GPU: Ensures length of curves GPUIndexBuf to be multiple of 4 Exception is thrown in gpu_storage_buffer.cc To reproduce create legacy Bezier curve and convert it to new Curves. Code is from #116617 Pull Request: https://projects.blender.org/blender/blender/pulls/118951	2024-03-03 16:39:11 +01:00
Sergey Sharybin	3fcd7ccbc0	Compositor: Enable lock-free GPU context activation on macOS This required to apply a small fix in the Metal texture uploader. Without synchronization pixels of a wrong pass can be uploaded to a wrong texture. This is because this code path is heavily reusing temporary allocations, and at some point the allocation is not considered as still in use, unless the command buffer used by the texture uploader is submitted. Ref #118919 Pull Request: https://projects.blender.org/blender/blender/pulls/118920	2024-03-01 14:38:09 +01:00
Jeroen Bakker	a012aeafd5	Revert "Fix: GPU: Reduce GPU_MAX_ATTR from 15 to 14" This reverts commit `d9caa19ec2`. This commit doesn't compile, and when fixing the issues, doesn't start blender.	2024-02-27 13:44:41 +01:00
Pratik Borhade	c926b65132	Merge branch 'blender-v4.1-release'	2024-02-27 17:43:50 +05:30
dupoxy	d9caa19ec2	Fix: GPU: Reduce GPU_MAX_ATTR from 15 to 14 This is to accommodate Position and Normal attributes. The normal used to be optional but isn't nowadays. So the limit is actually 14 attributes until we do some big refactoring of the attribute fetching. Pull Request: https://projects.blender.org/blender/blender/pulls/118441	2024-02-27 12:19:09 +01:00
Miguel Pozo	c713fbc2d3	GPU: Allow printing full shader source on compilation error Add a define (DEBUG_LOG_SHADER_SRC_ON_ERROR ) in gpu_shader_private.h to print the full source code of shaders that fail to compile. Pull Request: https://projects.blender.org/blender/blender/pulls/116470	2024-02-26 17:30:15 +01:00
Jeroen Bakker	8dce2a422b	EEVEE-Next: Specialization Constants for Film Accumulation On lower end hardware the film accumulation has bad performance. Sometimes upto 10ms. This PR improves the performance somewhat by adding a specialization constant around the renderpasses that are actually needed for rendering, the number of samples and if reprojection is enabled. `enabled_categories`: Based on the enabled render passes some outer loops are enabled/disabled that handle the specific render passes. This improves the performance as no memory will be reserved for branches that are never accessed. `samples_len` & `use_reprojection`: GPU compilers tend to optimize texture fetches when they to the outer loop. This is only possible when the inner loop can be unrolled. In the case of the film accumulation the inner loop couldn't be unrolled. By adding a specialization constant would allow unrolling of the inner loop. On old or low-end devices the improvement is around 40%. On newer devices the improvement is 50+%. Performance of this shader is similar to the godot. \| GPU \| Before \| New \| \|----------------------\|--------\|-------\| \| NVIDIA GTX 760 \| 3.5ms \| 2.4ms \| \| GFX1036 (RDNA2 iGPU) \| 9.9ms \| 6.2ms \| \| AMD Radeon Pro W7500 \| 2.1ms \| 0.9ms \| Pull Request: https://projects.blender.org/blender/blender/pulls/118385	2024-02-26 16:19:26 +01:00
Jeroen Bakker	3109564825	GPU: Fix shader compilation Metal Metal uses an union to store the `gl_WorkGroupSize` the union needs to be unpacked. We first unpack to uvec3 before in order to work around an NVIDIA driver bug. Issue introduced by: `e3ac2ac93e` Pull Request: https://projects.blender.org/blender/blender/pulls/118749	2024-02-26 14:51:21 +01:00
Jeroen Bakker	e3ac2ac93e	GPU: Shaders fail to compile on NVIDIA NVIDIA fails with segmentation fault when compiling shaders due to recent changes. This PR tweaks the shader code to work around the segmentation fault. Issue introduced by: `7f43699ebf` Pull Request: https://projects.blender.org/blender/blender/pulls/118744	2024-02-26 13:01:10 +01:00
Eugene Kuznetsov	7f43699ebf	DRW: Curves: Indexbuf optimization for large numbers of curves This optimizes a few loops that become significant bottlenecks during viewport rendering of scenes with large numbers of curves. To render a curves object, Blender needs to generate a potentially very large (but trivial) index buffer. As previously implemented, this index buffer is generated in an extremely inefficient manner, with a single-threaded loop and an explicit function call per entry. The buffer then needs to be pushed onto the GPU, which is also a fairly slow task. The PR generates the index buffer directly on the GPU with compute shader. Pull Request: https://projects.blender.org/blender/blender/pulls/116617	2024-02-25 17:22:58 +01:00
Clément Foucault	06d3627c43	EEVEE-Next: Make closure evaluation fully type agnostic The goal of this task is to remove noise in the most common material layering configuration. Subsequently, this also split the evaluation of different closure to their own buffer to avoid discontinuity when denoising them. This commit does a few things: - [x] Removes use of global for closure random number. - [x] Refactor the forward evaluation to be closure type agnostic. - [x] Refactor the gbuffer lib to be closure type agnostic. - [x] Reduces the number of picked closure to 3 maximum or less. - [x] Use GPU_MATFLAG_COAT to tag the use of multiple usage of glossy BSDF. - [x] Use two closure bin for Glossy when more than one. - [x] Set closure bin per type for best noise level for most materials. - [x] Change the gbuffer header to put the closure at their bin index. - [x] Add a method to get a closure from the gbuffer from a specific bin. - [x] Split lighting passes per Closure. Pull Request: https://projects.blender.org/blender/blender/pulls/118079	2024-02-24 00:00:11 +01:00
Jeroen Bakker	5698fb2049	RenderDoc: Set Capture Title Adds an option to set the capture title when using renderdoc `GPU_debug_capture_begin` has an optional `title` parameter to set the title of the renderdoc capture. Pull Request: https://projects.blender.org/blender/blender/pulls/118649	2024-02-23 10:57:37 +01:00
Jeroen Bakker	e70e9e3cf9	GPU: Report on vertex attribute conversions Blender uses some vertex attributes that are not (and sometimes never) supported by a GPU. OpenGL silently converted these changes but for Metal/Vulkan we need to convert then when uploading the data. This PR will write to console invalid usages which we should remove from Blender code-base. Note it is still possible to create attributes that still need conversions by using the PyGPU API.	2024-02-22 11:13:16 +01:00
Jason Fielder	eac8c381a0	Metal: Fix EEVEE sync issue on render When a sync primitive signal existed in its own command buffer, the command buffer execution was skipped as the empty flag was previously still set to true. Authored by Apple: Michael Parkin-White Pull Request: https://projects.blender.org/blender/blender/pulls/118557	2024-02-21 12:53:42 +01:00
Jeroen Bakker	e6eecdf614	EEVEE-Next: Voronoi colors are pure emissive The voronoi texture node only sets the first 3 components of the color. The alpha value is never set. Normally this is covered when using it in a shader node, but when directly connected to the AOV output, the color was stored as a pure emissive color. This resulted in incorrect colors in the viewport and image renders. This is a partial fix for #118494 Pull Request: https://projects.blender.org/blender/blender/pulls/118497	2024-02-21 11:32:29 +01:00
Campbell Barton	b6b00b61cb	Cleanup: various non-functional changes for C++	2024-02-21 10:33:56 +11:00
Jason Fielder	d1a9c2f650	Fix: Metal: EEVEE Next viewport motion blur Resolves assertion for EEVEE Next motion blur wherein a swizzled texture used in an image binding loses write-access. We instead must bind the source texture for image write operations. This is now consistent with expected behaviour in other APIs. Authored by Apple: Michael Parkin-White Pull Request: https://projects.blender.org/blender/blender/pulls/117479	2024-02-20 11:17:12 +01:00
Jeroen Bakker	df2b5630d8	Vulkan: Update PCI ids This change cleans up the PCI ids in the Vulkan backend. - reuses already exising constants. - add PCI-id for Apple devices. Pull Request: https://projects.blender.org/blender/blender/pulls/118485	2024-02-20 10:44:11 +01:00
Jeroen Bakker	98bc3369f8	Cleanup: Silence unused parameter warning in Vulkan backend	2024-02-20 10:02:11 +01:00
Jeroen Bakker	5294381dae	GPU: Fix compilation issues in shader builder	2024-02-20 08:07:02 +01:00
Brecht Van Lommel	0f2064bc3b	Revert changes from main commits that were merged into blender-v4.1-release The last good commit was `4bf6a2e564`.	2024-02-19 15:59:59 +01:00
Iliya Katueshenock	9e12a675b5	Cleanup: Merge BKE_node.h into BKE_node.hh Trivial change, just move all the code from `BKE_node.h` to `BKE_node.hh` header top. No mixing code from different headers or namespace changes. Part of #117773 Pull Request: https://projects.blender.org/blender/blender/pulls/118407	2024-02-19 15:26:10 +01:00
Jeroen Bakker	2cb2d3944b	RenderTest: Fix EEVEE Render Test Panorama dicing test fails for EEVEE on legacy platforms. EEVEE creates a shader interface that isn't compatible with the vulkan backend. This PR hides the check. Check should be enabled again after EEVEE has been replaced by EEVEE-Next. This PR also changes the behavior when checks are executed. It used to be executed when blender was build with asserts. Now it is behind the --debug-gpu flag. Pull Request: https://projects.blender.org/blender/blender/pulls/117992	2024-02-19 08:07:53 +01:00
Clément Foucault	8b6d145a6b	Fix: Metal: Shader compilation logging with compute shader It was missing the line directive treatment the fragment and vertex shader had.	2024-02-16 22:52:59 +01:00
Brecht Van Lommel	7453c5ed67	Merge branch 'blender-v4.1-release' into main	2024-02-16 19:31:31 +01:00
Raul Fernandez	324ff4ddef	macOS: Remove unnecessary checks now that minimum version is macOS 11.2 MacOS minimum version is now 11.2 we no longer need to check for lower API versions. Pull Request: https://projects.blender.org/blender/blender/pulls/118388	2024-02-16 19:03:23 +01:00
Jeroen Bakker	c790e6e49d	OpenGL: Reduce Shader Switches Specialization constants was always switching shader even when the constants were not changed. An early exit path was never taken. The performance improvement should not be noticable to end users. But would match with the intention of the design of specialization constants. Pull Request: https://projects.blender.org/blender/blender/pulls/118315	2024-02-16 12:26:10 +01:00
Campbell Barton	7582b15c4c	Cleanup: spelling in comments	2024-02-16 14:26:46 +11:00
Campbell Barton	55adfdc7af	Merge branch 'blender-v4.1-release'	2024-02-15 21:22:52 +11:00

1 2 3 4 5 ...

5037 Commits