Simplifies/optimizes the "font" shader. It runs faster now too, but primarily this is so that it loads/initializes faster. * Instead of doing blur via individual bilinear samples (where each sample is 4 texel fetches), do raw texel fetches of the kernel footprint and compute final result by shifting the kernel weights according to bilinear fraction weight. For 5x5 blur, this reduces number of texel fetches from 64 down to 36. * Instead of checking "is the texel inside the glyph box? if so, then fetch it", first fetch it, and then set result to zero if it was outside. Simplifies the branching code flow in the compiled GPU shader. * Avoid costly integer modulo/division for "unwrapping" the font texture. The texture width is always power of two size, so division/modulo can be replaced by masking and a shift. Setup uniforms to contain the needed data. ### Fixes * The 3x3 blur was not doing a 3x3 blur, due to a copy-pasta typo (one of the sample offsets was repeated twice, and thus another sample offset was missing). * Blur towards left/top edges of the glyphs had artifacts, because float->int casting in GLSL rounds towards zero, but the code actually wanted to round towards floor. Image of how the blur has changed in the PR. ### First time initialization * Windows 10, NVIDIA RTX 3080Ti, OpenGL: 274.4ms -> 51.3ms * macOS, Apple M1 Max, Metal: 456ms -> 289ms (this is including PSO creation time). ### Shader performance/complexity Performance I only measured on macOS (M1 Max), by making a BLF text that is scaled up to cover most of screen via Python. Using Xcode Metal profiler, drawing that text with 5x5 shadow blur: 1.5ms -> 0.3ms. More performance analysis details in PR. Pull Request: https://projects.blender.org/blender/blender/pulls/119653
28 lines
885 B
GLSL
28 lines
885 B
GLSL
/* SPDX-FileCopyrightText: 2016-2022 Blender Authors
|
|
*
|
|
* SPDX-License-Identifier: GPL-2.0-or-later */
|
|
|
|
void main()
|
|
{
|
|
color_flat = col;
|
|
glyph_offset = offset;
|
|
glyph_dim = abs(glyph_size);
|
|
glyph_mode = mode;
|
|
glyph_comp_len = comp_len;
|
|
interp_size = int(glyph_size.x < 0) + int(glyph_size.y < 0);
|
|
|
|
/* Quad expansion using instanced rendering. */
|
|
float x = float(gl_VertexID % 2);
|
|
float y = float(gl_VertexID / 2);
|
|
vec2 quad = vec2(x, y);
|
|
|
|
vec2 interp_offset = float(interp_size) / abs(pos.zw - pos.xy);
|
|
texCoord_interp = mix(-interp_offset, 1.0 + interp_offset, quad) * vec2(glyph_dim) + vec2(0.5);
|
|
|
|
vec2 final_pos = mix(vec2(ivec2(pos.xy) + ivec2(-interp_size, interp_size)),
|
|
vec2(ivec2(pos.zw) + ivec2(interp_size, -interp_size)),
|
|
quad);
|
|
|
|
gl_Position = ModelViewProjectionMatrix * vec4(final_pos, 0.0, 1.0);
|
|
}
|