Files
test/source/blender/blenfont
Aras Pranckevicius a05adbef28 BLF: optimizations and fixes to font shader
Simplifies/optimizes the "font" shader. It runs faster now too, but primarily
this is so that it loads/initializes faster.

* Instead of doing blur via individual bilinear samples (where each sample is 4
  texel fetches), do raw texel fetches of the kernel footprint and compute final
  result by shifting the kernel weights according to bilinear fraction weight.
  For 5x5 blur, this reduces number of texel fetches from 64 down to 36.
* Instead of checking "is the texel inside the glyph box? if so, then fetch it",
  first fetch it, and then set result to zero if it was outside. Simplifies the
  branching code flow in the compiled GPU shader.
* Avoid costly integer modulo/division for "unwrapping" the font texture. The
  texture width is always power of two size, so division/modulo can be replaced
  by masking and a shift. Setup uniforms to contain the needed data.

### Fixes

* The 3x3 blur was not doing a 3x3 blur, due to a copy-pasta typo (one of the
  sample offsets was repeated twice, and thus another sample offset was
  missing).
* Blur towards left/top edges of the glyphs had artifacts, because float->int
  casting in GLSL rounds towards zero, but the code actually wanted to round
  towards floor.

Image of how the blur has changed in the PR.

### First time initialization

* Windows 10, NVIDIA RTX 3080Ti, OpenGL: 274.4ms -> 51.3ms
* macOS, Apple M1 Max, Metal: 456ms -> 289ms (this is including PSO creation
  time).

### Shader performance/complexity

Performance I only measured on macOS (M1 Max), by making a BLF text that is
scaled up to cover most of screen via Python. Using Xcode Metal profiler,
drawing that text with 5x5 shadow blur: 1.5ms -> 0.3ms.

More performance analysis details in PR.

Pull Request: https://projects.blender.org/blender/blender/pulls/119653
2024-03-19 16:29:21 +01:00
..