Previously the code was scattered around in places and the logic
was hard to follow. CRF values will need more remapping done on them
for future codecs and/or 10/12 bit video support, so move all that logic
into set_quality_rate_options function that does three clear things:
1) set constant bit rate if that is used,
2) set lossless mode if that is used,
3) set CRF parameters if that is used.
Pull Request: https://projects.blender.org/blender/blender/pulls/129092
Two issues were present in code that was resetting CRF quality level
to "do not use CRF, use bitrate settings" mode:
- It did not include AV1 codec, so whenever you switched to AV1 the CRF
was changing to constant bitrate.
- It wrongly included DNxHD codec. That seemed like it was trying to
fix#100079 in 06a01168f6, but it was doing exactly the opposite
of what a fix should have been? I don't understand it :/ In any case,
now with DNxHD removed the CRF properly switches to constant bitrate
when changing codec to DNxHD.
Factored the actual logic into BKE_ffmpeg_codec_supports_crf function.
Pull Request: https://projects.blender.org/blender/blender/pulls/129050
Previous fix to make VP9 lossless work (98689f51c0) applied it to
all videos that happen to be in WebM containers. While it is typical
that WebM would be used for VP9, it is not necessarily so (WebM can
have H.264 or really any other video). Do the check based on video
codec being VP9.
Pull Request: https://projects.blender.org/blender/blender/pulls/129045
Previous fix (b17734598d) tried to fix this, but it seems that
depending on CPU and video width, ffmpeg might need at least 64
byte alignment, even if CPU we're running on is only AVX2 (32 byte
alignment). That is because some other parts of ffmpeg code
statically pick 64 or 32 byte alignment internally, depending on whether
AVX512 support is even compiled in (even if CPU might not have it).
I have checked whether this does not negatively affect a platform where
SIMD alignment is always 16 (Mac), and it does not, everything works as
expected.
Pull Request: https://projects.blender.org/blender/blender/pulls/128107
When building proxies at lower than 100% resolution, the video frame
downscaling step was single threaded, as found via #127956.
Make it use the same threaded sws_scale machinery that the usual video
decoding/encoding uses. Video encoding/decoding was only using it for
RGB<->YUV conversions, so source and destination sizes were always matching;
here it needs to have different source and destination sizes though.
Time taken to rebuild 50% proxy for a 4K resolution 1440 frames (1 minute)
long video file, on Ryzen 5950X (Win10/VS2022):
- Blender 4.2: 20.1 sec, CPU usage 30-40%.
- Blender 4.3 main: 13.1 sec (ffmpeg build has been fixed to use SIMD),
CPU usage still 30-40% though.
- This PR: 8.3 sec, CPU usage ~95%.
Pull Request: https://projects.blender.org/blender/blender/pulls/128054
Started happening with 422dd9404f that introduced multi-threaded
conversions of src->dst (usually RGBA->YUV) format before encoding
the frame with ffmpeg. But the issue itself is not related to
multi-threading, but rather with the fact that AVFrame objects
started to be backed by an AVBuffer object (as that is needed for
threaded swscale to work).
Turns out, if a frame is backed by AVBuffer object, said buffer
might become "non writable" because it got shared (non-1 refcount).
And that happens with some ffmpeg video codecs, particularly PNG one.
Make sure to make the AVFrame objects writable inside
generate_video_frame. This follows official ffmpeg example
(doc/examples/encode_video.c) that explains why that is needed:
"the codec may have kept a reference to the frame in its internal
structures, that makes the frame unwritable. av_frame_make_writable()
checks that and allocates a new buffer for the frame only if necessary"
Pull Request: https://projects.blender.org/blender/blender/pulls/126317
after alignment, the rgb_frame's linesize is not equal to the actual
data length, so when copying the data generated from render to the
rgb_frame, they should use different linesize, one after alignment
and one before alignment.
Pull Request: https://projects.blender.org/blender/blender/pulls/120184
ffmpeg AVFrame objects should use correct alignment between image
rows, or otherwise bad things might happen. In this particular
case, multi-threaded libswscale RGB->YUV conversion was trampling
over 4 bytes of V plane, for each thread boundary.
Pull Request: https://projects.blender.org/blender/blender/pulls/120168
Towards #118493: make movie writing functionality take ImBuf instead
of int* to pixel data.
While at it, make other bMovieHandle functions use "bool" return type
when it is strictly a success/failure result.
Pull Request: https://projects.blender.org/blender/blender/pulls/118559
ffmpeg's libswscale is used to do RGB<->YUV conversions on movie reading
and writing. The "context" for the scale operation was being created
and destroyed for each movie clip / animation. Now, maintain a cache
of the scale contexts instead.
E.g. in Gold edit, it only ever needs two contexts (one for reading
source movie clips since they are all exactly the same resolution
and format; and one for rendering the resulting movie).
During playback, on some of the "slow" frames (camera cuts) this
saves 10-20ms (Windows, Ryzen 5950X). Rendering whole movie goes
from 390sec to 376sec.
Pull Request: https://projects.blender.org/blender/blender/pulls/118130
The BLI_endian_defines.h include was removed in [0] because
the resulting object files were unchanged on little endian systems.
Using -Werror=undef with GCC prevents this from happening in the future.
[0]: 9e4129feeb
- "can not" -> "cannot" in many places (ambiguous, also see
Writing Style guide).
- "Bezier" -> "Bézier": proper spelling of the eponym.
- Tool keymaps: make "Uv" all caps.
- "FFMPEG" -> "FFmpeg" (official spelling)
- Use MULTIPLICATION SIGN U+00D7 instead of MULTIPLICATION X U+2715.
- "LClick" -> "LMB", "RClick" -> "RMB": this convention is used
everywhere else.
- "Save rendered the image..." -> "Save the rendered image...": typo.
- "Preserve Current retiming": title case for property.
- Bend status message: punctuation.
- "... class used to define the panel" -> "header": copy-paste error.
- "... class used to define the menu" -> "asset": copy-paste error.
- "Lights user to display objects..." -> "Lights used...": typo.
- "-setaudio require one argument" -> "requires": typo.
Some issues reported by Joan Pujolar and Tamar Mebonia.
Pull Request: https://projects.blender.org/blender/blender/pulls/117856
Along with the 4.1 libraries upgrade, we are bumping the clang-format
version from 8-12 to 17. This affects quite a few files.
If not already the case, you may consider pointing your IDE to the
clang-format binary bundled with the Blender precompiled libraries.
After doing regular movie frame decoding, there's a "postprocess" step for
each incoming frame, that does deinterlacing if needed, then YUV->RGB
conversion, then vertical image flip and additional interlace filtering if
needed. While this postprocess step is not the "heavy" part of movie
playback, it still takes 2-3ms per each 1080p resolution input frame that
is being played.
This PR does two things:
- Similar to #116008, uses multi-threaded `sws_scale` to do YUV->RGB
conversion.
- Reintroduces "do vertical flip while converting to RGB", where possible.
That was removed in 2ed73fc97e due to issues on arm64 platform, and
theory that negative strides passed to sws_scale is not an officially
supported usage.
My take on the last point: negative strides to sws_scale is a fine and
supported usage, just ffmpeg had a bug specifically on arm64 where they
were accidentally not respected. They fixed that for ffmpeg 6.0, and
backported it to all versions back to 3.4.13 -- you would not backport
something to 10 releases unless that was an actual bug fix!
I have tested the glitch_480p.mp4 that was originally attached to the
bug report #94237 back then, and it works fine both on x64 (Windows)
and arm64 (Mac).
Timings, ffmpeg_postprocess cost for a single 1920x1080 resolution movie
strip inside VSE:
- Windows/VS2022 Ryzen 5950X: 3.04ms -> 1.18ms
- Mac/clang15 M1 Max: 1.10ms -> 0.71ms
Pull Request: https://projects.blender.org/blender/blender/pulls/116309
Whenever movie frame encoding needs to be in non-RGBA format (pretty much
always, e.g. H.264 uses YUV etc.), the ffmpeg code has been using
sws_scale() since 2007. But that one is completely single threaded.
It can be multi-threaded by passing "threads" option to SwsContext
(in a cumbersome way), combined with sws_scale_frame API. Which however
requires frame data buffers to be allocated via AVBuffer machinery.
Rendering a 300-frame part of Sprite Fright Edit (target H.264 Medium):
- Windows Ryzen 5950X: 16.1 -> 12.0 seconds (generate_video_frame part
4.7 -> 0.7 sec).
- Mac M1 Max: 13.1 -> 12.5 sec. Speedup is smaller, but comparatively,
entirely other part of movie rendering (audio resampling inside audaspace)
is way more expensive compared to the windows machine.