When building proxies at lower than 100% resolution, the video frame
downscaling step was single threaded, as found via #127956.
Make it use the same threaded sws_scale machinery that the usual video
decoding/encoding uses. Video encoding/decoding was only using it for
RGB<->YUV conversions, so source and destination sizes were always matching;
here it needs to have different source and destination sizes though.
Time taken to rebuild 50% proxy for a 4K resolution 1440 frames (1 minute)
long video file, on Ryzen 5950X (Win10/VS2022):
- Blender 4.2: 20.1 sec, CPU usage 30-40%.
- Blender 4.3 main: 13.1 sec (ffmpeg build has been fixed to use SIMD),
CPU usage still 30-40% though.
- This PR: 8.3 sec, CPU usage ~95%.
Pull Request: https://projects.blender.org/blender/blender/pulls/128054