Looks like a logic inversion mistake.
Not clear how bad this issue is, as other code related to image metadata
seems to expect 1024 char max size too (e.g. `MAX_METADATA_STR` define
in `ed_draw.cc`. But this is potentially a very bad issue, and the fix
seems safe enough for 4.4 still.
Should also be backported to active LTSs.
Pull Request: https://projects.blender.org/blender/blender/pulls/135983
The main issue of 'type-less' standard C allocations is that there is no check on
allocated type possible.
This is a serious source of annoyance (and crashes) when making some
low-level structs non-trivial, as tracking down all usages of these
structs in higher-level other structs and their allocation is... really
painful.
MEM_[cm]allocN<T> templates on the other hand do check that the
given type is trivial, at build time (static assert), which makes such issue...
trivial to catch.
NOTE: New code should strive to use MEM_new (i.e. allocation and
construction) as much as possible, even for trivial PoD types.
Pull Request: https://projects.blender.org/blender/blender/pulls/135994
There is a bug in OpenEXR bug where it requests at least 4096 bytes even
if the file is smaller than that. Work around it by padding with zeros.
This was fixed upstream in commit 97f857131e6b4c43ab after the 3.3.2
release, but not in any official release yet.
Pull Request: https://projects.blender.org/blender/blender/pulls/135796
Various intermediate calculations would overflow inside both
`imb_save_openexr_float` and `imb_save_openexr_half`.
Additionally, use a raw array for the half conversion since `vector`
will perform an unnecessary zero-initialize on a large amount of memory.
Refer to: #135648
Pull Request: https://projects.blender.org/blender/blender/pulls/135678
Increase the default OIIO limit for uncompressed image buffers. Without
this Cycles could, among some other operations, encounter the following
type of error:
```
E0307 19:21:38.489921 2588 tile.cpp:634] Error opening tile file t:\temp\blender_a26776\cycles-tile-buffer-34740-2157603138768-0-0.exr
OpenImageIO exited with a pending error message that was never
retrieved via OIIO::geterror(). This was the error message:
Uncompressed image size 33645.6 MB exceeds the 32768 MB limit.
Image claimed to be 42000x42000, 5-channel float. Possible corrupt input?
If this is a valid file, raise the OIIO attribute "limits:imagesize_MB".
```
Users are able to bypass this themselves in two different ways if
the limit does not meet their needs.
The primary downside of this change is that it increases the memory
consumed if the file is actually malicious. Perhaps well beyond what
most consumer devices have available with physical+swap.
An alternate design would be to expose a System-level preference
for this limit and keep the default 32gb.
Refer to: #135648
Pull Request: https://projects.blender.org/blender/blender/pulls/135676
This PR creates 2 namespaces for VSE code:
- `blender::seq` for sequencer core code
- `blender::ed::vse` for editor code
These names are chosen to not be in conflict with each other.
No namespace was used for RNA.
Finally, file `BKE_sequencer_offscreen.h` was moved from BKE to sequencer.
Pull Request: https://projects.blender.org/blender/blender/pulls/135500
The general idea is to keep the 'old', C-style MEM_callocN signature, and slowly
replace most of its usages with the new, C++-style type-safer template version.
* `MEM_cnew<T>` allocation version is renamed to `MEM_callocN<T>`.
* `MEM_cnew_array<T>` allocation version is renamed to `MEM_calloc_arrayN<T>`.
* `MEM_cnew<T>` duplicate version is renamed to `MEM_dupallocN<T>`.
Similar templates type-safe version of `MEM_mallocN` will be added soon
as well.
Following discussions in !134452.
NOTE: For now static type checking in `MEM_callocN` and related are slightly
different for Windows MSVC. This compiler seems to consider structs using the
`DNA_DEFINE_CXX_METHODS` macro as non-trivial (likely because their default
copy constructors are deleted). So using checks on trivially
constructible/destructible instead on this compiler/system.
Pull Request: https://projects.blender.org/blender/blender/pulls/134771
Replace (only two remaining) usages of C-style IMB_processor_apply_threaded
with just threading::parallel_for which is much easier to use in C++ without
intermediate structs.
IMB_display_buffer_acquire got faster as a result -- parallel for has lower
overhead compared to the task pool approach that the previous
implementation was using. While at it, noticed that
IMB_display_buffer_acquire was clearing just-allocated memory, immediately
before overwriting it. So that is now gone too.
IMB_display_buffer_acquire time during playback of 4K resolution float
content in VSE (Ryzen 5950X, Windows): 10.7ms -> 7.7ms
Pull Request: https://projects.blender.org/blender/blender/pulls/135269
IMB_alpha_under_color_[byte/float] functions are used when preparing
the rendered image for image/movie output with RGB channels (i.e. no
transparency). They were single threaded before, multi-thread them.
Time taken by them on 4K resolution image (mix of various transparency
values in source), on Ryzen 5950X/Windows:
- IMB_alpha_under_color_byte: 10.1ms -> 1.9ms
- IMB_alpha_under_color_float: 14.6ms -> 8.8ms (smaller speedup since
it becomes memory bandwidth limited)
Pull Request: https://projects.blender.org/blender/blender/pulls/135258
OpenEXR DWA compression in Blender is derived from a more user-friendly
quality slider which has an intuitive range 0 .. 100.
Initially the mapping was done so that the visually lossless JPEG
quality of 97 was mapped to the default DWA compression 45. A point was
made that we should make it so default quality is mapped to the default
compression, following the intent of DWA for rendering and compositing
the main target.
This change adjusts the mapping so that quality of 90 is mapped to DWA
compression 45.
This change relies on the library update to fully utilize the DWA
compression #135037.
This change leads to the difference in the way proxies of EXR images
are generated:
```
DWA compression Size (bytes)
Before the change 750 175,208,243
After the change 225 77,838,827
```
It is worth noting that the DWA compression seemed to be ignored in
the 4.4 branch before this change (this is what the original report is
about, a bit indirectly).
This is measured on the Fabrik Eingang footage converted to EXR. The
absolute value is ptobably not that important, it just shows the
reduction in size. This also leads to a lower quality of the proxy
image, but it is not worse than an actual JPEG proxy: the quality is
set to rather low 50 for the strip proxies.
Ref #134802
Pull Request: https://projects.blender.org/blender/blender/pulls/135103
Float->byte rendered image dithering uses triangle noise algorithm. Keep
the algorithm the same, just make some improvements and fix some issues:
1) The hash function for noise was using "trig" hash from "On generating
random numbers" (Rey 1998), but that is not a great quality hash, plus it
can produce very different results between CPUs/GPUs. Replace it with
"iqint3" (recommended by "Hash Functions for GPU Rendering", JCGT 2020),
which is same performance on GPU, faster on CPU, and much better quality.
This is the same hash as Cycles already uses elsewhere. Also it is purely
integer based, so exactly the same results on all platforms.
2) For the above point, replace `dither_random_value` to take integer
pixel coordinates and adjust calling code accordingly. Some previous
callers were (accidentally?) passing integer coordinates already. Other
places actually get a tiny bit simpler, since they now no longer need an
extra multiplication.
3) The CPU dithering path was wrongly introducing bias, i.e. making the
image lighter. The CPU path also needs dither noise to be in [-1..+1]
range (not [-0.5..+1.5]!) just like GPU path does, since the later
float->byte conversion already does rounding.
4) The CPU dithering path was using thread-slice-local Y coordinate,
meaning the dithering pattern was repeating vertically. The more CPU cores
you use, the worse the repetition.
5) Change the way that uniform noise is converted to triangle noise.
Previous implementation was based on one shadertoy from 2015, change it
to another shadertoy from 2020. The new one fixes issues with the old way,
and it just works on the CPU too, so now both CPU and GPU code paths are
exactly the same.
6) Cleanup: remove DitherContext, just a single float is enough
Performance and image comparisons in the PR.
Pull Request: https://projects.blender.org/blender/blender/pulls/135224
Looks like "divers" comes from ancient times, Dutch word meaning "misc".
But by now, everything in that file is about conversion between different
pixel data types.
Pull Request: https://projects.blender.org/blender/blender/pulls/135165
There's no point in having non-threaded image color space conversion functions.
So merge the threaded and non-threaded functions and clarify names while at it:
- IMB_colormanagement_transform & IMB_colormanagement_transform_threaded
-> IMB_colormanagement_transform_float
- IMB_colormanagement_transform_byte & IMB_colormanagement_transform_byte_threaded
-> IMB_colormanagement_transform_byte
- IMB_colormanagement_transform_from_byte & IMB_colormanagement_transform_from_byte_threaded
-> IMB_colormanagement_transform_byte_to_float
These places were doing single-threaded colorspace conversion previously, and
thus now are potentially faster:
- IMB_rect_from_float (used in many places)
- EXR image "save as render" saving (image_exr_from_scene_linear_to_output)
- Object baking (write_internal_bake_pixels, write_external_bake_pixels)
- General image saving, clipboard copy, movie preparation
(IMB_colormanagement_imbuf_for_write)
- Linear conversion when reading HDR images/movies
(colormanage_imbuf_make_linear)
- EXR multi-layer conversion (render_result_new_from_exr)
For one case I benchmarked, which is to render out a 2D stabilized 10 bit input
movie clip out of VSE, the total render time went from 49sec down to 44sec
(Ryzen 5950X), one of the single-threaded parts was the colorspace conversion
in the movieclip code.
Pull Request: https://projects.blender.org/blender/blender/pulls/135155
Speedup IMB_rotate_orthogonal (used for example in auto-rotating
videos that were shot sideways on a phone) by: 1) not copying previous
pixel values into new result, only for them to be immediately
overwritten by rotated pixels, and 2) using multi-threading.
Performing rotation of 1920x1080 resolution HDR (float) video frame
goes from 20ms down to 5ms (Ryzen 5950X, Windows)
Pull Request: https://projects.blender.org/blender/blender/pulls/135158
The File Output node crashes when saving a 16-bit vector image in an
RGBA image. That's because the OIIO writer assumes 4-channel buffer
while the buffer provided by the node is only 3-channel. To fix this,
the OIIO writer is extended to support all possible combination of
source and target channels.
Pull Request: https://projects.blender.org/blender/blender/pulls/134789
All 2D vectors related to image transform code were changed to float2.
Previously, it was decided, that 4x4 matrix should be used for 2D
affine transform, but this is changed to 3x3 now.
Texture painting code did rely on `IMB_transform` with 4x4 matrix.
To avoid large changes, I have added function
`BLI_rctf_transform_calc_m3_pivot_min`.
Main motivation is cleaner code - ease of use of c++ API, and avoiding
returning values by arguments.
Pull Request: https://projects.blender.org/blender/blender/pulls/133692
Since one user-defined conversion operator is allowed during implicit conversion,
and after this conversion here is a constructor which can accept result
of conversion, there was a backdoor for a vector types to up-cast their
dimensions via cast to pointer type of a component of a vector. Since it was
implicit and non-intentional it led to buffer overflows.
Pull Request: https://projects.blender.org/blender/blender/pulls/132927
* Ensure valid bit depth is set along with file type
* Guard against invalid inputs in stereo imbuf creation
* Remove some unused code
Thanks Yiming Wu for finding the cause.
Pull Request: https://projects.blender.org/blender/blender/pulls/133499
This works around ffmpeg bug https://trac.ffmpeg.org/ticket/10755
where for specific files that are:
- Ogg container format, with supported audio stream (e.g. Vorbis),
- But the video stream is not Ogg-compatible (e.g. Theora), but rather
it is an embedded "album art" (AV_DISPOSITION_ATTACHED_PIC) in
MJPEG, PNG or some other non-Ogg format.
Calling any sort of ffmpeg "seek" function on that video stream just
aborts from innards of ffmpeg.
So to work around this:
- Detect such files (ogg container, non-theora video, attached picture
disposition) and for those:
- Never seek within them, and only ever decode one frame. Return that
frame for any & all "give me a frame" requests.
- Additionally, calculating "how many frames this video has" for such
files also returns nonsense ("millions of frames") since their frame
rate is set to like 90000 or similar. So pretend they have a "sane"
frame rate. Do all this frame rate calculation just once when opening
the video, and use that result in all other places.
- Never build proxies for such video files, since e.g. "timecode"
for them does not make sense.
All of this could be removed once/if ffmpeg fixes their issue.
Pull Request: https://projects.blender.org/blender/blender/pulls/132920
Looks like this regressed in c1f5d8d023 (blender 3.1), basically
since then if there was no video, then no audio was ever written
either.
From what I can tell, the original change tried to fix the problem
that "file size autosplit" logic was after video, but before audio
data writing. So it moved audio writing to be before the split (good),
but also (not sure whether by accident) moved audio writing to
only happen if video is written.
Pull Request: https://projects.blender.org/blender/blender/pulls/132874
A work around ffmpeg issue that everyone (e.g. OBS) seems to be doing.
By default ffmpeg uses built-in VP8/VP9 decoders, however those
do not detect alpha channel (https://trac.ffmpeg.org/ticket/8344 -
the bug filed in 2019, currently still open in ffmpeg 7.1. There's
an older report from 2016 too, https://trac.ffmpeg.org/ticket/5792).
The trick for VP8/VP9 is to explicitly force use of libvpx decoder.
Only do this where alpha_mode=1 metadata is set. Note that in order
to work, the previously initialized format context must be closed
and a fresh one with explicitly requested codec must be created.
Pull Request: https://projects.blender.org/blender/blender/pulls/132795