BLI_snprintf and related functions could truncate partial UTF8
code-points, which would then cause problems elsewhere -
Python raises an exception when accessing for example.
Existing uses of BLI_snprintf should use the UTF8 versions in most
cases, except for file paths which are not required to be UTF8.
A non UTF8 title on Wayland causes disconnection from the server.
Resolve by using the path as-is unless invalid UTF8 byte sequences
are found, in that case they're replaced by `?`.
This is skipped on WIN32 & macOS since the issue only exists on Wayland.
Similar to BLI_str_utf8_invalid_strip except that it substitutes
invalid characters and doesn't change the string length.
Useful for displaying strings that include invalid UTF8 code-points.
This commit enables Blender 4.5 to use (to some extent) blendfiles from
Blender 5.0 and later using 'long' ID names (i.e. ID names over 63 bytes).
On a more general perspective, it also introduces safer handling of
potentially corrupted ID names in a blendfile.
This is achieved by carefully checking for non-null terminated ID names
early on in readfile process, and then:
* Truncating and ensuring uniqueness of ID names.
* Doing similar process for action slot and slot users identifiers.
* In linking (and appending) context, such IDs are totally ignored. They are
not listed, and are considered as missing if some other (valid) linked ID
attempt to indirectly link them).
* Informing users through usual reporting ways.
Technically, this mainly changes two areas of the readfile code related to IDs
themselves:
* The utils `blo_bhead_id_name` that returns the ID name of an ID BHead,
without actually reading that ID, now check for a valid null-terminated
string of `MAX_ID_NAME` max size, and returns a `nullptr` on error.
_This essentially prevents listing and linking such IDs, in any way._
* The actual ID reading code (`read_id_struct`) does the same check, and
truncate the ID name to its maximum allowed length.
* Both of above checks also set a new FileData flag
(`FD_FLAGS_HAS_LONG_ID_NAME`), which is used to ensure that ID names (and
related actions slots identifiers) remain unique, and report to info to the
user.
Implements #137608.
Branched out from !137196.
Co-authored-by: michal.krupa <michal.krupa@cdprojektred.com>
Co-authored-by: Campbell Barton <ideasman42@gmail.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/139336
This PR adds new typographical line breaking opportunities, characters
that break depending on what follows. For example comma if not followed
by a space or a number. Period if not followed by a space or another
period. Along with related non-Latin equivalents, like Greek question
mark, full width punctuation, Hebrew, Arabic, etc.
Pull Request: https://projects.blender.org/blender/blender/pulls/137934
The PR allows specifying the method of line breaking. Current functions
gain an optional argument that defaults to "Minimal" which breaks as it
does now, only on space and line feed. "Typographical" breaks on
different types of spaces, backslashes, underscore, forward slash if
not following a number, dashes if not following a space, Chinese,
Japanese, and Korean Ideograms, and Tibetan intersyllabic marks. "Path"
breaks on space, underscore, and per-platform path separators. There is
also a "HardLimit" flag that forces breaking at character boundary at
the wrapping limit.
Pull Request: https://projects.blender.org/blender/blender/pulls/135203
Previously, there was a `StringRef.copy` method which would copy the string into
the given buffer. However, it was not defined for the case when the buffer was
too small. It moved the responsibility of making sure the buffer is large enough
to the caller.
Unfortunately, in practice that easily hides bugs in builds without asserts
which don't come up in testing much. Now, the method is replaced with
`StringRef.copy_utf8_truncated` which has much more well defined semantics and
also makes sure that the string remains valid utf-8.
This also renames `unsafe_copy` to `copy_unsafe` to make the naming more similar
to `copy_utf8_truncated`.
Pull Request: https://projects.blender.org/blender/blender/pulls/133677
This makes it clearer other "safe" functions should be used in
combination with the resulting offsets.
Also correct doc-string which wasn't updated from the "or_error()"
version of this function.
There were enough cases of callers ignoring a potential the error value,
using the column width for e.g. to calculate pixel sizes, or the size in
bytes to calculate buffer offsets.
Since text fields & labels can include characters that return an error
from BLI_str_utf8_as_unicode, add the suffix to make this explicit.
Strings that include Latin1 encoding or corrupt UTF8 byte sequences
could read past the buffer bounds (stepping over the null terminator).
Resolve by passing in the string length.
Other changes to support non-UTF8 byte sequences:
- BLI_str_utf8_offset_{to/from}_index were accumulating
the UTF8 offset without accounting for non-UTF8 characters
which could cause a buffer underflow or enter an eternal loop.
- BLI_str_utf8_offset_to_index would read past the buffer bounds if the
offset passed in if it was in the middle of a UTF8 byte sequence.
Listing the "Blender Foundation" as copyright holder implied the Blender
Foundation holds copyright to files which may include work from many
developers.
While keeping copyright on headers makes sense for isolated libraries,
Blender's own code may be refactored or moved between files in a way
that makes the per file copyright holders less meaningful.
Copyright references to the "Blender Foundation" have been replaced with
"Blender Authors", with the exception of `./extern/` since these this
contains libraries which are more isolated, any changed to license
headers there can be handled on a case-by-case basis.
Some directories in `./intern/` have also been excluded:
- `./intern/cycles/` it's own `AUTHORS` file is planned.
- `./intern/opensubdiv/`.
An "AUTHORS" file has been added, using the chromium projects authors
file as a template.
Design task: #110784
Ref !110783.
A lot of files were missing copyright field in the header and
the Blender Foundation contributed to them in a sense of bug
fixing and general maintenance.
This change makes it explicit that those files are at least
partially copyrighted by the Blender Foundation.
Note that this does not make it so the Blender Foundation is
the only holder of the copyright in those files, and developers
who do not have a signed contract with the foundation still
hold the copyright as well.
Another aspect of this change is using SPDX format for the
header. We already used it for the license specification,
and now we state it for the copyright as well, following the
FAQ:
https://reuse.software/faq/
Reserve the term `len` for string length, some functions used this for
an string/array length, others a destination buffer size
(even within a single function declaration).
Also improve naming consistency across different functions.
Only the text editor supported the primary clipboard & only for modal
selection. Now selecting text in the console & 3D text editing also
sets the primary clipboard under X11 & Wayland.
Notes:
- Pasting from the primary clipboard isn't yet exposed in the key-map
so in practice it's only useful for pasting text outside of Blender.
- Use skip-save option when pasting from the primary selection
so this is never used by the regular paste shortcut.
- This commit adds a primary-clipboard flag to WM_capabilities_flag() so
creating the the copy-buffer is only performed when necessary.
Use a shorter/simpler license convention, stops the header taking so
much space.
Follow the SPDX license specification: https://spdx.org/licenses
- C/C++/objc/objc++
- Python
- Shell Scripts
- CMake, GNUmakefile
While most of the source tree has been included
- `./extern/` was left out.
- `./intern/cycles` & `./intern/atomic` are also excluded because they
use different header conventions.
doc/license/SPDX-license-identifiers.txt has been added to list SPDX all
used identifiers.
See P2788 for the script that automated these edits.
Reviewed By: brecht, mont29, sergey
Ref D14069
This patch implements the vector types (i.e:`float2`) by making heavy
usage of templating. All vector functions are now outside of the vector
classes (inside the `blender::math` namespace) and are not vector size
dependent for the most part.
In the ongoing effort to make shaders less GL centric, we are aiming
to share more code between GLSL and C++ to avoid code duplication.
####Motivations:
- We are aiming to share UBO and SSBO structures between GLSL and C++.
This means we will use many of the existing vector types and others
we currently don't have (uintX, intX). All these variations were
asking for many more code duplication.
- Deduplicate existing code which is duplicated for each vector size.
- We also want to share small functions. Which means that vector
functions should be static and not in the class namespace.
- Reduce friction to use these types in new projects due to their
incompleteness.
- The current state of the `BLI_(float|double|mpq)(2|3|4).hh` is a
bit of a let down. Most clases are incomplete, out of sync with each
others with different codestyles, and some functions that should be
static are not (i.e: `float3::reflect()`).
####Upsides:
- Still support `.x, .y, .z, .w` for readability.
- Compact, readable and easilly extendable.
- All of the vector functions are available for all the vectors types
and can be restricted to certain types. Also template specialization
let us define exception for special class (like mpq).
- With optimization ON, the compiler unroll the loops and performance
is the same.
####Downsides:
- Might impact debugability. Though I would arge that the bugs are
rarelly caused by the vector class itself (since the operations are
quite trivial) but by the type conversions.
- Might impact compile time. I did not saw a significant impact since
the usage is not really widespread.
- Functions needs to be rewritten to support arbitrary vector length.
For instance, one can't call `len_squared_v3v3` in
`math::length_squared()` and call it a day.
- Type cast does not work with the template version of the `math::`
vector functions. Meaning you need to manually cast `float *` and
`(float *)[3]` to `float3` for the function calls.
i.e: `math::distance_squared(float3(nearest.co), positions[i]);`
- Some parts might loose in readability:
`float3::dot(v1.normalized(), v2.normalized())`
becoming
`math::dot(math::normalize(v1), math::normalize(v2))`
But I propose, when appropriate, to use
`using namespace blender::math;` on function local or file scope to
increase readability.
`dot(normalize(v1), normalize(v2))`
####Consideration:
- Include back `.length()` method. It is quite handy and is more C++
oriented.
- I considered the GLM library as a candidate for replacement. It felt
like too much for what we need and would be difficult to extend / modify
to our needs.
- I used Macros to reduce code in operators declaration and potential
copy paste bugs. This could reduce debugability and could be reverted.
- This touches `delaunay_2d.cc` and the intersection code. I would like
to know @howardt opinion on the matter.
- The `noexcept` on the copy constructor of `mpq(2|3)` is being removed.
But according to @JacquesLucke it is not a real problem for now.
I would like to give a huge thanks to @JacquesLucke who helped during this
and pushed me to reduce the duplication further.
Reviewed By: brecht, sergey, JacquesLucke
Differential Revision: https://developer.blender.org/D13791
This patch implements the vector types (i.e:float2) by making heavy
usage of templating. All vector functions are now outside of the vector
classes (inside the blender::math namespace) and are not vector size
dependent for the most part.
In the ongoing effort to make shaders less GL centric, we are aiming
to share more code between GLSL and C++ to avoid code duplication.
Motivations:
- We are aiming to share UBO and SSBO structures between GLSL and C++.
This means we will use many of the existing vector types and others we
currently don't have (uintX, intX). All these variations were asking
for many more code duplication.
- Deduplicate existing code which is duplicated for each vector size.
- We also want to share small functions. Which means that vector functions
should be static and not in the class namespace.
- Reduce friction to use these types in new projects due to their
incompleteness.
- The current state of the BLI_(float|double|mpq)(2|3|4).hh is a bit of a
let down. Most clases are incomplete, out of sync with each others with
different codestyles, and some functions that should be static are not
(i.e: float3::reflect()).
Upsides:
- Still support .x, .y, .z, .w for readability.
- Compact, readable and easilly extendable.
- All of the vector functions are available for all the vectors types and
can be restricted to certain types. Also template specialization let us
define exception for special class (like mpq).
- With optimization ON, the compiler unroll the loops and performance is
the same.
Downsides:
- Might impact debugability. Though I would arge that the bugs are rarelly
caused by the vector class itself (since the operations are quite trivial)
but by the type conversions.
- Might impact compile time. I did not saw a significant impact since the
usage is not really widespread.
- Functions needs to be rewritten to support arbitrary vector length. For
instance, one can't call len_squared_v3v3 in math::length_squared() and
call it a day.
- Type cast does not work with the template version of the math:: vector
functions. Meaning you need to manually cast float * and (float *)[3] to
float3 for the function calls.
i.e: math::distance_squared(float3(nearest.co), positions[i]);
- Some parts might loose in readability:
float3::dot(v1.normalized(), v2.normalized())
becoming
math::dot(math::normalize(v1), math::normalize(v2))
But I propose, when appropriate, to use
using namespace blender::math; on function local or file scope to
increase readability. dot(normalize(v1), normalize(v2))
Consideration:
- Include back .length() method. It is quite handy and is more C++
oriented.
- I considered the GLM library as a candidate for replacement.
It felt like too much for what we need and would be difficult to
extend / modify to our needs.
- I used Macros to reduce code in operators declaration and potential
copy paste bugs. This could reduce debugability and could be reverted.
- This touches delaunay_2d.cc and the intersection code. I would like to
know @Howard Trickey (howardt) opinion on the matter.
- The noexcept on the copy constructor of mpq(2|3) is being removed.
But according to @Jacques Lucke (JacquesLucke) it is not a real problem
for now.
I would like to give a huge thanks to @Jacques Lucke (JacquesLucke) who
helped during this and pushed me to reduce the duplication further.
Reviewed By: brecht, sergey, JacquesLucke
Differential Revision: http://developer.blender.org/D13791
- Added space below non doc-string comments to make it clear
these aren't comments for the symbols directly below them.
- Use doxy sections for some headers.
- Minor improvements to doc-strings.
Ref T92709
Besides helping to avoid buffer overflow errors this reduces complexity
of BLI_str_utf32_as_utf8 which needed a special loop for the last 6
characters to avoid writing past the buffer bounds.
Also add BLI_str_utf8_from_unicode_len which only returns the length.
Various changes to reduce risk of out of bounds errors in utf8 seeking.
- Remove BLI_str_prev_char_utf8
This function could potentially scan past the beginning of a string.
Use BLI_str_find_prev_char_utf8 instead which takes a limiting
string start argument.
- Swap arguments for BLI_str_find_prev_char_utf8 so the stepping
argument is first and the limiting argument is last.
This matches BLI_str_find_next_char_utf8.
- Change behavior of these functions to return it the start or end
pointers instead of NULL, which complicated use of these functions
to calculate offsets.
Callers that need to check if the limits were reached can compare
the return value with the start/end pointers.
- Return 'const char *' from these functions
so they don't remove const from the input arguments.
Remove BLI_str_utf8_as_unicode_and_size and
BLI_str_utf8_as_unicode_and_size_safe.
Use BLI_str_utf8_as_unicode_step instead since it takes
a buffer bounds argument to prevent buffer over-reading.
There were multiple utf8 functions which treated
errors slightly differently.
Split BLI_str_utf8_as_unicode_step into two functions.
- BLI_str_utf8_as_unicode_step_or_error returns error value
when decoding fails and doesn't step.
- BLI_str_utf8_as_unicode_step always steps forward at least one
returning the byte value without decoding
(needed to display some latin1 file-paths).
Font drawing uses BLI_str_utf8_as_unicode_step and no longer
check for error values.
Add a string length argument to BLI_str_utf8_as_unicode_step to prevent
reading past the buffer bounds or the intended range since some callers
of this function take a string length to operate on part of the string.
Font drawing for example didn't respect the length argument,
potentially causing a buffer over-read with multi-byte characters
that could read past the end of the string.
The following command would read 5 bytes past the end of the input.
`BLF_draw(font_id, (char[]){252}, 1);`
In practice strings are typically null terminated so this didn't crash
reading past buffer bounds.
Nevertheless, this wasn't correct and could cause bugs in the future.
Clamping by the length now has the same behavior as a null byte.
Add test to ensure this is working as intended.