Commit Graph

24 Commits

Author SHA1 Message Date
Campbell Barton
0501555dcc BLI_string: add UTF8 aware BLI_snprintf functions & macros
BLI_snprintf and related functions could truncate partial UTF8
code-points, which would then cause problems elsewhere -
Python raises an exception when accessing for example.

Existing uses of BLI_snprintf should use the UTF8 versions in most
cases, except for file paths which are not required to be UTF8.
2025-07-22 16:25:51 +10:00
Campbell Barton
54ac76e425 BLI_string_utf8: add BLI_str_utf8_column_count
Add a utility function to count the number of columns in a string.
This is just a convenience wrapper around BLI_str_utf8_offset_to_column.
2025-07-15 13:44:10 +10:00
Campbell Barton
6d06fc3979 Fix immediate exit when displaying an invalid UTF8 title on Wayland
A non UTF8 title on Wayland causes disconnection from the server.
Resolve by using the path as-is unless invalid UTF8 byte sequences
are found, in that case they're replaced by `?`.

This is skipped on WIN32 & macOS since the issue only exists on Wayland.
2025-06-26 18:01:06 +10:00
Campbell Barton
c5bae85893 BLI_string: add BLI_str_utf8_invalid_substitute
Similar to BLI_str_utf8_invalid_strip except that it substitutes
invalid characters and doesn't change the string length.

Useful for displaying strings that include invalid UTF8 code-points.
2025-06-26 18:01:06 +10:00
Bastien Montagne
a883f555ae Blender 4.5 Forward Compatibility for Long ID Names.
This commit enables Blender 4.5 to use (to some extent) blendfiles from
Blender 5.0 and later using 'long' ID names (i.e. ID names over 63 bytes).

On a more general perspective, it also introduces safer handling of
potentially corrupted ID names in a blendfile.

This is achieved by carefully checking for non-null terminated ID names
early on in readfile process, and then:
* Truncating and ensuring uniqueness of ID names.
* Doing similar process for action slot and slot users identifiers.
* In linking (and appending) context, such IDs are totally ignored. They are
  not listed, and are considered as missing if some other (valid) linked ID
  attempt to indirectly link them).
* Informing users through usual reporting ways.

Technically, this mainly changes two areas of the readfile code related to IDs
themselves:
* The utils `blo_bhead_id_name` that returns the ID name of an ID BHead,
  without actually reading that ID, now check for a valid null-terminated
  string of `MAX_ID_NAME` max size, and returns a `nullptr` on error.
  _This essentially prevents listing and linking such IDs, in any way._
* The actual ID reading code (`read_id_struct`) does the same check, and
  truncate the ID name to its maximum allowed length.
* Both of above checks also set a new FileData flag
  (`FD_FLAGS_HAS_LONG_ID_NAME`), which is used to ensure that ID names (and
  related actions slots identifiers) remain unique, and report to info to the
  user.

Implements #137608.

Branched out from !137196.

Co-authored-by: michal.krupa <michal.krupa@cdprojektred.com>
Co-authored-by: Campbell Barton <ideasman42@gmail.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/139336
2025-06-06 14:19:00 +02:00
Campbell Barton
b3dfde88f3 Cleanup: spelling in comments (check_spelling_* target)
Also uppercase acronyms: API, UTF & ASCII.
2025-05-17 10:17:37 +10:00
Campbell Barton
2d4290f285 Cleanup: spelling in comments (make check_spelling_*)
Also inconsistent capitalization.
2025-04-24 22:45:22 +00:00
Harley Acheson
9dd4e90136 UI: Line Break on Conditional Punctuation
This PR adds new typographical line breaking opportunities, characters
that break depending on what follows. For example comma if not followed
by a space or a number. Period if not followed by a space or another
period. Along with related non-Latin equivalents, like Greek question
mark, full width punctuation, Hebrew, Arabic, etc.

Pull Request: https://projects.blender.org/blender/blender/pulls/137934
2025-04-24 20:42:32 +02:00
Campbell Barton
3bf92a6e83 Cleanup: rename nullptr to null for null-characters 2025-04-11 23:55:00 +00:00
Campbell Barton
f2e4b35589 Cleanup: rename UTF8 stripping arguments to avoid confusion 2025-04-12 09:50:14 +10:00
Harley Acheson
4894c888ee BLF: Configurable Line Breaking Behavior
The PR allows specifying the method of line breaking. Current functions
gain an optional argument that defaults to "Minimal" which breaks as it
does now, only on space and line feed. "Typographical" breaks on
different types of spaces, backslashes, underscore, forward slash if
not following a number, dashes if not following a space, Chinese,
Japanese, and Korean Ideograms, and Tibetan intersyllabic marks. "Path"
breaks on space, underscore, and per-platform path separators. There is
also a "HardLimit" flag that forces breaking at character boundary at
the wrapping limit.

Pull Request: https://projects.blender.org/blender/blender/pulls/135203
2025-03-18 18:02:36 +01:00
Brecht Van Lommel
e2e1984e60 Refactor: Convert remainder of blenlib to C++
A few headers like BLI_math_constants.h and BLI_utildefines.h keep working
for C code, for remaining makesdna and userdef defaults code in C.

Pull Request: https://projects.blender.org/blender/blender/pulls/134406
2025-02-12 23:01:08 +01:00
Brecht Van Lommel
6b6cd3307b Cleanup: Various clang-tidy warnings in blenlib
Pull Request: https://projects.blender.org/blender/blender/pulls/133734
2025-01-31 17:03:17 +01:00
Jacques Lucke
e1753900b7 BLI: improve UTF-8 safety when copying StringRef to char buffers
Previously, there was a `StringRef.copy` method which would copy the string into
the given buffer. However, it was not defined for the case when the buffer was
too small. It moved the responsibility of making sure the buffer is large enough
to the caller.

Unfortunately, in practice that easily hides bugs in builds without asserts
which don't come up in testing much. Now, the method is replaced with
`StringRef.copy_utf8_truncated` which has much more well defined semantics and
also makes sure that the string remains valid utf-8.

This also renames `unsafe_copy` to `copy_unsafe` to make the naming more similar
to `copy_utf8_truncated`.

Pull Request: https://projects.blender.org/blender/blender/pulls/133677
2025-01-29 12:12:27 +01:00
Campbell Barton
98cae94f6b Fix potential out of bounds read in UTF8 string length calculation
The length checking wasn't accounting for null bytes within multi-byte
sequences and could step over the null bytes.

For BLI_strlen_utf8 this could result in an out of bounds read.

In practice most UTF8 data is validated so the extra checks
are mainly to prevent errors on invalid or corrupt UTF8 text.
2024-10-25 16:50:10 +11:00
Campbell Barton
3f8cd44485 Cleanup: move BLI_strict_flags.h last, not that it should be kept last
Also add a note in the header why it should be kept last.
2024-02-14 13:40:31 +11:00
Campbell Barton
e72c9397f5 Cleanup: various non-functional changes for C++ 2024-02-02 10:43:18 +11:00
Hans Goudey
0618de49ad Cleanup: Replace MIN/MAX macros with C++ functions
Use `std::min` and `std::max` instead. Though keep MIN2 and MAX2
just for C code that hasn't been moved to C++ yet.

Pull Request: https://projects.blender.org/blender/blender/pulls/117384
2024-01-22 15:58:18 +01:00
Campbell Barton
375c217ea0 Cleanup: apply clang-tidy modernize-deprecated-headers 2024-01-08 11:32:49 +11:00
Campbell Barton
aaf05c2497 Cleanup: various C++ changes (use nullptr, function style casts) 2023-11-07 11:35:16 +11:00
Campbell Barton
6983c14955 Cleanup: spelling in comments, use doxygen doc-strings 2023-11-02 16:43:04 +11:00
Campbell Barton
c8c2343b4b Cleanup: spelling in comments 2023-10-23 10:09:05 +11:00
Campbell Barton
2864c20302 Cleanup: various C++ changes (use nullptr, function style casts) 2023-10-21 21:17:57 +11:00
Sergey Sharybin
21c8af467d Cleanup: Convert winfunc and utfconv to C++
Basically, the intern/utfconv directory, as well as users of
these headers.

Pull Request: https://projects.blender.org/blender/blender/pulls/113901
2023-10-20 10:27:31 +02:00