Files
test2/source/blender/io/common/IO_string_utils.hh

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

121 lines
4.0 KiB
C++
Raw Normal View History

/* SPDX-FileCopyrightText: 2024 Blender Authors
*
* SPDX-License-Identifier: GPL-2.0-or-later */
OBJ: further optimize, cleanup and harden the new C++ importer Continued improvements to the new C++ based OBJ importer. Performance: about 2x faster. - Rungholt.obj (several meshes, 263MB file): Windows 12.7s -> 5.9s, Mac 7.7s -> 3.1s. - Blender 3.0 splash (24k meshes, 2.4GB file): Windows 97.3s -> 53.6s, Mac 137.3s -> 80.0s. - "Windows" is VS2022, AMD Ryzen 5950X (32 threads), "Mac" is Xcode/clang 13, M1Max (10 threads). - Slightly reduced memory usage during import as well. The performance gains are a combination of several things: - Replacing `std::stof` / `std::stoi` with C++17 `from_chars`. - Stop reading input file char-by-char using `std::getline`, and instead read in 64kb chunks, and parse from there (taking care of possibly handling lines split mid-way due to chunk boundaries). - Removing abstractions for splitting a line by some char, - Avoid tiny memory allocations: instead of storing a vector of polygon corners in each face, store all the corners in one big array, and per-face only store indices "where do corners start, and how many". Likewise, don't store full string names of material/group names for each face; only store indices into overall material/group names arrays. - Stop always doing mesh validation, which is slow. Do it just like the Alembic importer does: only do validation if found some invalid faces during import, or if requested by the user via an import setting checkbox (which defaults to off). - Stop doing "collection sync" for each object being added; instead do the collection sync right after creating all the objects. Cleanup / Robustness: This reworking of parser (see "removing abstractions" point above) means that all the functions that were in `parser_string_utils` file are gone, and replaced with different set of functions. However they are not OBJ specific, so as pointed out during review of the previous differential, they are now in `source/blender/io/common` library. Added gtest coverage for said functions as well; something that was only indirectly covered by obj tests previously. Rework of some bits of parsing made the parser actually better able to deal with invalid syntax. E.g. previously, if a face corner were a `/123` string, it would have incorrectly treated that as a vertex index (since it would get "hey that's one number" after splitting a string by a slash), instead of properly marking it as invalid syntax. Added gtest coverage for .mtl parsing; something that was not covered by any tests at all previously. Reviewed By: Howard Trickey Differential Revision: https://developer.blender.org/D14586
2022-04-17 22:07:43 +03:00
#pragma once
#include "BLI_string_ref.hh"
/*
* Various text parsing utilities used by importers.
OBJ: improve new importer file parsing performance on windows The OBJ parser was primarily using StringRef for convenience, with functions like "skip whitespace" or "parse a number" taking an input stringref, representing an input line, and returning a new stringref, representing the remainder of the line. This is convenient, but does more work than strictly needed -- while parsing, only the "beginning" of the line ever changes by moving forward; the end of the line always stays the same. We can change the code to take a pair of pointers (begin of line, end of line) as input, and make the functions return the new begin of line pointer. This makes the return value neatly fit into a processor register, which StringRef did not. On Windows, this does result in non-trivial speedups in the actual OBJ file parsing part, due to Windows calling convention where return values larger than 64 bits are returned via memory. Does not measurably affect performance on Mac/Linux, because the calling convention there uses a pair of 64-bit registers to return a StringRef. End-to-end times of importing several test files, on Windows (VS2022 build, Ryzen 5950X): - Monkey subdivided to level 6, no normals (220MB file): 1.25s -> 0.85s - Rungholt minecraft level (270MB file): 7.0s -> 5.8s - Blender 3 splash scene (2.4GB file): 49.1s -> 45.5s The full import process has a lot of other overhead besides actual OBJ file parsing (mostly creating actual blender objects out of parsed data). In pure parsing, in the monkey test scene above, the parsing part goes 1.0s -> 0.6s. Reviewed By: Howard Trickey Differential Revision: https://developer.blender.org/D14936
2022-05-12 13:48:55 +03:00
*
* Many of these functions take two pointers (p, end) indicating
* which part of a string to operate on, and return a possibly
* changed new start of the string. They could be taking a StringRef
* as input and returning a new StringRef, but this is a hot path
* in CSV and OBJ parsing, and the StringRef approach does lose performance
OBJ: improve new importer file parsing performance on windows The OBJ parser was primarily using StringRef for convenience, with functions like "skip whitespace" or "parse a number" taking an input stringref, representing an input line, and returning a new stringref, representing the remainder of the line. This is convenient, but does more work than strictly needed -- while parsing, only the "beginning" of the line ever changes by moving forward; the end of the line always stays the same. We can change the code to take a pair of pointers (begin of line, end of line) as input, and make the functions return the new begin of line pointer. This makes the return value neatly fit into a processor register, which StringRef did not. On Windows, this does result in non-trivial speedups in the actual OBJ file parsing part, due to Windows calling convention where return values larger than 64 bits are returned via memory. Does not measurably affect performance on Mac/Linux, because the calling convention there uses a pair of 64-bit registers to return a StringRef. End-to-end times of importing several test files, on Windows (VS2022 build, Ryzen 5950X): - Monkey subdivided to level 6, no normals (220MB file): 1.25s -> 0.85s - Rungholt minecraft level (270MB file): 7.0s -> 5.8s - Blender 3 splash scene (2.4GB file): 49.1s -> 45.5s The full import process has a lot of other overhead besides actual OBJ file parsing (mostly creating actual blender objects out of parsed data). In pure parsing, in the monkey test scene above, the parsing part goes 1.0s -> 0.6s. Reviewed By: Howard Trickey Differential Revision: https://developer.blender.org/D14936
2022-05-12 13:48:55 +03:00
* (mostly due to return of StringRef being two register-size values
* instead of just one pointer).
OBJ: further optimize, cleanup and harden the new C++ importer Continued improvements to the new C++ based OBJ importer. Performance: about 2x faster. - Rungholt.obj (several meshes, 263MB file): Windows 12.7s -> 5.9s, Mac 7.7s -> 3.1s. - Blender 3.0 splash (24k meshes, 2.4GB file): Windows 97.3s -> 53.6s, Mac 137.3s -> 80.0s. - "Windows" is VS2022, AMD Ryzen 5950X (32 threads), "Mac" is Xcode/clang 13, M1Max (10 threads). - Slightly reduced memory usage during import as well. The performance gains are a combination of several things: - Replacing `std::stof` / `std::stoi` with C++17 `from_chars`. - Stop reading input file char-by-char using `std::getline`, and instead read in 64kb chunks, and parse from there (taking care of possibly handling lines split mid-way due to chunk boundaries). - Removing abstractions for splitting a line by some char, - Avoid tiny memory allocations: instead of storing a vector of polygon corners in each face, store all the corners in one big array, and per-face only store indices "where do corners start, and how many". Likewise, don't store full string names of material/group names for each face; only store indices into overall material/group names arrays. - Stop always doing mesh validation, which is slow. Do it just like the Alembic importer does: only do validation if found some invalid faces during import, or if requested by the user via an import setting checkbox (which defaults to off). - Stop doing "collection sync" for each object being added; instead do the collection sync right after creating all the objects. Cleanup / Robustness: This reworking of parser (see "removing abstractions" point above) means that all the functions that were in `parser_string_utils` file are gone, and replaced with different set of functions. However they are not OBJ specific, so as pointed out during review of the previous differential, they are now in `source/blender/io/common` library. Added gtest coverage for said functions as well; something that was only indirectly covered by obj tests previously. Rework of some bits of parsing made the parser actually better able to deal with invalid syntax. E.g. previously, if a face corner were a `/123` string, it would have incorrectly treated that as a vertex index (since it would get "hey that's one number" after splitting a string by a slash), instead of properly marking it as invalid syntax. Added gtest coverage for .mtl parsing; something that was not covered by any tests at all previously. Reviewed By: Howard Trickey Differential Revision: https://developer.blender.org/D14586
2022-04-17 22:07:43 +03:00
*/
namespace blender::io {
OBJ: further optimize, cleanup and harden the new C++ importer Continued improvements to the new C++ based OBJ importer. Performance: about 2x faster. - Rungholt.obj (several meshes, 263MB file): Windows 12.7s -> 5.9s, Mac 7.7s -> 3.1s. - Blender 3.0 splash (24k meshes, 2.4GB file): Windows 97.3s -> 53.6s, Mac 137.3s -> 80.0s. - "Windows" is VS2022, AMD Ryzen 5950X (32 threads), "Mac" is Xcode/clang 13, M1Max (10 threads). - Slightly reduced memory usage during import as well. The performance gains are a combination of several things: - Replacing `std::stof` / `std::stoi` with C++17 `from_chars`. - Stop reading input file char-by-char using `std::getline`, and instead read in 64kb chunks, and parse from there (taking care of possibly handling lines split mid-way due to chunk boundaries). - Removing abstractions for splitting a line by some char, - Avoid tiny memory allocations: instead of storing a vector of polygon corners in each face, store all the corners in one big array, and per-face only store indices "where do corners start, and how many". Likewise, don't store full string names of material/group names for each face; only store indices into overall material/group names arrays. - Stop always doing mesh validation, which is slow. Do it just like the Alembic importer does: only do validation if found some invalid faces during import, or if requested by the user via an import setting checkbox (which defaults to off). - Stop doing "collection sync" for each object being added; instead do the collection sync right after creating all the objects. Cleanup / Robustness: This reworking of parser (see "removing abstractions" point above) means that all the functions that were in `parser_string_utils` file are gone, and replaced with different set of functions. However they are not OBJ specific, so as pointed out during review of the previous differential, they are now in `source/blender/io/common` library. Added gtest coverage for said functions as well; something that was only indirectly covered by obj tests previously. Rework of some bits of parsing made the parser actually better able to deal with invalid syntax. E.g. previously, if a face corner were a `/123` string, it would have incorrectly treated that as a vertex index (since it would get "hey that's one number" after splitting a string by a slash), instead of properly marking it as invalid syntax. Added gtest coverage for .mtl parsing; something that was not covered by any tests at all previously. Reviewed By: Howard Trickey Differential Revision: https://developer.blender.org/D14586
2022-04-17 22:07:43 +03:00
/**
* Fetches next line from an input string buffer.
*
* The returned line will not have '\n' characters at the end;
* the `buffer` is modified to contain remaining text without
* the input line.
*/
StringRef read_next_line(StringRef &buffer);
/**
* Fix up OBJ line continuations by replacing backslash (\) and the
* following newline with spaces.
*/
void fixup_line_continuations(char *p, char *end);
OBJ: further optimize, cleanup and harden the new C++ importer Continued improvements to the new C++ based OBJ importer. Performance: about 2x faster. - Rungholt.obj (several meshes, 263MB file): Windows 12.7s -> 5.9s, Mac 7.7s -> 3.1s. - Blender 3.0 splash (24k meshes, 2.4GB file): Windows 97.3s -> 53.6s, Mac 137.3s -> 80.0s. - "Windows" is VS2022, AMD Ryzen 5950X (32 threads), "Mac" is Xcode/clang 13, M1Max (10 threads). - Slightly reduced memory usage during import as well. The performance gains are a combination of several things: - Replacing `std::stof` / `std::stoi` with C++17 `from_chars`. - Stop reading input file char-by-char using `std::getline`, and instead read in 64kb chunks, and parse from there (taking care of possibly handling lines split mid-way due to chunk boundaries). - Removing abstractions for splitting a line by some char, - Avoid tiny memory allocations: instead of storing a vector of polygon corners in each face, store all the corners in one big array, and per-face only store indices "where do corners start, and how many". Likewise, don't store full string names of material/group names for each face; only store indices into overall material/group names arrays. - Stop always doing mesh validation, which is slow. Do it just like the Alembic importer does: only do validation if found some invalid faces during import, or if requested by the user via an import setting checkbox (which defaults to off). - Stop doing "collection sync" for each object being added; instead do the collection sync right after creating all the objects. Cleanup / Robustness: This reworking of parser (see "removing abstractions" point above) means that all the functions that were in `parser_string_utils` file are gone, and replaced with different set of functions. However they are not OBJ specific, so as pointed out during review of the previous differential, they are now in `source/blender/io/common` library. Added gtest coverage for said functions as well; something that was only indirectly covered by obj tests previously. Rework of some bits of parsing made the parser actually better able to deal with invalid syntax. E.g. previously, if a face corner were a `/123` string, it would have incorrectly treated that as a vertex index (since it would get "hey that's one number" after splitting a string by a slash), instead of properly marking it as invalid syntax. Added gtest coverage for .mtl parsing; something that was not covered by any tests at all previously. Reviewed By: Howard Trickey Differential Revision: https://developer.blender.org/D14586
2022-04-17 22:07:43 +03:00
/**
OBJ: improve new importer file parsing performance on windows The OBJ parser was primarily using StringRef for convenience, with functions like "skip whitespace" or "parse a number" taking an input stringref, representing an input line, and returning a new stringref, representing the remainder of the line. This is convenient, but does more work than strictly needed -- while parsing, only the "beginning" of the line ever changes by moving forward; the end of the line always stays the same. We can change the code to take a pair of pointers (begin of line, end of line) as input, and make the functions return the new begin of line pointer. This makes the return value neatly fit into a processor register, which StringRef did not. On Windows, this does result in non-trivial speedups in the actual OBJ file parsing part, due to Windows calling convention where return values larger than 64 bits are returned via memory. Does not measurably affect performance on Mac/Linux, because the calling convention there uses a pair of 64-bit registers to return a StringRef. End-to-end times of importing several test files, on Windows (VS2022 build, Ryzen 5950X): - Monkey subdivided to level 6, no normals (220MB file): 1.25s -> 0.85s - Rungholt minecraft level (270MB file): 7.0s -> 5.8s - Blender 3 splash scene (2.4GB file): 49.1s -> 45.5s The full import process has a lot of other overhead besides actual OBJ file parsing (mostly creating actual blender objects out of parsed data). In pure parsing, in the monkey test scene above, the parsing part goes 1.0s -> 0.6s. Reviewed By: Howard Trickey Differential Revision: https://developer.blender.org/D14936
2022-05-12 13:48:55 +03:00
* Drop leading white-space from a string part.
OBJ: further optimize, cleanup and harden the new C++ importer Continued improvements to the new C++ based OBJ importer. Performance: about 2x faster. - Rungholt.obj (several meshes, 263MB file): Windows 12.7s -> 5.9s, Mac 7.7s -> 3.1s. - Blender 3.0 splash (24k meshes, 2.4GB file): Windows 97.3s -> 53.6s, Mac 137.3s -> 80.0s. - "Windows" is VS2022, AMD Ryzen 5950X (32 threads), "Mac" is Xcode/clang 13, M1Max (10 threads). - Slightly reduced memory usage during import as well. The performance gains are a combination of several things: - Replacing `std::stof` / `std::stoi` with C++17 `from_chars`. - Stop reading input file char-by-char using `std::getline`, and instead read in 64kb chunks, and parse from there (taking care of possibly handling lines split mid-way due to chunk boundaries). - Removing abstractions for splitting a line by some char, - Avoid tiny memory allocations: instead of storing a vector of polygon corners in each face, store all the corners in one big array, and per-face only store indices "where do corners start, and how many". Likewise, don't store full string names of material/group names for each face; only store indices into overall material/group names arrays. - Stop always doing mesh validation, which is slow. Do it just like the Alembic importer does: only do validation if found some invalid faces during import, or if requested by the user via an import setting checkbox (which defaults to off). - Stop doing "collection sync" for each object being added; instead do the collection sync right after creating all the objects. Cleanup / Robustness: This reworking of parser (see "removing abstractions" point above) means that all the functions that were in `parser_string_utils` file are gone, and replaced with different set of functions. However they are not OBJ specific, so as pointed out during review of the previous differential, they are now in `source/blender/io/common` library. Added gtest coverage for said functions as well; something that was only indirectly covered by obj tests previously. Rework of some bits of parsing made the parser actually better able to deal with invalid syntax. E.g. previously, if a face corner were a `/123` string, it would have incorrectly treated that as a vertex index (since it would get "hey that's one number" after splitting a string by a slash), instead of properly marking it as invalid syntax. Added gtest coverage for .mtl parsing; something that was not covered by any tests at all previously. Reviewed By: Howard Trickey Differential Revision: https://developer.blender.org/D14586
2022-04-17 22:07:43 +03:00
*/
OBJ: improve new importer file parsing performance on windows The OBJ parser was primarily using StringRef for convenience, with functions like "skip whitespace" or "parse a number" taking an input stringref, representing an input line, and returning a new stringref, representing the remainder of the line. This is convenient, but does more work than strictly needed -- while parsing, only the "beginning" of the line ever changes by moving forward; the end of the line always stays the same. We can change the code to take a pair of pointers (begin of line, end of line) as input, and make the functions return the new begin of line pointer. This makes the return value neatly fit into a processor register, which StringRef did not. On Windows, this does result in non-trivial speedups in the actual OBJ file parsing part, due to Windows calling convention where return values larger than 64 bits are returned via memory. Does not measurably affect performance on Mac/Linux, because the calling convention there uses a pair of 64-bit registers to return a StringRef. End-to-end times of importing several test files, on Windows (VS2022 build, Ryzen 5950X): - Monkey subdivided to level 6, no normals (220MB file): 1.25s -> 0.85s - Rungholt minecraft level (270MB file): 7.0s -> 5.8s - Blender 3 splash scene (2.4GB file): 49.1s -> 45.5s The full import process has a lot of other overhead besides actual OBJ file parsing (mostly creating actual blender objects out of parsed data). In pure parsing, in the monkey test scene above, the parsing part goes 1.0s -> 0.6s. Reviewed By: Howard Trickey Differential Revision: https://developer.blender.org/D14936
2022-05-12 13:48:55 +03:00
const char *drop_whitespace(const char *p, const char *end);
OBJ: further optimize, cleanup and harden the new C++ importer Continued improvements to the new C++ based OBJ importer. Performance: about 2x faster. - Rungholt.obj (several meshes, 263MB file): Windows 12.7s -> 5.9s, Mac 7.7s -> 3.1s. - Blender 3.0 splash (24k meshes, 2.4GB file): Windows 97.3s -> 53.6s, Mac 137.3s -> 80.0s. - "Windows" is VS2022, AMD Ryzen 5950X (32 threads), "Mac" is Xcode/clang 13, M1Max (10 threads). - Slightly reduced memory usage during import as well. The performance gains are a combination of several things: - Replacing `std::stof` / `std::stoi` with C++17 `from_chars`. - Stop reading input file char-by-char using `std::getline`, and instead read in 64kb chunks, and parse from there (taking care of possibly handling lines split mid-way due to chunk boundaries). - Removing abstractions for splitting a line by some char, - Avoid tiny memory allocations: instead of storing a vector of polygon corners in each face, store all the corners in one big array, and per-face only store indices "where do corners start, and how many". Likewise, don't store full string names of material/group names for each face; only store indices into overall material/group names arrays. - Stop always doing mesh validation, which is slow. Do it just like the Alembic importer does: only do validation if found some invalid faces during import, or if requested by the user via an import setting checkbox (which defaults to off). - Stop doing "collection sync" for each object being added; instead do the collection sync right after creating all the objects. Cleanup / Robustness: This reworking of parser (see "removing abstractions" point above) means that all the functions that were in `parser_string_utils` file are gone, and replaced with different set of functions. However they are not OBJ specific, so as pointed out during review of the previous differential, they are now in `source/blender/io/common` library. Added gtest coverage for said functions as well; something that was only indirectly covered by obj tests previously. Rework of some bits of parsing made the parser actually better able to deal with invalid syntax. E.g. previously, if a face corner were a `/123` string, it would have incorrectly treated that as a vertex index (since it would get "hey that's one number" after splitting a string by a slash), instead of properly marking it as invalid syntax. Added gtest coverage for .mtl parsing; something that was not covered by any tests at all previously. Reviewed By: Howard Trickey Differential Revision: https://developer.blender.org/D14586
2022-04-17 22:07:43 +03:00
/**
OBJ: improve new importer file parsing performance on windows The OBJ parser was primarily using StringRef for convenience, with functions like "skip whitespace" or "parse a number" taking an input stringref, representing an input line, and returning a new stringref, representing the remainder of the line. This is convenient, but does more work than strictly needed -- while parsing, only the "beginning" of the line ever changes by moving forward; the end of the line always stays the same. We can change the code to take a pair of pointers (begin of line, end of line) as input, and make the functions return the new begin of line pointer. This makes the return value neatly fit into a processor register, which StringRef did not. On Windows, this does result in non-trivial speedups in the actual OBJ file parsing part, due to Windows calling convention where return values larger than 64 bits are returned via memory. Does not measurably affect performance on Mac/Linux, because the calling convention there uses a pair of 64-bit registers to return a StringRef. End-to-end times of importing several test files, on Windows (VS2022 build, Ryzen 5950X): - Monkey subdivided to level 6, no normals (220MB file): 1.25s -> 0.85s - Rungholt minecraft level (270MB file): 7.0s -> 5.8s - Blender 3 splash scene (2.4GB file): 49.1s -> 45.5s The full import process has a lot of other overhead besides actual OBJ file parsing (mostly creating actual blender objects out of parsed data). In pure parsing, in the monkey test scene above, the parsing part goes 1.0s -> 0.6s. Reviewed By: Howard Trickey Differential Revision: https://developer.blender.org/D14936
2022-05-12 13:48:55 +03:00
* Drop leading non-white-space from a string part.
OBJ: further optimize, cleanup and harden the new C++ importer Continued improvements to the new C++ based OBJ importer. Performance: about 2x faster. - Rungholt.obj (several meshes, 263MB file): Windows 12.7s -> 5.9s, Mac 7.7s -> 3.1s. - Blender 3.0 splash (24k meshes, 2.4GB file): Windows 97.3s -> 53.6s, Mac 137.3s -> 80.0s. - "Windows" is VS2022, AMD Ryzen 5950X (32 threads), "Mac" is Xcode/clang 13, M1Max (10 threads). - Slightly reduced memory usage during import as well. The performance gains are a combination of several things: - Replacing `std::stof` / `std::stoi` with C++17 `from_chars`. - Stop reading input file char-by-char using `std::getline`, and instead read in 64kb chunks, and parse from there (taking care of possibly handling lines split mid-way due to chunk boundaries). - Removing abstractions for splitting a line by some char, - Avoid tiny memory allocations: instead of storing a vector of polygon corners in each face, store all the corners in one big array, and per-face only store indices "where do corners start, and how many". Likewise, don't store full string names of material/group names for each face; only store indices into overall material/group names arrays. - Stop always doing mesh validation, which is slow. Do it just like the Alembic importer does: only do validation if found some invalid faces during import, or if requested by the user via an import setting checkbox (which defaults to off). - Stop doing "collection sync" for each object being added; instead do the collection sync right after creating all the objects. Cleanup / Robustness: This reworking of parser (see "removing abstractions" point above) means that all the functions that were in `parser_string_utils` file are gone, and replaced with different set of functions. However they are not OBJ specific, so as pointed out during review of the previous differential, they are now in `source/blender/io/common` library. Added gtest coverage for said functions as well; something that was only indirectly covered by obj tests previously. Rework of some bits of parsing made the parser actually better able to deal with invalid syntax. E.g. previously, if a face corner were a `/123` string, it would have incorrectly treated that as a vertex index (since it would get "hey that's one number" after splitting a string by a slash), instead of properly marking it as invalid syntax. Added gtest coverage for .mtl parsing; something that was not covered by any tests at all previously. Reviewed By: Howard Trickey Differential Revision: https://developer.blender.org/D14586
2022-04-17 22:07:43 +03:00
*/
OBJ: improve new importer file parsing performance on windows The OBJ parser was primarily using StringRef for convenience, with functions like "skip whitespace" or "parse a number" taking an input stringref, representing an input line, and returning a new stringref, representing the remainder of the line. This is convenient, but does more work than strictly needed -- while parsing, only the "beginning" of the line ever changes by moving forward; the end of the line always stays the same. We can change the code to take a pair of pointers (begin of line, end of line) as input, and make the functions return the new begin of line pointer. This makes the return value neatly fit into a processor register, which StringRef did not. On Windows, this does result in non-trivial speedups in the actual OBJ file parsing part, due to Windows calling convention where return values larger than 64 bits are returned via memory. Does not measurably affect performance on Mac/Linux, because the calling convention there uses a pair of 64-bit registers to return a StringRef. End-to-end times of importing several test files, on Windows (VS2022 build, Ryzen 5950X): - Monkey subdivided to level 6, no normals (220MB file): 1.25s -> 0.85s - Rungholt minecraft level (270MB file): 7.0s -> 5.8s - Blender 3 splash scene (2.4GB file): 49.1s -> 45.5s The full import process has a lot of other overhead besides actual OBJ file parsing (mostly creating actual blender objects out of parsed data). In pure parsing, in the monkey test scene above, the parsing part goes 1.0s -> 0.6s. Reviewed By: Howard Trickey Differential Revision: https://developer.blender.org/D14936
2022-05-12 13:48:55 +03:00
const char *drop_non_whitespace(const char *p, const char *end);
OBJ: further optimize, cleanup and harden the new C++ importer Continued improvements to the new C++ based OBJ importer. Performance: about 2x faster. - Rungholt.obj (several meshes, 263MB file): Windows 12.7s -> 5.9s, Mac 7.7s -> 3.1s. - Blender 3.0 splash (24k meshes, 2.4GB file): Windows 97.3s -> 53.6s, Mac 137.3s -> 80.0s. - "Windows" is VS2022, AMD Ryzen 5950X (32 threads), "Mac" is Xcode/clang 13, M1Max (10 threads). - Slightly reduced memory usage during import as well. The performance gains are a combination of several things: - Replacing `std::stof` / `std::stoi` with C++17 `from_chars`. - Stop reading input file char-by-char using `std::getline`, and instead read in 64kb chunks, and parse from there (taking care of possibly handling lines split mid-way due to chunk boundaries). - Removing abstractions for splitting a line by some char, - Avoid tiny memory allocations: instead of storing a vector of polygon corners in each face, store all the corners in one big array, and per-face only store indices "where do corners start, and how many". Likewise, don't store full string names of material/group names for each face; only store indices into overall material/group names arrays. - Stop always doing mesh validation, which is slow. Do it just like the Alembic importer does: only do validation if found some invalid faces during import, or if requested by the user via an import setting checkbox (which defaults to off). - Stop doing "collection sync" for each object being added; instead do the collection sync right after creating all the objects. Cleanup / Robustness: This reworking of parser (see "removing abstractions" point above) means that all the functions that were in `parser_string_utils` file are gone, and replaced with different set of functions. However they are not OBJ specific, so as pointed out during review of the previous differential, they are now in `source/blender/io/common` library. Added gtest coverage for said functions as well; something that was only indirectly covered by obj tests previously. Rework of some bits of parsing made the parser actually better able to deal with invalid syntax. E.g. previously, if a face corner were a `/123` string, it would have incorrectly treated that as a vertex index (since it would get "hey that's one number" after splitting a string by a slash), instead of properly marking it as invalid syntax. Added gtest coverage for .mtl parsing; something that was not covered by any tests at all previously. Reviewed By: Howard Trickey Differential Revision: https://developer.blender.org/D14586
2022-04-17 22:07:43 +03:00
/**
* Parse an integer from an input string.
* The parsed result is stored in `dst`. The function skips
* leading white-space unless `skip_space=false`. If the
* number can't be parsed (invalid syntax, out of range),
* `success` value is false.
*
* Returns the start of remainder of the input string after parsing.
*/
const char *try_parse_int(
const char *p, const char *end, int fallback, bool &success, int &dst, bool skip_space = true);
/**
* Parse a float from an input string.
* The parsed result is stored in `dst`. The function skips
* leading white-space unless `skip_space=false`. If the
* number can't be parsed (invalid syntax, out of range),
* `success` value is false.
*
* Returns the start of remainder of the input string after parsing.
*/
const char *try_parse_float(const char *p,
const char *end,
int fallback,
bool &success,
float &dst,
bool skip_space = true);
OBJ: further optimize, cleanup and harden the new C++ importer Continued improvements to the new C++ based OBJ importer. Performance: about 2x faster. - Rungholt.obj (several meshes, 263MB file): Windows 12.7s -> 5.9s, Mac 7.7s -> 3.1s. - Blender 3.0 splash (24k meshes, 2.4GB file): Windows 97.3s -> 53.6s, Mac 137.3s -> 80.0s. - "Windows" is VS2022, AMD Ryzen 5950X (32 threads), "Mac" is Xcode/clang 13, M1Max (10 threads). - Slightly reduced memory usage during import as well. The performance gains are a combination of several things: - Replacing `std::stof` / `std::stoi` with C++17 `from_chars`. - Stop reading input file char-by-char using `std::getline`, and instead read in 64kb chunks, and parse from there (taking care of possibly handling lines split mid-way due to chunk boundaries). - Removing abstractions for splitting a line by some char, - Avoid tiny memory allocations: instead of storing a vector of polygon corners in each face, store all the corners in one big array, and per-face only store indices "where do corners start, and how many". Likewise, don't store full string names of material/group names for each face; only store indices into overall material/group names arrays. - Stop always doing mesh validation, which is slow. Do it just like the Alembic importer does: only do validation if found some invalid faces during import, or if requested by the user via an import setting checkbox (which defaults to off). - Stop doing "collection sync" for each object being added; instead do the collection sync right after creating all the objects. Cleanup / Robustness: This reworking of parser (see "removing abstractions" point above) means that all the functions that were in `parser_string_utils` file are gone, and replaced with different set of functions. However they are not OBJ specific, so as pointed out during review of the previous differential, they are now in `source/blender/io/common` library. Added gtest coverage for said functions as well; something that was only indirectly covered by obj tests previously. Rework of some bits of parsing made the parser actually better able to deal with invalid syntax. E.g. previously, if a face corner were a `/123` string, it would have incorrectly treated that as a vertex index (since it would get "hey that's one number" after splitting a string by a slash), instead of properly marking it as invalid syntax. Added gtest coverage for .mtl parsing; something that was not covered by any tests at all previously. Reviewed By: Howard Trickey Differential Revision: https://developer.blender.org/D14586
2022-04-17 22:07:43 +03:00
/**
* Parse an integer from an input string.
* The parsed result is stored in `dst`. The function skips
* leading white-space unless `skip_space=false`. If the
* number can't be parsed (invalid syntax, out of range),
* `fallback` value is stored instead.
*
OBJ: improve new importer file parsing performance on windows The OBJ parser was primarily using StringRef for convenience, with functions like "skip whitespace" or "parse a number" taking an input stringref, representing an input line, and returning a new stringref, representing the remainder of the line. This is convenient, but does more work than strictly needed -- while parsing, only the "beginning" of the line ever changes by moving forward; the end of the line always stays the same. We can change the code to take a pair of pointers (begin of line, end of line) as input, and make the functions return the new begin of line pointer. This makes the return value neatly fit into a processor register, which StringRef did not. On Windows, this does result in non-trivial speedups in the actual OBJ file parsing part, due to Windows calling convention where return values larger than 64 bits are returned via memory. Does not measurably affect performance on Mac/Linux, because the calling convention there uses a pair of 64-bit registers to return a StringRef. End-to-end times of importing several test files, on Windows (VS2022 build, Ryzen 5950X): - Monkey subdivided to level 6, no normals (220MB file): 1.25s -> 0.85s - Rungholt minecraft level (270MB file): 7.0s -> 5.8s - Blender 3 splash scene (2.4GB file): 49.1s -> 45.5s The full import process has a lot of other overhead besides actual OBJ file parsing (mostly creating actual blender objects out of parsed data). In pure parsing, in the monkey test scene above, the parsing part goes 1.0s -> 0.6s. Reviewed By: Howard Trickey Differential Revision: https://developer.blender.org/D14936
2022-05-12 13:48:55 +03:00
* Returns the start of remainder of the input string after parsing.
OBJ: further optimize, cleanup and harden the new C++ importer Continued improvements to the new C++ based OBJ importer. Performance: about 2x faster. - Rungholt.obj (several meshes, 263MB file): Windows 12.7s -> 5.9s, Mac 7.7s -> 3.1s. - Blender 3.0 splash (24k meshes, 2.4GB file): Windows 97.3s -> 53.6s, Mac 137.3s -> 80.0s. - "Windows" is VS2022, AMD Ryzen 5950X (32 threads), "Mac" is Xcode/clang 13, M1Max (10 threads). - Slightly reduced memory usage during import as well. The performance gains are a combination of several things: - Replacing `std::stof` / `std::stoi` with C++17 `from_chars`. - Stop reading input file char-by-char using `std::getline`, and instead read in 64kb chunks, and parse from there (taking care of possibly handling lines split mid-way due to chunk boundaries). - Removing abstractions for splitting a line by some char, - Avoid tiny memory allocations: instead of storing a vector of polygon corners in each face, store all the corners in one big array, and per-face only store indices "where do corners start, and how many". Likewise, don't store full string names of material/group names for each face; only store indices into overall material/group names arrays. - Stop always doing mesh validation, which is slow. Do it just like the Alembic importer does: only do validation if found some invalid faces during import, or if requested by the user via an import setting checkbox (which defaults to off). - Stop doing "collection sync" for each object being added; instead do the collection sync right after creating all the objects. Cleanup / Robustness: This reworking of parser (see "removing abstractions" point above) means that all the functions that were in `parser_string_utils` file are gone, and replaced with different set of functions. However they are not OBJ specific, so as pointed out during review of the previous differential, they are now in `source/blender/io/common` library. Added gtest coverage for said functions as well; something that was only indirectly covered by obj tests previously. Rework of some bits of parsing made the parser actually better able to deal with invalid syntax. E.g. previously, if a face corner were a `/123` string, it would have incorrectly treated that as a vertex index (since it would get "hey that's one number" after splitting a string by a slash), instead of properly marking it as invalid syntax. Added gtest coverage for .mtl parsing; something that was not covered by any tests at all previously. Reviewed By: Howard Trickey Differential Revision: https://developer.blender.org/D14586
2022-04-17 22:07:43 +03:00
*/
OBJ: improve new importer file parsing performance on windows The OBJ parser was primarily using StringRef for convenience, with functions like "skip whitespace" or "parse a number" taking an input stringref, representing an input line, and returning a new stringref, representing the remainder of the line. This is convenient, but does more work than strictly needed -- while parsing, only the "beginning" of the line ever changes by moving forward; the end of the line always stays the same. We can change the code to take a pair of pointers (begin of line, end of line) as input, and make the functions return the new begin of line pointer. This makes the return value neatly fit into a processor register, which StringRef did not. On Windows, this does result in non-trivial speedups in the actual OBJ file parsing part, due to Windows calling convention where return values larger than 64 bits are returned via memory. Does not measurably affect performance on Mac/Linux, because the calling convention there uses a pair of 64-bit registers to return a StringRef. End-to-end times of importing several test files, on Windows (VS2022 build, Ryzen 5950X): - Monkey subdivided to level 6, no normals (220MB file): 1.25s -> 0.85s - Rungholt minecraft level (270MB file): 7.0s -> 5.8s - Blender 3 splash scene (2.4GB file): 49.1s -> 45.5s The full import process has a lot of other overhead besides actual OBJ file parsing (mostly creating actual blender objects out of parsed data). In pure parsing, in the monkey test scene above, the parsing part goes 1.0s -> 0.6s. Reviewed By: Howard Trickey Differential Revision: https://developer.blender.org/D14936
2022-05-12 13:48:55 +03:00
const char *parse_int(
const char *p, const char *end, int fallback, int &dst, bool skip_space = true);
OBJ: further optimize, cleanup and harden the new C++ importer Continued improvements to the new C++ based OBJ importer. Performance: about 2x faster. - Rungholt.obj (several meshes, 263MB file): Windows 12.7s -> 5.9s, Mac 7.7s -> 3.1s. - Blender 3.0 splash (24k meshes, 2.4GB file): Windows 97.3s -> 53.6s, Mac 137.3s -> 80.0s. - "Windows" is VS2022, AMD Ryzen 5950X (32 threads), "Mac" is Xcode/clang 13, M1Max (10 threads). - Slightly reduced memory usage during import as well. The performance gains are a combination of several things: - Replacing `std::stof` / `std::stoi` with C++17 `from_chars`. - Stop reading input file char-by-char using `std::getline`, and instead read in 64kb chunks, and parse from there (taking care of possibly handling lines split mid-way due to chunk boundaries). - Removing abstractions for splitting a line by some char, - Avoid tiny memory allocations: instead of storing a vector of polygon corners in each face, store all the corners in one big array, and per-face only store indices "where do corners start, and how many". Likewise, don't store full string names of material/group names for each face; only store indices into overall material/group names arrays. - Stop always doing mesh validation, which is slow. Do it just like the Alembic importer does: only do validation if found some invalid faces during import, or if requested by the user via an import setting checkbox (which defaults to off). - Stop doing "collection sync" for each object being added; instead do the collection sync right after creating all the objects. Cleanup / Robustness: This reworking of parser (see "removing abstractions" point above) means that all the functions that were in `parser_string_utils` file are gone, and replaced with different set of functions. However they are not OBJ specific, so as pointed out during review of the previous differential, they are now in `source/blender/io/common` library. Added gtest coverage for said functions as well; something that was only indirectly covered by obj tests previously. Rework of some bits of parsing made the parser actually better able to deal with invalid syntax. E.g. previously, if a face corner were a `/123` string, it would have incorrectly treated that as a vertex index (since it would get "hey that's one number" after splitting a string by a slash), instead of properly marking it as invalid syntax. Added gtest coverage for .mtl parsing; something that was not covered by any tests at all previously. Reviewed By: Howard Trickey Differential Revision: https://developer.blender.org/D14586
2022-04-17 22:07:43 +03:00
/**
* Parse a float from an input string.
* The parsed result is stored in `dst`. The function skips
* leading white-space unless `skip_space=false`. If the
* number can't be parsed (invalid syntax, out of range),
* `fallback` value is stored instead. If `require_trailing_space`
* is true, the character after the number has to be whitespace.
OBJ: further optimize, cleanup and harden the new C++ importer Continued improvements to the new C++ based OBJ importer. Performance: about 2x faster. - Rungholt.obj (several meshes, 263MB file): Windows 12.7s -> 5.9s, Mac 7.7s -> 3.1s. - Blender 3.0 splash (24k meshes, 2.4GB file): Windows 97.3s -> 53.6s, Mac 137.3s -> 80.0s. - "Windows" is VS2022, AMD Ryzen 5950X (32 threads), "Mac" is Xcode/clang 13, M1Max (10 threads). - Slightly reduced memory usage during import as well. The performance gains are a combination of several things: - Replacing `std::stof` / `std::stoi` with C++17 `from_chars`. - Stop reading input file char-by-char using `std::getline`, and instead read in 64kb chunks, and parse from there (taking care of possibly handling lines split mid-way due to chunk boundaries). - Removing abstractions for splitting a line by some char, - Avoid tiny memory allocations: instead of storing a vector of polygon corners in each face, store all the corners in one big array, and per-face only store indices "where do corners start, and how many". Likewise, don't store full string names of material/group names for each face; only store indices into overall material/group names arrays. - Stop always doing mesh validation, which is slow. Do it just like the Alembic importer does: only do validation if found some invalid faces during import, or if requested by the user via an import setting checkbox (which defaults to off). - Stop doing "collection sync" for each object being added; instead do the collection sync right after creating all the objects. Cleanup / Robustness: This reworking of parser (see "removing abstractions" point above) means that all the functions that were in `parser_string_utils` file are gone, and replaced with different set of functions. However they are not OBJ specific, so as pointed out during review of the previous differential, they are now in `source/blender/io/common` library. Added gtest coverage for said functions as well; something that was only indirectly covered by obj tests previously. Rework of some bits of parsing made the parser actually better able to deal with invalid syntax. E.g. previously, if a face corner were a `/123` string, it would have incorrectly treated that as a vertex index (since it would get "hey that's one number" after splitting a string by a slash), instead of properly marking it as invalid syntax. Added gtest coverage for .mtl parsing; something that was not covered by any tests at all previously. Reviewed By: Howard Trickey Differential Revision: https://developer.blender.org/D14586
2022-04-17 22:07:43 +03:00
*
OBJ: improve new importer file parsing performance on windows The OBJ parser was primarily using StringRef for convenience, with functions like "skip whitespace" or "parse a number" taking an input stringref, representing an input line, and returning a new stringref, representing the remainder of the line. This is convenient, but does more work than strictly needed -- while parsing, only the "beginning" of the line ever changes by moving forward; the end of the line always stays the same. We can change the code to take a pair of pointers (begin of line, end of line) as input, and make the functions return the new begin of line pointer. This makes the return value neatly fit into a processor register, which StringRef did not. On Windows, this does result in non-trivial speedups in the actual OBJ file parsing part, due to Windows calling convention where return values larger than 64 bits are returned via memory. Does not measurably affect performance on Mac/Linux, because the calling convention there uses a pair of 64-bit registers to return a StringRef. End-to-end times of importing several test files, on Windows (VS2022 build, Ryzen 5950X): - Monkey subdivided to level 6, no normals (220MB file): 1.25s -> 0.85s - Rungholt minecraft level (270MB file): 7.0s -> 5.8s - Blender 3 splash scene (2.4GB file): 49.1s -> 45.5s The full import process has a lot of other overhead besides actual OBJ file parsing (mostly creating actual blender objects out of parsed data). In pure parsing, in the monkey test scene above, the parsing part goes 1.0s -> 0.6s. Reviewed By: Howard Trickey Differential Revision: https://developer.blender.org/D14936
2022-05-12 13:48:55 +03:00
* Returns the start of remainder of the input string after parsing.
OBJ: further optimize, cleanup and harden the new C++ importer Continued improvements to the new C++ based OBJ importer. Performance: about 2x faster. - Rungholt.obj (several meshes, 263MB file): Windows 12.7s -> 5.9s, Mac 7.7s -> 3.1s. - Blender 3.0 splash (24k meshes, 2.4GB file): Windows 97.3s -> 53.6s, Mac 137.3s -> 80.0s. - "Windows" is VS2022, AMD Ryzen 5950X (32 threads), "Mac" is Xcode/clang 13, M1Max (10 threads). - Slightly reduced memory usage during import as well. The performance gains are a combination of several things: - Replacing `std::stof` / `std::stoi` with C++17 `from_chars`. - Stop reading input file char-by-char using `std::getline`, and instead read in 64kb chunks, and parse from there (taking care of possibly handling lines split mid-way due to chunk boundaries). - Removing abstractions for splitting a line by some char, - Avoid tiny memory allocations: instead of storing a vector of polygon corners in each face, store all the corners in one big array, and per-face only store indices "where do corners start, and how many". Likewise, don't store full string names of material/group names for each face; only store indices into overall material/group names arrays. - Stop always doing mesh validation, which is slow. Do it just like the Alembic importer does: only do validation if found some invalid faces during import, or if requested by the user via an import setting checkbox (which defaults to off). - Stop doing "collection sync" for each object being added; instead do the collection sync right after creating all the objects. Cleanup / Robustness: This reworking of parser (see "removing abstractions" point above) means that all the functions that were in `parser_string_utils` file are gone, and replaced with different set of functions. However they are not OBJ specific, so as pointed out during review of the previous differential, they are now in `source/blender/io/common` library. Added gtest coverage for said functions as well; something that was only indirectly covered by obj tests previously. Rework of some bits of parsing made the parser actually better able to deal with invalid syntax. E.g. previously, if a face corner were a `/123` string, it would have incorrectly treated that as a vertex index (since it would get "hey that's one number" after splitting a string by a slash), instead of properly marking it as invalid syntax. Added gtest coverage for .mtl parsing; something that was not covered by any tests at all previously. Reviewed By: Howard Trickey Differential Revision: https://developer.blender.org/D14586
2022-04-17 22:07:43 +03:00
*/
const char *parse_float(const char *p,
const char *end,
float fallback,
float &dst,
bool skip_space = true,
bool require_trailing_space = false);
OBJ: further optimize, cleanup and harden the new C++ importer Continued improvements to the new C++ based OBJ importer. Performance: about 2x faster. - Rungholt.obj (several meshes, 263MB file): Windows 12.7s -> 5.9s, Mac 7.7s -> 3.1s. - Blender 3.0 splash (24k meshes, 2.4GB file): Windows 97.3s -> 53.6s, Mac 137.3s -> 80.0s. - "Windows" is VS2022, AMD Ryzen 5950X (32 threads), "Mac" is Xcode/clang 13, M1Max (10 threads). - Slightly reduced memory usage during import as well. The performance gains are a combination of several things: - Replacing `std::stof` / `std::stoi` with C++17 `from_chars`. - Stop reading input file char-by-char using `std::getline`, and instead read in 64kb chunks, and parse from there (taking care of possibly handling lines split mid-way due to chunk boundaries). - Removing abstractions for splitting a line by some char, - Avoid tiny memory allocations: instead of storing a vector of polygon corners in each face, store all the corners in one big array, and per-face only store indices "where do corners start, and how many". Likewise, don't store full string names of material/group names for each face; only store indices into overall material/group names arrays. - Stop always doing mesh validation, which is slow. Do it just like the Alembic importer does: only do validation if found some invalid faces during import, or if requested by the user via an import setting checkbox (which defaults to off). - Stop doing "collection sync" for each object being added; instead do the collection sync right after creating all the objects. Cleanup / Robustness: This reworking of parser (see "removing abstractions" point above) means that all the functions that were in `parser_string_utils` file are gone, and replaced with different set of functions. However they are not OBJ specific, so as pointed out during review of the previous differential, they are now in `source/blender/io/common` library. Added gtest coverage for said functions as well; something that was only indirectly covered by obj tests previously. Rework of some bits of parsing made the parser actually better able to deal with invalid syntax. E.g. previously, if a face corner were a `/123` string, it would have incorrectly treated that as a vertex index (since it would get "hey that's one number" after splitting a string by a slash), instead of properly marking it as invalid syntax. Added gtest coverage for .mtl parsing; something that was not covered by any tests at all previously. Reviewed By: Howard Trickey Differential Revision: https://developer.blender.org/D14586
2022-04-17 22:07:43 +03:00
/**
* Parse a number of white-space separated floats from an input string.
* The parsed `count` numbers are stored in `dst`. If a
* number can't be parsed (invalid syntax, out of range),
* `fallback` value is stored instead.
*
OBJ: improve new importer file parsing performance on windows The OBJ parser was primarily using StringRef for convenience, with functions like "skip whitespace" or "parse a number" taking an input stringref, representing an input line, and returning a new stringref, representing the remainder of the line. This is convenient, but does more work than strictly needed -- while parsing, only the "beginning" of the line ever changes by moving forward; the end of the line always stays the same. We can change the code to take a pair of pointers (begin of line, end of line) as input, and make the functions return the new begin of line pointer. This makes the return value neatly fit into a processor register, which StringRef did not. On Windows, this does result in non-trivial speedups in the actual OBJ file parsing part, due to Windows calling convention where return values larger than 64 bits are returned via memory. Does not measurably affect performance on Mac/Linux, because the calling convention there uses a pair of 64-bit registers to return a StringRef. End-to-end times of importing several test files, on Windows (VS2022 build, Ryzen 5950X): - Monkey subdivided to level 6, no normals (220MB file): 1.25s -> 0.85s - Rungholt minecraft level (270MB file): 7.0s -> 5.8s - Blender 3 splash scene (2.4GB file): 49.1s -> 45.5s The full import process has a lot of other overhead besides actual OBJ file parsing (mostly creating actual blender objects out of parsed data). In pure parsing, in the monkey test scene above, the parsing part goes 1.0s -> 0.6s. Reviewed By: Howard Trickey Differential Revision: https://developer.blender.org/D14936
2022-05-12 13:48:55 +03:00
* Returns the start of remainder of the input string after parsing.
OBJ: further optimize, cleanup and harden the new C++ importer Continued improvements to the new C++ based OBJ importer. Performance: about 2x faster. - Rungholt.obj (several meshes, 263MB file): Windows 12.7s -> 5.9s, Mac 7.7s -> 3.1s. - Blender 3.0 splash (24k meshes, 2.4GB file): Windows 97.3s -> 53.6s, Mac 137.3s -> 80.0s. - "Windows" is VS2022, AMD Ryzen 5950X (32 threads), "Mac" is Xcode/clang 13, M1Max (10 threads). - Slightly reduced memory usage during import as well. The performance gains are a combination of several things: - Replacing `std::stof` / `std::stoi` with C++17 `from_chars`. - Stop reading input file char-by-char using `std::getline`, and instead read in 64kb chunks, and parse from there (taking care of possibly handling lines split mid-way due to chunk boundaries). - Removing abstractions for splitting a line by some char, - Avoid tiny memory allocations: instead of storing a vector of polygon corners in each face, store all the corners in one big array, and per-face only store indices "where do corners start, and how many". Likewise, don't store full string names of material/group names for each face; only store indices into overall material/group names arrays. - Stop always doing mesh validation, which is slow. Do it just like the Alembic importer does: only do validation if found some invalid faces during import, or if requested by the user via an import setting checkbox (which defaults to off). - Stop doing "collection sync" for each object being added; instead do the collection sync right after creating all the objects. Cleanup / Robustness: This reworking of parser (see "removing abstractions" point above) means that all the functions that were in `parser_string_utils` file are gone, and replaced with different set of functions. However they are not OBJ specific, so as pointed out during review of the previous differential, they are now in `source/blender/io/common` library. Added gtest coverage for said functions as well; something that was only indirectly covered by obj tests previously. Rework of some bits of parsing made the parser actually better able to deal with invalid syntax. E.g. previously, if a face corner were a `/123` string, it would have incorrectly treated that as a vertex index (since it would get "hey that's one number" after splitting a string by a slash), instead of properly marking it as invalid syntax. Added gtest coverage for .mtl parsing; something that was not covered by any tests at all previously. Reviewed By: Howard Trickey Differential Revision: https://developer.blender.org/D14586
2022-04-17 22:07:43 +03:00
*/
const char *parse_floats(const char *p,
const char *end,
float fallback,
float *dst,
int count,
bool require_trailing_space = false);
OBJ: further optimize, cleanup and harden the new C++ importer Continued improvements to the new C++ based OBJ importer. Performance: about 2x faster. - Rungholt.obj (several meshes, 263MB file): Windows 12.7s -> 5.9s, Mac 7.7s -> 3.1s. - Blender 3.0 splash (24k meshes, 2.4GB file): Windows 97.3s -> 53.6s, Mac 137.3s -> 80.0s. - "Windows" is VS2022, AMD Ryzen 5950X (32 threads), "Mac" is Xcode/clang 13, M1Max (10 threads). - Slightly reduced memory usage during import as well. The performance gains are a combination of several things: - Replacing `std::stof` / `std::stoi` with C++17 `from_chars`. - Stop reading input file char-by-char using `std::getline`, and instead read in 64kb chunks, and parse from there (taking care of possibly handling lines split mid-way due to chunk boundaries). - Removing abstractions for splitting a line by some char, - Avoid tiny memory allocations: instead of storing a vector of polygon corners in each face, store all the corners in one big array, and per-face only store indices "where do corners start, and how many". Likewise, don't store full string names of material/group names for each face; only store indices into overall material/group names arrays. - Stop always doing mesh validation, which is slow. Do it just like the Alembic importer does: only do validation if found some invalid faces during import, or if requested by the user via an import setting checkbox (which defaults to off). - Stop doing "collection sync" for each object being added; instead do the collection sync right after creating all the objects. Cleanup / Robustness: This reworking of parser (see "removing abstractions" point above) means that all the functions that were in `parser_string_utils` file are gone, and replaced with different set of functions. However they are not OBJ specific, so as pointed out during review of the previous differential, they are now in `source/blender/io/common` library. Added gtest coverage for said functions as well; something that was only indirectly covered by obj tests previously. Rework of some bits of parsing made the parser actually better able to deal with invalid syntax. E.g. previously, if a face corner were a `/123` string, it would have incorrectly treated that as a vertex index (since it would get "hey that's one number" after splitting a string by a slash), instead of properly marking it as invalid syntax. Added gtest coverage for .mtl parsing; something that was not covered by any tests at all previously. Reviewed By: Howard Trickey Differential Revision: https://developer.blender.org/D14586
2022-04-17 22:07:43 +03:00
} // namespace blender::io