Files
test/intern/cycles/blender/mesh.cpp

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

1298 lines
46 KiB
C++
Raw Normal View History

/* SPDX-FileCopyrightText: 2011-2022 Blender Foundation
*
* SPDX-License-Identifier: Apache-2.0 */
#include <optional>
#include "blender/attribute_convert.h"
#include "blender/session.h"
#include "blender/sync.h"
#include "blender/util.h"
#include "scene/bake.h"
#include "scene/camera.h"
#include "scene/mesh.h"
#include "scene/object.h"
#include "scene/scene.h"
#include "subd/split.h"
#include "util/algorithm.h"
#include "util/disjoint_set.h"
#include "util/hash.h"
#include "util/log.h"
#include "util/math.h"
Mikktspace: Optimized port to C++ This commit is a big overhaul to the Mikktspace module, which is used to compute tangents. I'm not calling it a rewrite since it's the result of a lot of iterations on the original code, but pretty much everything is reworked somehow. Overall goal was to a) make it faster and b) make it maintainable. Notable changes: - Since the callbacks for requesting geometry data were a big bottleneck before, I've ported it to C++ and made it header-only, templating on the data source. That way, the compiler generates code specific to the caller, which allows it to inline the data source and specialize for some cases (e.g. subd vs. non-subd in Cycles). - The one input parameter, an optional angle threshold, was not used anywhere. Turns out that removing it allows for considerable algorithmic simplification, removing a lot of the complexity in the later stages. Therefore, I've just removed the option in the new code. - The code computes several outputs, but only one (the tangent itself) is ever used in Blender. Therefore, I've removed the others to simplify the code. They could easily be brought back if needed, none of the algorithmic simplifications are conflicting with them. - The original code had fallback paths for many steps in case temporary memory allocation fails, but that never actually gets used anyways since malloc() doesn't really ever return NULL in practise, so I removed them. - In general, I've restructured A LOT of the code to make the algorithms clearer and make use of some C++ features (vectors, std::array, booleans, classes), though there's still some of cleanup that could be done. - Parallelized duplicate detection, neighbor detection, triangle tangent computation, degenerate triangle handling and tangent space accumulation. - Replaced several algorithms with faster equivalents: Duplicate detection uses a (concurrent) hash set now, neighbor detection uses Radixsort and splits vertices by index pairs etc. As for results, the exact speedup depends on the scene of course, but let's consider the file from T97378: - Blender 3.1 (before D14675): 6.07sec - Blender 3.2 (with D14675): 4.62sec - rBf0a36599007d (last nightly build): 4.42sec - With this commit: 0.90sec This speedup will mostly be noticed at the start of Cycles renders and, even more importantly, in Eevee when doing something that changes the geometry (e.g. animating) on a model using normal maps. Differential Revision: https://developer.blender.org/D15589
2022-09-03 17:21:44 +02:00
#include "mikktspace.hh"
#include "BKE_attribute.hh"
#include "BKE_attribute_math.hh"
#include "BKE_customdata.hh"
#include "BKE_mesh.hh"
Mesh: Remove redundant custom data pointers For copy-on-write, we want to share attribute arrays between meshes where possible. Mutable pointers like `Mesh.mvert` make that difficult by making ownership vague. They also make code more complex by adding redundancy. The simplest solution is just removing them and retrieving layers from `CustomData` as needed. Similar changes have already been applied to curves and point clouds (e9f82d3dc7ee, 410a6efb747f). Removing use of the pointers generally makes code more obvious and more reusable. Mesh data is now accessed with a C++ API (`Mesh::edges()` or `Mesh::edges_for_write()`), and a C API (`BKE_mesh_edges(mesh)`). The CoW changes this commit makes possible are described in T95845 and T95842, and started in D14139 and D14140. The change also simplifies the ongoing mesh struct-of-array refactors from T95965. **RNA/Python Access Performance** Theoretically, accessing mesh elements with the RNA API may become slower, since the layer needs to be found on every random access. However, overhead is already high enough that this doesn't make a noticible differenc, and performance is actually improved in some cases. Random access can be up to 10% faster, but other situations might be a bit slower. Generally using `foreach_get/set` are the best way to improve performance. See the differential revision for more discussion about Python performance. Cycles has been updated to use raw pointers and the internal Blender mesh types, mostly because there is no sense in having this overhead when it's already compiled with Blender. In my tests this roughly halves the Cycles mesh creation time (0.19s to 0.10s for a 1 million face grid). Differential Revision: https://developer.blender.org/D15488
2022-09-05 11:56:34 -05:00
CCL_NAMESPACE_BEGIN
/* Tangent Space */
Mikktspace: Optimized port to C++ This commit is a big overhaul to the Mikktspace module, which is used to compute tangents. I'm not calling it a rewrite since it's the result of a lot of iterations on the original code, but pretty much everything is reworked somehow. Overall goal was to a) make it faster and b) make it maintainable. Notable changes: - Since the callbacks for requesting geometry data were a big bottleneck before, I've ported it to C++ and made it header-only, templating on the data source. That way, the compiler generates code specific to the caller, which allows it to inline the data source and specialize for some cases (e.g. subd vs. non-subd in Cycles). - The one input parameter, an optional angle threshold, was not used anywhere. Turns out that removing it allows for considerable algorithmic simplification, removing a lot of the complexity in the later stages. Therefore, I've just removed the option in the new code. - The code computes several outputs, but only one (the tangent itself) is ever used in Blender. Therefore, I've removed the others to simplify the code. They could easily be brought back if needed, none of the algorithmic simplifications are conflicting with them. - The original code had fallback paths for many steps in case temporary memory allocation fails, but that never actually gets used anyways since malloc() doesn't really ever return NULL in practise, so I removed them. - In general, I've restructured A LOT of the code to make the algorithms clearer and make use of some C++ features (vectors, std::array, booleans, classes), though there's still some of cleanup that could be done. - Parallelized duplicate detection, neighbor detection, triangle tangent computation, degenerate triangle handling and tangent space accumulation. - Replaced several algorithms with faster equivalents: Duplicate detection uses a (concurrent) hash set now, neighbor detection uses Radixsort and splits vertices by index pairs etc. As for results, the exact speedup depends on the scene of course, but let's consider the file from T97378: - Blender 3.1 (before D14675): 6.07sec - Blender 3.2 (with D14675): 4.62sec - rBf0a36599007d (last nightly build): 4.42sec - With this commit: 0.90sec This speedup will mostly be noticed at the start of Cycles renders and, even more importantly, in Eevee when doing something that changes the geometry (e.g. animating) on a model using normal maps. Differential Revision: https://developer.blender.org/D15589
2022-09-03 17:21:44 +02:00
template<bool is_subd> struct MikkMeshWrapper {
MikkMeshWrapper(const ::Mesh &b_mesh,
Mikktspace: Optimized port to C++ This commit is a big overhaul to the Mikktspace module, which is used to compute tangents. I'm not calling it a rewrite since it's the result of a lot of iterations on the original code, but pretty much everything is reworked somehow. Overall goal was to a) make it faster and b) make it maintainable. Notable changes: - Since the callbacks for requesting geometry data were a big bottleneck before, I've ported it to C++ and made it header-only, templating on the data source. That way, the compiler generates code specific to the caller, which allows it to inline the data source and specialize for some cases (e.g. subd vs. non-subd in Cycles). - The one input parameter, an optional angle threshold, was not used anywhere. Turns out that removing it allows for considerable algorithmic simplification, removing a lot of the complexity in the later stages. Therefore, I've just removed the option in the new code. - The code computes several outputs, but only one (the tangent itself) is ever used in Blender. Therefore, I've removed the others to simplify the code. They could easily be brought back if needed, none of the algorithmic simplifications are conflicting with them. - The original code had fallback paths for many steps in case temporary memory allocation fails, but that never actually gets used anyways since malloc() doesn't really ever return NULL in practise, so I removed them. - In general, I've restructured A LOT of the code to make the algorithms clearer and make use of some C++ features (vectors, std::array, booleans, classes), though there's still some of cleanup that could be done. - Parallelized duplicate detection, neighbor detection, triangle tangent computation, degenerate triangle handling and tangent space accumulation. - Replaced several algorithms with faster equivalents: Duplicate detection uses a (concurrent) hash set now, neighbor detection uses Radixsort and splits vertices by index pairs etc. As for results, the exact speedup depends on the scene of course, but let's consider the file from T97378: - Blender 3.1 (before D14675): 6.07sec - Blender 3.2 (with D14675): 4.62sec - rBf0a36599007d (last nightly build): 4.42sec - With this commit: 0.90sec This speedup will mostly be noticed at the start of Cycles renders and, even more importantly, in Eevee when doing something that changes the geometry (e.g. animating) on a model using normal maps. Differential Revision: https://developer.blender.org/D15589
2022-09-03 17:21:44 +02:00
const char *layer_name,
const Mesh *mesh,
float3 *tangent,
float *tangent_sign)
: mesh(mesh), tangent(tangent), tangent_sign(tangent_sign)
{
Mikktspace: Optimized port to C++ This commit is a big overhaul to the Mikktspace module, which is used to compute tangents. I'm not calling it a rewrite since it's the result of a lot of iterations on the original code, but pretty much everything is reworked somehow. Overall goal was to a) make it faster and b) make it maintainable. Notable changes: - Since the callbacks for requesting geometry data were a big bottleneck before, I've ported it to C++ and made it header-only, templating on the data source. That way, the compiler generates code specific to the caller, which allows it to inline the data source and specialize for some cases (e.g. subd vs. non-subd in Cycles). - The one input parameter, an optional angle threshold, was not used anywhere. Turns out that removing it allows for considerable algorithmic simplification, removing a lot of the complexity in the later stages. Therefore, I've just removed the option in the new code. - The code computes several outputs, but only one (the tangent itself) is ever used in Blender. Therefore, I've removed the others to simplify the code. They could easily be brought back if needed, none of the algorithmic simplifications are conflicting with them. - The original code had fallback paths for many steps in case temporary memory allocation fails, but that never actually gets used anyways since malloc() doesn't really ever return NULL in practise, so I removed them. - In general, I've restructured A LOT of the code to make the algorithms clearer and make use of some C++ features (vectors, std::array, booleans, classes), though there's still some of cleanup that could be done. - Parallelized duplicate detection, neighbor detection, triangle tangent computation, degenerate triangle handling and tangent space accumulation. - Replaced several algorithms with faster equivalents: Duplicate detection uses a (concurrent) hash set now, neighbor detection uses Radixsort and splits vertices by index pairs etc. As for results, the exact speedup depends on the scene of course, but let's consider the file from T97378: - Blender 3.1 (before D14675): 6.07sec - Blender 3.2 (with D14675): 4.62sec - rBf0a36599007d (last nightly build): 4.42sec - With this commit: 0.90sec This speedup will mostly be noticed at the start of Cycles renders and, even more importantly, in Eevee when doing something that changes the geometry (e.g. animating) on a model using normal maps. Differential Revision: https://developer.blender.org/D15589
2022-09-03 17:21:44 +02:00
const AttributeSet &attributes = is_subd ? mesh->subd_attributes : mesh->attributes;
Attribute *attr_vN = attributes.find(ATTR_STD_VERTEX_NORMAL);
vertex_normal = attr_vN->data_float3();
if (layer_name == nullptr) {
Attribute *attr_orco = attributes.find(ATTR_STD_GENERATED);
if (attr_orco) {
orco = attr_orco->data_float3();
Mikktspace: Optimized port to C++ This commit is a big overhaul to the Mikktspace module, which is used to compute tangents. I'm not calling it a rewrite since it's the result of a lot of iterations on the original code, but pretty much everything is reworked somehow. Overall goal was to a) make it faster and b) make it maintainable. Notable changes: - Since the callbacks for requesting geometry data were a big bottleneck before, I've ported it to C++ and made it header-only, templating on the data source. That way, the compiler generates code specific to the caller, which allows it to inline the data source and specialize for some cases (e.g. subd vs. non-subd in Cycles). - The one input parameter, an optional angle threshold, was not used anywhere. Turns out that removing it allows for considerable algorithmic simplification, removing a lot of the complexity in the later stages. Therefore, I've just removed the option in the new code. - The code computes several outputs, but only one (the tangent itself) is ever used in Blender. Therefore, I've removed the others to simplify the code. They could easily be brought back if needed, none of the algorithmic simplifications are conflicting with them. - The original code had fallback paths for many steps in case temporary memory allocation fails, but that never actually gets used anyways since malloc() doesn't really ever return NULL in practise, so I removed them. - In general, I've restructured A LOT of the code to make the algorithms clearer and make use of some C++ features (vectors, std::array, booleans, classes), though there's still some of cleanup that could be done. - Parallelized duplicate detection, neighbor detection, triangle tangent computation, degenerate triangle handling and tangent space accumulation. - Replaced several algorithms with faster equivalents: Duplicate detection uses a (concurrent) hash set now, neighbor detection uses Radixsort and splits vertices by index pairs etc. As for results, the exact speedup depends on the scene of course, but let's consider the file from T97378: - Blender 3.1 (before D14675): 6.07sec - Blender 3.2 (with D14675): 4.62sec - rBf0a36599007d (last nightly build): 4.42sec - With this commit: 0.90sec This speedup will mostly be noticed at the start of Cycles renders and, even more importantly, in Eevee when doing something that changes the geometry (e.g. animating) on a model using normal maps. Differential Revision: https://developer.blender.org/D15589
2022-09-03 17:21:44 +02:00
float3 orco_size;
mesh_texture_space(b_mesh, orco_loc, orco_size);
Mikktspace: Optimized port to C++ This commit is a big overhaul to the Mikktspace module, which is used to compute tangents. I'm not calling it a rewrite since it's the result of a lot of iterations on the original code, but pretty much everything is reworked somehow. Overall goal was to a) make it faster and b) make it maintainable. Notable changes: - Since the callbacks for requesting geometry data were a big bottleneck before, I've ported it to C++ and made it header-only, templating on the data source. That way, the compiler generates code specific to the caller, which allows it to inline the data source and specialize for some cases (e.g. subd vs. non-subd in Cycles). - The one input parameter, an optional angle threshold, was not used anywhere. Turns out that removing it allows for considerable algorithmic simplification, removing a lot of the complexity in the later stages. Therefore, I've just removed the option in the new code. - The code computes several outputs, but only one (the tangent itself) is ever used in Blender. Therefore, I've removed the others to simplify the code. They could easily be brought back if needed, none of the algorithmic simplifications are conflicting with them. - The original code had fallback paths for many steps in case temporary memory allocation fails, but that never actually gets used anyways since malloc() doesn't really ever return NULL in practise, so I removed them. - In general, I've restructured A LOT of the code to make the algorithms clearer and make use of some C++ features (vectors, std::array, booleans, classes), though there's still some of cleanup that could be done. - Parallelized duplicate detection, neighbor detection, triangle tangent computation, degenerate triangle handling and tangent space accumulation. - Replaced several algorithms with faster equivalents: Duplicate detection uses a (concurrent) hash set now, neighbor detection uses Radixsort and splits vertices by index pairs etc. As for results, the exact speedup depends on the scene of course, but let's consider the file from T97378: - Blender 3.1 (before D14675): 6.07sec - Blender 3.2 (with D14675): 4.62sec - rBf0a36599007d (last nightly build): 4.42sec - With this commit: 0.90sec This speedup will mostly be noticed at the start of Cycles renders and, even more importantly, in Eevee when doing something that changes the geometry (e.g. animating) on a model using normal maps. Differential Revision: https://developer.blender.org/D15589
2022-09-03 17:21:44 +02:00
inv_orco_size = 1.0f / orco_size;
}
}
else {
Attribute *attr_uv = attributes.find(ustring(layer_name));
if (attr_uv != nullptr) {
uv = attr_uv->data_float2();
}
}
}
Mikktspace: Optimized port to C++ This commit is a big overhaul to the Mikktspace module, which is used to compute tangents. I'm not calling it a rewrite since it's the result of a lot of iterations on the original code, but pretty much everything is reworked somehow. Overall goal was to a) make it faster and b) make it maintainable. Notable changes: - Since the callbacks for requesting geometry data were a big bottleneck before, I've ported it to C++ and made it header-only, templating on the data source. That way, the compiler generates code specific to the caller, which allows it to inline the data source and specialize for some cases (e.g. subd vs. non-subd in Cycles). - The one input parameter, an optional angle threshold, was not used anywhere. Turns out that removing it allows for considerable algorithmic simplification, removing a lot of the complexity in the later stages. Therefore, I've just removed the option in the new code. - The code computes several outputs, but only one (the tangent itself) is ever used in Blender. Therefore, I've removed the others to simplify the code. They could easily be brought back if needed, none of the algorithmic simplifications are conflicting with them. - The original code had fallback paths for many steps in case temporary memory allocation fails, but that never actually gets used anyways since malloc() doesn't really ever return NULL in practise, so I removed them. - In general, I've restructured A LOT of the code to make the algorithms clearer and make use of some C++ features (vectors, std::array, booleans, classes), though there's still some of cleanup that could be done. - Parallelized duplicate detection, neighbor detection, triangle tangent computation, degenerate triangle handling and tangent space accumulation. - Replaced several algorithms with faster equivalents: Duplicate detection uses a (concurrent) hash set now, neighbor detection uses Radixsort and splits vertices by index pairs etc. As for results, the exact speedup depends on the scene of course, but let's consider the file from T97378: - Blender 3.1 (before D14675): 6.07sec - Blender 3.2 (with D14675): 4.62sec - rBf0a36599007d (last nightly build): 4.42sec - With this commit: 0.90sec This speedup will mostly be noticed at the start of Cycles renders and, even more importantly, in Eevee when doing something that changes the geometry (e.g. animating) on a model using normal maps. Differential Revision: https://developer.blender.org/D15589
2022-09-03 17:21:44 +02:00
int GetNumFaces()
{
if constexpr (is_subd) {
return mesh->get_num_subd_faces();
}
else {
return mesh->num_triangles();
}
}
Mikktspace: Optimized port to C++ This commit is a big overhaul to the Mikktspace module, which is used to compute tangents. I'm not calling it a rewrite since it's the result of a lot of iterations on the original code, but pretty much everything is reworked somehow. Overall goal was to a) make it faster and b) make it maintainable. Notable changes: - Since the callbacks for requesting geometry data were a big bottleneck before, I've ported it to C++ and made it header-only, templating on the data source. That way, the compiler generates code specific to the caller, which allows it to inline the data source and specialize for some cases (e.g. subd vs. non-subd in Cycles). - The one input parameter, an optional angle threshold, was not used anywhere. Turns out that removing it allows for considerable algorithmic simplification, removing a lot of the complexity in the later stages. Therefore, I've just removed the option in the new code. - The code computes several outputs, but only one (the tangent itself) is ever used in Blender. Therefore, I've removed the others to simplify the code. They could easily be brought back if needed, none of the algorithmic simplifications are conflicting with them. - The original code had fallback paths for many steps in case temporary memory allocation fails, but that never actually gets used anyways since malloc() doesn't really ever return NULL in practise, so I removed them. - In general, I've restructured A LOT of the code to make the algorithms clearer and make use of some C++ features (vectors, std::array, booleans, classes), though there's still some of cleanup that could be done. - Parallelized duplicate detection, neighbor detection, triangle tangent computation, degenerate triangle handling and tangent space accumulation. - Replaced several algorithms with faster equivalents: Duplicate detection uses a (concurrent) hash set now, neighbor detection uses Radixsort and splits vertices by index pairs etc. As for results, the exact speedup depends on the scene of course, but let's consider the file from T97378: - Blender 3.1 (before D14675): 6.07sec - Blender 3.2 (with D14675): 4.62sec - rBf0a36599007d (last nightly build): 4.42sec - With this commit: 0.90sec This speedup will mostly be noticed at the start of Cycles renders and, even more importantly, in Eevee when doing something that changes the geometry (e.g. animating) on a model using normal maps. Differential Revision: https://developer.blender.org/D15589
2022-09-03 17:21:44 +02:00
int GetNumVerticesOfFace(const int face_num)
{
if constexpr (is_subd) {
return mesh->get_subd_num_corners()[face_num];
}
else {
return 3;
}
}
Mikktspace: Optimized port to C++ This commit is a big overhaul to the Mikktspace module, which is used to compute tangents. I'm not calling it a rewrite since it's the result of a lot of iterations on the original code, but pretty much everything is reworked somehow. Overall goal was to a) make it faster and b) make it maintainable. Notable changes: - Since the callbacks for requesting geometry data were a big bottleneck before, I've ported it to C++ and made it header-only, templating on the data source. That way, the compiler generates code specific to the caller, which allows it to inline the data source and specialize for some cases (e.g. subd vs. non-subd in Cycles). - The one input parameter, an optional angle threshold, was not used anywhere. Turns out that removing it allows for considerable algorithmic simplification, removing a lot of the complexity in the later stages. Therefore, I've just removed the option in the new code. - The code computes several outputs, but only one (the tangent itself) is ever used in Blender. Therefore, I've removed the others to simplify the code. They could easily be brought back if needed, none of the algorithmic simplifications are conflicting with them. - The original code had fallback paths for many steps in case temporary memory allocation fails, but that never actually gets used anyways since malloc() doesn't really ever return NULL in practise, so I removed them. - In general, I've restructured A LOT of the code to make the algorithms clearer and make use of some C++ features (vectors, std::array, booleans, classes), though there's still some of cleanup that could be done. - Parallelized duplicate detection, neighbor detection, triangle tangent computation, degenerate triangle handling and tangent space accumulation. - Replaced several algorithms with faster equivalents: Duplicate detection uses a (concurrent) hash set now, neighbor detection uses Radixsort and splits vertices by index pairs etc. As for results, the exact speedup depends on the scene of course, but let's consider the file from T97378: - Blender 3.1 (before D14675): 6.07sec - Blender 3.2 (with D14675): 4.62sec - rBf0a36599007d (last nightly build): 4.42sec - With this commit: 0.90sec This speedup will mostly be noticed at the start of Cycles renders and, even more importantly, in Eevee when doing something that changes the geometry (e.g. animating) on a model using normal maps. Differential Revision: https://developer.blender.org/D15589
2022-09-03 17:21:44 +02:00
int CornerIndex(const int face_num, const int vert_num)
{
if constexpr (is_subd) {
const Mesh::SubdFace &face = mesh->get_subd_face(face_num);
return face.start_corner + vert_num;
}
else {
return face_num * 3 + vert_num;
}
}
Mikktspace: Optimized port to C++ This commit is a big overhaul to the Mikktspace module, which is used to compute tangents. I'm not calling it a rewrite since it's the result of a lot of iterations on the original code, but pretty much everything is reworked somehow. Overall goal was to a) make it faster and b) make it maintainable. Notable changes: - Since the callbacks for requesting geometry data were a big bottleneck before, I've ported it to C++ and made it header-only, templating on the data source. That way, the compiler generates code specific to the caller, which allows it to inline the data source and specialize for some cases (e.g. subd vs. non-subd in Cycles). - The one input parameter, an optional angle threshold, was not used anywhere. Turns out that removing it allows for considerable algorithmic simplification, removing a lot of the complexity in the later stages. Therefore, I've just removed the option in the new code. - The code computes several outputs, but only one (the tangent itself) is ever used in Blender. Therefore, I've removed the others to simplify the code. They could easily be brought back if needed, none of the algorithmic simplifications are conflicting with them. - The original code had fallback paths for many steps in case temporary memory allocation fails, but that never actually gets used anyways since malloc() doesn't really ever return NULL in practise, so I removed them. - In general, I've restructured A LOT of the code to make the algorithms clearer and make use of some C++ features (vectors, std::array, booleans, classes), though there's still some of cleanup that could be done. - Parallelized duplicate detection, neighbor detection, triangle tangent computation, degenerate triangle handling and tangent space accumulation. - Replaced several algorithms with faster equivalents: Duplicate detection uses a (concurrent) hash set now, neighbor detection uses Radixsort and splits vertices by index pairs etc. As for results, the exact speedup depends on the scene of course, but let's consider the file from T97378: - Blender 3.1 (before D14675): 6.07sec - Blender 3.2 (with D14675): 4.62sec - rBf0a36599007d (last nightly build): 4.42sec - With this commit: 0.90sec This speedup will mostly be noticed at the start of Cycles renders and, even more importantly, in Eevee when doing something that changes the geometry (e.g. animating) on a model using normal maps. Differential Revision: https://developer.blender.org/D15589
2022-09-03 17:21:44 +02:00
int VertexIndex(const int face_num, const int vert_num)
{
int corner = CornerIndex(face_num, vert_num);
if constexpr (is_subd) {
return mesh->get_subd_face_corners()[corner];
}
else {
return mesh->get_triangles()[corner];
}
}
Mikktspace: Optimized port to C++ This commit is a big overhaul to the Mikktspace module, which is used to compute tangents. I'm not calling it a rewrite since it's the result of a lot of iterations on the original code, but pretty much everything is reworked somehow. Overall goal was to a) make it faster and b) make it maintainable. Notable changes: - Since the callbacks for requesting geometry data were a big bottleneck before, I've ported it to C++ and made it header-only, templating on the data source. That way, the compiler generates code specific to the caller, which allows it to inline the data source and specialize for some cases (e.g. subd vs. non-subd in Cycles). - The one input parameter, an optional angle threshold, was not used anywhere. Turns out that removing it allows for considerable algorithmic simplification, removing a lot of the complexity in the later stages. Therefore, I've just removed the option in the new code. - The code computes several outputs, but only one (the tangent itself) is ever used in Blender. Therefore, I've removed the others to simplify the code. They could easily be brought back if needed, none of the algorithmic simplifications are conflicting with them. - The original code had fallback paths for many steps in case temporary memory allocation fails, but that never actually gets used anyways since malloc() doesn't really ever return NULL in practise, so I removed them. - In general, I've restructured A LOT of the code to make the algorithms clearer and make use of some C++ features (vectors, std::array, booleans, classes), though there's still some of cleanup that could be done. - Parallelized duplicate detection, neighbor detection, triangle tangent computation, degenerate triangle handling and tangent space accumulation. - Replaced several algorithms with faster equivalents: Duplicate detection uses a (concurrent) hash set now, neighbor detection uses Radixsort and splits vertices by index pairs etc. As for results, the exact speedup depends on the scene of course, but let's consider the file from T97378: - Blender 3.1 (before D14675): 6.07sec - Blender 3.2 (with D14675): 4.62sec - rBf0a36599007d (last nightly build): 4.42sec - With this commit: 0.90sec This speedup will mostly be noticed at the start of Cycles renders and, even more importantly, in Eevee when doing something that changes the geometry (e.g. animating) on a model using normal maps. Differential Revision: https://developer.blender.org/D15589
2022-09-03 17:21:44 +02:00
mikk::float3 GetPosition(const int face_num, const int vert_num)
{
const float3 vP = mesh->get_verts()[VertexIndex(face_num, vert_num)];
return mikk::float3(vP.x, vP.y, vP.z);
}
Mikktspace: Optimized port to C++ This commit is a big overhaul to the Mikktspace module, which is used to compute tangents. I'm not calling it a rewrite since it's the result of a lot of iterations on the original code, but pretty much everything is reworked somehow. Overall goal was to a) make it faster and b) make it maintainable. Notable changes: - Since the callbacks for requesting geometry data were a big bottleneck before, I've ported it to C++ and made it header-only, templating on the data source. That way, the compiler generates code specific to the caller, which allows it to inline the data source and specialize for some cases (e.g. subd vs. non-subd in Cycles). - The one input parameter, an optional angle threshold, was not used anywhere. Turns out that removing it allows for considerable algorithmic simplification, removing a lot of the complexity in the later stages. Therefore, I've just removed the option in the new code. - The code computes several outputs, but only one (the tangent itself) is ever used in Blender. Therefore, I've removed the others to simplify the code. They could easily be brought back if needed, none of the algorithmic simplifications are conflicting with them. - The original code had fallback paths for many steps in case temporary memory allocation fails, but that never actually gets used anyways since malloc() doesn't really ever return NULL in practise, so I removed them. - In general, I've restructured A LOT of the code to make the algorithms clearer and make use of some C++ features (vectors, std::array, booleans, classes), though there's still some of cleanup that could be done. - Parallelized duplicate detection, neighbor detection, triangle tangent computation, degenerate triangle handling and tangent space accumulation. - Replaced several algorithms with faster equivalents: Duplicate detection uses a (concurrent) hash set now, neighbor detection uses Radixsort and splits vertices by index pairs etc. As for results, the exact speedup depends on the scene of course, but let's consider the file from T97378: - Blender 3.1 (before D14675): 6.07sec - Blender 3.2 (with D14675): 4.62sec - rBf0a36599007d (last nightly build): 4.42sec - With this commit: 0.90sec This speedup will mostly be noticed at the start of Cycles renders and, even more importantly, in Eevee when doing something that changes the geometry (e.g. animating) on a model using normal maps. Differential Revision: https://developer.blender.org/D15589
2022-09-03 17:21:44 +02:00
mikk::float3 GetTexCoord(const int face_num, const int vert_num)
{
/* TODO: Check whether introducing a template boolean in order to
* turn this into a constexpr is worth it. */
if (uv != nullptr) {
Mikktspace: Optimized port to C++ This commit is a big overhaul to the Mikktspace module, which is used to compute tangents. I'm not calling it a rewrite since it's the result of a lot of iterations on the original code, but pretty much everything is reworked somehow. Overall goal was to a) make it faster and b) make it maintainable. Notable changes: - Since the callbacks for requesting geometry data were a big bottleneck before, I've ported it to C++ and made it header-only, templating on the data source. That way, the compiler generates code specific to the caller, which allows it to inline the data source and specialize for some cases (e.g. subd vs. non-subd in Cycles). - The one input parameter, an optional angle threshold, was not used anywhere. Turns out that removing it allows for considerable algorithmic simplification, removing a lot of the complexity in the later stages. Therefore, I've just removed the option in the new code. - The code computes several outputs, but only one (the tangent itself) is ever used in Blender. Therefore, I've removed the others to simplify the code. They could easily be brought back if needed, none of the algorithmic simplifications are conflicting with them. - The original code had fallback paths for many steps in case temporary memory allocation fails, but that never actually gets used anyways since malloc() doesn't really ever return NULL in practise, so I removed them. - In general, I've restructured A LOT of the code to make the algorithms clearer and make use of some C++ features (vectors, std::array, booleans, classes), though there's still some of cleanup that could be done. - Parallelized duplicate detection, neighbor detection, triangle tangent computation, degenerate triangle handling and tangent space accumulation. - Replaced several algorithms with faster equivalents: Duplicate detection uses a (concurrent) hash set now, neighbor detection uses Radixsort and splits vertices by index pairs etc. As for results, the exact speedup depends on the scene of course, but let's consider the file from T97378: - Blender 3.1 (before D14675): 6.07sec - Blender 3.2 (with D14675): 4.62sec - rBf0a36599007d (last nightly build): 4.42sec - With this commit: 0.90sec This speedup will mostly be noticed at the start of Cycles renders and, even more importantly, in Eevee when doing something that changes the geometry (e.g. animating) on a model using normal maps. Differential Revision: https://developer.blender.org/D15589
2022-09-03 17:21:44 +02:00
const int corner_index = CornerIndex(face_num, vert_num);
float2 tfuv = uv[corner_index];
Mikktspace: Optimized port to C++ This commit is a big overhaul to the Mikktspace module, which is used to compute tangents. I'm not calling it a rewrite since it's the result of a lot of iterations on the original code, but pretty much everything is reworked somehow. Overall goal was to a) make it faster and b) make it maintainable. Notable changes: - Since the callbacks for requesting geometry data were a big bottleneck before, I've ported it to C++ and made it header-only, templating on the data source. That way, the compiler generates code specific to the caller, which allows it to inline the data source and specialize for some cases (e.g. subd vs. non-subd in Cycles). - The one input parameter, an optional angle threshold, was not used anywhere. Turns out that removing it allows for considerable algorithmic simplification, removing a lot of the complexity in the later stages. Therefore, I've just removed the option in the new code. - The code computes several outputs, but only one (the tangent itself) is ever used in Blender. Therefore, I've removed the others to simplify the code. They could easily be brought back if needed, none of the algorithmic simplifications are conflicting with them. - The original code had fallback paths for many steps in case temporary memory allocation fails, but that never actually gets used anyways since malloc() doesn't really ever return NULL in practise, so I removed them. - In general, I've restructured A LOT of the code to make the algorithms clearer and make use of some C++ features (vectors, std::array, booleans, classes), though there's still some of cleanup that could be done. - Parallelized duplicate detection, neighbor detection, triangle tangent computation, degenerate triangle handling and tangent space accumulation. - Replaced several algorithms with faster equivalents: Duplicate detection uses a (concurrent) hash set now, neighbor detection uses Radixsort and splits vertices by index pairs etc. As for results, the exact speedup depends on the scene of course, but let's consider the file from T97378: - Blender 3.1 (before D14675): 6.07sec - Blender 3.2 (with D14675): 4.62sec - rBf0a36599007d (last nightly build): 4.42sec - With this commit: 0.90sec This speedup will mostly be noticed at the start of Cycles renders and, even more importantly, in Eevee when doing something that changes the geometry (e.g. animating) on a model using normal maps. Differential Revision: https://developer.blender.org/D15589
2022-09-03 17:21:44 +02:00
return mikk::float3(tfuv.x, tfuv.y, 1.0f);
}
if (orco != nullptr) {
Mikktspace: Optimized port to C++ This commit is a big overhaul to the Mikktspace module, which is used to compute tangents. I'm not calling it a rewrite since it's the result of a lot of iterations on the original code, but pretty much everything is reworked somehow. Overall goal was to a) make it faster and b) make it maintainable. Notable changes: - Since the callbacks for requesting geometry data were a big bottleneck before, I've ported it to C++ and made it header-only, templating on the data source. That way, the compiler generates code specific to the caller, which allows it to inline the data source and specialize for some cases (e.g. subd vs. non-subd in Cycles). - The one input parameter, an optional angle threshold, was not used anywhere. Turns out that removing it allows for considerable algorithmic simplification, removing a lot of the complexity in the later stages. Therefore, I've just removed the option in the new code. - The code computes several outputs, but only one (the tangent itself) is ever used in Blender. Therefore, I've removed the others to simplify the code. They could easily be brought back if needed, none of the algorithmic simplifications are conflicting with them. - The original code had fallback paths for many steps in case temporary memory allocation fails, but that never actually gets used anyways since malloc() doesn't really ever return NULL in practise, so I removed them. - In general, I've restructured A LOT of the code to make the algorithms clearer and make use of some C++ features (vectors, std::array, booleans, classes), though there's still some of cleanup that could be done. - Parallelized duplicate detection, neighbor detection, triangle tangent computation, degenerate triangle handling and tangent space accumulation. - Replaced several algorithms with faster equivalents: Duplicate detection uses a (concurrent) hash set now, neighbor detection uses Radixsort and splits vertices by index pairs etc. As for results, the exact speedup depends on the scene of course, but let's consider the file from T97378: - Blender 3.1 (before D14675): 6.07sec - Blender 3.2 (with D14675): 4.62sec - rBf0a36599007d (last nightly build): 4.42sec - With this commit: 0.90sec This speedup will mostly be noticed at the start of Cycles renders and, even more importantly, in Eevee when doing something that changes the geometry (e.g. animating) on a model using normal maps. Differential Revision: https://developer.blender.org/D15589
2022-09-03 17:21:44 +02:00
const int vertex_index = VertexIndex(face_num, vert_num);
const float2 uv = map_to_sphere((orco[vertex_index] + orco_loc) * inv_orco_size);
return mikk::float3(uv.x, uv.y, 1.0f);
}
return mikk::float3(0.0f, 0.0f, 1.0f);
}
Mikktspace: Optimized port to C++ This commit is a big overhaul to the Mikktspace module, which is used to compute tangents. I'm not calling it a rewrite since it's the result of a lot of iterations on the original code, but pretty much everything is reworked somehow. Overall goal was to a) make it faster and b) make it maintainable. Notable changes: - Since the callbacks for requesting geometry data were a big bottleneck before, I've ported it to C++ and made it header-only, templating on the data source. That way, the compiler generates code specific to the caller, which allows it to inline the data source and specialize for some cases (e.g. subd vs. non-subd in Cycles). - The one input parameter, an optional angle threshold, was not used anywhere. Turns out that removing it allows for considerable algorithmic simplification, removing a lot of the complexity in the later stages. Therefore, I've just removed the option in the new code. - The code computes several outputs, but only one (the tangent itself) is ever used in Blender. Therefore, I've removed the others to simplify the code. They could easily be brought back if needed, none of the algorithmic simplifications are conflicting with them. - The original code had fallback paths for many steps in case temporary memory allocation fails, but that never actually gets used anyways since malloc() doesn't really ever return NULL in practise, so I removed them. - In general, I've restructured A LOT of the code to make the algorithms clearer and make use of some C++ features (vectors, std::array, booleans, classes), though there's still some of cleanup that could be done. - Parallelized duplicate detection, neighbor detection, triangle tangent computation, degenerate triangle handling and tangent space accumulation. - Replaced several algorithms with faster equivalents: Duplicate detection uses a (concurrent) hash set now, neighbor detection uses Radixsort and splits vertices by index pairs etc. As for results, the exact speedup depends on the scene of course, but let's consider the file from T97378: - Blender 3.1 (before D14675): 6.07sec - Blender 3.2 (with D14675): 4.62sec - rBf0a36599007d (last nightly build): 4.42sec - With this commit: 0.90sec This speedup will mostly be noticed at the start of Cycles renders and, even more importantly, in Eevee when doing something that changes the geometry (e.g. animating) on a model using normal maps. Differential Revision: https://developer.blender.org/D15589
2022-09-03 17:21:44 +02:00
mikk::float3 GetNormal(const int face_num, const int vert_num)
{
float3 vN;
if (is_subd) {
const Mesh::SubdFace &face = mesh->get_subd_face(face_num);
if (face.smooth) {
const int vertex_index = VertexIndex(face_num, vert_num);
vN = vertex_normal[vertex_index];
}
else {
vN = face.normal(mesh);
}
}
else {
Mikktspace: Optimized port to C++ This commit is a big overhaul to the Mikktspace module, which is used to compute tangents. I'm not calling it a rewrite since it's the result of a lot of iterations on the original code, but pretty much everything is reworked somehow. Overall goal was to a) make it faster and b) make it maintainable. Notable changes: - Since the callbacks for requesting geometry data were a big bottleneck before, I've ported it to C++ and made it header-only, templating on the data source. That way, the compiler generates code specific to the caller, which allows it to inline the data source and specialize for some cases (e.g. subd vs. non-subd in Cycles). - The one input parameter, an optional angle threshold, was not used anywhere. Turns out that removing it allows for considerable algorithmic simplification, removing a lot of the complexity in the later stages. Therefore, I've just removed the option in the new code. - The code computes several outputs, but only one (the tangent itself) is ever used in Blender. Therefore, I've removed the others to simplify the code. They could easily be brought back if needed, none of the algorithmic simplifications are conflicting with them. - The original code had fallback paths for many steps in case temporary memory allocation fails, but that never actually gets used anyways since malloc() doesn't really ever return NULL in practise, so I removed them. - In general, I've restructured A LOT of the code to make the algorithms clearer and make use of some C++ features (vectors, std::array, booleans, classes), though there's still some of cleanup that could be done. - Parallelized duplicate detection, neighbor detection, triangle tangent computation, degenerate triangle handling and tangent space accumulation. - Replaced several algorithms with faster equivalents: Duplicate detection uses a (concurrent) hash set now, neighbor detection uses Radixsort and splits vertices by index pairs etc. As for results, the exact speedup depends on the scene of course, but let's consider the file from T97378: - Blender 3.1 (before D14675): 6.07sec - Blender 3.2 (with D14675): 4.62sec - rBf0a36599007d (last nightly build): 4.42sec - With this commit: 0.90sec This speedup will mostly be noticed at the start of Cycles renders and, even more importantly, in Eevee when doing something that changes the geometry (e.g. animating) on a model using normal maps. Differential Revision: https://developer.blender.org/D15589
2022-09-03 17:21:44 +02:00
if (mesh->get_smooth()[face_num]) {
const int vertex_index = VertexIndex(face_num, vert_num);
vN = vertex_normal[vertex_index];
}
else {
const Mesh::Triangle tri = mesh->get_triangle(face_num);
vN = tri.compute_normal(mesh->get_verts().data());
Mikktspace: Optimized port to C++ This commit is a big overhaul to the Mikktspace module, which is used to compute tangents. I'm not calling it a rewrite since it's the result of a lot of iterations on the original code, but pretty much everything is reworked somehow. Overall goal was to a) make it faster and b) make it maintainable. Notable changes: - Since the callbacks for requesting geometry data were a big bottleneck before, I've ported it to C++ and made it header-only, templating on the data source. That way, the compiler generates code specific to the caller, which allows it to inline the data source and specialize for some cases (e.g. subd vs. non-subd in Cycles). - The one input parameter, an optional angle threshold, was not used anywhere. Turns out that removing it allows for considerable algorithmic simplification, removing a lot of the complexity in the later stages. Therefore, I've just removed the option in the new code. - The code computes several outputs, but only one (the tangent itself) is ever used in Blender. Therefore, I've removed the others to simplify the code. They could easily be brought back if needed, none of the algorithmic simplifications are conflicting with them. - The original code had fallback paths for many steps in case temporary memory allocation fails, but that never actually gets used anyways since malloc() doesn't really ever return NULL in practise, so I removed them. - In general, I've restructured A LOT of the code to make the algorithms clearer and make use of some C++ features (vectors, std::array, booleans, classes), though there's still some of cleanup that could be done. - Parallelized duplicate detection, neighbor detection, triangle tangent computation, degenerate triangle handling and tangent space accumulation. - Replaced several algorithms with faster equivalents: Duplicate detection uses a (concurrent) hash set now, neighbor detection uses Radixsort and splits vertices by index pairs etc. As for results, the exact speedup depends on the scene of course, but let's consider the file from T97378: - Blender 3.1 (before D14675): 6.07sec - Blender 3.2 (with D14675): 4.62sec - rBf0a36599007d (last nightly build): 4.42sec - With this commit: 0.90sec This speedup will mostly be noticed at the start of Cycles renders and, even more importantly, in Eevee when doing something that changes the geometry (e.g. animating) on a model using normal maps. Differential Revision: https://developer.blender.org/D15589
2022-09-03 17:21:44 +02:00
}
}
Mikktspace: Optimized port to C++ This commit is a big overhaul to the Mikktspace module, which is used to compute tangents. I'm not calling it a rewrite since it's the result of a lot of iterations on the original code, but pretty much everything is reworked somehow. Overall goal was to a) make it faster and b) make it maintainable. Notable changes: - Since the callbacks for requesting geometry data were a big bottleneck before, I've ported it to C++ and made it header-only, templating on the data source. That way, the compiler generates code specific to the caller, which allows it to inline the data source and specialize for some cases (e.g. subd vs. non-subd in Cycles). - The one input parameter, an optional angle threshold, was not used anywhere. Turns out that removing it allows for considerable algorithmic simplification, removing a lot of the complexity in the later stages. Therefore, I've just removed the option in the new code. - The code computes several outputs, but only one (the tangent itself) is ever used in Blender. Therefore, I've removed the others to simplify the code. They could easily be brought back if needed, none of the algorithmic simplifications are conflicting with them. - The original code had fallback paths for many steps in case temporary memory allocation fails, but that never actually gets used anyways since malloc() doesn't really ever return NULL in practise, so I removed them. - In general, I've restructured A LOT of the code to make the algorithms clearer and make use of some C++ features (vectors, std::array, booleans, classes), though there's still some of cleanup that could be done. - Parallelized duplicate detection, neighbor detection, triangle tangent computation, degenerate triangle handling and tangent space accumulation. - Replaced several algorithms with faster equivalents: Duplicate detection uses a (concurrent) hash set now, neighbor detection uses Radixsort and splits vertices by index pairs etc. As for results, the exact speedup depends on the scene of course, but let's consider the file from T97378: - Blender 3.1 (before D14675): 6.07sec - Blender 3.2 (with D14675): 4.62sec - rBf0a36599007d (last nightly build): 4.42sec - With this commit: 0.90sec This speedup will mostly be noticed at the start of Cycles renders and, even more importantly, in Eevee when doing something that changes the geometry (e.g. animating) on a model using normal maps. Differential Revision: https://developer.blender.org/D15589
2022-09-03 17:21:44 +02:00
return mikk::float3(vN.x, vN.y, vN.z);
}
Mikktspace: Optimized port to C++ This commit is a big overhaul to the Mikktspace module, which is used to compute tangents. I'm not calling it a rewrite since it's the result of a lot of iterations on the original code, but pretty much everything is reworked somehow. Overall goal was to a) make it faster and b) make it maintainable. Notable changes: - Since the callbacks for requesting geometry data were a big bottleneck before, I've ported it to C++ and made it header-only, templating on the data source. That way, the compiler generates code specific to the caller, which allows it to inline the data source and specialize for some cases (e.g. subd vs. non-subd in Cycles). - The one input parameter, an optional angle threshold, was not used anywhere. Turns out that removing it allows for considerable algorithmic simplification, removing a lot of the complexity in the later stages. Therefore, I've just removed the option in the new code. - The code computes several outputs, but only one (the tangent itself) is ever used in Blender. Therefore, I've removed the others to simplify the code. They could easily be brought back if needed, none of the algorithmic simplifications are conflicting with them. - The original code had fallback paths for many steps in case temporary memory allocation fails, but that never actually gets used anyways since malloc() doesn't really ever return NULL in practise, so I removed them. - In general, I've restructured A LOT of the code to make the algorithms clearer and make use of some C++ features (vectors, std::array, booleans, classes), though there's still some of cleanup that could be done. - Parallelized duplicate detection, neighbor detection, triangle tangent computation, degenerate triangle handling and tangent space accumulation. - Replaced several algorithms with faster equivalents: Duplicate detection uses a (concurrent) hash set now, neighbor detection uses Radixsort and splits vertices by index pairs etc. As for results, the exact speedup depends on the scene of course, but let's consider the file from T97378: - Blender 3.1 (before D14675): 6.07sec - Blender 3.2 (with D14675): 4.62sec - rBf0a36599007d (last nightly build): 4.42sec - With this commit: 0.90sec This speedup will mostly be noticed at the start of Cycles renders and, even more importantly, in Eevee when doing something that changes the geometry (e.g. animating) on a model using normal maps. Differential Revision: https://developer.blender.org/D15589
2022-09-03 17:21:44 +02:00
void SetTangentSpace(const int face_num, const int vert_num, mikk::float3 T, bool orientation)
{
const int corner_index = CornerIndex(face_num, vert_num);
tangent[corner_index] = make_float3(T.x, T.y, T.z);
if (tangent_sign != nullptr) {
Mikktspace: Optimized port to C++ This commit is a big overhaul to the Mikktspace module, which is used to compute tangents. I'm not calling it a rewrite since it's the result of a lot of iterations on the original code, but pretty much everything is reworked somehow. Overall goal was to a) make it faster and b) make it maintainable. Notable changes: - Since the callbacks for requesting geometry data were a big bottleneck before, I've ported it to C++ and made it header-only, templating on the data source. That way, the compiler generates code specific to the caller, which allows it to inline the data source and specialize for some cases (e.g. subd vs. non-subd in Cycles). - The one input parameter, an optional angle threshold, was not used anywhere. Turns out that removing it allows for considerable algorithmic simplification, removing a lot of the complexity in the later stages. Therefore, I've just removed the option in the new code. - The code computes several outputs, but only one (the tangent itself) is ever used in Blender. Therefore, I've removed the others to simplify the code. They could easily be brought back if needed, none of the algorithmic simplifications are conflicting with them. - The original code had fallback paths for many steps in case temporary memory allocation fails, but that never actually gets used anyways since malloc() doesn't really ever return NULL in practise, so I removed them. - In general, I've restructured A LOT of the code to make the algorithms clearer and make use of some C++ features (vectors, std::array, booleans, classes), though there's still some of cleanup that could be done. - Parallelized duplicate detection, neighbor detection, triangle tangent computation, degenerate triangle handling and tangent space accumulation. - Replaced several algorithms with faster equivalents: Duplicate detection uses a (concurrent) hash set now, neighbor detection uses Radixsort and splits vertices by index pairs etc. As for results, the exact speedup depends on the scene of course, but let's consider the file from T97378: - Blender 3.1 (before D14675): 6.07sec - Blender 3.2 (with D14675): 4.62sec - rBf0a36599007d (last nightly build): 4.42sec - With this commit: 0.90sec This speedup will mostly be noticed at the start of Cycles renders and, even more importantly, in Eevee when doing something that changes the geometry (e.g. animating) on a model using normal maps. Differential Revision: https://developer.blender.org/D15589
2022-09-03 17:21:44 +02:00
tangent_sign[corner_index] = orientation ? 1.0f : -1.0f;
}
}
Mikktspace: Optimized port to C++ This commit is a big overhaul to the Mikktspace module, which is used to compute tangents. I'm not calling it a rewrite since it's the result of a lot of iterations on the original code, but pretty much everything is reworked somehow. Overall goal was to a) make it faster and b) make it maintainable. Notable changes: - Since the callbacks for requesting geometry data were a big bottleneck before, I've ported it to C++ and made it header-only, templating on the data source. That way, the compiler generates code specific to the caller, which allows it to inline the data source and specialize for some cases (e.g. subd vs. non-subd in Cycles). - The one input parameter, an optional angle threshold, was not used anywhere. Turns out that removing it allows for considerable algorithmic simplification, removing a lot of the complexity in the later stages. Therefore, I've just removed the option in the new code. - The code computes several outputs, but only one (the tangent itself) is ever used in Blender. Therefore, I've removed the others to simplify the code. They could easily be brought back if needed, none of the algorithmic simplifications are conflicting with them. - The original code had fallback paths for many steps in case temporary memory allocation fails, but that never actually gets used anyways since malloc() doesn't really ever return NULL in practise, so I removed them. - In general, I've restructured A LOT of the code to make the algorithms clearer and make use of some C++ features (vectors, std::array, booleans, classes), though there's still some of cleanup that could be done. - Parallelized duplicate detection, neighbor detection, triangle tangent computation, degenerate triangle handling and tangent space accumulation. - Replaced several algorithms with faster equivalents: Duplicate detection uses a (concurrent) hash set now, neighbor detection uses Radixsort and splits vertices by index pairs etc. As for results, the exact speedup depends on the scene of course, but let's consider the file from T97378: - Blender 3.1 (before D14675): 6.07sec - Blender 3.2 (with D14675): 4.62sec - rBf0a36599007d (last nightly build): 4.42sec - With this commit: 0.90sec This speedup will mostly be noticed at the start of Cycles renders and, even more importantly, in Eevee when doing something that changes the geometry (e.g. animating) on a model using normal maps. Differential Revision: https://developer.blender.org/D15589
2022-09-03 17:21:44 +02:00
const Mesh *mesh;
int num_faces;
float3 *vertex_normal;
float2 *texface;
float2 *uv = nullptr;
float3 *orco = nullptr;
Mikktspace: Optimized port to C++ This commit is a big overhaul to the Mikktspace module, which is used to compute tangents. I'm not calling it a rewrite since it's the result of a lot of iterations on the original code, but pretty much everything is reworked somehow. Overall goal was to a) make it faster and b) make it maintainable. Notable changes: - Since the callbacks for requesting geometry data were a big bottleneck before, I've ported it to C++ and made it header-only, templating on the data source. That way, the compiler generates code specific to the caller, which allows it to inline the data source and specialize for some cases (e.g. subd vs. non-subd in Cycles). - The one input parameter, an optional angle threshold, was not used anywhere. Turns out that removing it allows for considerable algorithmic simplification, removing a lot of the complexity in the later stages. Therefore, I've just removed the option in the new code. - The code computes several outputs, but only one (the tangent itself) is ever used in Blender. Therefore, I've removed the others to simplify the code. They could easily be brought back if needed, none of the algorithmic simplifications are conflicting with them. - The original code had fallback paths for many steps in case temporary memory allocation fails, but that never actually gets used anyways since malloc() doesn't really ever return NULL in practise, so I removed them. - In general, I've restructured A LOT of the code to make the algorithms clearer and make use of some C++ features (vectors, std::array, booleans, classes), though there's still some of cleanup that could be done. - Parallelized duplicate detection, neighbor detection, triangle tangent computation, degenerate triangle handling and tangent space accumulation. - Replaced several algorithms with faster equivalents: Duplicate detection uses a (concurrent) hash set now, neighbor detection uses Radixsort and splits vertices by index pairs etc. As for results, the exact speedup depends on the scene of course, but let's consider the file from T97378: - Blender 3.1 (before D14675): 6.07sec - Blender 3.2 (with D14675): 4.62sec - rBf0a36599007d (last nightly build): 4.42sec - With this commit: 0.90sec This speedup will mostly be noticed at the start of Cycles renders and, even more importantly, in Eevee when doing something that changes the geometry (e.g. animating) on a model using normal maps. Differential Revision: https://developer.blender.org/D15589
2022-09-03 17:21:44 +02:00
float3 orco_loc, inv_orco_size;
float3 *tangent;
float *tangent_sign;
};
static void mikk_compute_tangents(
const ::Mesh &b_mesh, const char *layer_name, Mesh *mesh, bool need_sign, bool active_render)
{
/* Create tangent attributes. */
Mikktspace: Optimized port to C++ This commit is a big overhaul to the Mikktspace module, which is used to compute tangents. I'm not calling it a rewrite since it's the result of a lot of iterations on the original code, but pretty much everything is reworked somehow. Overall goal was to a) make it faster and b) make it maintainable. Notable changes: - Since the callbacks for requesting geometry data were a big bottleneck before, I've ported it to C++ and made it header-only, templating on the data source. That way, the compiler generates code specific to the caller, which allows it to inline the data source and specialize for some cases (e.g. subd vs. non-subd in Cycles). - The one input parameter, an optional angle threshold, was not used anywhere. Turns out that removing it allows for considerable algorithmic simplification, removing a lot of the complexity in the later stages. Therefore, I've just removed the option in the new code. - The code computes several outputs, but only one (the tangent itself) is ever used in Blender. Therefore, I've removed the others to simplify the code. They could easily be brought back if needed, none of the algorithmic simplifications are conflicting with them. - The original code had fallback paths for many steps in case temporary memory allocation fails, but that never actually gets used anyways since malloc() doesn't really ever return NULL in practise, so I removed them. - In general, I've restructured A LOT of the code to make the algorithms clearer and make use of some C++ features (vectors, std::array, booleans, classes), though there's still some of cleanup that could be done. - Parallelized duplicate detection, neighbor detection, triangle tangent computation, degenerate triangle handling and tangent space accumulation. - Replaced several algorithms with faster equivalents: Duplicate detection uses a (concurrent) hash set now, neighbor detection uses Radixsort and splits vertices by index pairs etc. As for results, the exact speedup depends on the scene of course, but let's consider the file from T97378: - Blender 3.1 (before D14675): 6.07sec - Blender 3.2 (with D14675): 4.62sec - rBf0a36599007d (last nightly build): 4.42sec - With this commit: 0.90sec This speedup will mostly be noticed at the start of Cycles renders and, even more importantly, in Eevee when doing something that changes the geometry (e.g. animating) on a model using normal maps. Differential Revision: https://developer.blender.org/D15589
2022-09-03 17:21:44 +02:00
const bool is_subd = mesh->get_num_subd_faces();
AttributeSet &attributes = is_subd ? mesh->subd_attributes : mesh->attributes;
Attribute *attr;
ustring name;
if (layer_name != nullptr) {
name = ustring((string(layer_name) + ".tangent").c_str());
}
else {
name = ustring("orco.tangent");
}
if (active_render) {
attr = attributes.add(ATTR_STD_UV_TANGENT, name);
}
else {
attr = attributes.add(name, TypeVector, ATTR_ELEMENT_CORNER);
}
float3 *tangent = attr->data_float3();
/* Create bitangent sign attribute. */
float *tangent_sign = nullptr;
if (need_sign) {
Attribute *attr_sign;
ustring name_sign;
if (layer_name != nullptr) {
name_sign = ustring((string(layer_name) + ".tangent_sign").c_str());
}
else {
name_sign = ustring("orco.tangent_sign");
}
if (active_render) {
attr_sign = attributes.add(ATTR_STD_UV_TANGENT_SIGN, name_sign);
}
else {
attr_sign = attributes.add(name_sign, TypeFloat, ATTR_ELEMENT_CORNER);
}
tangent_sign = attr_sign->data_float();
}
Mikktspace: Optimized port to C++ This commit is a big overhaul to the Mikktspace module, which is used to compute tangents. I'm not calling it a rewrite since it's the result of a lot of iterations on the original code, but pretty much everything is reworked somehow. Overall goal was to a) make it faster and b) make it maintainable. Notable changes: - Since the callbacks for requesting geometry data were a big bottleneck before, I've ported it to C++ and made it header-only, templating on the data source. That way, the compiler generates code specific to the caller, which allows it to inline the data source and specialize for some cases (e.g. subd vs. non-subd in Cycles). - The one input parameter, an optional angle threshold, was not used anywhere. Turns out that removing it allows for considerable algorithmic simplification, removing a lot of the complexity in the later stages. Therefore, I've just removed the option in the new code. - The code computes several outputs, but only one (the tangent itself) is ever used in Blender. Therefore, I've removed the others to simplify the code. They could easily be brought back if needed, none of the algorithmic simplifications are conflicting with them. - The original code had fallback paths for many steps in case temporary memory allocation fails, but that never actually gets used anyways since malloc() doesn't really ever return NULL in practise, so I removed them. - In general, I've restructured A LOT of the code to make the algorithms clearer and make use of some C++ features (vectors, std::array, booleans, classes), though there's still some of cleanup that could be done. - Parallelized duplicate detection, neighbor detection, triangle tangent computation, degenerate triangle handling and tangent space accumulation. - Replaced several algorithms with faster equivalents: Duplicate detection uses a (concurrent) hash set now, neighbor detection uses Radixsort and splits vertices by index pairs etc. As for results, the exact speedup depends on the scene of course, but let's consider the file from T97378: - Blender 3.1 (before D14675): 6.07sec - Blender 3.2 (with D14675): 4.62sec - rBf0a36599007d (last nightly build): 4.42sec - With this commit: 0.90sec This speedup will mostly be noticed at the start of Cycles renders and, even more importantly, in Eevee when doing something that changes the geometry (e.g. animating) on a model using normal maps. Differential Revision: https://developer.blender.org/D15589
2022-09-03 17:21:44 +02:00
/* Setup userdata. */
Mikktspace: Optimized port to C++ This commit is a big overhaul to the Mikktspace module, which is used to compute tangents. I'm not calling it a rewrite since it's the result of a lot of iterations on the original code, but pretty much everything is reworked somehow. Overall goal was to a) make it faster and b) make it maintainable. Notable changes: - Since the callbacks for requesting geometry data were a big bottleneck before, I've ported it to C++ and made it header-only, templating on the data source. That way, the compiler generates code specific to the caller, which allows it to inline the data source and specialize for some cases (e.g. subd vs. non-subd in Cycles). - The one input parameter, an optional angle threshold, was not used anywhere. Turns out that removing it allows for considerable algorithmic simplification, removing a lot of the complexity in the later stages. Therefore, I've just removed the option in the new code. - The code computes several outputs, but only one (the tangent itself) is ever used in Blender. Therefore, I've removed the others to simplify the code. They could easily be brought back if needed, none of the algorithmic simplifications are conflicting with them. - The original code had fallback paths for many steps in case temporary memory allocation fails, but that never actually gets used anyways since malloc() doesn't really ever return NULL in practise, so I removed them. - In general, I've restructured A LOT of the code to make the algorithms clearer and make use of some C++ features (vectors, std::array, booleans, classes), though there's still some of cleanup that could be done. - Parallelized duplicate detection, neighbor detection, triangle tangent computation, degenerate triangle handling and tangent space accumulation. - Replaced several algorithms with faster equivalents: Duplicate detection uses a (concurrent) hash set now, neighbor detection uses Radixsort and splits vertices by index pairs etc. As for results, the exact speedup depends on the scene of course, but let's consider the file from T97378: - Blender 3.1 (before D14675): 6.07sec - Blender 3.2 (with D14675): 4.62sec - rBf0a36599007d (last nightly build): 4.42sec - With this commit: 0.90sec This speedup will mostly be noticed at the start of Cycles renders and, even more importantly, in Eevee when doing something that changes the geometry (e.g. animating) on a model using normal maps. Differential Revision: https://developer.blender.org/D15589
2022-09-03 17:21:44 +02:00
if (is_subd) {
MikkMeshWrapper<true> userdata(b_mesh, layer_name, mesh, tangent, tangent_sign);
/* Compute tangents. */
mikk::Mikktspace(userdata).genTangSpace();
}
else {
MikkMeshWrapper<false> userdata(b_mesh, layer_name, mesh, tangent, tangent_sign);
/* Compute tangents. */
mikk::Mikktspace(userdata).genTangSpace();
}
}
static void attr_create_motion_from_velocity(Mesh *mesh,
const blender::Span<blender::float3> b_attr,
const float motion_scale)
{
const int numverts = mesh->get_verts().size();
/* Override motion steps to fixed number. */
mesh->set_motion_steps(3);
/* Find or add attribute */
float3 *P = mesh->get_verts().data();
Attribute *attr_mP = mesh->attributes.find(ATTR_STD_MOTION_VERTEX_POSITION);
if (!attr_mP) {
attr_mP = mesh->attributes.add(ATTR_STD_MOTION_VERTEX_POSITION);
}
/* Only export previous and next frame, we don't have any in between data. */
float motion_times[2] = {-1.0f, 1.0f};
for (int step = 0; step < 2; step++) {
const float relative_time = motion_times[step] * 0.5f * motion_scale;
float3 *mP = attr_mP->data_float3() + step * numverts;
for (int i = 0; i < numverts; i++) {
mP[i] = P[i] + make_float3(b_attr[i][0], b_attr[i][1], b_attr[i][2]) * relative_time;
}
}
}
static void attr_create_generic(Scene *scene,
Mesh *mesh,
const ::Mesh &b_mesh,
const bool subdivision,
const bool need_motion,
const float motion_scale)
{
blender::Span<blender::int3> corner_tris;
blender::Span<int> tri_faces;
if (!subdivision) {
corner_tris = b_mesh.corner_tris();
tri_faces = b_mesh.corner_tri_faces();
}
const blender::bke::AttributeAccessor b_attributes = b_mesh.attributes();
AttributeSet &attributes = (subdivision) ? mesh->subd_attributes : mesh->attributes;
static const ustring u_velocity("velocity");
const ustring default_color_name{BKE_id_attributes_default_color_name(&b_mesh.id)};
b_attributes.foreach_attribute([&](const blender::bke::AttributeIter &iter) {
const ustring name{std::string_view(iter.name)};
const bool is_render_color = name == default_color_name;
if (need_motion && name == u_velocity) {
const blender::VArraySpan b_attribute = *iter.get<blender::float3>(
blender::bke::AttrDomain::Point);
attr_create_motion_from_velocity(mesh, b_attribute, motion_scale);
}
if (!(mesh->need_attribute(scene, name) ||
(is_render_color && mesh->need_attribute(scene, ATTR_STD_VERTEX_COLOR))))
{
return;
}
if (attributes.find(name)) {
return;
}
blender::bke::AttrDomain b_domain = iter.domain;
if (b_domain == blender::bke::AttrDomain::Edge) {
/* Blender's attribute API handles edge to vertex attribute domain interpolation. */
b_domain = blender::bke::AttrDomain::Point;
}
const blender::bke::GAttributeReader b_attr = iter.get(b_domain);
if (b_attr.varray.is_empty()) {
return;
}
if (b_attr.domain == blender::bke::AttrDomain::Corner && iter.data_type == CD_PROP_BYTE_COLOR)
{
Attribute *attr = attributes.add(name, TypeRGBA, ATTR_ELEMENT_CORNER_BYTE);
if (is_render_color) {
attr->std = ATTR_STD_VERTEX_COLOR;
}
uchar4 *data = attr->data_uchar4();
const blender::VArraySpan src = b_attr.varray.typed<blender::ColorGeometry4b>();
if (subdivision) {
for (const int i : src.index_range()) {
data[i] = make_uchar4(src[i][0], src[i][1], src[i][2], src[i][3]);
}
}
else {
for (const int i : corner_tris.index_range()) {
const blender::int3 &tri = corner_tris[i];
data[i * 3 + 0] = make_uchar4(
src[tri[0]][0], src[tri[0]][1], src[tri[0]][2], src[tri[0]][3]);
data[i * 3 + 1] = make_uchar4(
src[tri[1]][0], src[tri[1]][1], src[tri[1]][2], src[tri[1]][3]);
data[i * 3 + 2] = make_uchar4(
src[tri[2]][0], src[tri[2]][1], src[tri[2]][2], src[tri[2]][3]);
}
}
return;
}
AttributeElement element = ATTR_ELEMENT_NONE;
switch (b_domain) {
case blender::bke::AttrDomain::Corner:
element = ATTR_ELEMENT_CORNER;
break;
case blender::bke::AttrDomain::Point:
element = ATTR_ELEMENT_VERTEX;
break;
case blender::bke::AttrDomain::Face:
element = ATTR_ELEMENT_FACE;
break;
default:
assert(false);
return;
}
blender::bke::attribute_math::convert_to_static_type(b_attr.varray.type(), [&](auto dummy) {
using BlenderT = decltype(dummy);
using Converter = typename ccl::AttributeConverter<BlenderT>;
using CyclesT = typename Converter::CyclesT;
if constexpr (!std::is_void_v<CyclesT>) {
Attribute *attr = attributes.add(name, Converter::type_desc, element);
if (is_render_color) {
attr->std = ATTR_STD_VERTEX_COLOR;
}
CyclesT *data = reinterpret_cast<CyclesT *>(attr->data());
const blender::VArraySpan src = b_attr.varray.typed<BlenderT>();
switch (b_attr.domain) {
case blender::bke::AttrDomain::Corner: {
if (subdivision) {
for (const int i : src.index_range()) {
data[i] = Converter::convert(src[i]);
}
}
else {
for (const int i : corner_tris.index_range()) {
const blender::int3 &tri = corner_tris[i];
data[i * 3 + 0] = Converter::convert(src[tri[0]]);
data[i * 3 + 1] = Converter::convert(src[tri[1]]);
data[i * 3 + 2] = Converter::convert(src[tri[2]]);
}
}
break;
}
case blender::bke::AttrDomain::Point: {
for (const int i : src.index_range()) {
data[i] = Converter::convert(src[i]);
}
break;
}
case blender::bke::AttrDomain::Face: {
if (subdivision) {
for (const int i : src.index_range()) {
data[i] = Converter::convert(src[i]);
}
}
else {
for (const int i : corner_tris.index_range()) {
data[i] = Converter::convert(src[tri_faces[i]]);
}
}
break;
}
default: {
assert(false);
break;
}
}
}
});
});
}
static set<ustring> get_blender_uv_names(const ::Mesh &b_mesh)
{
set<ustring> uv_names;
b_mesh.attributes().foreach_attribute([&](const blender::bke::AttributeIter &iter) {
if (iter.domain == blender::bke::AttrDomain::Corner && iter.data_type == CD_PROP_FLOAT2) {
if (!blender::bke::attribute_name_is_anonymous(iter.name)) {
uv_names.emplace(std::string_view(iter.name));
}
}
});
return uv_names;
}
/* Create uv map attributes. */
static void attr_create_uv_map(Scene *scene,
Mesh *mesh,
const ::Mesh &b_mesh,
const set<ustring> &blender_uv_names)
{
const blender::Span<blender::int3> corner_tris = b_mesh.corner_tris();
const blender::bke::AttributeAccessor b_attributes = b_mesh.attributes();
const ustring render_name(CustomData_get_render_layer_name(&b_mesh.corner_data, CD_PROP_FLOAT2));
if (!blender_uv_names.empty()) {
for (const ustring &uv_name : blender_uv_names) {
const bool active_render = uv_name == render_name;
AttributeStandard uv_std = (active_render) ? ATTR_STD_UV : ATTR_STD_NONE;
AttributeStandard tangent_std = (active_render) ? ATTR_STD_UV_TANGENT : ATTR_STD_NONE;
ustring tangent_name = ustring((string(uv_name) + ".tangent").c_str());
/* Denotes whether UV map was requested directly. */
const bool need_uv = mesh->need_attribute(scene, uv_name) ||
mesh->need_attribute(scene, uv_std);
/* Denotes whether tangent was requested directly. */
const bool need_tangent = mesh->need_attribute(scene, tangent_name) ||
(active_render && mesh->need_attribute(scene, tangent_std));
/* UV map */
/* NOTE: We create temporary UV layer if its needed for tangent but
* wasn't requested by other nodes in shaders.
*/
Attribute *uv_attr = nullptr;
if (need_uv || need_tangent) {
if (active_render) {
uv_attr = mesh->attributes.add(uv_std, uv_name);
}
else {
uv_attr = mesh->attributes.add(uv_name, TypeFloat2, ATTR_ELEMENT_CORNER);
}
const blender::VArraySpan b_uv_map = *b_attributes.lookup<blender::float2>(
uv_name.c_str(), blender::bke::AttrDomain::Corner);
float2 *fdata = uv_attr->data_float2();
for (const int i : corner_tris.index_range()) {
const blender::int3 &tri = corner_tris[i];
fdata[i * 3 + 0] = make_float2(b_uv_map[tri[0]][0], b_uv_map[tri[0]][1]);
fdata[i * 3 + 1] = make_float2(b_uv_map[tri[1]][0], b_uv_map[tri[1]][1]);
fdata[i * 3 + 2] = make_float2(b_uv_map[tri[2]][0], b_uv_map[tri[2]][1]);
}
}
/* UV tangent */
if (need_tangent) {
AttributeStandard sign_std = (active_render) ? ATTR_STD_UV_TANGENT_SIGN : ATTR_STD_NONE;
ustring sign_name = ustring((string(uv_name) + ".tangent_sign").c_str());
bool need_sign = (mesh->need_attribute(scene, sign_name) ||
mesh->need_attribute(scene, sign_std));
mikk_compute_tangents(b_mesh, uv_name.c_str(), mesh, need_sign, active_render);
}
/* Remove temporarily created UV attribute. */
if (!need_uv && uv_attr != nullptr) {
mesh->attributes.remove(uv_attr);
}
}
}
else if (mesh->need_attribute(scene, ATTR_STD_UV_TANGENT)) {
bool need_sign = mesh->need_attribute(scene, ATTR_STD_UV_TANGENT_SIGN);
mikk_compute_tangents(b_mesh, nullptr, mesh, need_sign, true);
if (!mesh->need_attribute(scene, ATTR_STD_GENERATED)) {
mesh->attributes.remove(ATTR_STD_GENERATED);
}
}
}
static void attr_create_subd_uv_map(Scene *scene,
Mesh *mesh,
const ::Mesh &b_mesh,
bool subdivide_uvs,
const set<ustring> &blender_uv_names)
{
const blender::OffsetIndices faces = b_mesh.faces();
if (faces.is_empty()) {
Mesh: Remove redundant custom data pointers For copy-on-write, we want to share attribute arrays between meshes where possible. Mutable pointers like `Mesh.mvert` make that difficult by making ownership vague. They also make code more complex by adding redundancy. The simplest solution is just removing them and retrieving layers from `CustomData` as needed. Similar changes have already been applied to curves and point clouds (e9f82d3dc7ee, 410a6efb747f). Removing use of the pointers generally makes code more obvious and more reusable. Mesh data is now accessed with a C++ API (`Mesh::edges()` or `Mesh::edges_for_write()`), and a C API (`BKE_mesh_edges(mesh)`). The CoW changes this commit makes possible are described in T95845 and T95842, and started in D14139 and D14140. The change also simplifies the ongoing mesh struct-of-array refactors from T95965. **RNA/Python Access Performance** Theoretically, accessing mesh elements with the RNA API may become slower, since the layer needs to be found on every random access. However, overhead is already high enough that this doesn't make a noticible differenc, and performance is actually improved in some cases. Random access can be up to 10% faster, but other situations might be a bit slower. Generally using `foreach_get/set` are the best way to improve performance. See the differential revision for more discussion about Python performance. Cycles has been updated to use raw pointers and the internal Blender mesh types, mostly because there is no sense in having this overhead when it's already compiled with Blender. In my tests this roughly halves the Cycles mesh creation time (0.19s to 0.10s for a 1 million face grid). Differential Revision: https://developer.blender.org/D15488
2022-09-05 11:56:34 -05:00
return;
}
if (!blender_uv_names.empty()) {
const blender::bke::AttributeAccessor b_attributes = b_mesh.attributes();
const ustring render_name(
CustomData_get_render_layer_name(&b_mesh.corner_data, CD_PROP_FLOAT2));
for (const ustring &uv_name : blender_uv_names) {
const bool active_render = uv_name == render_name;
AttributeStandard uv_std = (active_render) ? ATTR_STD_UV : ATTR_STD_NONE;
AttributeStandard tangent_std = (active_render) ? ATTR_STD_UV_TANGENT : ATTR_STD_NONE;
ustring tangent_name = ustring((string(uv_name) + ".tangent").c_str());
/* Denotes whether UV map was requested directly. */
const bool need_uv = mesh->need_attribute(scene, uv_name) ||
mesh->need_attribute(scene, uv_std);
/* Denotes whether tangent was requested directly. */
const bool need_tangent = mesh->need_attribute(scene, tangent_name) ||
(active_render && mesh->need_attribute(scene, tangent_std));
Attribute *uv_attr = nullptr;
/* UV map */
if (need_uv || need_tangent) {
if (active_render) {
uv_attr = mesh->subd_attributes.add(uv_std, uv_name);
}
else {
uv_attr = mesh->subd_attributes.add(uv_name, TypeFloat2, ATTR_ELEMENT_CORNER);
}
if (subdivide_uvs) {
uv_attr->flags |= ATTR_SUBDIVIDED;
}
const blender::VArraySpan b_uv_map = *b_attributes.lookup<blender::float2>(
uv_name.c_str(), blender::bke::AttrDomain::Corner);
float2 *fdata = uv_attr->data_float2();
for (const int i : faces.index_range()) {
const blender::IndexRange face = faces[i];
for (const int corner : face) {
*(fdata++) = make_float2(b_uv_map[corner][0], b_uv_map[corner][1]);
}
}
}
/* UV tangent */
if (need_tangent) {
AttributeStandard sign_std = (active_render) ? ATTR_STD_UV_TANGENT_SIGN : ATTR_STD_NONE;
ustring sign_name = ustring((string(uv_name) + ".tangent_sign").c_str());
bool need_sign = (mesh->need_attribute(scene, sign_name) ||
mesh->need_attribute(scene, sign_std));
mikk_compute_tangents(b_mesh, uv_name.c_str(), mesh, need_sign, active_render);
}
/* Remove temporarily created UV attribute. */
if (!need_uv && uv_attr != nullptr) {
mesh->subd_attributes.remove(uv_attr);
}
}
}
else if (mesh->need_attribute(scene, ATTR_STD_UV_TANGENT)) {
bool need_sign = mesh->need_attribute(scene, ATTR_STD_UV_TANGENT_SIGN);
mikk_compute_tangents(b_mesh, nullptr, mesh, need_sign, true);
if (!mesh->need_attribute(scene, ATTR_STD_GENERATED)) {
mesh->subd_attributes.remove(ATTR_STD_GENERATED);
}
}
}
/* Create vertex pointiness attributes. */
/* Compare vertices by sum of their coordinates. */
class VertexAverageComparator {
public:
VertexAverageComparator(const array<float3> &verts) : verts_(verts) {}
bool operator()(const int &vert_idx_a, const int &vert_idx_b)
{
const float3 &vert_a = verts_[vert_idx_a];
const float3 &vert_b = verts_[vert_idx_b];
if (vert_a == vert_b) {
/* Special case for doubles, so we ensure ordering. */
return vert_idx_a > vert_idx_b;
}
const float x1 = vert_a.x + vert_a.y + vert_a.z;
const float x2 = vert_b.x + vert_b.y + vert_b.z;
return x1 < x2;
}
protected:
const array<float3> &verts_;
};
static void attr_create_pointiness(Mesh *mesh,
const blender::Span<blender::float3> positions,
const blender::Span<blender::float3> b_vert_normals,
const blender::Span<blender::int2> edges,
bool subdivision)
{
const int num_verts = positions.size();
if (positions.is_empty()) {
return;
}
Mesh: Remove redundant custom data pointers For copy-on-write, we want to share attribute arrays between meshes where possible. Mutable pointers like `Mesh.mvert` make that difficult by making ownership vague. They also make code more complex by adding redundancy. The simplest solution is just removing them and retrieving layers from `CustomData` as needed. Similar changes have already been applied to curves and point clouds (e9f82d3dc7ee, 410a6efb747f). Removing use of the pointers generally makes code more obvious and more reusable. Mesh data is now accessed with a C++ API (`Mesh::edges()` or `Mesh::edges_for_write()`), and a C API (`BKE_mesh_edges(mesh)`). The CoW changes this commit makes possible are described in T95845 and T95842, and started in D14139 and D14140. The change also simplifies the ongoing mesh struct-of-array refactors from T95965. **RNA/Python Access Performance** Theoretically, accessing mesh elements with the RNA API may become slower, since the layer needs to be found on every random access. However, overhead is already high enough that this doesn't make a noticible differenc, and performance is actually improved in some cases. Random access can be up to 10% faster, but other situations might be a bit slower. Generally using `foreach_get/set` are the best way to improve performance. See the differential revision for more discussion about Python performance. Cycles has been updated to use raw pointers and the internal Blender mesh types, mostly because there is no sense in having this overhead when it's already compiled with Blender. In my tests this roughly halves the Cycles mesh creation time (0.19s to 0.10s for a 1 million face grid). Differential Revision: https://developer.blender.org/D15488
2022-09-05 11:56:34 -05:00
/* STEP 1: Find out duplicated vertices and point duplicates to a single
* original vertex.
*/
vector<int> sorted_vert_indeices(num_verts);
2017-02-15 20:33:49 +01:00
for (int vert_index = 0; vert_index < num_verts; ++vert_index) {
sorted_vert_indeices[vert_index] = vert_index;
}
VertexAverageComparator compare(mesh->get_verts());
sort(sorted_vert_indeices.begin(), sorted_vert_indeices.end(), compare);
/* This array stores index of the original vertex for the given vertex
* index.
*/
vector<int> vert_orig_index(num_verts);
2017-02-15 20:33:49 +01:00
for (int sorted_vert_index = 0; sorted_vert_index < num_verts; ++sorted_vert_index) {
const int vert_index = sorted_vert_indeices[sorted_vert_index];
const float3 &vert_co = mesh->get_verts()[vert_index];
bool found = false;
for (int other_sorted_vert_index = sorted_vert_index + 1; other_sorted_vert_index < num_verts;
++other_sorted_vert_index)
{
const int other_vert_index = sorted_vert_indeices[other_sorted_vert_index];
const float3 &other_vert_co = mesh->get_verts()[other_vert_index];
/* We are too far away now, we wouldn't have duplicate. */
if ((other_vert_co.x + other_vert_co.y + other_vert_co.z) -
(vert_co.x + vert_co.y + vert_co.z) >
3 * FLT_EPSILON)
{
break;
}
/* Found duplicate. */
if (len_squared(other_vert_co - vert_co) < FLT_EPSILON) {
found = true;
vert_orig_index[vert_index] = other_vert_index;
break;
}
}
if (!found) {
vert_orig_index[vert_index] = vert_index;
}
}
/* Make sure we always points to the very first orig vertex. */
for (int vert_index = 0; vert_index < num_verts; ++vert_index) {
int orig_index = vert_orig_index[vert_index];
while (orig_index != vert_orig_index[orig_index]) {
orig_index = vert_orig_index[orig_index];
}
vert_orig_index[vert_index] = orig_index;
}
sorted_vert_indeices.free_memory();
/* STEP 2: Calculate vertex normals taking into account their possible
* duplicates which gets "welded" together.
*/
vector<float3> vert_normal(num_verts, zero_float3());
/* First we accumulate all vertex normals in the original index. */
for (int vert_index = 0; vert_index < num_verts; ++vert_index) {
Mesh: Remove redundant custom data pointers For copy-on-write, we want to share attribute arrays between meshes where possible. Mutable pointers like `Mesh.mvert` make that difficult by making ownership vague. They also make code more complex by adding redundancy. The simplest solution is just removing them and retrieving layers from `CustomData` as needed. Similar changes have already been applied to curves and point clouds (e9f82d3dc7ee, 410a6efb747f). Removing use of the pointers generally makes code more obvious and more reusable. Mesh data is now accessed with a C++ API (`Mesh::edges()` or `Mesh::edges_for_write()`), and a C API (`BKE_mesh_edges(mesh)`). The CoW changes this commit makes possible are described in T95845 and T95842, and started in D14139 and D14140. The change also simplifies the ongoing mesh struct-of-array refactors from T95965. **RNA/Python Access Performance** Theoretically, accessing mesh elements with the RNA API may become slower, since the layer needs to be found on every random access. However, overhead is already high enough that this doesn't make a noticible differenc, and performance is actually improved in some cases. Random access can be up to 10% faster, but other situations might be a bit slower. Generally using `foreach_get/set` are the best way to improve performance. See the differential revision for more discussion about Python performance. Cycles has been updated to use raw pointers and the internal Blender mesh types, mostly because there is no sense in having this overhead when it's already compiled with Blender. In my tests this roughly halves the Cycles mesh creation time (0.19s to 0.10s for a 1 million face grid). Differential Revision: https://developer.blender.org/D15488
2022-09-05 11:56:34 -05:00
const float *b_vert_normal = b_vert_normals[vert_index];
const int orig_index = vert_orig_index[vert_index];
Mesh: Remove redundant custom data pointers For copy-on-write, we want to share attribute arrays between meshes where possible. Mutable pointers like `Mesh.mvert` make that difficult by making ownership vague. They also make code more complex by adding redundancy. The simplest solution is just removing them and retrieving layers from `CustomData` as needed. Similar changes have already been applied to curves and point clouds (e9f82d3dc7ee, 410a6efb747f). Removing use of the pointers generally makes code more obvious and more reusable. Mesh data is now accessed with a C++ API (`Mesh::edges()` or `Mesh::edges_for_write()`), and a C API (`BKE_mesh_edges(mesh)`). The CoW changes this commit makes possible are described in T95845 and T95842, and started in D14139 and D14140. The change also simplifies the ongoing mesh struct-of-array refactors from T95965. **RNA/Python Access Performance** Theoretically, accessing mesh elements with the RNA API may become slower, since the layer needs to be found on every random access. However, overhead is already high enough that this doesn't make a noticible differenc, and performance is actually improved in some cases. Random access can be up to 10% faster, but other situations might be a bit slower. Generally using `foreach_get/set` are the best way to improve performance. See the differential revision for more discussion about Python performance. Cycles has been updated to use raw pointers and the internal Blender mesh types, mostly because there is no sense in having this overhead when it's already compiled with Blender. In my tests this roughly halves the Cycles mesh creation time (0.19s to 0.10s for a 1 million face grid). Differential Revision: https://developer.blender.org/D15488
2022-09-05 11:56:34 -05:00
vert_normal[orig_index] += make_float3(b_vert_normal[0], b_vert_normal[1], b_vert_normal[2]);
}
/* Then we normalize the accumulated result and flush it to all duplicates
* as well.
*/
for (int vert_index = 0; vert_index < num_verts; ++vert_index) {
const int orig_index = vert_orig_index[vert_index];
vert_normal[vert_index] = normalize(vert_normal[orig_index]);
}
/* STEP 3: Calculate pointiness using single ring neighborhood. */
vector<int> counter(num_verts, 0);
vector<float> raw_data(num_verts, 0.0f);
vector<float3> edge_accum(num_verts, zero_float3());
EdgeMap visited_edges;
memset(counter.data(), 0, sizeof(int) * counter.size());
Mesh: Remove redundant custom data pointers For copy-on-write, we want to share attribute arrays between meshes where possible. Mutable pointers like `Mesh.mvert` make that difficult by making ownership vague. They also make code more complex by adding redundancy. The simplest solution is just removing them and retrieving layers from `CustomData` as needed. Similar changes have already been applied to curves and point clouds (e9f82d3dc7ee, 410a6efb747f). Removing use of the pointers generally makes code more obvious and more reusable. Mesh data is now accessed with a C++ API (`Mesh::edges()` or `Mesh::edges_for_write()`), and a C API (`BKE_mesh_edges(mesh)`). The CoW changes this commit makes possible are described in T95845 and T95842, and started in D14139 and D14140. The change also simplifies the ongoing mesh struct-of-array refactors from T95965. **RNA/Python Access Performance** Theoretically, accessing mesh elements with the RNA API may become slower, since the layer needs to be found on every random access. However, overhead is already high enough that this doesn't make a noticible differenc, and performance is actually improved in some cases. Random access can be up to 10% faster, but other situations might be a bit slower. Generally using `foreach_get/set` are the best way to improve performance. See the differential revision for more discussion about Python performance. Cycles has been updated to use raw pointers and the internal Blender mesh types, mostly because there is no sense in having this overhead when it's already compiled with Blender. In my tests this roughly halves the Cycles mesh creation time (0.19s to 0.10s for a 1 million face grid). Differential Revision: https://developer.blender.org/D15488
2022-09-05 11:56:34 -05:00
for (const int i : edges.index_range()) {
const blender::int2 b_edge = edges[i];
Mesh: Move edges to a generic attribute Implements #95966, as the final step of #95965. This commit changes the storage of mesh edge vertex indices from the `MEdge` type to the generic `int2` attribute type. This follows the general design for geometry and the attribute system, where the data storage type and the usage semantics are separated. The main benefit of the change is reduced memory usage-- the requirements of storing mesh edges is reduced by 1/3. For example, this saves 8MB on a 1 million vertex grid. This also gives performance benefits to any memory-bound mesh processing algorithm that uses edges. Another benefit is that all of the edge's vertex indices are contiguous. In a few cases, it's helpful to process all of them as `Span<int>` rather than `Span<int2>`. Similarly, the type is more likely to match a generic format used by a library, or code that shouldn't know about specific Blender `Mesh` types. Various Notes: - The `.edge_verts` name is used to reflect a mapping between domains, similar to `.corner_verts`, etc. The period means that it the data shouldn't change arbitrarily by the user or procedural operations. - `edge[0]` is now used instead of `edge.v1` - Signed integers are used instead of unsigned to reduce the mixing of signed-ness, which can be error prone. - All of the previously used core mesh data types (`MVert`, `MEdge`, `MLoop`, `MPoly` are now deprecated. Only generic types are used). - The `vec2i` DNA type is used in the few C files where necessary. Pull Request: https://projects.blender.org/blender/blender/pulls/106638
2023-04-17 13:47:41 +02:00
const int v0 = vert_orig_index[b_edge[0]];
const int v1 = vert_orig_index[b_edge[1]];
if (visited_edges.exists(v0, v1)) {
continue;
}
visited_edges.insert(v0, v1);
Mesh: Move positions to a generic attribute **Changes** As described in T93602, this patch removes all use of the `MVert` struct, replacing it with a generic named attribute with the name `"position"`, consistent with other geometry types. Variable names have been changed from `verts` to `positions`, to align with the attribute name and the more generic design (positions are not vertices, they are just an attribute stored on the point domain). This change is made possible by previous commits that moved all other data out of `MVert` to runtime data or other generic attributes. What remains is mostly a simple type change. Though, the type still shows up 859 times, so the patch is quite large. One compromise is that now `CD_MASK_BAREMESH` now contains `CD_PROP_FLOAT3`. With the general move towards generic attributes over custom data types, we are removing use of these type masks anyway. **Benefits** The most obvious benefit is reduced memory usage and the benefits that brings in memory-bound situations. `float3` is only 3 bytes, in comparison to `MVert` which was 4. When there are millions of vertices this starts to matter more. The other benefits come from using a more generic type. Instead of writing algorithms specifically for `MVert`, code can just use arrays of vectors. This will allow eliminating many temporary arrays or wrappers used to extract positions. Many possible improvements aren't implemented in this patch, though I did switch simplify or remove the process of creating temporary position arrays in a few places. The design clarity that "positions are just another attribute" brings allows removing explicit copying of vertices in some procedural operations-- they are just processed like most other attributes. **Performance** This touches so many areas that it's hard to benchmark exhaustively, but I observed some areas as examples. * The mesh line node with 4 million count was 1.5x (8ms to 12ms) faster. * The Spring splash screen went from ~4.3 to ~4.5 fps. * The subdivision surface modifier/node was slightly faster RNA access through Python may be slightly slower, since now we need a name lookup instead of just a custom data type lookup for each index. **Future Improvements** * Remove uses of "vert_coords" functions: * `BKE_mesh_vert_coords_alloc` * `BKE_mesh_vert_coords_get` * `BKE_mesh_vert_coords_apply{_with_mat4}` * Remove more hidden copying of positions * General simplification now possible in many areas * Convert more code to C++ to use `float3` instead of `float[3]` * Currently `reinterpret_cast` is used for those C-API functions Differential Revision: https://developer.blender.org/D15982
2023-01-10 00:10:43 -05:00
float3 co0 = make_float3(positions[v0][0], positions[v0][1], positions[v0][2]);
float3 co1 = make_float3(positions[v1][0], positions[v1][1], positions[v1][2]);
float3 edge = normalize(co1 - co0);
edge_accum[v0] += edge;
edge_accum[v1] += -edge;
++counter[v0];
++counter[v1];
}
for (int vert_index = 0; vert_index < num_verts; ++vert_index) {
const int orig_index = vert_orig_index[vert_index];
if (orig_index != vert_index) {
/* Skip duplicates, they'll be overwritten later on. */
continue;
}
if (counter[vert_index] > 0) {
const float3 normal = vert_normal[vert_index];
const float angle = safe_acosf(dot(normal, edge_accum[vert_index] / counter[vert_index]));
raw_data[vert_index] = angle * M_1_PI_F;
}
else {
raw_data[vert_index] = 0.0f;
}
}
/* STEP 3: Blur vertices to approximate 2 ring neighborhood. */
AttributeSet &attributes = (subdivision) ? mesh->subd_attributes : mesh->attributes;
Attribute *attr = attributes.add(ATTR_STD_POINTINESS);
float *data = attr->data_float();
memcpy(data, raw_data.data(), sizeof(float) * raw_data.size());
memset(counter.data(), 0, sizeof(int) * counter.size());
visited_edges.clear();
for (const int i : edges.index_range()) {
const blender::int2 b_edge = edges[i];
Mesh: Move edges to a generic attribute Implements #95966, as the final step of #95965. This commit changes the storage of mesh edge vertex indices from the `MEdge` type to the generic `int2` attribute type. This follows the general design for geometry and the attribute system, where the data storage type and the usage semantics are separated. The main benefit of the change is reduced memory usage-- the requirements of storing mesh edges is reduced by 1/3. For example, this saves 8MB on a 1 million vertex grid. This also gives performance benefits to any memory-bound mesh processing algorithm that uses edges. Another benefit is that all of the edge's vertex indices are contiguous. In a few cases, it's helpful to process all of them as `Span<int>` rather than `Span<int2>`. Similarly, the type is more likely to match a generic format used by a library, or code that shouldn't know about specific Blender `Mesh` types. Various Notes: - The `.edge_verts` name is used to reflect a mapping between domains, similar to `.corner_verts`, etc. The period means that it the data shouldn't change arbitrarily by the user or procedural operations. - `edge[0]` is now used instead of `edge.v1` - Signed integers are used instead of unsigned to reduce the mixing of signed-ness, which can be error prone. - All of the previously used core mesh data types (`MVert`, `MEdge`, `MLoop`, `MPoly` are now deprecated. Only generic types are used). - The `vec2i` DNA type is used in the few C files where necessary. Pull Request: https://projects.blender.org/blender/blender/pulls/106638
2023-04-17 13:47:41 +02:00
const int v0 = vert_orig_index[b_edge[0]];
const int v1 = vert_orig_index[b_edge[1]];
if (visited_edges.exists(v0, v1)) {
continue;
}
visited_edges.insert(v0, v1);
data[v0] += raw_data[v1];
data[v1] += raw_data[v0];
++counter[v0];
++counter[v1];
}
for (int vert_index = 0; vert_index < num_verts; ++vert_index) {
data[vert_index] /= counter[vert_index] + 1;
}
/* STEP 4: Copy attribute to the duplicated vertices. */
for (int vert_index = 0; vert_index < num_verts; ++vert_index) {
const int orig_index = vert_orig_index[vert_index];
data[vert_index] = data[orig_index];
}
}
/* The Random Per Island attribute is a random float associated with each
* connected component (island) of the mesh. The attribute is computed by
* first classifying the vertices into different sets using a Disjoint Set
* data structure. Then the index of the root of each vertex (Which is the
* representative of the set the vertex belongs to) is hashed and stored.
*
* We are using a face attribute to avoid interpolation during rendering,
* allowing the user to safely hash the output further. Had we used vertex
* attribute, the interpolation will introduce very slight variations,
* making the output unsafe to hash. */
static void attr_create_random_per_island(Scene *scene,
Mesh *mesh,
const ::Mesh &b_mesh,
bool subdivision)
{
if (!mesh->need_attribute(scene, ATTR_STD_RANDOM_PER_ISLAND)) {
return;
}
if (b_mesh.verts_num == 0) {
return;
}
DisjointSet vertices_sets(b_mesh.verts_num);
const blender::Span<blender::int2> edges = b_mesh.edges();
const blender::Span<int> corner_verts = b_mesh.corner_verts();
Mesh: Remove redundant custom data pointers For copy-on-write, we want to share attribute arrays between meshes where possible. Mutable pointers like `Mesh.mvert` make that difficult by making ownership vague. They also make code more complex by adding redundancy. The simplest solution is just removing them and retrieving layers from `CustomData` as needed. Similar changes have already been applied to curves and point clouds (e9f82d3dc7ee, 410a6efb747f). Removing use of the pointers generally makes code more obvious and more reusable. Mesh data is now accessed with a C++ API (`Mesh::edges()` or `Mesh::edges_for_write()`), and a C API (`BKE_mesh_edges(mesh)`). The CoW changes this commit makes possible are described in T95845 and T95842, and started in D14139 and D14140. The change also simplifies the ongoing mesh struct-of-array refactors from T95965. **RNA/Python Access Performance** Theoretically, accessing mesh elements with the RNA API may become slower, since the layer needs to be found on every random access. However, overhead is already high enough that this doesn't make a noticible differenc, and performance is actually improved in some cases. Random access can be up to 10% faster, but other situations might be a bit slower. Generally using `foreach_get/set` are the best way to improve performance. See the differential revision for more discussion about Python performance. Cycles has been updated to use raw pointers and the internal Blender mesh types, mostly because there is no sense in having this overhead when it's already compiled with Blender. In my tests this roughly halves the Cycles mesh creation time (0.19s to 0.10s for a 1 million face grid). Differential Revision: https://developer.blender.org/D15488
2022-09-05 11:56:34 -05:00
for (const int i : edges.index_range()) {
Mesh: Move edges to a generic attribute Implements #95966, as the final step of #95965. This commit changes the storage of mesh edge vertex indices from the `MEdge` type to the generic `int2` attribute type. This follows the general design for geometry and the attribute system, where the data storage type and the usage semantics are separated. The main benefit of the change is reduced memory usage-- the requirements of storing mesh edges is reduced by 1/3. For example, this saves 8MB on a 1 million vertex grid. This also gives performance benefits to any memory-bound mesh processing algorithm that uses edges. Another benefit is that all of the edge's vertex indices are contiguous. In a few cases, it's helpful to process all of them as `Span<int>` rather than `Span<int2>`. Similarly, the type is more likely to match a generic format used by a library, or code that shouldn't know about specific Blender `Mesh` types. Various Notes: - The `.edge_verts` name is used to reflect a mapping between domains, similar to `.corner_verts`, etc. The period means that it the data shouldn't change arbitrarily by the user or procedural operations. - `edge[0]` is now used instead of `edge.v1` - Signed integers are used instead of unsigned to reduce the mixing of signed-ness, which can be error prone. - All of the previously used core mesh data types (`MVert`, `MEdge`, `MLoop`, `MPoly` are now deprecated. Only generic types are used). - The `vec2i` DNA type is used in the few C files where necessary. Pull Request: https://projects.blender.org/blender/blender/pulls/106638
2023-04-17 13:47:41 +02:00
vertices_sets.join(edges[i][0], edges[i][1]);
}
AttributeSet &attributes = (subdivision) ? mesh->subd_attributes : mesh->attributes;
Attribute *attribute = attributes.add(ATTR_STD_RANDOM_PER_ISLAND);
float *data = attribute->data_float();
if (!subdivision) {
const blender::Span<blender::int3> corner_tris = b_mesh.corner_tris();
if (!corner_tris.is_empty()) {
for (const int i : corner_tris.index_range()) {
const int vert = corner_verts[corner_tris[i][0]];
data[i] = hash_uint_to_float(vertices_sets.find(vert));
}
}
}
else {
const blender::OffsetIndices<int> faces = b_mesh.faces();
if (!faces.is_empty()) {
for (const int i : faces.index_range()) {
const int vert = corner_verts[faces[i].start()];
Mesh: Replace MLoop struct with generic attributes Implements #102359. Split the `MLoop` struct into two separate integer arrays called `corner_verts` and `corner_edges`, referring to the vertex each corner is attached to and the next edge around the face at each corner. These arrays can be sliced to give access to the edges or vertices in a face. Then they are often referred to as "poly_verts" or "poly_edges". The main benefits are halving the necessary memory bandwidth when only one array is used and simplifications from using regular integer indices instead of a special-purpose struct. The commit also starts a renaming from "loop" to "corner" in mesh code. Like the other mesh struct of array refactors, forward compatibility is kept by writing files with the older format. This will be done until 4.0 to ease the transition process. Looking at a small portion of the patch should give a good impression for the rest of the changes. I tried to make the changes as small as possible so it's easy to tell the correctness from the diff. Though I found Blender developers have been very inventive over the last decade when finding different ways to loop over the corners in a face. For performance, nearly every piece of code that deals with `Mesh` is slightly impacted. Any algorithm that is memory bottle-necked should see an improvement. For example, here is a comparison of interpolating a vertex float attribute to face corners (Ryzen 3700x): **Before** (Average: 3.7 ms, Min: 3.4 ms) ``` threading::parallel_for(loops.index_range(), 4096, [&](IndexRange range) { for (const int64_t i : range) { dst[i] = src[loops[i].v]; } }); ``` **After** (Average: 2.9 ms, Min: 2.6 ms) ``` array_utils::gather(src, corner_verts, dst); ``` That's an improvement of 28% to the average timings, and it's also a simplification, since an index-based routine can be used instead. For more examples using the new arrays, see the design task. Pull Request: https://projects.blender.org/blender/blender/pulls/104424
2023-03-20 15:55:13 +01:00
data[i] = hash_uint_to_float(vertices_sets.find(vert));
Mesh: Remove redundant custom data pointers For copy-on-write, we want to share attribute arrays between meshes where possible. Mutable pointers like `Mesh.mvert` make that difficult by making ownership vague. They also make code more complex by adding redundancy. The simplest solution is just removing them and retrieving layers from `CustomData` as needed. Similar changes have already been applied to curves and point clouds (e9f82d3dc7ee, 410a6efb747f). Removing use of the pointers generally makes code more obvious and more reusable. Mesh data is now accessed with a C++ API (`Mesh::edges()` or `Mesh::edges_for_write()`), and a C API (`BKE_mesh_edges(mesh)`). The CoW changes this commit makes possible are described in T95845 and T95842, and started in D14139 and D14140. The change also simplifies the ongoing mesh struct-of-array refactors from T95965. **RNA/Python Access Performance** Theoretically, accessing mesh elements with the RNA API may become slower, since the layer needs to be found on every random access. However, overhead is already high enough that this doesn't make a noticible differenc, and performance is actually improved in some cases. Random access can be up to 10% faster, but other situations might be a bit slower. Generally using `foreach_get/set` are the best way to improve performance. See the differential revision for more discussion about Python performance. Cycles has been updated to use raw pointers and the internal Blender mesh types, mostly because there is no sense in having this overhead when it's already compiled with Blender. In my tests this roughly halves the Cycles mesh creation time (0.19s to 0.10s for a 1 million face grid). Differential Revision: https://developer.blender.org/D15488
2022-09-05 11:56:34 -05:00
}
}
}
}
/* Create Mesh */
static void create_mesh(Scene *scene,
Mesh *mesh,
const ::Mesh &b_mesh,
const array<Node *> &used_shaders,
const bool need_motion,
const float motion_scale,
const bool subdivision = false,
const bool subdivide_uvs = true)
{
const blender::Span<blender::float3> positions = b_mesh.vert_positions();
const blender::OffsetIndices faces = b_mesh.faces();
const blender::Span<int> corner_verts = b_mesh.corner_verts();
const blender::bke::AttributeAccessor b_attributes = b_mesh.attributes();
const blender::bke::MeshNormalDomain normals_domain = b_mesh.normals_domain(true);
int numfaces = (!subdivision) ? b_mesh.corner_tris().size() : faces.size();
bool use_corner_normals = normals_domain == blender::bke::MeshNormalDomain::Corner &&
(mesh->get_subdivision_type() != Mesh::SUBDIVISION_CATMULL_CLARK);
/* If no faces, create empty mesh. */
if (faces.is_empty()) {
return;
}
const blender::VArraySpan material_indices = *b_attributes.lookup<int>(
"material_index", blender::bke::AttrDomain::Face);
const blender::VArraySpan sharp_faces = *b_attributes.lookup<bool>(
"sharp_face", blender::bke::AttrDomain::Face);
blender::Span<blender::float3> corner_normals;
if (use_corner_normals) {
Mesh: Replace auto smooth with node group Design task: #93551 This PR replaces the auto smooth option with a geometry nodes modifier that sets the sharp edge attribute. This solves a fair number of long- standing problems related to auto smooth, simplifies the process of normal computation, and allows Blender to automatically choose between face, vertex, and face corner normals based on the sharp edge and face attributes. Versioning adds a geometry node group to objects with meshes that had auto-smooth enabled. The modifier can be applied, which also improves performance. Auto smooth is now unnecessary to get a combination of sharp and smooth edges. In general workflows are changed a bit. Separate procedural and destructive workflows are available. Custom normals can be used immediately without turning on the removed auto smooth option. **Procedural** The node group asset "Smooth by Angle" is the main way to set sharp normals based on the edge angle. It can be accessed directly in the add modifier menu. Of course the modifier can be reordered, muted, or applied like any other, or changed internally like any geometry nodes modifier. **Destructive** Often the sharp edges don't need to be dynamic. This can give better performance since edge angles don't need to be recalculated. In edit mode the two operators "Select Sharp Edges" and "Mark Sharp" can be used. In other modes, the "Shade Smooth by Angle" controls the edge sharpness directly. ### Breaking API Changes - `use_auto_smooth` is removed. Face corner normals are now used automatically if there are mixed smooth vs. not smooth tags. Meshes now always use custom normals if they exist. - In Cycles, the lack of the separate auto smooth state makes normals look triangulated when all faces are shaded smooth. - `auto_smooth_angle` is removed. Replaced by a modifier (or operator) controlling the sharp edge attribute. This means the mesh itself (without an object) doesn't know anything about automatically smoothing by angle anymore. - `create_normals_split`, `calc_normals_split`, and `free_normals_split` are removed, and are replaced by the simpler `Mesh.corner_normals` collection property. Since it gives access to the normals cache, it is automatically updated when relevant data changes. Addons are updated here: https://projects.blender.org/blender/blender-addons/pulls/104609 ### Tests - `geo_node_curves_test_deform_curves_on_surface` has slightly different results because face corner normals are used instead of interpolated vertex normals. - `bf_wavefront_obj_tests` has different export results for one file which mixed sharp and smooth faces without turning on auto smooth. - `cycles_mesh_cpu` has one object which is completely flat shaded. Previously every edge was split before rendering, now it looks triangulated. Pull Request: https://projects.blender.org/blender/blender/pulls/108014
2023-10-20 16:54:08 +02:00
corner_normals = b_mesh.corner_normals();
}
Mesh: Remove redundant custom data pointers For copy-on-write, we want to share attribute arrays between meshes where possible. Mutable pointers like `Mesh.mvert` make that difficult by making ownership vague. They also make code more complex by adding redundancy. The simplest solution is just removing them and retrieving layers from `CustomData` as needed. Similar changes have already been applied to curves and point clouds (e9f82d3dc7ee, 410a6efb747f). Removing use of the pointers generally makes code more obvious and more reusable. Mesh data is now accessed with a C++ API (`Mesh::edges()` or `Mesh::edges_for_write()`), and a C API (`BKE_mesh_edges(mesh)`). The CoW changes this commit makes possible are described in T95845 and T95842, and started in D14139 and D14140. The change also simplifies the ongoing mesh struct-of-array refactors from T95965. **RNA/Python Access Performance** Theoretically, accessing mesh elements with the RNA API may become slower, since the layer needs to be found on every random access. However, overhead is already high enough that this doesn't make a noticible differenc, and performance is actually improved in some cases. Random access can be up to 10% faster, but other situations might be a bit slower. Generally using `foreach_get/set` are the best way to improve performance. See the differential revision for more discussion about Python performance. Cycles has been updated to use raw pointers and the internal Blender mesh types, mostly because there is no sense in having this overhead when it's already compiled with Blender. In my tests this roughly halves the Cycles mesh creation time (0.19s to 0.10s for a 1 million face grid). Differential Revision: https://developer.blender.org/D15488
2022-09-05 11:56:34 -05:00
int numngons = 0;
int numtris = 0;
if (!subdivision) {
numtris = numfaces;
}
else {
const blender::OffsetIndices faces = b_mesh.faces();
for (const int i : faces.index_range()) {
numngons += (faces[i].size() == 4) ? 0 : 1;
}
}
/* allocate memory */
if (subdivision) {
mesh->resize_subd_faces(numfaces, numngons, corner_verts.size());
}
mesh->resize_mesh(positions.size(), numtris);
float3 *verts = mesh->get_verts().data();
for (const int i : positions.index_range()) {
verts[i] = make_float3(positions[i][0], positions[i][1], positions[i][2]);
Mesh: Remove redundant custom data pointers For copy-on-write, we want to share attribute arrays between meshes where possible. Mutable pointers like `Mesh.mvert` make that difficult by making ownership vague. They also make code more complex by adding redundancy. The simplest solution is just removing them and retrieving layers from `CustomData` as needed. Similar changes have already been applied to curves and point clouds (e9f82d3dc7ee, 410a6efb747f). Removing use of the pointers generally makes code more obvious and more reusable. Mesh data is now accessed with a C++ API (`Mesh::edges()` or `Mesh::edges_for_write()`), and a C API (`BKE_mesh_edges(mesh)`). The CoW changes this commit makes possible are described in T95845 and T95842, and started in D14139 and D14140. The change also simplifies the ongoing mesh struct-of-array refactors from T95965. **RNA/Python Access Performance** Theoretically, accessing mesh elements with the RNA API may become slower, since the layer needs to be found on every random access. However, overhead is already high enough that this doesn't make a noticible differenc, and performance is actually improved in some cases. Random access can be up to 10% faster, but other situations might be a bit slower. Generally using `foreach_get/set` are the best way to improve performance. See the differential revision for more discussion about Python performance. Cycles has been updated to use raw pointers and the internal Blender mesh types, mostly because there is no sense in having this overhead when it's already compiled with Blender. In my tests this roughly halves the Cycles mesh creation time (0.19s to 0.10s for a 1 million face grid). Differential Revision: https://developer.blender.org/D15488
2022-09-05 11:56:34 -05:00
}
AttributeSet &attributes = (subdivision) ? mesh->subd_attributes : mesh->attributes;
Attribute *attr_N = attributes.add(ATTR_STD_VERTEX_NORMAL);
float3 *N = attr_N->data_float3();
if (subdivision || !(use_corner_normals && !corner_normals.is_empty())) {
const blender::Span<blender::float3> vert_normals = b_mesh.vert_normals();
for (const int i : vert_normals.index_range()) {
N[i] = make_float3(vert_normals[i][0], vert_normals[i][1], vert_normals[i][2]);
Mesh: Remove redundant custom data pointers For copy-on-write, we want to share attribute arrays between meshes where possible. Mutable pointers like `Mesh.mvert` make that difficult by making ownership vague. They also make code more complex by adding redundancy. The simplest solution is just removing them and retrieving layers from `CustomData` as needed. Similar changes have already been applied to curves and point clouds (e9f82d3dc7ee, 410a6efb747f). Removing use of the pointers generally makes code more obvious and more reusable. Mesh data is now accessed with a C++ API (`Mesh::edges()` or `Mesh::edges_for_write()`), and a C API (`BKE_mesh_edges(mesh)`). The CoW changes this commit makes possible are described in T95845 and T95842, and started in D14139 and D14140. The change also simplifies the ongoing mesh struct-of-array refactors from T95965. **RNA/Python Access Performance** Theoretically, accessing mesh elements with the RNA API may become slower, since the layer needs to be found on every random access. However, overhead is already high enough that this doesn't make a noticible differenc, and performance is actually improved in some cases. Random access can be up to 10% faster, but other situations might be a bit slower. Generally using `foreach_get/set` are the best way to improve performance. See the differential revision for more discussion about Python performance. Cycles has been updated to use raw pointers and the internal Blender mesh types, mostly because there is no sense in having this overhead when it's already compiled with Blender. In my tests this roughly halves the Cycles mesh creation time (0.19s to 0.10s for a 1 million face grid). Differential Revision: https://developer.blender.org/D15488
2022-09-05 11:56:34 -05:00
}
}
const set<ustring> blender_uv_names = get_blender_uv_names(b_mesh);
/* create generated coordinates from undeformed coordinates */
const bool need_default_tangent = (subdivision == false) && (blender_uv_names.empty()) &&
(mesh->need_attribute(scene, ATTR_STD_UV_TANGENT));
if (mesh->need_attribute(scene, ATTR_STD_GENERATED) || need_default_tangent) {
const float(*orco)[3] = static_cast<const float(*)[3]>(
CustomData_get_layer(&b_mesh.vert_data, CD_ORCO));
Attribute *attr = attributes.add(ATTR_STD_GENERATED);
attr->flags |= ATTR_SUBDIVIDED;
float3 loc, size;
mesh_texture_space(b_mesh, loc, size);
float texspace_location[3], texspace_size[3];
BKE_mesh_texspace_get(const_cast<::Mesh *>(b_mesh.texcomesh ? b_mesh.texcomesh : &b_mesh),
texspace_location,
texspace_size);
float3 *generated = attr->data_float3();
for (const int i : positions.index_range()) {
blender::float3 value;
if (orco) {
madd_v3_v3v3v3(value, texspace_location, orco[i], texspace_size);
}
else {
value = positions[i];
}
generated[i] = make_float3(value[0], value[1], value[2]) * size - loc;
}
}
auto clamp_material_index = [&](const int material_index) -> int {
return clamp(material_index, 0, used_shaders.size() - 1);
Mesh: Move face shade smooth flag to a generic attribute Currently the shade smooth status for mesh faces is stored as part of `MPoly::flag`. As described in #95967, this moves that information to a separate boolean attribute. It also flips its status, so the attribute is now called `sharp_face`, which mirrors the existing `sharp_edge` attribute. The attribute doesn't need to be allocated when all faces are smooth. Forward compatibility is kept until 4.0 like the other mesh refactors. This will reduce memory bandwidth requirements for some operations, since the array of booleans uses 12 times less memory than `MPoly`. It also allows faces to be stored more efficiently in the future, since the flag is now unused. It's also possible to use generic functions to process the values. For example, finding whether there is a sharp face is just `sharp_faces.contains(true)`. The `shade_smooth` attribute is no longer accessible with geometry nodes. Since there were dedicated accessor nodes for that data, that shouldn't be a problem. That's difficult to version automatically since the named attribute nodes could be used in arbitrary combinations. **Implementation notes:** - The attribute and array variables in the code use the `sharp_faces` term, to be consistent with the user-facing "sharp faces" wording, and to avoid requiring many renames when #101689 is implemented. - Cycles now accesses smooth face status with the generic attribute, to avoid overhead. - Changing the zero-value from "smooth" to "flat" takes some care to make sure defaults are the same. - Versioning for the edge mode extrude node is particularly complex. New nodes are added by versioning to propagate the attribute in its old inverted state. - A lot of access is still done through the `CustomData` API rather than the attribute API because of a few functions. That can be cleaned up easily in the future. - In the future we would benefit from a way to store attributes as a single value for when all faces are sharp. Pull Request: https://projects.blender.org/blender/blender/pulls/104422
2023-03-08 15:36:18 +01:00
};
/* create faces */
if (!subdivision) {
int *triangles = mesh->get_triangles().data();
bool *smooth = mesh->get_smooth().data();
int *shader = mesh->get_shader().data();
const blender::Span<blender::int3> corner_tris = b_mesh.corner_tris();
for (const int i : corner_tris.index_range()) {
const blender::int3 &tri = corner_tris[i];
triangles[i * 3 + 0] = corner_verts[tri[0]];
triangles[i * 3 + 1] = corner_verts[tri[1]];
triangles[i * 3 + 2] = corner_verts[tri[2]];
}
if (!material_indices.is_empty()) {
const blender::Span<int> tri_faces = b_mesh.corner_tri_faces();
for (const int i : corner_tris.index_range()) {
shader[i] = clamp_material_index(material_indices[tri_faces[i]]);
}
}
else {
std::fill(shader, shader + numtris, 0);
}
if (!sharp_faces.is_empty() && !(use_corner_normals && !corner_normals.is_empty())) {
const blender::Span<int> tri_faces = b_mesh.corner_tri_faces();
for (const int i : corner_tris.index_range()) {
smooth[i] = !sharp_faces[tri_faces[i]];
}
}
else {
Mesh: Replace auto smooth with node group Design task: #93551 This PR replaces the auto smooth option with a geometry nodes modifier that sets the sharp edge attribute. This solves a fair number of long- standing problems related to auto smooth, simplifies the process of normal computation, and allows Blender to automatically choose between face, vertex, and face corner normals based on the sharp edge and face attributes. Versioning adds a geometry node group to objects with meshes that had auto-smooth enabled. The modifier can be applied, which also improves performance. Auto smooth is now unnecessary to get a combination of sharp and smooth edges. In general workflows are changed a bit. Separate procedural and destructive workflows are available. Custom normals can be used immediately without turning on the removed auto smooth option. **Procedural** The node group asset "Smooth by Angle" is the main way to set sharp normals based on the edge angle. It can be accessed directly in the add modifier menu. Of course the modifier can be reordered, muted, or applied like any other, or changed internally like any geometry nodes modifier. **Destructive** Often the sharp edges don't need to be dynamic. This can give better performance since edge angles don't need to be recalculated. In edit mode the two operators "Select Sharp Edges" and "Mark Sharp" can be used. In other modes, the "Shade Smooth by Angle" controls the edge sharpness directly. ### Breaking API Changes - `use_auto_smooth` is removed. Face corner normals are now used automatically if there are mixed smooth vs. not smooth tags. Meshes now always use custom normals if they exist. - In Cycles, the lack of the separate auto smooth state makes normals look triangulated when all faces are shaded smooth. - `auto_smooth_angle` is removed. Replaced by a modifier (or operator) controlling the sharp edge attribute. This means the mesh itself (without an object) doesn't know anything about automatically smoothing by angle anymore. - `create_normals_split`, `calc_normals_split`, and `free_normals_split` are removed, and are replaced by the simpler `Mesh.corner_normals` collection property. Since it gives access to the normals cache, it is automatically updated when relevant data changes. Addons are updated here: https://projects.blender.org/blender/blender-addons/pulls/104609 ### Tests - `geo_node_curves_test_deform_curves_on_surface` has slightly different results because face corner normals are used instead of interpolated vertex normals. - `bf_wavefront_obj_tests` has different export results for one file which mixed sharp and smooth faces without turning on auto smooth. - `cycles_mesh_cpu` has one object which is completely flat shaded. Previously every edge was split before rendering, now it looks triangulated. Pull Request: https://projects.blender.org/blender/blender/pulls/108014
2023-10-20 16:54:08 +02:00
/* If only face normals are needed, all faces are sharp. */
std::fill(smooth, smooth + numtris, normals_domain != blender::bke::MeshNormalDomain::Face);
}
if (use_corner_normals && !corner_normals.is_empty()) {
for (const int i : corner_tris.index_range()) {
const blender::int3 &tri = corner_tris[i];
for (int i = 0; i < 3; i++) {
const int corner = tri[i];
const int vert = corner_verts[corner];
const float *normal = corner_normals[corner];
N[vert] = make_float3(normal[0], normal[1], normal[2]);
}
}
}
mesh->tag_triangles_modified();
mesh->tag_shader_modified();
mesh->tag_smooth_modified();
}
else {
int *subd_start_corner = mesh->get_subd_start_corner().data();
int *subd_num_corners = mesh->get_subd_num_corners().data();
int *subd_shader = mesh->get_subd_shader().data();
bool *subd_smooth = mesh->get_subd_smooth().data();
int *subd_ptex_offset = mesh->get_subd_ptex_offset().data();
int *subd_face_corners = mesh->get_subd_face_corners().data();
if (!sharp_faces.is_empty() && !use_corner_normals) {
for (int i = 0; i < numfaces; i++) {
subd_smooth[i] = !sharp_faces[i];
}
}
else {
std::fill(subd_smooth, subd_smooth + numfaces, true);
}
if (!material_indices.is_empty()) {
for (int i = 0; i < numfaces; i++) {
subd_shader[i] = clamp_material_index(material_indices[i]);
}
}
else {
std::fill(subd_shader, subd_shader + numfaces, 0);
}
Mesh: Remove redundant custom data pointers For copy-on-write, we want to share attribute arrays between meshes where possible. Mutable pointers like `Mesh.mvert` make that difficult by making ownership vague. They also make code more complex by adding redundancy. The simplest solution is just removing them and retrieving layers from `CustomData` as needed. Similar changes have already been applied to curves and point clouds (e9f82d3dc7ee, 410a6efb747f). Removing use of the pointers generally makes code more obvious and more reusable. Mesh data is now accessed with a C++ API (`Mesh::edges()` or `Mesh::edges_for_write()`), and a C API (`BKE_mesh_edges(mesh)`). The CoW changes this commit makes possible are described in T95845 and T95842, and started in D14139 and D14140. The change also simplifies the ongoing mesh struct-of-array refactors from T95965. **RNA/Python Access Performance** Theoretically, accessing mesh elements with the RNA API may become slower, since the layer needs to be found on every random access. However, overhead is already high enough that this doesn't make a noticible differenc, and performance is actually improved in some cases. Random access can be up to 10% faster, but other situations might be a bit slower. Generally using `foreach_get/set` are the best way to improve performance. See the differential revision for more discussion about Python performance. Cycles has been updated to use raw pointers and the internal Blender mesh types, mostly because there is no sense in having this overhead when it's already compiled with Blender. In my tests this roughly halves the Cycles mesh creation time (0.19s to 0.10s for a 1 million face grid). Differential Revision: https://developer.blender.org/D15488
2022-09-05 11:56:34 -05:00
std::copy(corner_verts.data(), corner_verts.data() + corner_verts.size(), subd_face_corners);
const blender::OffsetIndices faces = b_mesh.faces();
int ptex_offset = 0;
for (const int i : faces.index_range()) {
const blender::IndexRange face = faces[i];
Mesh: Replace MPoly struct with offset indices Implements #95967. Currently the `MPoly` struct is 12 bytes, and stores the index of a face's first corner and the number of corners/verts/edges. Polygons and corners are always created in order by Blender, meaning each face's corners will be after the previous face's corners. We can take advantage of this fact and eliminate the redundancy in mesh face storage by only storing a single integer corner offset for each face. The size of the face is then encoded by the offset of the next face. The size of a single integer is 4 bytes, so this reduces memory usage by 3 times. The same method is used for `CurvesGeometry`, so Blender already has an abstraction to simplify using these offsets called `OffsetIndices`. This class is used to easily retrieve a range of corner indices for each face. This also gives the opportunity for sharing some logic with curves. Another benefit of the change is that the offsets and sizes stored in `MPoly` can no longer disagree with each other. Storing faces in the order of their corners can simplify some code too. Face/polygon variables now use the `IndexRange` type, which comes with quite a few utilities that can simplify code. Some: - The offset integer array has to be one longer than the face count to avoid a branch for every face, which means the data is no longer part of the mesh's `CustomData`. - We lose the ability to "reference" an original mesh's offset array until more reusable CoW from #104478 is committed. That will be added in a separate commit. - Since they aren't part of `CustomData`, poly offsets often have to be copied manually. - To simplify using `OffsetIndices` in many places, some functions and structs in headers were moved to only compile in C++. - All meshes created by Blender use the same order for faces and face corners, but just in case, meshes with mismatched order are fixed by versioning code. - `MeshPolygon.totloop` is no longer editable in RNA. This API break is necessary here unfortunately. It should be worth it in 3.6, since that's the best way to allow loading meshes from 4.0, which is important for an LTS version. Pull Request: https://projects.blender.org/blender/blender/pulls/105938
2023-04-04 20:39:28 +02:00
subd_start_corner[i] = face.start();
subd_num_corners[i] = face.size();
subd_ptex_offset[i] = ptex_offset;
const int num_ptex = (face.size() == 4) ? 1 : face.size();
ptex_offset += num_ptex;
}
mesh->tag_subd_face_corners_modified();
mesh->tag_subd_start_corner_modified();
mesh->tag_subd_num_corners_modified();
mesh->tag_subd_shader_modified();
mesh->tag_subd_smooth_modified();
mesh->tag_subd_ptex_offset_modified();
}
/* Create all needed attributes.
* The calculate functions will check whether they're needed or not.
*/
if (mesh->need_attribute(scene, ATTR_STD_POINTINESS)) {
attr_create_pointiness(mesh, positions, b_mesh.vert_normals(), b_mesh.edges(), subdivision);
}
attr_create_random_per_island(scene, mesh, b_mesh, subdivision);
attr_create_generic(scene, mesh, b_mesh, subdivision, need_motion, motion_scale);
if (subdivision) {
attr_create_subd_uv_map(scene, mesh, b_mesh, subdivide_uvs, blender_uv_names);
}
else {
attr_create_uv_map(scene, mesh, b_mesh, blender_uv_names);
}
/* For volume objects, create a matrix to transform from object space to
* mesh texture space. this does not work with deformations but that can
* probably only be done well with a volume grid mapping of coordinates. */
if (mesh->need_attribute(scene, ATTR_STD_GENERATED_TRANSFORM)) {
Attribute *attr = mesh->attributes.add(ATTR_STD_GENERATED_TRANSFORM);
Transform *tfm = attr->data_transform();
float3 loc, size;
mesh_texture_space(b_mesh, loc, size);
*tfm = transform_translate(-loc) * transform_scale(size);
}
}
static void create_subd_mesh(Scene *scene,
Mesh *mesh,
Geometry Nodes: support for geometry instancing Previously, the Point Instance node in geometry nodes could only instance existing objects or collections. The reason was that large parts of Blender worked under the assumption that objects are the main unit of instancing. Now we also want to instance geometry within an object, so a slightly larger refactor was necessary. This should not affect files that do not use the new kind of instances. The main change is a redefinition of what "instanced data" is. Now, an instances is a cow-object + object-data (the geometry). This can be nicely seen in `struct DupliObject`. This allows the same object to generate multiple geometries of different types which can be instanced individually. A nice side effect of this refactor is that having multiple geometry components is not a special case in the depsgraph object iterator anymore, because those components are integrated with the `DupliObject` system. Unfortunately, different systems that work with instances in Blender (e.g. render engines and exporters) often work under the assumption that objects are the main unit of instancing. So those have to be updated as well to be able to handle the new instances. This patch updates Cycles, EEVEE and other viewport engines. Exporters have not been updated yet. Some minimal (not master-ready) changes to update the obj and alembic exporters can be found in P2336 and P2335. Different file formats may want to handle these new instances in different ways. For users, the only thing that changed is that the Point Instance node now has a geometry mode. This also fixes T88454. Differential Revision: https://developer.blender.org/D11841
2021-09-06 18:22:24 +02:00
BObjectInfo &b_ob_info,
const ::Mesh &b_mesh,
const array<Node *> &used_shaders,
const bool need_motion,
const float motion_scale,
float dicing_rate,
int max_subdivisions)
{
Geometry Nodes: support for geometry instancing Previously, the Point Instance node in geometry nodes could only instance existing objects or collections. The reason was that large parts of Blender worked under the assumption that objects are the main unit of instancing. Now we also want to instance geometry within an object, so a slightly larger refactor was necessary. This should not affect files that do not use the new kind of instances. The main change is a redefinition of what "instanced data" is. Now, an instances is a cow-object + object-data (the geometry). This can be nicely seen in `struct DupliObject`. This allows the same object to generate multiple geometries of different types which can be instanced individually. A nice side effect of this refactor is that having multiple geometry components is not a special case in the depsgraph object iterator anymore, because those components are integrated with the `DupliObject` system. Unfortunately, different systems that work with instances in Blender (e.g. render engines and exporters) often work under the assumption that objects are the main unit of instancing. So those have to be updated as well to be able to handle the new instances. This patch updates Cycles, EEVEE and other viewport engines. Exporters have not been updated yet. Some minimal (not master-ready) changes to update the obj and alembic exporters can be found in P2336 and P2335. Different file formats may want to handle these new instances in different ways. For users, the only thing that changed is that the Point Instance node now has a geometry mode. This also fixes T88454. Differential Revision: https://developer.blender.org/D11841
2021-09-06 18:22:24 +02:00
BL::Object b_ob = b_ob_info.real_object;
BL::SubsurfModifier subsurf_mod(b_ob.modifiers[b_ob.modifiers.length() - 1]);
bool subdivide_uvs = subsurf_mod.uv_smooth() != BL::SubsurfModifier::uv_smooth_NONE;
create_mesh(scene, mesh, b_mesh, used_shaders, need_motion, motion_scale, true, subdivide_uvs);
const blender::VArraySpan creases = *b_mesh.attributes().lookup<float>(
"crease_edge", blender::bke::AttrDomain::Edge);
if (!creases.is_empty()) {
size_t num_creases = 0;
for (const int i : creases.index_range()) {
if (creases[i] != 0.0f) {
Mesh: Remove redundant custom data pointers For copy-on-write, we want to share attribute arrays between meshes where possible. Mutable pointers like `Mesh.mvert` make that difficult by making ownership vague. They also make code more complex by adding redundancy. The simplest solution is just removing them and retrieving layers from `CustomData` as needed. Similar changes have already been applied to curves and point clouds (e9f82d3dc7ee, 410a6efb747f). Removing use of the pointers generally makes code more obvious and more reusable. Mesh data is now accessed with a C++ API (`Mesh::edges()` or `Mesh::edges_for_write()`), and a C API (`BKE_mesh_edges(mesh)`). The CoW changes this commit makes possible are described in T95845 and T95842, and started in D14139 and D14140. The change also simplifies the ongoing mesh struct-of-array refactors from T95965. **RNA/Python Access Performance** Theoretically, accessing mesh elements with the RNA API may become slower, since the layer needs to be found on every random access. However, overhead is already high enough that this doesn't make a noticible differenc, and performance is actually improved in some cases. Random access can be up to 10% faster, but other situations might be a bit slower. Generally using `foreach_get/set` are the best way to improve performance. See the differential revision for more discussion about Python performance. Cycles has been updated to use raw pointers and the internal Blender mesh types, mostly because there is no sense in having this overhead when it's already compiled with Blender. In my tests this roughly halves the Cycles mesh creation time (0.19s to 0.10s for a 1 million face grid). Differential Revision: https://developer.blender.org/D15488
2022-09-05 11:56:34 -05:00
num_creases++;
}
}
Mesh: Remove redundant custom data pointers For copy-on-write, we want to share attribute arrays between meshes where possible. Mutable pointers like `Mesh.mvert` make that difficult by making ownership vague. They also make code more complex by adding redundancy. The simplest solution is just removing them and retrieving layers from `CustomData` as needed. Similar changes have already been applied to curves and point clouds (e9f82d3dc7ee, 410a6efb747f). Removing use of the pointers generally makes code more obvious and more reusable. Mesh data is now accessed with a C++ API (`Mesh::edges()` or `Mesh::edges_for_write()`), and a C API (`BKE_mesh_edges(mesh)`). The CoW changes this commit makes possible are described in T95845 and T95842, and started in D14139 and D14140. The change also simplifies the ongoing mesh struct-of-array refactors from T95965. **RNA/Python Access Performance** Theoretically, accessing mesh elements with the RNA API may become slower, since the layer needs to be found on every random access. However, overhead is already high enough that this doesn't make a noticible differenc, and performance is actually improved in some cases. Random access can be up to 10% faster, but other situations might be a bit slower. Generally using `foreach_get/set` are the best way to improve performance. See the differential revision for more discussion about Python performance. Cycles has been updated to use raw pointers and the internal Blender mesh types, mostly because there is no sense in having this overhead when it's already compiled with Blender. In my tests this roughly halves the Cycles mesh creation time (0.19s to 0.10s for a 1 million face grid). Differential Revision: https://developer.blender.org/D15488
2022-09-05 11:56:34 -05:00
mesh->reserve_subd_creases(num_creases);
const blender::Span<blender::int2> edges = b_mesh.edges();
for (const int i : edges.index_range()) {
const float crease = creases[i];
if (crease != 0.0f) {
const blender::int2 &b_edge = edges[i];
Mesh: Move edges to a generic attribute Implements #95966, as the final step of #95965. This commit changes the storage of mesh edge vertex indices from the `MEdge` type to the generic `int2` attribute type. This follows the general design for geometry and the attribute system, where the data storage type and the usage semantics are separated. The main benefit of the change is reduced memory usage-- the requirements of storing mesh edges is reduced by 1/3. For example, this saves 8MB on a 1 million vertex grid. This also gives performance benefits to any memory-bound mesh processing algorithm that uses edges. Another benefit is that all of the edge's vertex indices are contiguous. In a few cases, it's helpful to process all of them as `Span<int>` rather than `Span<int2>`. Similarly, the type is more likely to match a generic format used by a library, or code that shouldn't know about specific Blender `Mesh` types. Various Notes: - The `.edge_verts` name is used to reflect a mapping between domains, similar to `.corner_verts`, etc. The period means that it the data shouldn't change arbitrarily by the user or procedural operations. - `edge[0]` is now used instead of `edge.v1` - Signed integers are used instead of unsigned to reduce the mixing of signed-ness, which can be error prone. - All of the previously used core mesh data types (`MVert`, `MEdge`, `MLoop`, `MPoly` are now deprecated. Only generic types are used). - The `vec2i` DNA type is used in the few C files where necessary. Pull Request: https://projects.blender.org/blender/blender/pulls/106638
2023-04-17 13:47:41 +02:00
mesh->add_edge_crease(b_edge[0], b_edge[1], crease);
Mesh: Remove redundant custom data pointers For copy-on-write, we want to share attribute arrays between meshes where possible. Mutable pointers like `Mesh.mvert` make that difficult by making ownership vague. They also make code more complex by adding redundancy. The simplest solution is just removing them and retrieving layers from `CustomData` as needed. Similar changes have already been applied to curves and point clouds (e9f82d3dc7ee, 410a6efb747f). Removing use of the pointers generally makes code more obvious and more reusable. Mesh data is now accessed with a C++ API (`Mesh::edges()` or `Mesh::edges_for_write()`), and a C API (`BKE_mesh_edges(mesh)`). The CoW changes this commit makes possible are described in T95845 and T95842, and started in D14139 and D14140. The change also simplifies the ongoing mesh struct-of-array refactors from T95965. **RNA/Python Access Performance** Theoretically, accessing mesh elements with the RNA API may become slower, since the layer needs to be found on every random access. However, overhead is already high enough that this doesn't make a noticible differenc, and performance is actually improved in some cases. Random access can be up to 10% faster, but other situations might be a bit slower. Generally using `foreach_get/set` are the best way to improve performance. See the differential revision for more discussion about Python performance. Cycles has been updated to use raw pointers and the internal Blender mesh types, mostly because there is no sense in having this overhead when it's already compiled with Blender. In my tests this roughly halves the Cycles mesh creation time (0.19s to 0.10s for a 1 million face grid). Differential Revision: https://developer.blender.org/D15488
2022-09-05 11:56:34 -05:00
}
}
}
const blender::VArraySpan vert_creases = *b_mesh.attributes().lookup<float>(
"crease_vert", blender::bke::AttrDomain::Point);
if (!vert_creases.is_empty()) {
for (const int i : vert_creases.index_range()) {
if (vert_creases[i] != 0.0f) {
mesh->add_vertex_crease(i, vert_creases[i]);
}
}
}
/* set subd params */
PointerRNA cobj = RNA_pointer_get(&b_ob.ptr, "cycles");
float subd_dicing_rate = max(0.1f, RNA_float_get(&cobj, "dicing_rate") * dicing_rate);
mesh->set_subd_dicing_rate(subd_dicing_rate);
mesh->set_subd_max_level(max_subdivisions);
mesh->set_subd_objecttoworld(get_transform(b_ob.matrix_world()));
}
/* Sync */
Geometry Nodes: support for geometry instancing Previously, the Point Instance node in geometry nodes could only instance existing objects or collections. The reason was that large parts of Blender worked under the assumption that objects are the main unit of instancing. Now we also want to instance geometry within an object, so a slightly larger refactor was necessary. This should not affect files that do not use the new kind of instances. The main change is a redefinition of what "instanced data" is. Now, an instances is a cow-object + object-data (the geometry). This can be nicely seen in `struct DupliObject`. This allows the same object to generate multiple geometries of different types which can be instanced individually. A nice side effect of this refactor is that having multiple geometry components is not a special case in the depsgraph object iterator anymore, because those components are integrated with the `DupliObject` system. Unfortunately, different systems that work with instances in Blender (e.g. render engines and exporters) often work under the assumption that objects are the main unit of instancing. So those have to be updated as well to be able to handle the new instances. This patch updates Cycles, EEVEE and other viewport engines. Exporters have not been updated yet. Some minimal (not master-ready) changes to update the obj and alembic exporters can be found in P2336 and P2335. Different file formats may want to handle these new instances in different ways. For users, the only thing that changed is that the Point Instance node now has a geometry mode. This also fixes T88454. Differential Revision: https://developer.blender.org/D11841
2021-09-06 18:22:24 +02:00
void BlenderSync::sync_mesh(BL::Depsgraph b_depsgraph, BObjectInfo &b_ob_info, Mesh *mesh)
{
/* make a copy of the shaders as the caller in the main thread still need them for syncing the
* attributes */
array<Node *> used_shaders = mesh->get_used_shaders();
Mesh new_mesh;
new_mesh.set_used_shaders(used_shaders);
if (view_layer.use_surfaces) {
/* Adaptive subdivision setup. Not for baking since that requires
* exact mapping to the Blender mesh. */
Mesh::SubdivisionType subdivision_type = (b_ob_info.real_object != b_bake_target) ?
object_subdivision_type(b_ob_info.real_object,
preview,
experimental) :
Mesh::SUBDIVISION_NONE;
new_mesh.set_subdivision_type(subdivision_type);
/* For some reason, meshes do not need this... */
bool need_undeformed = new_mesh.need_attribute(scene, ATTR_STD_GENERATED);
BL::Mesh b_mesh = object_to_mesh(
Geometry Nodes: support for geometry instancing Previously, the Point Instance node in geometry nodes could only instance existing objects or collections. The reason was that large parts of Blender worked under the assumption that objects are the main unit of instancing. Now we also want to instance geometry within an object, so a slightly larger refactor was necessary. This should not affect files that do not use the new kind of instances. The main change is a redefinition of what "instanced data" is. Now, an instances is a cow-object + object-data (the geometry). This can be nicely seen in `struct DupliObject`. This allows the same object to generate multiple geometries of different types which can be instanced individually. A nice side effect of this refactor is that having multiple geometry components is not a special case in the depsgraph object iterator anymore, because those components are integrated with the `DupliObject` system. Unfortunately, different systems that work with instances in Blender (e.g. render engines and exporters) often work under the assumption that objects are the main unit of instancing. So those have to be updated as well to be able to handle the new instances. This patch updates Cycles, EEVEE and other viewport engines. Exporters have not been updated yet. Some minimal (not master-ready) changes to update the obj and alembic exporters can be found in P2336 and P2335. Different file formats may want to handle these new instances in different ways. For users, the only thing that changed is that the Point Instance node now has a geometry mode. This also fixes T88454. Differential Revision: https://developer.blender.org/D11841
2021-09-06 18:22:24 +02:00
b_data, b_ob_info, b_depsgraph, need_undeformed, new_mesh.get_subdivision_type());
if (b_mesh) {
/* Motion blur attribute is relative to seconds, we need it relative to frames. */
const bool need_motion = object_need_motion_attribute(b_ob_info, scene);
const float motion_scale = (need_motion) ?
scene->motion_shutter_time() /
(b_scene.render().fps() / b_scene.render().fps_base()) :
0.0f;
/* Sync mesh itself. */
if (new_mesh.get_subdivision_type() != Mesh::SUBDIVISION_NONE) {
create_subd_mesh(scene,
&new_mesh,
Geometry Nodes: support for geometry instancing Previously, the Point Instance node in geometry nodes could only instance existing objects or collections. The reason was that large parts of Blender worked under the assumption that objects are the main unit of instancing. Now we also want to instance geometry within an object, so a slightly larger refactor was necessary. This should not affect files that do not use the new kind of instances. The main change is a redefinition of what "instanced data" is. Now, an instances is a cow-object + object-data (the geometry). This can be nicely seen in `struct DupliObject`. This allows the same object to generate multiple geometries of different types which can be instanced individually. A nice side effect of this refactor is that having multiple geometry components is not a special case in the depsgraph object iterator anymore, because those components are integrated with the `DupliObject` system. Unfortunately, different systems that work with instances in Blender (e.g. render engines and exporters) often work under the assumption that objects are the main unit of instancing. So those have to be updated as well to be able to handle the new instances. This patch updates Cycles, EEVEE and other viewport engines. Exporters have not been updated yet. Some minimal (not master-ready) changes to update the obj and alembic exporters can be found in P2336 and P2335. Different file formats may want to handle these new instances in different ways. For users, the only thing that changed is that the Point Instance node now has a geometry mode. This also fixes T88454. Differential Revision: https://developer.blender.org/D11841
2021-09-06 18:22:24 +02:00
b_ob_info,
*static_cast<const ::Mesh *>(b_mesh.ptr.data),
new_mesh.get_used_shaders(),
need_motion,
motion_scale,
dicing_rate,
max_subdivisions);
}
else {
create_mesh(scene,
&new_mesh,
*static_cast<const ::Mesh *>(b_mesh.ptr.data),
new_mesh.get_used_shaders(),
need_motion,
motion_scale,
false);
}
Geometry Nodes: support for geometry instancing Previously, the Point Instance node in geometry nodes could only instance existing objects or collections. The reason was that large parts of Blender worked under the assumption that objects are the main unit of instancing. Now we also want to instance geometry within an object, so a slightly larger refactor was necessary. This should not affect files that do not use the new kind of instances. The main change is a redefinition of what "instanced data" is. Now, an instances is a cow-object + object-data (the geometry). This can be nicely seen in `struct DupliObject`. This allows the same object to generate multiple geometries of different types which can be instanced individually. A nice side effect of this refactor is that having multiple geometry components is not a special case in the depsgraph object iterator anymore, because those components are integrated with the `DupliObject` system. Unfortunately, different systems that work with instances in Blender (e.g. render engines and exporters) often work under the assumption that objects are the main unit of instancing. So those have to be updated as well to be able to handle the new instances. This patch updates Cycles, EEVEE and other viewport engines. Exporters have not been updated yet. Some minimal (not master-ready) changes to update the obj and alembic exporters can be found in P2336 and P2335. Different file formats may want to handle these new instances in different ways. For users, the only thing that changed is that the Point Instance node now has a geometry mode. This also fixes T88454. Differential Revision: https://developer.blender.org/D11841
2021-09-06 18:22:24 +02:00
free_object_to_mesh(b_data, b_ob_info, b_mesh);
}
}
/* update original sockets */
mesh->clear_non_sockets();
for (const SocketType &socket : new_mesh.type->inputs) {
/* Those sockets are updated in sync_object, so do not modify them. */
2020-11-06 10:29:00 +11:00
if (socket.name == "use_motion_blur" || socket.name == "motion_steps" ||
socket.name == "used_shaders")
{
continue;
}
mesh->set_value(socket, new_mesh, socket);
}
mesh->attributes.update(std::move(new_mesh.attributes));
mesh->subd_attributes.update(std::move(new_mesh.subd_attributes));
mesh->set_num_subd_faces(new_mesh.get_num_subd_faces());
/* tag update */
bool rebuild = (mesh->triangles_is_modified()) || (mesh->subd_num_corners_is_modified()) ||
(mesh->subd_shader_is_modified()) || (mesh->subd_smooth_is_modified()) ||
(mesh->subd_ptex_offset_is_modified()) ||
(mesh->subd_start_corner_is_modified()) ||
(mesh->subd_face_corners_is_modified());
mesh->tag_update(scene, rebuild);
}
void BlenderSync::sync_mesh_motion(BL::Depsgraph b_depsgraph,
Geometry Nodes: support for geometry instancing Previously, the Point Instance node in geometry nodes could only instance existing objects or collections. The reason was that large parts of Blender worked under the assumption that objects are the main unit of instancing. Now we also want to instance geometry within an object, so a slightly larger refactor was necessary. This should not affect files that do not use the new kind of instances. The main change is a redefinition of what "instanced data" is. Now, an instances is a cow-object + object-data (the geometry). This can be nicely seen in `struct DupliObject`. This allows the same object to generate multiple geometries of different types which can be instanced individually. A nice side effect of this refactor is that having multiple geometry components is not a special case in the depsgraph object iterator anymore, because those components are integrated with the `DupliObject` system. Unfortunately, different systems that work with instances in Blender (e.g. render engines and exporters) often work under the assumption that objects are the main unit of instancing. So those have to be updated as well to be able to handle the new instances. This patch updates Cycles, EEVEE and other viewport engines. Exporters have not been updated yet. Some minimal (not master-ready) changes to update the obj and alembic exporters can be found in P2336 and P2335. Different file formats may want to handle these new instances in different ways. For users, the only thing that changed is that the Point Instance node now has a geometry mode. This also fixes T88454. Differential Revision: https://developer.blender.org/D11841
2021-09-06 18:22:24 +02:00
BObjectInfo &b_ob_info,
Mesh *mesh,
int motion_step)
{
/* Skip if no vertices were exported. */
size_t numverts = mesh->get_verts().size();
if (numverts == 0) {
return;
}
/* Skip objects without deforming modifiers. this is not totally reliable,
* would need a more extensive check to see which objects are animated. */
BL::Mesh b_mesh_rna(PointerRNA_NULL);
Geometry Nodes: support for geometry instancing Previously, the Point Instance node in geometry nodes could only instance existing objects or collections. The reason was that large parts of Blender worked under the assumption that objects are the main unit of instancing. Now we also want to instance geometry within an object, so a slightly larger refactor was necessary. This should not affect files that do not use the new kind of instances. The main change is a redefinition of what "instanced data" is. Now, an instances is a cow-object + object-data (the geometry). This can be nicely seen in `struct DupliObject`. This allows the same object to generate multiple geometries of different types which can be instanced individually. A nice side effect of this refactor is that having multiple geometry components is not a special case in the depsgraph object iterator anymore, because those components are integrated with the `DupliObject` system. Unfortunately, different systems that work with instances in Blender (e.g. render engines and exporters) often work under the assumption that objects are the main unit of instancing. So those have to be updated as well to be able to handle the new instances. This patch updates Cycles, EEVEE and other viewport engines. Exporters have not been updated yet. Some minimal (not master-ready) changes to update the obj and alembic exporters can be found in P2336 and P2335. Different file formats may want to handle these new instances in different ways. For users, the only thing that changed is that the Point Instance node now has a geometry mode. This also fixes T88454. Differential Revision: https://developer.blender.org/D11841
2021-09-06 18:22:24 +02:00
if (ccl::BKE_object_is_deform_modified(b_ob_info, b_scene, preview)) {
/* get derived mesh */
b_mesh_rna = object_to_mesh(b_data, b_ob_info, b_depsgraph, false, Mesh::SUBDIVISION_NONE);
}
Geometry Nodes: support for geometry instancing Previously, the Point Instance node in geometry nodes could only instance existing objects or collections. The reason was that large parts of Blender worked under the assumption that objects are the main unit of instancing. Now we also want to instance geometry within an object, so a slightly larger refactor was necessary. This should not affect files that do not use the new kind of instances. The main change is a redefinition of what "instanced data" is. Now, an instances is a cow-object + object-data (the geometry). This can be nicely seen in `struct DupliObject`. This allows the same object to generate multiple geometries of different types which can be instanced individually. A nice side effect of this refactor is that having multiple geometry components is not a special case in the depsgraph object iterator anymore, because those components are integrated with the `DupliObject` system. Unfortunately, different systems that work with instances in Blender (e.g. render engines and exporters) often work under the assumption that objects are the main unit of instancing. So those have to be updated as well to be able to handle the new instances. This patch updates Cycles, EEVEE and other viewport engines. Exporters have not been updated yet. Some minimal (not master-ready) changes to update the obj and alembic exporters can be found in P2336 and P2335. Different file formats may want to handle these new instances in different ways. For users, the only thing that changed is that the Point Instance node now has a geometry mode. This also fixes T88454. Differential Revision: https://developer.blender.org/D11841
2021-09-06 18:22:24 +02:00
const std::string ob_name = b_ob_info.real_object.name();
2019-06-17 12:51:53 +10:00
/* TODO(sergey): Perform preliminary check for number of vertices. */
if (b_mesh_rna) {
const ::Mesh &b_mesh = *static_cast<const ::Mesh *>(b_mesh_rna.ptr.data);
const int b_verts_num = b_mesh.verts_num;
const blender::Span<blender::float3> positions = b_mesh.vert_positions();
if (positions.is_empty()) {
free_object_to_mesh(b_data, b_ob_info, b_mesh_rna);
Mesh: Remove redundant custom data pointers For copy-on-write, we want to share attribute arrays between meshes where possible. Mutable pointers like `Mesh.mvert` make that difficult by making ownership vague. They also make code more complex by adding redundancy. The simplest solution is just removing them and retrieving layers from `CustomData` as needed. Similar changes have already been applied to curves and point clouds (e9f82d3dc7ee, 410a6efb747f). Removing use of the pointers generally makes code more obvious and more reusable. Mesh data is now accessed with a C++ API (`Mesh::edges()` or `Mesh::edges_for_write()`), and a C API (`BKE_mesh_edges(mesh)`). The CoW changes this commit makes possible are described in T95845 and T95842, and started in D14139 and D14140. The change also simplifies the ongoing mesh struct-of-array refactors from T95965. **RNA/Python Access Performance** Theoretically, accessing mesh elements with the RNA API may become slower, since the layer needs to be found on every random access. However, overhead is already high enough that this doesn't make a noticible differenc, and performance is actually improved in some cases. Random access can be up to 10% faster, but other situations might be a bit slower. Generally using `foreach_get/set` are the best way to improve performance. See the differential revision for more discussion about Python performance. Cycles has been updated to use raw pointers and the internal Blender mesh types, mostly because there is no sense in having this overhead when it's already compiled with Blender. In my tests this roughly halves the Cycles mesh creation time (0.19s to 0.10s for a 1 million face grid). Differential Revision: https://developer.blender.org/D15488
2022-09-05 11:56:34 -05:00
return;
}
/* Export deformed coordinates. */
/* Find attributes. */
Attribute *attr_mP = mesh->attributes.find(ATTR_STD_MOTION_VERTEX_POSITION);
Attribute *attr_mN = mesh->attributes.find(ATTR_STD_MOTION_VERTEX_NORMAL);
Attribute *attr_N = mesh->attributes.find(ATTR_STD_VERTEX_NORMAL);
bool new_attribute = false;
/* Add new attributes if they don't exist already. */
if (!attr_mP) {
attr_mP = mesh->attributes.add(ATTR_STD_MOTION_VERTEX_POSITION);
if (attr_N) {
attr_mN = mesh->attributes.add(ATTR_STD_MOTION_VERTEX_NORMAL);
}
new_attribute = true;
}
/* Load vertex data from mesh. */
float3 *mP = attr_mP->data_float3() + motion_step * numverts;
float3 *mN = (attr_mN) ? attr_mN->data_float3() + motion_step * numverts : nullptr;
Mesh: Remove redundant custom data pointers For copy-on-write, we want to share attribute arrays between meshes where possible. Mutable pointers like `Mesh.mvert` make that difficult by making ownership vague. They also make code more complex by adding redundancy. The simplest solution is just removing them and retrieving layers from `CustomData` as needed. Similar changes have already been applied to curves and point clouds (e9f82d3dc7ee, 410a6efb747f). Removing use of the pointers generally makes code more obvious and more reusable. Mesh data is now accessed with a C++ API (`Mesh::edges()` or `Mesh::edges_for_write()`), and a C API (`BKE_mesh_edges(mesh)`). The CoW changes this commit makes possible are described in T95845 and T95842, and started in D14139 and D14140. The change also simplifies the ongoing mesh struct-of-array refactors from T95965. **RNA/Python Access Performance** Theoretically, accessing mesh elements with the RNA API may become slower, since the layer needs to be found on every random access. However, overhead is already high enough that this doesn't make a noticible differenc, and performance is actually improved in some cases. Random access can be up to 10% faster, but other situations might be a bit slower. Generally using `foreach_get/set` are the best way to improve performance. See the differential revision for more discussion about Python performance. Cycles has been updated to use raw pointers and the internal Blender mesh types, mostly because there is no sense in having this overhead when it's already compiled with Blender. In my tests this roughly halves the Cycles mesh creation time (0.19s to 0.10s for a 1 million face grid). Differential Revision: https://developer.blender.org/D15488
2022-09-05 11:56:34 -05:00
/* NOTE: We don't copy more that existing amount of vertices to prevent
* possible memory corruption.
*/
Mesh: Remove redundant custom data pointers For copy-on-write, we want to share attribute arrays between meshes where possible. Mutable pointers like `Mesh.mvert` make that difficult by making ownership vague. They also make code more complex by adding redundancy. The simplest solution is just removing them and retrieving layers from `CustomData` as needed. Similar changes have already been applied to curves and point clouds (e9f82d3dc7ee, 410a6efb747f). Removing use of the pointers generally makes code more obvious and more reusable. Mesh data is now accessed with a C++ API (`Mesh::edges()` or `Mesh::edges_for_write()`), and a C API (`BKE_mesh_edges(mesh)`). The CoW changes this commit makes possible are described in T95845 and T95842, and started in D14139 and D14140. The change also simplifies the ongoing mesh struct-of-array refactors from T95965. **RNA/Python Access Performance** Theoretically, accessing mesh elements with the RNA API may become slower, since the layer needs to be found on every random access. However, overhead is already high enough that this doesn't make a noticible differenc, and performance is actually improved in some cases. Random access can be up to 10% faster, but other situations might be a bit slower. Generally using `foreach_get/set` are the best way to improve performance. See the differential revision for more discussion about Python performance. Cycles has been updated to use raw pointers and the internal Blender mesh types, mostly because there is no sense in having this overhead when it's already compiled with Blender. In my tests this roughly halves the Cycles mesh creation time (0.19s to 0.10s for a 1 million face grid). Differential Revision: https://developer.blender.org/D15488
2022-09-05 11:56:34 -05:00
for (int i = 0; i < std::min<size_t>(b_verts_num, numverts); i++) {
Mesh: Move positions to a generic attribute **Changes** As described in T93602, this patch removes all use of the `MVert` struct, replacing it with a generic named attribute with the name `"position"`, consistent with other geometry types. Variable names have been changed from `verts` to `positions`, to align with the attribute name and the more generic design (positions are not vertices, they are just an attribute stored on the point domain). This change is made possible by previous commits that moved all other data out of `MVert` to runtime data or other generic attributes. What remains is mostly a simple type change. Though, the type still shows up 859 times, so the patch is quite large. One compromise is that now `CD_MASK_BAREMESH` now contains `CD_PROP_FLOAT3`. With the general move towards generic attributes over custom data types, we are removing use of these type masks anyway. **Benefits** The most obvious benefit is reduced memory usage and the benefits that brings in memory-bound situations. `float3` is only 3 bytes, in comparison to `MVert` which was 4. When there are millions of vertices this starts to matter more. The other benefits come from using a more generic type. Instead of writing algorithms specifically for `MVert`, code can just use arrays of vectors. This will allow eliminating many temporary arrays or wrappers used to extract positions. Many possible improvements aren't implemented in this patch, though I did switch simplify or remove the process of creating temporary position arrays in a few places. The design clarity that "positions are just another attribute" brings allows removing explicit copying of vertices in some procedural operations-- they are just processed like most other attributes. **Performance** This touches so many areas that it's hard to benchmark exhaustively, but I observed some areas as examples. * The mesh line node with 4 million count was 1.5x (8ms to 12ms) faster. * The Spring splash screen went from ~4.3 to ~4.5 fps. * The subdivision surface modifier/node was slightly faster RNA access through Python may be slightly slower, since now we need a name lookup instead of just a custom data type lookup for each index. **Future Improvements** * Remove uses of "vert_coords" functions: * `BKE_mesh_vert_coords_alloc` * `BKE_mesh_vert_coords_get` * `BKE_mesh_vert_coords_apply{_with_mat4}` * Remove more hidden copying of positions * General simplification now possible in many areas * Convert more code to C++ to use `float3` instead of `float[3]` * Currently `reinterpret_cast` is used for those C-API functions Differential Revision: https://developer.blender.org/D15982
2023-01-10 00:10:43 -05:00
mP[i] = make_float3(positions[i][0], positions[i][1], positions[i][2]);
Mesh: Remove redundant custom data pointers For copy-on-write, we want to share attribute arrays between meshes where possible. Mutable pointers like `Mesh.mvert` make that difficult by making ownership vague. They also make code more complex by adding redundancy. The simplest solution is just removing them and retrieving layers from `CustomData` as needed. Similar changes have already been applied to curves and point clouds (e9f82d3dc7ee, 410a6efb747f). Removing use of the pointers generally makes code more obvious and more reusable. Mesh data is now accessed with a C++ API (`Mesh::edges()` or `Mesh::edges_for_write()`), and a C API (`BKE_mesh_edges(mesh)`). The CoW changes this commit makes possible are described in T95845 and T95842, and started in D14139 and D14140. The change also simplifies the ongoing mesh struct-of-array refactors from T95965. **RNA/Python Access Performance** Theoretically, accessing mesh elements with the RNA API may become slower, since the layer needs to be found on every random access. However, overhead is already high enough that this doesn't make a noticible differenc, and performance is actually improved in some cases. Random access can be up to 10% faster, but other situations might be a bit slower. Generally using `foreach_get/set` are the best way to improve performance. See the differential revision for more discussion about Python performance. Cycles has been updated to use raw pointers and the internal Blender mesh types, mostly because there is no sense in having this overhead when it's already compiled with Blender. In my tests this roughly halves the Cycles mesh creation time (0.19s to 0.10s for a 1 million face grid). Differential Revision: https://developer.blender.org/D15488
2022-09-05 11:56:34 -05:00
}
if (mN) {
const blender::Span<blender::float3> b_vert_normals = b_mesh.vert_normals();
Mesh: Remove redundant custom data pointers For copy-on-write, we want to share attribute arrays between meshes where possible. Mutable pointers like `Mesh.mvert` make that difficult by making ownership vague. They also make code more complex by adding redundancy. The simplest solution is just removing them and retrieving layers from `CustomData` as needed. Similar changes have already been applied to curves and point clouds (e9f82d3dc7ee, 410a6efb747f). Removing use of the pointers generally makes code more obvious and more reusable. Mesh data is now accessed with a C++ API (`Mesh::edges()` or `Mesh::edges_for_write()`), and a C API (`BKE_mesh_edges(mesh)`). The CoW changes this commit makes possible are described in T95845 and T95842, and started in D14139 and D14140. The change also simplifies the ongoing mesh struct-of-array refactors from T95965. **RNA/Python Access Performance** Theoretically, accessing mesh elements with the RNA API may become slower, since the layer needs to be found on every random access. However, overhead is already high enough that this doesn't make a noticible differenc, and performance is actually improved in some cases. Random access can be up to 10% faster, but other situations might be a bit slower. Generally using `foreach_get/set` are the best way to improve performance. See the differential revision for more discussion about Python performance. Cycles has been updated to use raw pointers and the internal Blender mesh types, mostly because there is no sense in having this overhead when it's already compiled with Blender. In my tests this roughly halves the Cycles mesh creation time (0.19s to 0.10s for a 1 million face grid). Differential Revision: https://developer.blender.org/D15488
2022-09-05 11:56:34 -05:00
for (int i = 0; i < std::min<size_t>(b_verts_num, numverts); i++) {
mN[i] = make_float3(b_vert_normals[i][0], b_vert_normals[i][1], b_vert_normals[i][2]);
Mesh: Remove redundant custom data pointers For copy-on-write, we want to share attribute arrays between meshes where possible. Mutable pointers like `Mesh.mvert` make that difficult by making ownership vague. They also make code more complex by adding redundancy. The simplest solution is just removing them and retrieving layers from `CustomData` as needed. Similar changes have already been applied to curves and point clouds (e9f82d3dc7ee, 410a6efb747f). Removing use of the pointers generally makes code more obvious and more reusable. Mesh data is now accessed with a C++ API (`Mesh::edges()` or `Mesh::edges_for_write()`), and a C API (`BKE_mesh_edges(mesh)`). The CoW changes this commit makes possible are described in T95845 and T95842, and started in D14139 and D14140. The change also simplifies the ongoing mesh struct-of-array refactors from T95965. **RNA/Python Access Performance** Theoretically, accessing mesh elements with the RNA API may become slower, since the layer needs to be found on every random access. However, overhead is already high enough that this doesn't make a noticible differenc, and performance is actually improved in some cases. Random access can be up to 10% faster, but other situations might be a bit slower. Generally using `foreach_get/set` are the best way to improve performance. See the differential revision for more discussion about Python performance. Cycles has been updated to use raw pointers and the internal Blender mesh types, mostly because there is no sense in having this overhead when it's already compiled with Blender. In my tests this roughly halves the Cycles mesh creation time (0.19s to 0.10s for a 1 million face grid). Differential Revision: https://developer.blender.org/D15488
2022-09-05 11:56:34 -05:00
}
}
if (new_attribute) {
/* In case of new attribute, we verify if there really was any motion. */
Mesh: Remove redundant custom data pointers For copy-on-write, we want to share attribute arrays between meshes where possible. Mutable pointers like `Mesh.mvert` make that difficult by making ownership vague. They also make code more complex by adding redundancy. The simplest solution is just removing them and retrieving layers from `CustomData` as needed. Similar changes have already been applied to curves and point clouds (e9f82d3dc7ee, 410a6efb747f). Removing use of the pointers generally makes code more obvious and more reusable. Mesh data is now accessed with a C++ API (`Mesh::edges()` or `Mesh::edges_for_write()`), and a C API (`BKE_mesh_edges(mesh)`). The CoW changes this commit makes possible are described in T95845 and T95842, and started in D14139 and D14140. The change also simplifies the ongoing mesh struct-of-array refactors from T95965. **RNA/Python Access Performance** Theoretically, accessing mesh elements with the RNA API may become slower, since the layer needs to be found on every random access. However, overhead is already high enough that this doesn't make a noticible differenc, and performance is actually improved in some cases. Random access can be up to 10% faster, but other situations might be a bit slower. Generally using `foreach_get/set` are the best way to improve performance. See the differential revision for more discussion about Python performance. Cycles has been updated to use raw pointers and the internal Blender mesh types, mostly because there is no sense in having this overhead when it's already compiled with Blender. In my tests this roughly halves the Cycles mesh creation time (0.19s to 0.10s for a 1 million face grid). Differential Revision: https://developer.blender.org/D15488
2022-09-05 11:56:34 -05:00
if (b_verts_num != numverts ||
memcmp(mP, mesh->get_verts().data(), sizeof(float3) * numverts) == 0)
{
/* no motion, remove attributes again */
Mesh: Remove redundant custom data pointers For copy-on-write, we want to share attribute arrays between meshes where possible. Mutable pointers like `Mesh.mvert` make that difficult by making ownership vague. They also make code more complex by adding redundancy. The simplest solution is just removing them and retrieving layers from `CustomData` as needed. Similar changes have already been applied to curves and point clouds (e9f82d3dc7ee, 410a6efb747f). Removing use of the pointers generally makes code more obvious and more reusable. Mesh data is now accessed with a C++ API (`Mesh::edges()` or `Mesh::edges_for_write()`), and a C API (`BKE_mesh_edges(mesh)`). The CoW changes this commit makes possible are described in T95845 and T95842, and started in D14139 and D14140. The change also simplifies the ongoing mesh struct-of-array refactors from T95965. **RNA/Python Access Performance** Theoretically, accessing mesh elements with the RNA API may become slower, since the layer needs to be found on every random access. However, overhead is already high enough that this doesn't make a noticible differenc, and performance is actually improved in some cases. Random access can be up to 10% faster, but other situations might be a bit slower. Generally using `foreach_get/set` are the best way to improve performance. See the differential revision for more discussion about Python performance. Cycles has been updated to use raw pointers and the internal Blender mesh types, mostly because there is no sense in having this overhead when it's already compiled with Blender. In my tests this roughly halves the Cycles mesh creation time (0.19s to 0.10s for a 1 million face grid). Differential Revision: https://developer.blender.org/D15488
2022-09-05 11:56:34 -05:00
if (b_verts_num != numverts) {
VLOG_WARNING << "Topology differs, disabling motion blur for object " << ob_name;
}
else {
VLOG_DEBUG << "No actual deformation motion for object " << ob_name;
}
mesh->attributes.remove(ATTR_STD_MOTION_VERTEX_POSITION);
if (attr_mN) {
mesh->attributes.remove(ATTR_STD_MOTION_VERTEX_NORMAL);
}
}
else if (motion_step > 0) {
VLOG_DEBUG << "Filling deformation motion for object " << ob_name;
/* motion, fill up previous steps that we might have skipped because
* they had no motion, but we need them anyway now */
float3 *P = mesh->get_verts().data();
float3 *N = (attr_N) ? attr_N->data_float3() : nullptr;
for (int step = 0; step < motion_step; step++) {
memcpy(attr_mP->data_float3() + step * numverts, P, sizeof(float3) * numverts);
if (attr_mN) {
memcpy(attr_mN->data_float3() + step * numverts, N, sizeof(float3) * numverts);
}
}
}
}
else {
Mesh: Remove redundant custom data pointers For copy-on-write, we want to share attribute arrays between meshes where possible. Mutable pointers like `Mesh.mvert` make that difficult by making ownership vague. They also make code more complex by adding redundancy. The simplest solution is just removing them and retrieving layers from `CustomData` as needed. Similar changes have already been applied to curves and point clouds (e9f82d3dc7ee, 410a6efb747f). Removing use of the pointers generally makes code more obvious and more reusable. Mesh data is now accessed with a C++ API (`Mesh::edges()` or `Mesh::edges_for_write()`), and a C API (`BKE_mesh_edges(mesh)`). The CoW changes this commit makes possible are described in T95845 and T95842, and started in D14139 and D14140. The change also simplifies the ongoing mesh struct-of-array refactors from T95965. **RNA/Python Access Performance** Theoretically, accessing mesh elements with the RNA API may become slower, since the layer needs to be found on every random access. However, overhead is already high enough that this doesn't make a noticible differenc, and performance is actually improved in some cases. Random access can be up to 10% faster, but other situations might be a bit slower. Generally using `foreach_get/set` are the best way to improve performance. See the differential revision for more discussion about Python performance. Cycles has been updated to use raw pointers and the internal Blender mesh types, mostly because there is no sense in having this overhead when it's already compiled with Blender. In my tests this roughly halves the Cycles mesh creation time (0.19s to 0.10s for a 1 million face grid). Differential Revision: https://developer.blender.org/D15488
2022-09-05 11:56:34 -05:00
if (b_verts_num != numverts) {
VLOG_WARNING << "Topology differs, discarding motion blur for object " << ob_name
<< " at time " << motion_step;
memcpy(mP, mesh->get_verts().data(), sizeof(float3) * numverts);
if (mN != nullptr) {
memcpy(mN, attr_N->data_float3(), sizeof(float3) * numverts);
}
}
}
free_object_to_mesh(b_data, b_ob_info, b_mesh_rna);
return;
}
/* No deformation on this frame, copy coordinates if other frames did have it. */
mesh->copy_center_to_motion_step(motion_step);
}
CCL_NAMESPACE_END