This patch optimizes `IndexMask::from_bits` by making use of the fact that many bits can be processed at once and one does not have to look at every bit individual in many cases. Bits are stored as array of `BitInt` (aka `uint64_t`). So we can process at least 64 bits at a time. On some platforms we can also make use of SIMD and process up to 128 bits at once. This can significantly improve performance if all bits are set/unset. As a byproduct, this patch also optimizes `IndexMask::from_bools` which is now implemented in terms of `IndexMask::from_bits`. The conversion from bools to bits has been optimized significantly too by using SIMD intrinsics. Pull Request: https://projects.blender.org/blender/blender/pulls/126888
26 lines
825 B
C++
26 lines
825 B
C++
/* SPDX-FileCopyrightText: 2024 Blender Authors
|
|
*
|
|
* SPDX-License-Identifier: GPL-2.0-or-later */
|
|
|
|
#pragma once
|
|
|
|
#include "BLI_bit_span.hh"
|
|
#include "BLI_span.hh"
|
|
|
|
namespace blender::bits {
|
|
|
|
/**
|
|
* Converts the bools to bits and `or`s them into the given bits. For pure conversion, the bits
|
|
* should therefore be zero initialized before they are passed into this function.
|
|
*
|
|
* \param allowed_overshoot: How many bools/bits can be read/written after the end of the given
|
|
* spans. This can help with performance because the internal algorithm can process many elements
|
|
* at once.
|
|
*
|
|
* \return True if any of the checked bools were true (this also includes the bools in the
|
|
* overshoot).
|
|
*/
|
|
bool or_bools_into_bits(Span<bool> bools, MutableBitSpan r_bits, int64_t allowed_overshoot = 0);
|
|
|
|
} // namespace blender::bits
|