Bit vector
0.9
Fast and space efficient dynamic bit vector library.
|
Container class for dynamic b-tree bit vector stuctures. More...
#include <bit_vector.hpp>
Public Member Functions | |
bit_vector (allocator *alloc) | |
Bit vector constructor with existing allocator. More... | |
bit_vector () | |
Default constructor that creates an owned allocator. More... | |
~bit_vector () | |
Deconstructor that deallocates the entire data structure. More... | |
template<dtype block_size = 2048> | |
void | generate_query_structure (query_support< dtype, leaf, block_size > *qs) const |
Populate a given query support stucture using this More... | |
template<dtype block_size = leaf_size / 3> | |
query_support< dtype, leaf, block_size > * | generate_query_structure () const |
Create and Populate a query support structure using this More... | |
void | insert (uint64_t index, bool value) |
Insert "value" into position "index". More... | |
bool | remove (uint64_t index) |
Remove element at "index". More... | |
uint64_t | sum () const |
Number of 1-bits in the data structure. More... | |
uint64_t | size () const |
Number of elements stored in the data structure. More... | |
bool | at (uint64_t index) const |
Retrieves the value of the indexth element in the data structure. More... | |
uint64_t | rank (uint64_t index) const |
Number of 1-bits up to position index. More... | |
uint64_t | select (uint64_t count) const |
Index of the countth 1-bit in the data structure. More... | |
void | set (uint64_t index, bool value) |
Sets the bit at "index" to "value". More... | |
void | flush () |
Recursively flushes all buffers in the data structure. More... | |
uint64_t | bit_size () const |
Total size of data structure allocations in bits. More... | |
void | validate () const |
Asserts that the data structure is internally consistent. More... | |
void | print (bool internal_only) const |
Output data structure to standard out as json. More... | |
Protected Member Functions | |
void | split_root () |
Increases the height of the tree by one level. More... | |
Protected Attributes | |
bool | root_is_leaf_ = true |
Value indicating whether the root is in n_root_ or l_root_ . | |
bool | owned_allocator_ = false |
True if deallocating the allocator is the responsibilty of the data structure. | |
node * | n_root_ |
Root of the b-tree if a multi-level structure is required. | |
leaf * | l_root_ |
Root if a single leaf is sufficient. | |
allocator * | allocator_ |
Pointer to allocator used for allocating internal nodes and leaves. | |
Container class for dynamic b-tree bit vector stuctures.
This class manages a dynamic bit vector structure and provides the public API for querying the structure.
The structure supports efficient insertion and removal, along with common bit vector operations (set
, access (at
), rank
and select
). Additionally the convenience and debug functions sum
, size
, bit_size
, validate
, print
and bit_size
are provided at no extra cost.
Practical performance depends on node and leaf implementations.
leaf | Type for leaves. Some kind of bv::leaf. |
node | Type for internal nodes. Some kind of bv::node. |
allocator | Allocator type. For example bv::malloc_alloc. |
leaf_size | Maximum number of elements in leaf. |
branches | Maximum branching factor of internal node. |
|
inline |
Bit vector constructor with existing allocator.
One allocator can be used to provide services for multiple bit vector instances. There is no practical benefit from sharing bv::malloc_alloc allocators, but a performance benefit is expected by sharing more advanced allocators between multiple bit vector instances.
alloc | The allocator instance. |
|
inline |
Default constructor that creates an owned allocator.
The default constructor will create an owned allocator that gets deallocated along with the rest of the data structure.
|
inline |
Deconstructor that deallocates the entire data structure.
All of the internal nodes and leaves allocated for this datastucture get recursively deallocated before deallocation of the bit vector container. If the allocator is owned by the bit vector container, that will be dallocated as well.
|
inline |
Retrieves the value of the indexth element in the data structure.
Performs an at-query on either n_root_
or l_root_
based on tree status.
When using bv::node and bv::leaf element for the internal data structure, access operations take \(\mathcal{O}\left(\log_2(b)\log_b(n / l)\right)\) time, where \(b\) is the branching factor, \(l\) is the leaf size and \(n\) is the data structure size.
index | Index to access |
|
inline |
Total size of data structure allocations in bits.
Calculates the total size of "this", allocated nodes and allocated leaves in bits.
Size of the allocator is not included, and no consideration is made on the effects of memory fragmentation.
8 * sizeof(bit_vector) + tree_size
|
inline |
Recursively flushes all buffers in the data structure.
If execution transitions to a phase where no insertions or removals are expected, flushing all buffers should improve query time at the cost of a fairly expensive flushing operations.
|
inline |
Create and Populate a query support structure using this
Creates a new support structure and adds leaves in order.
block_size | Size of blocks used in the query support structure. |
|
inline |
Populate a given query support stucture using this
Traverses the tree and ads encountered leaves to the query support strucutre.
block_size | Size of blocks used in the query support structure. |
qs | Query support structure. |
|
inline |
Insert "value" into position "index".
The bit vector container ensures that there is sufficient space in the b-tree, splitting nodes and increasing the tree height as necessary.
Insert operations take \(\mathcal{O}\left(b\log_b(n / l) + l\right)\) amortized time when using bv::node and bv::leaf elements for data storage, where \(b\) is the branching factor, \(l\) is the leaf size and \(n\) is the data structure size. The branching overhead compared to the access operation is due to updating cumulative sums and sizes in the internal nodes.
index | Where should value be inserted. |
value | What should be inserted at index . |
|
inline |
Output data structure to standard out as json.
internal_only | If true, actual bit vector data will not be output to save space. |
|
inline |
Number of 1-bits up to position index.
Counts the number of bits set in the first index bits.
When using bv::node and bv::leaf elements for the internal data structure, rank operations take \(\mathcal{O}\left(\log_2(b)\log_b(n / l) + l\right)\) time, where \(b\) is the branching factor, \(l\) is the leaf size and \(n\) is the data structure size.
index | Number of elements to include in the "summation". |
|
inline |
Remove element at "index".
Removes a bit from the underlying data structure, decreasing the tree height as appropriate. Tree height is decreased if n_roor_
is the active root and has exactly one child.
Remove operations take \(\mathcal{O}\left(b\log_b(n / l) + l\right)\) amortized time when using bv::node and bv::leaf elements for data storage, where \(b\) is the branching factor, \(l\) is the leaf size and \(n\) is the data structure size. The branching overhead compared to the access operation is due to updating cumulative sums and sizes in the internal nodes.
The value of the removed bit needs to be bubbled up the data structure to update cumulative sums. This makes returning the removed value here a 0-cost "side effect".
index | Position of element to remove. |
|
inline |
Index of the countth 1-bit in the data structure.
When using bv::node and bv::leaf for internal strucutres, finds the position of the countth 1-bit is found in \(\mathcal{O}\left(\log_2(b)\log_b(n / l) + l\right) \) time, where \(b\) is the branching factor, \(l\) is the leaf size and \(n\) is the data structure size.
count | Selection target. |
|
inline |
Sets the bit at "index" to "value".
When using bv::node and bv::leaf for internal strucutres, sets the value of the indexth bit in \(\mathcal{O}\left(b\log_b(n / l)\right) \) time, where \(b\) is the branching factor, \(l\) is the leaf size and \(n\) is the data structure size. As with insertion and removal there is overhead in updating internal nodes.
index | Index to set. |
value | value to set the indexth bit to. |
|
inline |
Number of elements stored in the data structure.
Cumulative sizes are maintained as the data structure changes. As such this is a constant time lookup operation.
|
inlineprotected |
Increases the height of the tree by one level.
A full root node will be split into two and nodes are set as children of a new root node.
The root node will be kept as the only node with less than branches/2
children. The child nodes will each have exactly branches/2
children.
|
inline |
Number of 1-bits in the data structure.
Cumulative partial sums are maintained as the data structure changes. As such this is a constant time lookup operation.
|
inline |
Asserts that the data structure is internally consistent.
Walks through the entire data structure and ensures that invariants and cumulative sums and sizes are valid for a data structure of the specified type.
This does not guarantee that the data structure is correct given the preceding query sequence, simply checks that a valid data structure seems to be correctly defined.
If the -DNDEBUG
compiler flag is given, this function will do nothing since assertions will be (void(0))
ed out.