Bit vector  0.9
Fast and space efficient dynamic bit vector library.
Public Member Functions | Protected Member Functions | Protected Attributes | List of all members
bv::bit_vector< leaf, node, allocator, leaf_size, branches, dtype > Class Template Reference

Container class for dynamic b-tree bit vector stuctures. More...

#include <bit_vector.hpp>

Collaboration diagram for bv::bit_vector< leaf, node, allocator, leaf_size, branches, dtype >:
Collaboration graph
[legend]

Public Member Functions

 bit_vector (allocator *alloc)
 Bit vector constructor with existing allocator. More...
 
 bit_vector ()
 Default constructor that creates an owned allocator. More...
 
 ~bit_vector ()
 Deconstructor that deallocates the entire data structure. More...
 
template<dtype block_size = 2048>
void generate_query_structure (query_support< dtype, leaf, block_size > *qs) const
 Populate a given query support stucture using this More...
 
template<dtype block_size = leaf_size / 3>
query_support< dtype, leaf, block_size > * generate_query_structure () const
 Create and Populate a query support structure using this More...
 
void insert (uint64_t index, bool value)
 Insert "value" into position "index". More...
 
bool remove (uint64_t index)
 Remove element at "index". More...
 
uint64_t sum () const
 Number of 1-bits in the data structure. More...
 
uint64_t size () const
 Number of elements stored in the data structure. More...
 
bool at (uint64_t index) const
 Retrieves the value of the indexth element in the data structure. More...
 
uint64_t rank (uint64_t index) const
 Number of 1-bits up to position index. More...
 
uint64_t select (uint64_t count) const
 Index of the countth 1-bit in the data structure. More...
 
void set (uint64_t index, bool value)
 Sets the bit at "index" to "value". More...
 
void flush ()
 Recursively flushes all buffers in the data structure. More...
 
uint64_t bit_size () const
 Total size of data structure allocations in bits. More...
 
void validate () const
 Asserts that the data structure is internally consistent. More...
 
void print (bool internal_only) const
 Output data structure to standard out as json. More...
 

Protected Member Functions

void split_root ()
 Increases the height of the tree by one level. More...
 

Protected Attributes

bool root_is_leaf_ = true
 Value indicating whether the root is in n_root_ or l_root_.
 
bool owned_allocator_ = false
 True if deallocating the allocator is the responsibilty of the data structure.
 
noden_root_
 Root of the b-tree if a multi-level structure is required.
 
leafl_root_
 Root if a single leaf is sufficient.
 
allocator * allocator_
 Pointer to allocator used for allocating internal nodes and leaves.
 

Detailed Description

template<class leaf, class node, class allocator, uint64_t leaf_size, uint8_t branches, class dtype>
class bv::bit_vector< leaf, node, allocator, leaf_size, branches, dtype >

Container class for dynamic b-tree bit vector stuctures.

This class manages a dynamic bit vector structure and provides the public API for querying the structure.

The structure supports efficient insertion and removal, along with common bit vector operations (set, access (at), rank and select). Additionally the convenience and debug functions sum, size, bit_size, validate, print and bit_size are provided at no extra cost.

Practical performance depends on node and leaf implementations.

Template Parameters
leafType for leaves. Some kind of bv::leaf.
nodeType for internal nodes. Some kind of bv::node.
allocatorAllocator type. For example bv::malloc_alloc.
leaf_sizeMaximum number of elements in leaf.
branchesMaximum branching factor of internal node.

Constructor & Destructor Documentation

◆ bit_vector() [1/2]

template<class leaf , class node , class allocator , uint64_t leaf_size, uint8_t branches, class dtype >
bv::bit_vector< leaf, node, allocator, leaf_size, branches, dtype >::bit_vector ( allocator *  alloc)
inline

Bit vector constructor with existing allocator.

One allocator can be used to provide services for multiple bit vector instances. There is no practical benefit from sharing bv::malloc_alloc allocators, but a performance benefit is expected by sharing more advanced allocators between multiple bit vector instances.

Parameters
allocThe allocator instance.

◆ bit_vector() [2/2]

template<class leaf , class node , class allocator , uint64_t leaf_size, uint8_t branches, class dtype >
bv::bit_vector< leaf, node, allocator, leaf_size, branches, dtype >::bit_vector ( )
inline

Default constructor that creates an owned allocator.

The default constructor will create an owned allocator that gets deallocated along with the rest of the data structure.

◆ ~bit_vector()

template<class leaf , class node , class allocator , uint64_t leaf_size, uint8_t branches, class dtype >
bv::bit_vector< leaf, node, allocator, leaf_size, branches, dtype >::~bit_vector ( )
inline

Deconstructor that deallocates the entire data structure.

All of the internal nodes and leaves allocated for this datastucture get recursively deallocated before deallocation of the bit vector container. If the allocator is owned by the bit vector container, that will be dallocated as well.

Member Function Documentation

◆ at()

template<class leaf , class node , class allocator , uint64_t leaf_size, uint8_t branches, class dtype >
bool bv::bit_vector< leaf, node, allocator, leaf_size, branches, dtype >::at ( uint64_t  index) const
inline

Retrieves the value of the indexth element in the data structure.

Performs an at-query on either n_root_ or l_root_ based on tree status.

When using bv::node and bv::leaf element for the internal data structure, access operations take \(\mathcal{O}\left(\log_2(b)\log_b(n / l)\right)\) time, where \(b\) is the branching factor, \(l\) is the leaf size and \(n\) is the data structure size.

Parameters
indexIndex to access
Returns
Boolean value indicating if the indexth element is set.

◆ bit_size()

template<class leaf , class node , class allocator , uint64_t leaf_size, uint8_t branches, class dtype >
uint64_t bv::bit_vector< leaf, node, allocator, leaf_size, branches, dtype >::bit_size ( ) const
inline

Total size of data structure allocations in bits.

Calculates the total size of "this", allocated nodes and allocated leaves in bits.

Size of the allocator is not included, and no consideration is made on the effects of memory fragmentation.

Returns
8 * sizeof(bit_vector) + tree_size

◆ flush()

template<class leaf , class node , class allocator , uint64_t leaf_size, uint8_t branches, class dtype >
void bv::bit_vector< leaf, node, allocator, leaf_size, branches, dtype >::flush ( )
inline

Recursively flushes all buffers in the data structure.

If execution transitions to a phase where no insertions or removals are expected, flushing all buffers should improve query time at the cost of a fairly expensive flushing operations.

◆ generate_query_structure() [1/2]

template<class leaf , class node , class allocator , uint64_t leaf_size, uint8_t branches, class dtype >
template<dtype block_size = leaf_size / 3>
query_support<dtype, leaf, block_size>* bv::bit_vector< leaf, node, allocator, leaf_size, branches, dtype >::generate_query_structure ( ) const
inline

Create and Populate a query support structure using this

Creates a new support structure and adds leaves in order.

Template Parameters
block_sizeSize of blocks used in the query support structure.
Returns
A new query support strucutre.

◆ generate_query_structure() [2/2]

template<class leaf , class node , class allocator , uint64_t leaf_size, uint8_t branches, class dtype >
template<dtype block_size = 2048>
void bv::bit_vector< leaf, node, allocator, leaf_size, branches, dtype >::generate_query_structure ( query_support< dtype, leaf, block_size > *  qs) const
inline

Populate a given query support stucture using this

Traverses the tree and ads encountered leaves to the query support strucutre.

Template Parameters
block_sizeSize of blocks used in the query support structure.
Parameters
qsQuery support structure.

◆ insert()

template<class leaf , class node , class allocator , uint64_t leaf_size, uint8_t branches, class dtype >
void bv::bit_vector< leaf, node, allocator, leaf_size, branches, dtype >::insert ( uint64_t  index,
bool  value 
)
inline

Insert "value" into position "index".

The bit vector container ensures that there is sufficient space in the b-tree, splitting nodes and increasing the tree height as necessary.

Insert operations take \(\mathcal{O}\left(b\log_b(n / l) + l\right)\) amortized time when using bv::node and bv::leaf elements for data storage, where \(b\) is the branching factor, \(l\) is the leaf size and \(n\) is the data structure size. The branching overhead compared to the access operation is due to updating cumulative sums and sizes in the internal nodes.

Parameters
indexWhere should value be inserted.
valueWhat should be inserted at index.

◆ print()

template<class leaf , class node , class allocator , uint64_t leaf_size, uint8_t branches, class dtype >
void bv::bit_vector< leaf, node, allocator, leaf_size, branches, dtype >::print ( bool  internal_only) const
inline

Output data structure to standard out as json.

Parameters
internal_onlyIf true, actual bit vector data will not be output to save space.

◆ rank()

template<class leaf , class node , class allocator , uint64_t leaf_size, uint8_t branches, class dtype >
uint64_t bv::bit_vector< leaf, node, allocator, leaf_size, branches, dtype >::rank ( uint64_t  index) const
inline

Number of 1-bits up to position index.

Counts the number of bits set in the first index bits.

When using bv::node and bv::leaf elements for the internal data structure, rank operations take \(\mathcal{O}\left(\log_2(b)\log_b(n / l) + l\right)\) time, where \(b\) is the branching factor, \(l\) is the leaf size and \(n\) is the data structure size.

Parameters
indexNumber of elements to include in the "summation".
Returns
\(\sum_{i = 0}^{\mathrm{index - 1}} \mathrm{bv}[i]\).

◆ remove()

template<class leaf , class node , class allocator , uint64_t leaf_size, uint8_t branches, class dtype >
bool bv::bit_vector< leaf, node, allocator, leaf_size, branches, dtype >::remove ( uint64_t  index)
inline

Remove element at "index".

Removes a bit from the underlying data structure, decreasing the tree height as appropriate. Tree height is decreased if n_roor_ is the active root and has exactly one child.

Remove operations take \(\mathcal{O}\left(b\log_b(n / l) + l\right)\) amortized time when using bv::node and bv::leaf elements for data storage, where \(b\) is the branching factor, \(l\) is the leaf size and \(n\) is the data structure size. The branching overhead compared to the access operation is due to updating cumulative sums and sizes in the internal nodes.

The value of the removed bit needs to be bubbled up the data structure to update cumulative sums. This makes returning the removed value here a 0-cost "side effect".

Parameters
indexPosition of element to remove.
Returns
Value of the removed bit.

◆ select()

template<class leaf , class node , class allocator , uint64_t leaf_size, uint8_t branches, class dtype >
uint64_t bv::bit_vector< leaf, node, allocator, leaf_size, branches, dtype >::select ( uint64_t  count) const
inline

Index of the countth 1-bit in the data structure.

When using bv::node and bv::leaf for internal strucutres, finds the position of the countth 1-bit is found in \(\mathcal{O}\left(\log_2(b)\log_b(n / l) + l\right) \) time, where \(b\) is the branching factor, \(l\) is the leaf size and \(n\) is the data structure size.

Parameters
countSelection target.
Returns
\(\underset{i \in [0..n)}{\mathrm{arg min}}\left(\sum_{j = 0}^i \mathrm{bv}[j]\right) = \) count.

◆ set()

template<class leaf , class node , class allocator , uint64_t leaf_size, uint8_t branches, class dtype >
void bv::bit_vector< leaf, node, allocator, leaf_size, branches, dtype >::set ( uint64_t  index,
bool  value 
)
inline

Sets the bit at "index" to "value".

When using bv::node and bv::leaf for internal strucutres, sets the value of the indexth bit in \(\mathcal{O}\left(b\log_b(n / l)\right) \) time, where \(b\) is the branching factor, \(l\) is the leaf size and \(n\) is the data structure size. As with insertion and removal there is overhead in updating internal nodes.

Parameters
indexIndex to set.
valuevalue to set the indexth bit to.

◆ size()

template<class leaf , class node , class allocator , uint64_t leaf_size, uint8_t branches, class dtype >
uint64_t bv::bit_vector< leaf, node, allocator, leaf_size, branches, dtype >::size ( ) const
inline

Number of elements stored in the data structure.

Cumulative sizes are maintained as the data structure changes. As such this is a constant time lookup operation.

Returns
\(n\).

◆ split_root()

template<class leaf , class node , class allocator , uint64_t leaf_size, uint8_t branches, class dtype >
void bv::bit_vector< leaf, node, allocator, leaf_size, branches, dtype >::split_root ( )
inlineprotected

Increases the height of the tree by one level.

A full root node will be split into two and nodes are set as children of a new root node.

The root node will be kept as the only node with less than branches/2 children. The child nodes will each have exactly branches/2 children.

◆ sum()

template<class leaf , class node , class allocator , uint64_t leaf_size, uint8_t branches, class dtype >
uint64_t bv::bit_vector< leaf, node, allocator, leaf_size, branches, dtype >::sum ( ) const
inline

Number of 1-bits in the data structure.

Cumulative partial sums are maintained as the data structure changes. As such this is a constant time lookup operation.

Returns
\(\sum_{i = 0}^{n - 1} \mathrm{bv}[i]\) for \(n\) element bit vector.

◆ validate()

template<class leaf , class node , class allocator , uint64_t leaf_size, uint8_t branches, class dtype >
void bv::bit_vector< leaf, node, allocator, leaf_size, branches, dtype >::validate ( ) const
inline

Asserts that the data structure is internally consistent.

Walks through the entire data structure and ensures that invariants and cumulative sums and sizes are valid for a data structure of the specified type.

This does not guarantee that the data structure is correct given the preceding query sequence, simply checks that a valid data structure seems to be correctly defined.

If the -DNDEBUG compiler flag is given, this function will do nothing since assertions will be (void(0))ed out.


The documentation for this class was generated from the following file: