sps.RamPrefixSum_3D_1O
- class sps.RamPrefixSum_3D_1O
Prefix sum index for intervals in 2-dimensional space.
- add_point(self: sps.RamPrefixSum_3D_1O, start: List[int[3]], end: List[int[3]], val: int = 1) None
Append a interval to the data structure.
- Parameters:
start (list[int[3]]) – The bottom left position of the interval.
end (list[int[3]]) – The top right position of the interval.
val (int) – The value of the interval.
desc (str) – A description for the Point, defaults to “”.
The interval will not be queryable until generate is called.
Dimensions 1 - 1 of start and end may have different values, where the value of end must be larger equal to the value of start. Dimensions 2 - 3 of start and end must have equal values.
Note that this function will add one point for each outside corner of the given interval.
- generate(self: sps.RamPrefixSum_3D_1O, factor: float = -1, verbosity: int = 1) int
Generate a new dataset from the previously added points.
- Parameters:
verbosity (int) – Degree of verbosity while creating the dataset, defaults to 1.
- Returns:
The id of the generated dataset.
- Return type:
int
This may take a long time to compute.
Use len(index) to determine the index of the first and last point, as add_point may add multiple points per call.
This function is multithreaded.
- count(*args, **kwargs)
Overloaded function.
count(self: sps.RamPrefixSum_3D_1O, dataset_id: int, from_pos: List[int[3]], to_pos: List[int[3]], intersection_type: sps.IntersectionType = <IntersectionType.enclosed: 0>, no_points: bool = False, verbosity: int = 0) -> int
Count the number of points between from and to in the given dataset.
- param dataset_id:
The id of the dataset to query
- type dataset_id:
int
- param from_pos:
The bottom left position of the query region.
- type from_pos:
list[int[3]]
- param to_pos:
The top right position of the query region.
- type to_pos:
list[int[3]]
- param intersection_type:
Which data elements to count (enclosed, overlapping, enclosing, etc…).
- type intersection_type:
IntersectionType
- param no_points:
Weather to count points or not.
- type no_points:
bool
- param verbosity:
Degree of verbosity while counting, defaults to 0.
- type verbosity:
int
- return:
The number of points in dataset_id between from_pos and to_pos.
- rtype:
int
to_pos must be larger equal than from_pos in each dimension.
count(self: sps.RamPrefixSum_3D_1O, dataset_id: int, from_pos: List[int[3]], to_pos: List[int[3]], intersection_types: List[sps.IntersectionType[1]], no_points: List[bool[1]], verbosity: int = 0) -> int
Count the number of points between from and to in the given dataset. This overload allows you to specify, for each dimension, how to deal with data-hyperrectangles. E.g. you can specify to count data-hyperrectangles that overlap the query-hyperrectangle in dimension 1 but are fully enclosed by the query-hyperrectangle in dimension 2, by providing intersection_types = [overlaps, encloses].
- param dataset_id:
The id of the dataset to query
- type dataset_id:
int
- param from_pos:
The bottom left position of the query region.
- type from_pos:
list[int[3]]
- param to_pos:
The top right position of the query region.
- type to_pos:
list[int[3]]
- param intersection_types:
Which data elements to count (enclosed, overlapping, enclosing, etc…), separated for each dimension.
- type intersection_types:
IntersectionType
- param no_points:
Weather to count points or not.
- type no_points:
list[bool[3]]
- param verbosity:
Degree of verbosity while counting, defaults to 0.
- type verbosity:
int
- return:
The number of points in dataset_id between from_pos and to_pos.
- rtype:
int
to_pos must be larger equal than from_pos in each dimension.
- count_size_limited(self: sps.RamPrefixSum_3D_1O, dataset_id: int, from_pos: List[int[4]], to_pos: List[int[4]], intersection_type: List[sps.IntersectionType[1]] = <IntersectionType.enclosed: 0>, verbosity: int = 0) int
Count the number of points between from and to and in the given dataset.
As opposed to count, this function allows specifying the start and end positions for all dimensions in the Datastructure. This is only relevant for indices with orthotope dimensions.
- Parameters:
dataset_id (int) – The id of the dataset to query
from_pos (list[int[4]]) – The bottom left position of the query region.
to_pos (list[int[4]]) – The top right position of the query region.
verbosity (int) – Degree of verbosity while counting, defaults to 0.
- Returns:
The number of points in dataset_id between from_pos and to_pos.
- Return type:
int
to_pos must be larger equal than from_pos in each dimension.
- count_multiple(self: sps.RamPrefixSum_3D_1O, regions: List[Tuple[int, List[int[3]], List[int[3]]]], intersection_type: sps.IntersectionType = <IntersectionType.enclosed: 0>, verbosity: int = 0) List[int]
Count the number of points between from and to in the given dataset.
Counts for multiple regions.
- Parameters:
dataset_id (int) – The id of the dataset to query
regions (list[tuple[list[int[3]], list[int[3]]]]) – The bottom left and top right positions of the queried regions. Given as a list (individual regions) of tuples (bottom-left, top-right) of lists (individual coordinates). The top right of each region must be larger equal the bottom left.
verbosity (int) – Degree of verbosity while counting, defaults to 0.
- Returns:
The number of points in dataset_id between from_pos and to_pos.
- Return type:
int
to_pos must be larger equal than from_pos in each dimension.
- __str__(self: sps.RamPrefixSum_3D_1O) str
Return a string describing the index. Very slow for large datasets.
- clear(self: sps.RamPrefixSum_3D_1O) None
Clear the complete index.
- clear_keep_points(self: sps.RamPrefixSum_3D_1O) None
Clear the index datasets, but keep the points.
- get_overlay_grid(self: sps.RamPrefixSum_3D_1O, dataset_id: int) List[List[List[int[4]][3]]]
Returns the bottom-left-front-… and top-right-back-… position of all overlays.
- estimate_num_elements(self: sps.RamPrefixSum_3D_1O, f: List[float]) List[Tuple[int, int, int, int, int, int]]
Predict the number of data structure elements stored in a dataset generated from the currently added points.
Here f is proportional to the number of boxes used in the data structure. For any dataset, there is an optimal value for f that leads to the minimal data structure size. Since the data structure size is proportional to the time required to build the datastructure this also minimizes construction time.
The purpose of this function is to find the optimal value for f.
Uses a statistical approach to predict the number of elements. For details see the corresponding github page or our manuscript.
There are five values predicted: - The number of internal prefix sums - The number of overlay prefix sums - The number of internal sparse coordinates - The number of overlay sparse coordinates - Total size of the datastructure
- Parameters:
f (list[float]) – list of factors that are proportional to the number of boxes within the data structure
- Returns:
The predicted number of dataset structure elements for each factor
- Return type:
list[tuple[int]]
- pick_num_overlays(self: sps.RamPrefixSum_3D_1O, verbosity: int = 0) int
Predict the best factor f for the currently added points.
Here f is a factor proportional to the number of boxes used in the data structure. See estimate_num_elements for a detailed description.
- Returns:
The predicted best value for f.
- Return type:
float
- to_factor(self: sps.RamPrefixSum_3D_1O, num_overlays: int) float
Convert a given number of overlay blocks to the factor f for the currently added points.
Here f is a factor proportional to the number of boxes used in the data structure. See estimate_num_elements for a detailed description.
- Parameters:
num_overlays – number of overlay blocks
- Returns:
- Return type:
float
- get_num_internal_prefix_sums(self: sps.RamPrefixSum_3D_1O, dataset_id: int) int
Count the number of internal prefix sums stored in the dataset with id dataset_id.
- Parameters:
dataset_id (int) – The id of the dataset to query
- Returns:
number of internal prefix sums.
- Return type:
int
- get_num_overlay_prefix_sums(self: sps.RamPrefixSum_3D_1O, dataset_id: int) int
Count the number of overlay prefix sums stored in the dataset with id dataset_id.
- Parameters:
dataset_id (int) – The id of the dataset to query
- Returns:
number of overlay prefix sums.
- Return type:
int
- get_num_prefix_sums(self: sps.RamPrefixSum_3D_1O, dataset_id: int) int
Count the number of prefix sums stored in the dataset with id dataset_id.
- Parameters:
dataset_id (int) – The id of the dataset to query
- Returns:
number of prefix sums.
- Return type:
int
- get_num_changing_prefix_sums(self: sps.RamPrefixSum_3D_1O) int
Count the number of prefix sums that are different than their immediate predecessor entries in all dimensions.
- Returns:
number of changing prefix sums.
- Return type:
int
- get_num_internal_sparse_coords(self: sps.RamPrefixSum_3D_1O, dataset_id: int) int
Count the number of internal sparse coordinates stored in the dataset with id dataset_id.
- Parameters:
dataset_id (int) – The id of the dataset to query
- Returns:
number of internal sparse coordinates.
- Return type:
int
- get_num_overlay_sparse_coords(self: sps.RamPrefixSum_3D_1O, dataset_id: int) int
Count the number of overlay sparse coordinates stored in the dataset with id dataset_id.
- Parameters:
dataset_id (int) – The id of the dataset to query
- Returns:
number of overlay sparse coordinates.
- Return type:
int
- __init__(self: sps.RamPrefixSum_3D_1O, path: str = '', write_mode: bool = False) None
Create a new Index.
- Parameters:
path (str) – Prefix path of the index on the filesystem (multiple files with different endings will be created), defaults to “”.
write_mode (str) – Open the index in write mode (if this is set to False no changes can be made to the index), defaults to False.
Methods
__init__(self[, path, write_mode])Create a new Index.
add_point(self, start, end[, val])Append a interval to the data structure.
clear(self)Clear the complete index.
clear_keep_points(self)Clear the index datasets, but keep the points.
count(*args, **kwargs)Overloaded function.
count_multiple(self, regions, List[int[3]], ...)Count the number of points between from and to in the given dataset.
count_size_limited(self, dataset_id, ...)Count the number of points between from and to and in the given dataset.
estimate_num_elements(self, f)Predict the number of data structure elements stored in a dataset generated from the currently added points.
generate(self[, factor, verbosity])Generate a new dataset from the previously added points.
get_max_prefix_sum(self)Get the largest stored prefix sum.
Count the number of prefix sums that are different than their immediate predecessor entries in all dimensions.
get_num_internal_prefix_sums(self, dataset_id)Count the number of internal prefix sums stored in the dataset with id dataset_id.
get_num_internal_sparse_coords(self, dataset_id)Count the number of internal sparse coordinates stored in the dataset with id dataset_id.
get_num_overlay_prefix_sums(self, dataset_id)Count the number of overlay prefix sums stored in the dataset with id dataset_id.
get_num_overlay_sparse_coords(self, dataset_id)Count the number of overlay sparse coordinates stored in the dataset with id dataset_id.
get_num_prefix_sums(self, dataset_id)Count the number of prefix sums stored in the dataset with id dataset_id.
get_overlay_grid(self, dataset_id)Returns the bottom-left-front-.
get_size(self, dataset_id)Returns the size of the dataset in bytes.
grid_count(*args, **kwargs)Overloaded function.
grid_size(self)Get the axis sizes of the area spanned by the currently added points.
max_prefix_value(self)Return the maximal stored prefix sum.
pick_num_overlays(self[, verbosity])Predict the best factor f for the currently added points.
pop_dataset(self)Remove the last dataset.
reserve(self, arg0, arg1, arg2, arg3, arg4)Reserve memory for the index corners, sparse coords and prefix sums.
to_factor(self, num_overlays)Convert a given number of overlay blocks to the factor f for the currently added points.
total_num_prefix_sums(self)Count the number of prefix sums stored in the entire index.