public final class ReservoirLongsSketch extends Object
long
s. The sketch
contains a uniform random sample of items from the stream.Modifier and Type | Method and Description |
---|---|
SampleSubsetSummary |
estimateSubsetSum(Predicate<Long> predicate)
Computes an estimated subset sum from the entire stream for objects matching a given
predicate.
|
int |
getK()
Returns the sketch's value of k, the maximum number of samples stored in the reservoir.
|
long |
getN()
Returns the number of items processed from the input stream
|
int |
getNumSamples()
Returns the current number of items in the reservoir, which may be smaller than the reservoir
capacity.
|
long[] |
getSamples()
Returns a copy of the items in the reservoir.
|
static ReservoirLongsSketch |
heapify(org.apache.datasketches.memory.Memory srcMem)
Returns a sketch instance of this class from the given srcMem, which must be a Memory
representation of this sketch class.
|
static ReservoirLongsSketch |
newInstance(int k)
Construct a mergeable reservoir sampling sketch with up to k samples using the default resize
factor (8).
|
static ReservoirLongsSketch |
newInstance(int k,
ResizeFactor rf)
Construct a mergeable reservoir sampling sketch with up to k samples using the default resize
factor (8).
|
void |
reset()
Resets this sketch to the empty state, but retains the original value of k.
|
byte[] |
toByteArray()
Returns a byte array representation of this sketch
|
String |
toString()
Returns a human-readable summary of the sketch, without items.
|
static String |
toString(byte[] byteArr)
Returns a human readable string of the preamble of a byte array image of a ReservoirLongsSketch.
|
static String |
toString(org.apache.datasketches.memory.Memory mem)
Returns a human readable string of the preamble of a Memory image of a ReservoirLongsSketch.
|
void |
update(long item)
Randomly decide whether or not to include an item in the sample set.
|
public static ReservoirLongsSketch newInstance(int k)
k
- Maximum size of sampling. Allocated size may be smaller until sampling fills. Unlike
many sketches in this package, this value does not need to be a power of 2.public static ReservoirLongsSketch newInstance(int k, ResizeFactor rf)
k
- Maximum size of sampling. Allocated size may be smaller until sampling fills. Unlike
many sketches in this package, this value does not need to be a power of 2.rf
- See Resize Factorpublic static ReservoirLongsSketch heapify(org.apache.datasketches.memory.Memory srcMem)
srcMem
- a Memory representation of a sketch of this class. See Memorypublic int getK()
public long getN()
public int getNumSamples()
public long[] getSamples()
public void update(long item)
item
- a unit-weight (equivalently, unweighted) item of the set being sampled frompublic void reset()
public String toString()
public static String toString(byte[] byteArr)
byteArr
- the given byte arraypublic static String toString(org.apache.datasketches.memory.Memory mem)
mem
- the given Memorypublic byte[] toByteArray()
public SampleSubsetSummary estimateSubsetSum(Predicate<Long> predicate)
This is technically a heuristic method, and tries to err on the conservative side.
predicate
- A predicate to use when identifying items.Copyright © 2015–2024 The Apache Software Foundation. All rights reserved.