public abstract class KllSketch extends Object implements QuantilesAPI
KLL is an implementation of a very compact quantiles sketch with lazy compaction scheme and nearly optimal accuracy per retained quantile.
Reference Optimal Quantile Approximation in Streams.
The default k of 200 yields a "single-sided" epsilon of about 1.33% and a "double-sided" (PMF) epsilon of about 1.65%, with a confidence of 99%.
QuantilesAPI
Modifier and Type | Class and Description |
---|---|
static class |
KllSketch.SketchStructure
Used primarily to define the structure of the serialized sketch.
|
static class |
KllSketch.SketchType
Used to define the variable type of the current instance of this class.
|
Modifier and Type | Field and Description |
---|---|
static int |
DEFAULT_K
The default K
|
static int |
MAX_K
The maximum K
|
EMPTY_MSG, MEM_REQ_SVR_NULL_MSG, NOT_SINGLE_ITEM_MSG, SELF_MERGE_MSG, TGT_IS_READ_ONLY_MSG, UNSUPPORTED_MSG
Modifier and Type | Method and Description |
---|---|
static int |
getKFromEpsilon(double epsilon,
boolean pmf)
Gets the approximate k to use given epsilon, the normalized rank error.
|
static int |
getMaxSerializedSizeBytes(int k,
long n,
KllSketch.SketchType sketchType,
boolean updatableMemFormat)
Returns upper bound on the serialized size of a KllSketch given the following parameters.
|
double |
getNormalizedRankError(boolean pmf)
Gets the approximate rank error of this sketch normalized as a fraction between zero and one.
|
static double |
getNormalizedRankError(int k,
boolean pmf)
Gets the normalized rank error given k and pmf.
|
int |
getNumRetained()
Gets the number of quantiles retained by the sketch.
|
int |
getSerializedSizeBytes()
Returns the current number of bytes this Sketch would require if serialized in compact form.
|
boolean |
hasMemory()
Returns true if this sketch's data structure is backed by Memory or WritableMemory.
|
boolean |
isCompactMemoryFormat()
Returns true if this sketch is in a Compact Memory Format.
|
boolean |
isDirect()
Returns true if this sketch's data structure is off-heap (a.k.a., Direct or Native memory).
|
boolean |
isEmpty()
Returns true if this sketch is empty.
|
boolean |
isEstimationMode()
Returns true if this sketch is in estimation mode.
|
boolean |
isMemoryUpdatableFormat()
Returns true if the backing WritableMemory is in updatable format.
|
boolean |
isReadOnly()
Returns true if this sketch is read only.
|
boolean |
isSameResource(org.apache.datasketches.memory.Memory that)
Returns true if the backing resource of this is identical with the backing resource
of that.
|
abstract void |
merge(KllSketch other)
Merges another sketch into this one.
|
String |
toString()
Returns a summary of the key parameters of the sketch.
|
abstract String |
toString(boolean withLevels,
boolean withLevelsAndItems)
Returns human readable summary information about this sketch.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
getK, getN, getRankLowerBound, getRankUpperBound, reset
public static final int DEFAULT_K
public static final int MAX_K
public static int getKFromEpsilon(double epsilon, boolean pmf)
epsilon
- the normalized rank error between zero and one.pmf
- if true, this function returns the k assuming the input epsilon
is the desired "double-sided" epsilon for the getPMF() function. Otherwise, this function
returns k assuming the input epsilon is the desired "single-sided"
epsilon for all the other queries.public static int getMaxSerializedSizeBytes(int k, long n, KllSketch.SketchType sketchType, boolean updatableMemFormat)
k
- parameter that controls size of the sketch and accuracy of estimatesn
- stream lengthsketchType
- Only DOUBLES_SKETCH and FLOATS_SKETCH is supported for this operation.updatableMemFormat
- true if updatable Memory format, otherwise the standard compact format.public static double getNormalizedRankError(int k, boolean pmf)
k
- the configuration parameterpmf
- if true, returns the "double-sided" normalized rank error for the getPMF() function.
Otherwise, it is the "single-sided" normalized rank error for all the other queries.public final double getNormalizedRankError(boolean pmf)
QuantilesAPI
getNormalizedRankError
in interface QuantilesAPI
pmf
- if true, returns the "double-sided" normalized rank error for the getPMF() function.
Otherwise, it is the "single-sided" normalized rank error for all the other queries.public final int getNumRetained()
QuantilesAPI
getNumRetained
in interface QuantilesAPI
public int getSerializedSizeBytes()
public boolean hasMemory()
QuantilesAPI
hasMemory
in interface QuantilesAPI
public boolean isCompactMemoryFormat()
public boolean isDirect()
QuantilesAPI
isDirect
in interface QuantilesAPI
public final boolean isEmpty()
QuantilesAPI
isEmpty
in interface QuantilesAPI
public final boolean isEstimationMode()
QuantilesAPI
isEstimationMode
in interface QuantilesAPI
public final boolean isMemoryUpdatableFormat()
public final boolean isReadOnly()
QuantilesAPI
isReadOnly
in interface QuantilesAPI
public final boolean isSameResource(org.apache.datasketches.memory.Memory that)
that
- A different non-null objectpublic abstract void merge(KllSketch other)
other
- sketch to merge into this onepublic final String toString()
QuantilesAPI
toString
in interface QuantilesAPI
toString
in class Object
public abstract String toString(boolean withLevels, boolean withLevelsAndItems)
withLevels
- if true includes sketch levels array summary informationwithLevelsAndItems
- if true include detail of levels array and items array togetherCopyright © 2015–2024 The Apache Software Foundation. All rights reserved.