public abstract class CompactSketch extends Sketch
A CompactSketch is the simplest form of a Theta Sketch. It consists of a compact list (i.e., no intervening spaces) of hash values, which may be ordered or not, a value for theta and a seed hash. A CompactSketch is immutable (read-only), and the space required when stored is only the space required for the hash values and 8 to 24 bytes of preamble. An empty CompactSketch consumes only 8 bytes.
Constructor and Description |
---|
CompactSketch() |
Modifier and Type | Method and Description |
---|---|
abstract CompactSketch |
compact(boolean dstOrdered,
org.apache.datasketches.memory.WritableMemory dstMem)
Convert this sketch to a CompactSketch.
|
int |
getCompactBytes()
Returns the number of storage bytes required for this Sketch if its current state were
compacted.
|
Family |
getFamily()
Returns the Family that this sketch belongs to
|
static CompactSketch |
heapify(org.apache.datasketches.memory.Memory srcMem)
Heapify takes a CompactSketch image in Memory and instantiates an on-heap CompactSketch.
|
static CompactSketch |
heapify(org.apache.datasketches.memory.Memory srcMem,
long expectedSeed)
Heapify takes a CompactSketch image in Memory and instantiates an on-heap CompactSketch.
|
boolean |
isCompact()
Returns true if this sketch is in compact form.
|
byte[] |
toByteArrayCompressed()
gets the sketch as a compressed byte array
|
static CompactSketch |
wrap(org.apache.datasketches.memory.Memory srcMem)
Wrap takes the CompactSketch image in given Memory and refers to it directly.
|
static CompactSketch |
wrap(org.apache.datasketches.memory.Memory srcMem,
long expectedSeed)
Wrap takes the sketch image in the given Memory and refers to it directly.
|
compact, getCompactSketchMaxBytes, getCountLessThanThetaLong, getCurrentBytes, getEstimate, getLowerBound, getMaxCompactSketchBytes, getMaxUpdateSketchBytes, getRetainedEntries, getRetainedEntries, getSerializationVersion, getTheta, getThetaLong, getUpperBound, isEmpty, isEstimationMode, isOrdered, iterator, toByteArray, toString, toString, toString, toString
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
hasMemory, isDirect, isSameResource
public static CompactSketch heapify(org.apache.datasketches.memory.Memory srcMem)
The resulting sketch will not retain any link to the source Memory and all of its data will be copied to the heap CompactSketch.
This method assumes that the sketch image was created with the correct hash seed, so it is not checked. The resulting on-heap CompactSketch will be given the seedHash derived from the given sketch image. However, Serial Version 1 sketch images do not have a seedHash field, so the resulting heapified CompactSketch will be given the hash of the DEFAULT_UPDATE_SEED.
srcMem
- an image of a CompactSketch.
See Memory.public static CompactSketch heapify(org.apache.datasketches.memory.Memory srcMem, long expectedSeed)
The resulting sketch will not retain any link to the source Memory and all of its data will be copied to the heap CompactSketch.
This method checks if the given expectedSeed was used to create the source Memory image. However, SerialVersion 1 sketch images cannot be checked as they don't have a seedHash field, so the resulting heapified CompactSketch will be given the hash of the expectedSeed.
srcMem
- an image of a CompactSketch that was created using the given expectedSeed.
See Memory.expectedSeed
- the seed used to validate the given Memory image.
See Update Hash Seed.public static CompactSketch wrap(org.apache.datasketches.memory.Memory srcMem)
Only "Direct" Serialization Version 3 (i.e, OpenSource) sketches that have been explicitly stored as direct sketches can be wrapped. Wrapping earlier serial version sketches will result in a heapify operation. These early versions were never designed to "wrap".
Wrapping any subclass of this class that is empty or contains only a single item will result in heapified forms of empty and single item sketch respectively. This is actually faster and consumes less overall memory.
This method assumes that the sketch image was created with the correct hash seed, so it is not checked. However, Serial Version 1 sketch images do not have a seedHash field, so the resulting on-heap CompactSketch will be given the hash of the DEFAULT_UPDATE_SEED.
srcMem
- an image of a Sketch.
See Memory.public static CompactSketch wrap(org.apache.datasketches.memory.Memory srcMem, long expectedSeed)
Only "Direct" Serialization Version 3 (i.e, OpenSource) sketches that have been explicitly stored as direct sketches can be wrapped. Wrapping earlier serial version sketches will result in a heapify operation. These early versions were never designed to "wrap".
Wrapping any subclass of this class that is empty or contains only a single item will result in heapified forms of empty and single item sketch respectively. This is actually faster and consumes less overall memory.
This method checks if the given expectedSeed was used to create the source Memory image. However, SerialVersion 1 sketches cannot be checked as they don't have a seedHash field, so the resulting heapified CompactSketch will be given the hash of the expectedSeed.
srcMem
- an image of a Sketch that was created using the given expectedSeed.
See MemoryexpectedSeed
- the seed used to validate the given Memory image.
See Update Hash Seed.public abstract CompactSketch compact(boolean dstOrdered, org.apache.datasketches.memory.WritableMemory dstMem)
Sketch
If this sketch is a type of UpdateSketch, the compacting process converts the hash table of the UpdateSketch to a simple list of the valid hash values. Any hash values of zero or equal-to or greater than theta will be discarded. The number of valid values remaining in the CompactSketch depends on a number of factors, but may be larger or smaller than Nominal Entries (or k). It will never exceed 2k. If it is critical to always limit the size to no more than k, then rebuild() should be called on the UpdateSketch prior to calling this method.
A CompactSketch is always immutable.
A new CompactSketch object is created:
Otherwise, this operation returns this.
compact
in class Sketch
dstOrdered
- assumed true if this sketch is empty or has only one value
See Destination OrdereddstMem
- See Destination Memory.public int getCompactBytes()
Sketch
Sketch.getCurrentBytes()
.getCompactBytes
in class Sketch
public Family getFamily()
Sketch
public boolean isCompact()
Sketch
public byte[] toByteArrayCompressed()
Copyright © 2015–2024 The Apache Software Foundation. All rights reserved.