Package org.apache.spark.sql
Class Observation
Object
org.apache.spark.sql.ObservationBase
org.apache.spark.sql.Observation
Helper class to simplify usage of
Dataset.observe(String, Column, Column*)
:
// Observe row count (rows) and highest id (maxid) in the Dataset while writing it
val observation = Observation("my metrics")
val observed_ds = ds.observe(observation, count(lit(1)).as("rows"), max($"id").as("maxid"))
observed_ds.write.parquet("ds.parquet")
val metrics = observation.get
This collects the metrics while the first action is executed on the observed dataset. Subsequent
actions do not modify the metrics returned by ObservationBase.get()
. Retrieval of the metric via ObservationBase.get()
blocks until the first action has finished and metrics become available.
This class does not support streaming datasets.
param: name name of the metric
- Since:
- 3.3.0
-
Constructor Summary
ConstructorDescriptionCreate an Observation instance without providing a name.Observation
(String name) -
Method Summary
Modifier and TypeMethodDescriptionstatic Observation
apply()
Observation constructor for creating an anonymous observation.static Observation
Observation constructor for creating a named observation.Methods inherited from class org.apache.spark.sql.ObservationBase
get, getAsJava, name
-
Constructor Details
-
Observation
-
Observation
public Observation()Create an Observation instance without providing a name. This generates a random name.
-
-
Method Details
-
apply
Observation constructor for creating an anonymous observation.- Returns:
- (undocumented)
-
apply
Observation constructor for creating a named observation.- Parameters:
name
- (undocumented)- Returns:
- (undocumented)
-