Dataset

Selects a column by name. Alias for col.

Attributes

Returns a new Dataset where each record is mapped to type U via its Encoder. This is a purely client-side reinterpretation (no server-side closure), so it works over Spark Connect.

Attributes

Persists this Dataset with the default storage level.

Attributes

Eagerly checkpoints this Dataset to reliable storage and returns the checkpointed copy.

Attributes

Checkpoints this Dataset to reliable storage.

Attributes

Selects a column by name, qualified by this Dataset's plan id so that it resolves unambiguously even in self-joins.

Attributes

Selects columns based on a column name regular expression.

Attributes

Computes basic statistics (count, mean, stddev, min, max) for numeric and string columns.

Attributes

Drops duplicates within the event-time watermark, keeping state bounded for streaming.

Attributes

Lateral join with a correlated right relation.

Attributes

Eagerly locally checkpoints this Dataset.

Attributes

Locally checkpoints this Dataset.

Attributes

Alias for unpivot.

Attributes

Alias for unpivot.

Attributes

Merges this Dataset (the source) into the table (the target) using condition to match rows. Returns a MergeIntoWriter to configure the WHEN clauses; call merge() to run it.

Attributes

Returns a DataFrameNaFunctions for working with missing data.

Attributes

Defines named observed metrics computed while this Dataset is processed.

Attributes

Persists this Dataset with the default storage level (MEMORY_AND_DISK).

Attributes

Persists this Dataset with the given storage level.

Attributes

The full protobuf proto.Plan (a root plan wrapping relation) that would be sent to the server to execute this Dataset. Exposed for plugin authors and tooling; ordinary code never needs it.

Attributes

Randomly splits this Dataset with the given weights and a fixed seed.

Attributes

Randomly splits this Dataset with the given weights.

Attributes

Range-partitions by the given expressions into numPartitions.

Attributes

Range-partitions by the given expressions.

Attributes

The schema of this Dataset.

Attributes

Returns a DataFrameStatFunctions for statistic functions.

Attributes

Returns the current storage level of this Dataset.

Attributes

Computes the requested summary statistics; defaults match Spark's summary().

Attributes

Returns a new Dataset where each row is reconciled to match the specified schema (by column name, reordering and casting as needed).

Attributes

Returns the content as a DataFrame of JSON strings in a single value column.

Attributes

Concisely applies a transformation to this Dataset.

Attributes

Transposes the DataFrame, turning the first column into the new column names.

Attributes

Transposes the DataFrame using indexColumn for the new column names.

Attributes

Marks this Dataset as non-persistent.

Attributes

Marks this Dataset as non-persistent.

Attributes

Unpivots (melts) a DataFrame from wide to long format.

Attributes

Unpivots, inferring the value columns from those not in ids.

Attributes

Defines an event-time watermark for this streaming Dataset.

Attributes

Interface for saving the content of this Dataset to external storage.

Attributes

Interface for saving the content of a streaming Dataset to external storage.

Attributes

Creates a v2 (catalog) write configuration builder.

Dataset

Attributes

Members list

Value members

Concrete methods

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Concrete fields