SparkSession

Attributes

Definition Classes: AutoCloseable

Creates a DataFrame from a local sequence of Rows using the given schema.

Attributes

Creates a DataFrame from a Java list of Rows using the given schema.

Attributes

Creates a typed Dataset from a local sequence of T using its Encoder. The data is serialized via the encoder and shipped as a local relation; no server-side closure is involved.

Attributes

Creates a typed Dataset from a Java list of T using its Encoder.

Attributes

Returns a DataFrame with no rows or columns.

Attributes

Creates a DataFrame from an arbitrary RelType, tagging it with a unique plan id.

This is the primary extension point for Spark Connect plugins: build any relation the server understands -- including a custom extension relation -- and turn it into a DataFrame. For the common case of a packed plugin message, prefer the extension overload.

Attributes

Creates a DataFrame backed by an extension relation carrying an already packed protobuf message. This is the Scala counterpart of the PySpark client's

plan.extension.Pack(message); DataFrame(plan, session)

and targets server-side org.apache.spark.sql.connect.plugin.RelationPlugin implementations.

 import com.google.protobuf.any.Any as ProtoAny

 // `myPluginMessage` is a ScalaPB message generated from the plugin's own .proto.
 // Existing DataFrames can be embedded via their `relation` (see Dataset.relation):
 //   val payload = MyPlugin(input = Some(existingDf.relation))
 val payload: ProtoAny = ProtoAny.pack(myPluginMessage)
 val df = spark.newDataFrame(payload)

Attributes

Wraps a RelType into a fresh proto.Relation tagged with a unique plan id.

This is a low-level building block intended for Spark Connect plugin authors who need to construct custom relations (for example, a relation handled by a server-side org.apache.spark.sql.connect.plugin.RelationPlugin). It is a stable extension point but not part of the day-to-day DataFrame API; ordinary code should use range, sql, createDataFrame, and friends instead.

Attributes

Starts a new independent session against the same endpoint (fresh server-side session).

Attributes

Creates a new Spark Declarative Pipeline (a dataflow graph) in this session. Available on Spark 4.1 and later servers.

Attributes

Creates a DataFrame with a single id column of [0, end).

Attributes

Creates a DataFrame with a single id column of [start, end).

Attributes

Returns a DataFrameReader that can be used to read non-streaming data as a DataFrame.

Attributes

Returns a org.apache.spark.sql.streaming.DataStreamReader for reading streaming data.

Attributes

The client session id (a UUID).

Attributes

Make this the active session for the current thread.

Attributes

Executes a SQL query and returns a lazy DataFrame over its result.

Attributes

Executes a SQL query with positional parameters bound into the query.

Attributes

Executes a SQL query with named parameters bound into the query.

Attributes

Releases the server-side session resources and closes the channel.

Attributes

Returns a org.apache.spark.sql.streaming.StreamingQueryManager for this session.

Attributes

Returns the named table or view as a DataFrame.

Attributes

The version of Spark on which the connected server is running.

SparkSession

Attributes

Members list

Type members

Classlikes

Attributes

Value members

Concrete methods

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Attributes

Concrete fields

Attributes

Attributes