org.apache.spark.sql.streaming
Members list
Type members
Classlikes
Loads a streaming DataFrame from a streaming source. Use SparkSession.readStream to access this.
Loads a streaming DataFrame from a streaming source. Use SparkSession.readStream to access this.
Mirrors the public surface of org.apache.spark.sql.streaming.DataStreamReader over the Spark Connect protocol. A streaming read is expressed as a Read relation whose is_streaming flag is set, so the resulting DataFrame is unbounded and is meant to be consumed through a DataStreamWriter.
Attributes
- Example
-
val df = spark.readStream.format("rate").option("rowsPerSecond", 5).load() - Supertypes
-
class Objecttrait Matchableclass Any
Writes a streaming Dataset to an external sink and starts the streaming query. Use Dataset.writeStream to access this.
Writes a streaming Dataset to an external sink and starts the streaming query. Use Dataset.writeStream to access this.
Mirrors the public surface of org.apache.spark.sql.streaming.DataStreamWriter over the Spark Connect protocol. Calling start or toTable builds a WriteStreamOperationStart command, sends it to the server, and returns a StreamingQuery handle built from the returned query id, run id and name.
foreach and foreachBatch are intentionally unsupported: they require user-defined functions, whose Spark Connect protobuf definitions are not handled here.
Attributes
- Example
-
val query = df.writeStream .format("console") .outputMode("append") .trigger(Trigger.ProcessingTime("1 second")) .start() query.stop() - Supertypes
-
class Objecttrait Matchableclass Any
A handle to a query that is executing continuously in the background as new data arrives. Returned by DataStreamWriter#start and StreamingQueryManager.
A handle to a query that is executing continuously in the background as new data arrives. Returned by DataStreamWriter#start and StreamingQueryManager.
Mirrors the public surface of org.apache.spark.sql.streaming.StreamingQuery over the Spark Connect protocol. Every method issues a StreamingQueryCommand to the server, identified by the query's id and runId, and reads the typed result back from the response stream.
Value parameters
- id
-
the stable query id that persists across restarts from a checkpoint.
- name
-
the user-specified query name, or
nullif none was set. - runId
-
the run id that is unique for each start or restart of the query.
- spark
-
the session that owns this query.
Attributes
- Supertypes
-
class Objecttrait Matchableclass Any
Describes the exception that terminated a StreamingQuery.
Describes the exception that terminated a StreamingQuery.
Value parameters
- errorClass
-
the error class of the exception, if any.
- message
-
the exception message, matching the server-side
StreamingQueryException.toString. - stackTrace
-
the stack trace of the exception, if any.
Attributes
- Supertypes
-
trait Serializabletrait Producttrait Equalsclass Objecttrait Matchableclass AnyShow all
Interface for listening to streaming query lifecycle events. Register an implementation with spark.streams.addListener(...).
Interface for listening to streaming query lifecycle events. Register an implementation with spark.streams.addListener(...).
Over Spark Connect the callbacks run on the client: the server forwards serialized events on a dedicated stream and this client dispatches them to the registered listeners. No user code runs on the server, so listeners (unlike foreach/foreachBatch) are fully supported.
spark.streams.addListener(new StreamingQueryListener {
def onQueryStarted(e: StreamingQueryListener.QueryStartedEvent): Unit = println(s"started ${e.id}")
def onQueryProgress(e: StreamingQueryListener.QueryProgressEvent): Unit = println(e.json)
def onQueryTerminated(e: StreamingQueryListener.QueryTerminatedEvent): Unit = println(s"done ${e.id}")
})
Attributes
- Companion
- object
- Supertypes
-
trait Serializableclass Objecttrait Matchableclass Any
Attributes
- Companion
- class
- Supertypes
-
class Objecttrait Matchableclass Any
- Self type
Manages all the StreamingQuery instances active in a single SparkSession. Use SparkSession.streams to access this.
Manages all the StreamingQuery instances active in a single SparkSession. Use SparkSession.streams to access this.
Mirrors the public surface of org.apache.spark.sql.streaming.StreamingQueryManager over the Spark Connect protocol. Every method issues a StreamingQueryManagerCommand to the server and reads the typed result back from the response stream.
Value parameters
- spark
-
the session whose streaming queries are managed.
Attributes
- Supertypes
-
class Objecttrait Matchableclass Any
A snapshot of a StreamingQuery's current status.
A snapshot of a StreamingQuery's current status.
Value parameters
- isActive
-
whether the query is still running.
- isDataAvailable
-
whether there is any data available to be processed.
- isTriggerActive
-
whether a trigger is currently active (a micro-batch is being processed).
- message
-
a human-readable description of what the query is currently doing.
Attributes
- Supertypes
-
trait Serializabletrait Producttrait Equalsclass Objecttrait Matchableclass AnyShow all
Policy used to indicate how often results should be produced by a StreamingQuery. Mirrors org.apache.spark.sql.streaming.Trigger. Pass an instance to DataStreamWriter#trigger.
Policy used to indicate how often results should be produced by a StreamingQuery. Mirrors org.apache.spark.sql.streaming.Trigger. Pass an instance to DataStreamWriter#trigger.
Attributes
- Companion
- object
- Supertypes
-
class Objecttrait Matchableclass Any