DataStreamWriter

org.apache.spark.sql.streaming.DataStreamWriter

Writes a streaming Dataset to an external sink and starts the streaming query. Use Dataset.writeStream to access this.

Mirrors the public surface of org.apache.spark.sql.streaming.DataStreamWriter over the Spark Connect protocol. Calling start or toTable builds a WriteStreamOperationStart command, sends it to the server, and returns a StreamingQuery handle built from the returned query id, run id and name.

foreach and foreachBatch are intentionally unsupported: they require user-defined functions, whose Spark Connect protobuf definitions are not handled here.

Attributes

Example
 val query = df.writeStream
   .format("console")
   .outputMode("append")
   .trigger(Trigger.ProcessingTime("1 second"))
   .start()
 query.stop()
Graph
Supertypes
class Object
trait Matchable
class Any

Members list

Value members

Concrete methods

def format(source: String): DataStreamWriter

Specifies the sink format (e.g. "console", "memory", "kafka").

Specifies the sink format (e.g. "console", "memory", "kafka").

Attributes

Returns

this writer, for chaining.

def option(key: String, value: String): DataStreamWriter

Adds an output option for the underlying sink.

Adds an output option for the underlying sink.

Attributes

Returns

this writer, for chaining.

def option(key: String, value: Boolean): DataStreamWriter

Adds a boolean output option. @return this writer, for chaining.

Adds a boolean output option. @return this writer, for chaining.

Attributes

def option(key: String, value: Long): DataStreamWriter

Adds a long output option. @return this writer, for chaining.

Adds a long output option. @return this writer, for chaining.

Attributes

def option(key: String, value: Double): DataStreamWriter

Adds a double output option. @return this writer, for chaining.

Adds a double output option. @return this writer, for chaining.

Attributes

def options(options: Map[String, String]): DataStreamWriter

Adds multiple output options.

Adds multiple output options.

Attributes

Returns

this writer, for chaining.

def outputMode(outputMode: String): DataStreamWriter

Specifies the output mode ("append", "complete", or "update").

Specifies the output mode ("append", "complete", or "update").

Attributes

Returns

this writer, for chaining.

def partitionBy(colNames: String*): DataStreamWriter

Partitions the output by the given columns on the file system.

Partitions the output by the given columns on the file system.

Attributes

Returns

this writer, for chaining.

def queryName(queryName: String): DataStreamWriter

Specifies the name of the StreamingQuery that can be used with StreamingQueryManager#get to look it up. The name is required by the memory sink.

Specifies the name of the StreamingQuery that can be used with StreamingQueryManager#get to look it up. The name is required by the memory sink.

Attributes

Returns

this writer, for chaining.

def start(path: String): StreamingQuery

Starts the execution of the streaming query, writing the result to the given path. This is shorthand for setting the path option and calling start.

Starts the execution of the streaming query, writing the result to the given path. This is shorthand for setting the path option and calling start.

Attributes

Returns

the StreamingQuery handle for the started query.

Starts the execution of the streaming query, which will continually output results to the configured sink as new data arrives.

Starts the execution of the streaming query, which will continually output results to the configured sink as new data arrives.

Attributes

Returns

the StreamingQuery handle for the started query.

def toTable(tableName: String): StreamingQuery

Starts the execution of the streaming query, writing the result into the given table.

Starts the execution of the streaming query, writing the result into the given table.

Attributes

Returns

the StreamingQuery handle for the started query.

Sets the trigger for the streaming query. Exactly one trigger is in effect; calling this method again replaces any previously set trigger.

Sets the trigger for the streaming query. Exactly one trigger is in effect; calling this method again replaces any previously set trigger.

Attributes

Returns

this writer, for chaining.