Troubleshooting¶
Common problems connecting to and querying Apache Spark Connect, with what each symptom usually means and how to fix it.
Connection refused on port 15002¶
The Spark Connect server is not running, or it is listening somewhere else.
- Confirm the server is up. Spark Connect listens on
sc://localhost:15002by default. On Spark 4.x the Connect server is bundled, so start it with./sbin/start-connect-server.sh(on Spark 3.5.x add--packages org.apache.spark:spark-connect_2.13:3.5.8). - Check the host and port in your URI. A managed endpoint usually uses port
443, not15002. - Verify nothing else is occupying the port and that no firewall blocks it.
TLS and token errors¶
- A bearer token requires TLS. The driver enables TLS automatically when a token
is set, but if you set
use_ssl=falseexplicitly with a token, the credential will not be sent. Remove the override or setuse_ssl=true. - For a plaintext local server, do not set
use_ssl=true; the handshake will fail because the server has no TLS. - Check the token has not expired and is the right kind for the endpoint (personal access token vs OAuth). See Connecting and Authentication.
"server session changed"¶
The server restarted, or the session you pinned with session_id no longer
exists.
- Drop the
session_idto let the driver create a fresh session. - If you rely on session reuse, make sure the same server instance is still running and that the UUID is current.
Large results stall or use too much memory¶
For very large result sets:
- Stream batches instead of materializing everything. Use
fetch_record_batch()in Python or iterate theRecordReaderin Go rather thanfetch_arrow_table(). The driver delivers one batch at a time, so a streaming consumer keeps memory flat regardless of result size. See Querying Data. - Reduce the result size at the source with
LIMIT, projection, or aggregation so Spark prunes the work server-side before sending Arrow.
CGO or shared library not found¶
This affects the C/C++/R/Python loading paths (the native Go driver does not use the shared library).
- Make sure
libadbc_driver_spark.{so,dylib,dll}is on the loader path (LD_LIBRARY_PATH,DYLD_LIBRARY_PATH, orPATH), or pass its absolute path as thedriveroption. - If you built from source, the C-ABI package must be built with
-tags driverlib -buildmode=c-sharedand a working cgo toolchain. See Installation. - For Python,
pip install adbc-driver-sparkbundles the library; a "not found" error usually means a partial or source install without the compiled artifact.
Version mismatches¶
- Check the Spark Connect server version. The driver supports Spark 4.0.x and 4.1.x (protos pinned to v4.1.2). See Compatibility.
- For Go, the module requires Go 1.25 or newer and matched
arrow-goandarrow-adbcversions; rungo mod tidyif you see resolution errors. - For Python, ensure
pyarrowandadbc-driver-managerwere installed by the samepip installso their ABIs line up.
Tip
Set the gRPC log level on the server, or capture the full ADBC error message (the driver propagates the server's status and detail), to see the underlying Spark Connect error behind a generic failure.