Metadata and Catalogs¶
The driver implements the ADBC metadata API so you can introspect catalogs,
schemas, tables, and columns without parsing SQL output. Behind the scenes it
uses Apache Spark Connect catalog relations and AnalyzePlan.
The metadata methods¶
| Method | Returns |
|---|---|
GetObjects |
A hierarchical catalog: catalogs, schemas, tables, columns. |
GetTableSchema |
The Arrow schema of a single table. |
GetTableTypes |
The table types the server reports (for example TABLE, VIEW). |
GetInfo |
Driver and server info (names, versions, ADBC version). |
GetObjects takes a depth (catalogs, db_schemas, tables, or all) plus optional
filters for catalog, schema, table name pattern, table types, and column name
pattern.
Listing catalogs, schemas, and tables¶
import adbc_driver_spark.dbapi as dbapi
with dbapi.connect("sc://localhost:15002") as conn:
# The DBAPI connection exposes ADBC metadata helpers that return
# pyarrow objects.
catalogs = conn.adbc_get_objects(depth="catalogs").read_all()
print(catalogs.to_pylist())
# Tables in a given catalog and schema.
tables = conn.adbc_get_objects(
depth="tables",
catalog_filter="spark_catalog",
db_schema_filter="default",
).read_all()
print(tables.to_pylist())
import (
"github.com/apache/arrow-adbc/go/adbc"
)
// Walk the full object hierarchy for one schema.
reader, err := cnxn.GetObjects(
ctx,
adbc.ObjectDepthAll,
nil, // catalog
strPtr("default"), // dbSchema
nil, // tableName
nil, // columnName
nil, // tableTypes
)
if err != nil {
panic(err)
}
defer reader.Release()
for reader.Next() {
fmt.Println(reader.Record())
}
/* AdbcConnectionGetObjects walks the catalog/schema/table/column hierarchy
* and returns the result as an Arrow C stream. Pass NULL for filters you
* do not want to apply. */
struct ArrowArrayStream stream = {0};
AdbcConnectionGetObjects(&connection, ADBC_OBJECT_DEPTH_ALL,
NULL, /* catalog */
"default", /* db_schema */
NULL, /* table_name */
NULL, /* table_types */
NULL, /* column_name */
&stream, &error);
/* Read `stream` with nanoarrow. */
Note
The full setup/teardown (creating the database and connection, error checking, releases) and the compile command live in Using from C and C++.
use adbc_core::options::ObjectDepth;
// get_objects walks the catalog/schema/table/column hierarchy and returns a
// RecordBatchReader. Pass None for filters you do not want to apply.
let reader = connection.get_objects(
ObjectDepth::All,
None, // catalog
Some("default"), // db_schema
None, // table_name
None, // table_type
None, // column_name
)?;
for batch in reader {
println!("{:?}", batch?);
}
Note
For the full Rust and Ruby setup see Using from Rust and Using from Ruby.
Equivalently, you can use SQL discovery statements such as SHOW CATALOGS,
SHOW DATABASES, and SHOW TABLES, which return ordinary Arrow result sets.
Inspecting a table schema¶
Table types¶
Driver and server info¶
GetInfo reports identifying details such as the driver name and version and
the Spark vendor name and version.
Note
The column types reported by GetObjects and GetTableSchema follow the
Spark to Arrow mapping in Type Mapping.