What it does
[Databricks](https://www.databricks.com/) is the unified lakehouse platform — Delta Lake storage plus a SQL warehouse on top. This sink writes one row per peepshow run to a Delta table via the [SQL Statement Execution API](https://docs.databricks.com/api/workspace/statementexecution) (`POST /api/2.0/sql/statements/`). Auth is a [personal access token (PAT)](https://docs.databricks.com/en/dev-tools/auth/pat.html) sent as a Bearer header — no JDBC driver, no SQL Connector for Python to install, no key-pair signing. The first write auto-creates the table with the standard peepshow schema (`run_id · title · frames · duration · transcript · thumbnail_url · strategy · tags · created_at`) using `CREATE TABLE IF NOT EXISTS ... USING DELTA`; subsequent runs append. The warehouse must already exist — `DATABRICKS_WAREHOUSE_ID` points the sink at it.
When to reach for it
- Pipe peepshow runs into the same Databricks workspace your product analytics already lives in
- Build a Databricks SQL dashboard or AI/BI Genie space over run history without an ETL layer
- Hand a service-principal PAT to a CI job that records every QA video into a shared lakehouse
Install
npm i -g peepshowUse it
DATABRICKS_URL="https://abc-123.cloud.databricks.com" \
DATABRICKS_TOKEN="$(< ~/.databricks-pat)" \
DATABRICKS_WAREHOUSE_ID="abcd1234efgh5678" \
peepshow ./demo.mp4 --sink databricksMake it automatic
Register the sink once — every run fires it afterward. Scope by --when so it only runs for matching videos.
peepshow sinks add databricks
peepshow sinks add databricks --when extension=mp4,mov
peepshow sinks add databricks --when path=/Volumes/Work/Configuration
DATABRICKS_URLWorkspace URL, e.g. `https://abc-123.cloud.databricks.com` (AWS), `https://adb-…azuredatabricks.net` (Azure), or your GCP workspace host. requiredDATABRICKS_TOKENPersonal access token (Bearer). Generate under User Settings → Developer → Access tokens. Use a service-principal PAT for CI. requiredDATABRICKS_WAREHOUSE_IDSQL warehouse id — the compute that runs the statement. Copy from the SQL warehouse details page. requiredDATABRICKS_CATALOGUnity Catalog catalog. Default `main`.DATABRICKS_SCHEMASchema (a.k.a. database) within the catalog. Default `default`.DATABRICKS_TABLETable name. Default `peepshow_runs`. Auto-created on first write.PEEPSHOW_FRAME_BASE_URLWhen set, the first frame URL is written to the `thumbnail_url` column.
Use with an LLM agent
Every peepshow sink reads its config from env vars and receives a single JSON payload on stdin. An LLM agent (Claude Code, Cursor, Windsurf, Gemini, Codex) can drive the Databricks sink automatically when three things are true:
- the env vars below are exported in the agent's shell (or a project
.envit can load), - the
peepshowCLI is onPATH— install withnpm i -g peepshow, - a peepshow auto-sink is registered for the run (optional but recommended — makes invocation zero-argument).
1. Set the environment
# Add to ~/.zshrc, ~/.bashrc, or a project .env the agent can load
export DATABRICKS_URL="..."
export DATABRICKS_TOKEN="..."
export DATABRICKS_WAREHOUSE_ID="..."2. Register as an auto-sink
peepshow sinks add databricks
peepshow sinks add databricks --when extension=mp4,mov3. Example LLM session
You → drop a
.movinto Claude Code.Claude → auto-invokes
/peepshow:slides ./clip.mov. peepshow extracts frames + audio, theDatabrickssink forwards the run to the configured database. Claude replies with a summary and a link to the created record.
The transcript rides along in the payload whenever the audio pass transcribes successfully.
Write your own
A sink is any executable that reads the --emit json payload on stdin. Shell, Node, Python, Go — the spec's in docs/PLUGINS.md. Register persistent ones with peepshow sinks add-cmd 'your-command'.