peepshow/ sinks/ duckdb

Reel #76 Embedded analytics

peepshow sink / duckdb

DuckDBEmbedded analytics — local .duckdb file you can query with SQL.

Append each peepshow run as one row in a local DuckDB file via the `duckdb` CLI. Sibling to the SQLite sink — same single-file simplicity, columnar query speed.

drop · process · duckdb

What it does

[DuckDB](https://duckdb.org) is the embedded analytical database for fast, local OLAP — think SQLite but optimised for analytical queries. This sink shells out to the `duckdb` CLI to append each peepshow run to a `.duckdb` file, auto-creating the table on first write. Query the file with `duckdb peepshow.duckdb`, Python, R, or any BI tool that speaks the DuckDB protocol. Zero servers, full SQL, columnar speed.

When to reach for it

  • Keep a local, queryable archive of every video an agent has watched with analytical query speed
  • Ship a `.duckdb` snapshot with a video dataset so analysts can slice run history without a server
  • Cross-join peepshow runs with Parquet, CSV, or remote data sources via DuckDB's federated query support

Install

# runtime + DuckDB CLI (no Node binding required)
npm i -g peepshow
brew install duckdb  # or: https://duckdb.org/docs/installation/

Use it

peepshow ./demo.mp4 --sink duckdb

Make it automatic

Register the sink once — every run fires it afterward. Scope by --when so it only runs for matching videos.

peepshow sinks add duckdb
peepshow sinks add duckdb --when extension=mp4,mov
peepshow sinks add duckdb --when path=/Volumes/Work/

Configuration

  • DUCKDB_PATH Database file path. Default `~/.peepshow/sinks/duckdb/peepshow.duckdb`. Parent dir auto-created.
  • DUCKDB_TABLE Table name. Default `peepshow_runs`. Auto-created on first write.
  • DUCKDB_BIN Override the `duckdb` executable path. Default `duckdb` (must be on PATH).
  • PEEPSHOW_FRAME_BASE_URL When set, the first frame URL is written to the `thumbnail_url` column.

Use with an LLM agent

Every peepshow sink reads its config from env vars and receives a single JSON payload on stdin. An LLM agent (Claude Code, Cursor, Windsurf, Gemini, Codex) can drive the DuckDB sink automatically when three things are true:

  • the env vars below are exported in the agent's shell (or a project .env it can load),
  • the peepshow CLI is on PATH — install with npm i -g peepshow,
  • a peepshow auto-sink is registered for the run (optional but recommended — makes invocation zero-argument).

1. Set the environment

This sink has no required env vars — it writes to a local path. Pass the destination via --sink-arg:

peepshow ./demo.mp4 --sink duckdb

2. Register as an auto-sink

peepshow sinks add duckdb
peepshow sinks add duckdb --when extension=mp4,mov

3. Example LLM session

You → drop a .mov into Claude Code.

Claude → auto-invokes /peepshow:slides ./clip.mov. peepshow extracts frames + audio, the DuckDB sink forwards the run to the configured database. Claude replies with a summary and a link to the created record.

The transcript rides along in the payload whenever the audio pass transcribes successfully.

Write your own

A sink is any executable that reads the --emit json payload on stdin. Shell, Node, Python, Go — the spec's in docs/PLUGINS.md. Register persistent ones with peepshow sinks add-cmd 'your-command'.