Where peepshow fits

peepshow is a CLI first — anything that can spawn a child process can use it. The intended use cases break down into six shapes, from a drag-and-drop Claude Code plugin to a server-side multi-LLM pre-processor.

Who it's for

LLMs can read images. Your footage is a sequence. Whatever's in the video — a bug, a break-in, a lecture, an exploit — peepshow turns it into still frames an LLM already knows how to reason about. Five audiences shape the defaults:

Developers

QA + dev video repros

Screen-recording a flicker, a designer sending a 12-second Loom, a user uploading the .mov with the frame that breaks everything. peepshow turns the clip into scene-aware stills so the model sees the bug frame-by-frame.

peepshow ./bug-repro.mov --strategy scene --max 12
CCTV & surveillance

Hours of footage → minutes of signal

An hour of overnight camera footage is hours of nothing followed by twelve seconds that matter. Scene detect flags motion; perceptual-hash dedup drops the static near-duplicates; SQLite sink archives the timeline.

Motion-only keyframes · searchable archive · LLM Q&A.
Researchers & students

Lectures, fieldwork, microscopy

Lecture captures, fieldwork timelapses, documentary clips, microscopy. The LLM can read a slide, a phase change, a titration colour shift — peepshow picks frames that change so notes are reasoned about, not transcribed.

Slide-by-slide · Obsidian sink · markdown emit.
Security research

CVE repros + exploit PoCs

Evidence is only useful if reviewable. peepshow extracts the frames where state changes so an analyst, a report, or an LLM can cite them directly — frame-accurate, no re-watching at 1×.

XML emit · GitHub Issues sink · severity tags.
Accessibility

Screen-reader friendly video

Video content is a wall to users on screen readers. peepshow converts the visual track into per-scene stills so an LLM can describe each moment — alt text that reflects the whole story, not a single thumbnail.

Scene alt text · webhook fan-out · deterministic.

Common patterns

peepshow + Gemini (and other native-video models)

Gemini 2.x reads video natively via the File API. So does GPT-4o on short clips, and so will most frontier models. peepshow doesn't compete with that — it sits in front of it. Native video is great for short clips where token budget is irrelevant; peepshow is the control plane for everything else.

The native multimodal API and peepshow are complementary: use Gemini direct for a 30-second clip in a single turn; use peepshow when the video is long, the same artifact needs to reach more than one model, or the frames need to live somewhere durable.

Use it from Gemini CLI (3 steps)

peepshow already ships a Gemini CLI extension — gemini-extension.json + GEMINI.md at the repo root. No skill file needed; Gemini picks it up as a custom tool.

  1. Install peepshow.
    npm install -g peepshow
  2. Register the extension with Gemini CLI.
    git clone --depth 1 https://github.com/t0mtaylor/peepshow.git
    cd peepshow
    gemini --extension .   # or copy gemini-extension.json + GEMINI.md into your global Gemini extensions dir
  3. Ask Gemini about a video.
    user:   summarise demo.mp4
    gemini: (invokes peepshow tool → reads scene frames + transcript → answers)

Same flow works for long lectures, surveillance clips, animated GIFs — Gemini gets a frame timeline instead of paying per-second video tokens. All 71 sinks fire automatically, so the run also lands wherever you've configured (Notion, Slack, SQLite, S3, …). Full reference: peepshow for Gemini CLI.

Privacy & telemetry

The CLI sends an anonymous run beacon by default (version + OS family + outcome — no paths, no payload). Opt out with peepshow config set telemetry off, PEEPSHOW_TELEMETRY=0, or DO_NOT_TRACK=1. Full details in the privacy policy and docs/PRIVACY.md.