Claude has no native video. peepshow bridges.
Claude's vision is image-only. peepshow turns video + animated formats into the frame timeline Claude already accepts.
- Only path to video on Claude. Claude's API accepts images, not video. peepshow is the bridge — frame timeline + transcript → Claude reads it as a sequence of images.
- Drag-and-drop in Claude Code. The Claude Code plugin auto-invokes `/peepshow:slides` on `UserPromptSubmit` — drop a video into the prompt and Claude reads it without typing anything.
- Image-cost predictable. Claude bills per image (~1.6K tokens / 1024×1024). N frames = N × image cost. peepshow lets you pick N (defaults pick ~20).
- Animated GIF / APNG / WebP. Claude won't accept these as motion. peepshow flattens them into a frame sequence.
- Transcript stays text. whisper.cpp transcript is text input — far cheaper than feeding audio as a series of stills.
- Same artifact for every Claude model. One peepshow run feeds Opus 4.7 / Sonnet 4.6 / Haiku 4.5 / older releases — no re-extract per model.
Token-cost math (worked examples)
| Clip | Native upload | peepshow + Claude |
|---|---|---|
| 30s product demo (peepshow) | — | ~16K (6 frames at ~1.6K each + transcript) |
| 10-minute lecture (peepshow) | — | ~36K (20 scene frames + transcript) |
| 1-hour CCTV reel (peepshow) | — | ~54K (30 motion frames + sparse transcript) |
| 3-hour conference (peepshow + chunked report) | — | ~110K (60 scene frames + chaptered transcript) — fits 200K context |
Claude has no native-video baseline to compare against. All numbers are peepshow + Claude. Per-image cost estimate uses Anthropic's published vision token formula for 1024×1024 frames.
Install (Claude CLI / agent)
npm install -g peepshow
# Claude Code (plugin):
claude plugin marketplace add t0mtaylor/peepshow
claude plugin install peepshow@peepshow-marketplace
# Then drop a video into the prompt — UserPromptSubmit hook fires automatically.Full agent reference: peepshow for Claude →. The CLI itself works in any shell — the agent integration is one of many entry points.
Install (Claude API directly, no CLI)
Calling the Claude API from your own code? Run peepshow first, then feed the JSON manifest in as multimodal parts:
# 1. Extract
peepshow ./demo.mp4 --emit json > run.json
# 2. Hand the frames + transcript to Claude
node -e '
import Anthropic from "@anthropic-ai/sdk";
import { readFileSync } from "node:fs";
const run = JSON.parse(readFileSync("run.json", "utf8"));
const content = [
{ type: "text", text: "Summarise this clip." },
...run.frames.map(f => ({
type: "image",
source: { type: "base64", media_type: "image/jpeg", data: readFileSync(f.path).toString("base64") }
})),
{ type: "text", text: "Transcript:\n" + (run.transcript?.text ?? "") }
];
const c = new Anthropic();
const r = await c.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
messages: [{ role: "user", content }]
});
console.log(r.content[0].text);
'Animated GIF / APNG / WebP — peepshow's killer move on Claude
Animated GIFs, APNGs, and animated WebPs arrive at Claude as the first frame only. peepshow normalises every animated format to a flat JPEG sequence so the motion isn't lost.
peepshow ./meme.gif # animated GIF → frame timeline
peepshow ./tutorial.apng # animated PNG → frames
peepshow ./loop.webp # animated WebP → framesFrame strategy presets
peepshow picks scene-change frames by default. For Claude specifically, these presets are worth knowing:
--strategy scene --max 20Default. Good balance for narrative video. ~32K tokens at 1024×1024 frame size.--strategy scene --max 12 --dedup perceptualLong static footage (CCTV / timelapse). Drops near-duplicates so Claude doesn't pay for noise.--strategy fps --fps 0.5 --max 40Steady-motion content (sport, gameplay). Predictable cadence.
All 95 sinks still fire
Same CLI = same sinks. Push frames to SQLite, embed captions into Chroma, mirror to S3, drop a thumbnail in Slack, file a GitHub issue with the offending frame attached — all from one Claude run. Browse the full sink catalogue →.
Report + LLM analysis loop
Every run also writes a self-contained report.html + manifest.json next to the frames (see the Report page). When Claude consumes the frames, the analysis flows back into the report — whoever opens it next sees the model's understanding without re-running the prompt.
echo '{"summary":"<Claude's summary>","provider":"claude-opus-4-7"}' \
| peepshow report annotate "<outputDir>"When to skip peepshow + use Claude direct
- Source is already a sequence of stills (don't double-extract).
- Only need to know the first frame.
- Caller is a Bash agent that can run ffmpeg itself.
For everything beyond those edge cases, peepshow is the bridge: video + animated formats + transcript → Claude reads them as images + text.