peepshow/ compare/ gemini-native-video

Reel #V-01 ◆ Frame extraction vs File API video upload ◆ 2026-04-24

peepshow / compare / gemini-native-video

peepshow vs uploading video directly to Gemini

Gemini 2.5 reads video natively via the File API. peepshow is a frame-extraction CLI that sits in front. Honest comparison: when is native video the right call, when does peepshow earn its place?

Side-by-side

	peepshow	Gemini File API native video
Setup	`npm i -g peepshow`	Built into Gemini SDK
Clip length	Unlimited (frames don't scale per second)	Soft-capped — long clips fail or get truncated
Token cost per hour	~22K	~930K
Animated GIF / APNG / WebP	✅ extracted as motion	❌ flattened to single frame
Audio transcript	Optional — whisper.cpp local	Bundled with video upload
Audit trail	✅ frames on disk, replayable	❌ opaque internal sampling
PII residency	✅ local-only by default	Goes to Google File API
Cross-model reuse	✅ same bundle to Claude / GPT / local	❌ Gemini-only artifact
Sink fan-out	✅ 95 destinations	❌ none
Short clip ergonomics	Frames + transcript = 2 inputs	Single upload — simpler

Pick peepshow when…

Clip is over 1 minute.
Source is animated GIF / APNG / animated WebP.
Same artifact will reach more than one model.
PII / regulated content shouldn't leave the box.
Output needs to land in Notion / Slack / SQL / S3 alongside the LLM call.
You want to inspect which frames the model saw.

Pick Gemini File API native video when…

Clip is under 60s and audio fidelity matters (lip-sync, sport).
One-shot prototype — token cost irrelevant.
No need to persist frames anywhere.
Gemini is the only model that'll ever see this clip.

Verdict

Use Gemini native video for short, one-off clips. Use peepshow for everything else — long footage, animated formats, multi-model pipelines, anything that needs to fan out to your stack.

Related LLM guides

Other comparisons

Related