Side-by-side
| peepshow | OpenAI native video (does not exist) | |
|---|---|---|
| Native video input | ✅ via frames + transcript | ❌ not supported |
| Native audio input | ✅ via whisper.cpp / Whisper API | Whisper API handles audio standalone |
| Setup | `npm i -g peepshow` | n/a (would have to wait for OpenAI) |
| Animated GIF / APNG / WebP | ✅ extracted as motion | Read as static first frame |
| File reuse across calls | ✅ via OpenAI Files sink | n/a |
| Cost | N × per-image vision price | Hypothetical — TBD if/when shipped |
| Latency | Local extraction + 1 API call | n/a |
Pick peepshow when…
- You're building a GPT-powered app and need video understanding today.
- You want to reuse extracted frames across multiple GPT calls (peepshow → OpenAI Files sink → reference by file-id).
- Whisper API or whisper.cpp is your transcription path.
Pick OpenAI native video (does not exist) when…
- OpenAI ships native video (date unknown) and your clip is short enough to fit.
Verdict
Until OpenAI ships native video, peepshow is the bridge. The OpenAI Files sink uploads frames once, references them by file-id across many Responses calls — RAG-style for video.