Optimised extraction
Perceptual dedup · duration-aware GPU pick · measured on the hero reels
Two things keep peepshow fast and its output lean: an 8×8 perceptual-hash post-pass that drops near-identical frames before they hit your model, and a duration-aware --gpu auto heuristic that skips hardware decode when init overhead would dominate. Both are on by default. The numbers below are real peepshow runs against the videos this site already uses for demos.
Frame dedup, measured
Less filler. Same story.
--dedup on (default) hashes every emitted frame as an 8×8 dHash and drops any frame within hamming distance 5 of the previous keeper. Tight enough to catch lighting flicker and slideshow repeats, loose enough that high-motion content keeps every distinct frame. Numbers are seven real runs — same input, dedup off vs dedup on.
- UI demo28s · 720p · talking head −44%
- Earth-at-night12s · timelapse · h264 −67%
- Earth-at-night12s · webm −67%
- Earth-at-night12s · mov −50%
- Big Buck Bunny12s · animation −17%
- Sintel12s · animation −14%
- Jellyfish10s · high motion 0%
Threshold tuned so high-motion content (jellyfish) keeps every distinct frame — zero false drops — while near-static timelapses collapse 67% of the redundancy. Tweak with --dedup-distance N or disable with --no-dedup. The manifest carries extraction.framesDeduped + extraction.dedupDistance so reports and sinks see exactly what was dropped.
Motion signal (free byproduct)
While dedup is hashing every frame anyway, peepshow computes the average pairwise hamming distance across the kept frames and ships it on the manifest as extraction.motionSignalAvg + extraction.motionSignalLevel (low <6 · medium <15 · high 15+). LLM agents reading the manifest can adapt their narration; reports show "high motion · zero dups" instead of just "0 dropped". Below is what the signal actually reads on every video this site uses.
| video | duration | dedup dropped | motionSignalAvg | level |
|---|---|---|---|---|
| UI demo (homepage) | 36 s | 8 | 15.7 | high |
| claude-code demo | 34 s | 6 | 17.1 | high |
| cursor demo | 34 s | 6 | 17.1 | high |
| cline demo | 34 s | 6 | 16.8 | high |
| codex demo | 34 s | 6 | 16.7 | high |
| gemini demo | 34 s | 6 | 17.1 | high |
| windsurf demo | 34 s | 6 | 16.9 | high |
| hub demo | 22 s | 2 | 14.0 | medium |
| report-outro | 8 s | 1 | 13.5 | medium |
| big-buck-bunny | 12 s | 1 | 23.0 | high |
| sintel | 12 s | 1 | 21.2 | high |
| jellyfish | 10 s | 0 | 10.6 | medium |
| earth-at-night (h264) | 12 s | 4 | 6.0 | medium |
| earth-at-night (mov) | 12 s | 3 | 6.0 | medium |
| earth-at-night (webm) | 12 s | 4 | 6.0 | medium |
Two callouts. (1) The agent demos are screen recordings of terminal + editor windows — the cuts between panes register as high motion, which is why the signal lands at ~17. (2) Jellyfish has continuous gentle wave motion but each frame is gradual — the signal reads medium, not high, which is the right call: gradual change is the territory where dedup correctly keeps every frame without claiming "rapid action."
Adaptive density (high-motion answer)
When dedup keeps everything (motion = high, framesDeduped = 0) and there's still budget under --max, peepshow re-extracts at higher density. Same input, second pass at ~80% of --max, guarded so it can never recurse. The result: high-motion clips get proportionally more frames inside the same budget.
# big-buck-bunny.mp4 · --max 20 · --dedup-distance 0 (forces dedup to keep all → adaptive fires)
$ peepshow big-buck-bunny.mp4 --no-adaptive
framesEmitted = 6 framesDeduped = 0 motionSignalLevel = high
$ peepshow big-buck-bunny.mp4
framesEmitted = 14 framesDeduped = 2 motionSignalLevel = high
^^ ↑ second pass hit dedup again, kept 14 distinct
└── 2.3× density on the same clip without raising --maxAdaptive only fires for fps strategy (scene mode already optimised the spacing) and only when the conditions in shouldRunAdaptiveDensity match — fully covered by 8 unit tests in tests/dedup.test.ts. Disable with --no-adaptive when you need a deterministic fps in every run.
GPU smart-pick, measured
Skip the hwaccel tax.
Forcing hardware decode on a short H.264 clip costs more than it saves: videotoolbox initialisation plus GPU↔CPU memory copies dominate the wall clock when there aren't enough frames to amortise them. --gpu auto notices and picks CPU. The chart shows time saved by the smart-pick versus a naïve always-GPU default — same clips, same output, ~3× faster.
- claude-code demo34 s · 720p · h264 3.0×
- cursor demo34 s · 720p · h264 3.6×
- codex demo34 s · 720p · h264 2.7×
- subs10 s · 720p · h264 3.1×
- report-outro8 s · 720p · h264 2.7×
--gpu videotoolbox) smart-pick (default · --gpu auto) measured: short H.264 clips · ffmpeg 7.xNumbers above are short H.264 screen recordings. The smart-pick falls back to software decode whenever its duration + resolution heuristic thinks GPU init won't amortise. Roles reverse on hardware where the GPU has a real advantage — see the table below.
When GPU actually wins
Hardware decoders are not codec-agnostic, and not every platform's ffmpeg path has the same overhead. The smart-pick covers the safe middle ground; pin --gpu <backend> when you know your machine + codec live in a column where GPU pays back.
| Platform · backend | H.264 / AVC | HEVC / H.265 | AV1 |
|---|---|---|---|
macOS · videotoolbox (Apple Silicon) | Often slower than CPU through ffmpeg — frame-by-frame extract path adds copy overhead. Use --no-gpu. | Wins on long clips. | Wins on M3+. |
Linux · vaapi (Intel / AMD) | Wins on 1080p+ clips >15 s. | Wins big — main use case. | Wins on Intel Arc / AMD RDNA 3+. |
Linux / Windows · cuda (NVIDIA) | Wins on most clips >5 s — NVDEC is fast. | Wins big. | Wins on RTX 30+. |
Windows · d3d11va | Wins on 1080p+ >15 s. | Wins. | Wins on RTX 30+ / Arc. |
On Linux + NVIDIA (the standard "throw 4 hours of CCTV at it" workload) --gpu cuda typically halves wall-clock time on H.264 and does dramatically better on HEVC. We don't have a Linux + NVIDIA bench in the repo, so we don't claim a number on this page — pin the backend and trust your own extraction.elapsedMs. The smart-pick is a safe cross-platform default, not the optimum for every machine.
Configuration cheat-sheet
# Dedup
--dedup on|auto|off (default on; auto = fps-fallback only)
--dedup-distance 0..64 (default 5; lower = stricter)
--no-dedup (alias for --dedup off)
# Adaptive density (high-motion answer)
--adaptive on|off (default on; second pass when motion=high
+ dedup dropped 0 + framesEmitted < max*0.6)
--no-adaptive (alias for --adaptive off)
# GPU
--gpu auto|off|videotoolbox|cuda|qsv|vaapi|amf|d3d11va (default auto)
--no-gpu (alias for --gpu off)
# Env overrides
PEEPSHOW_GPU_MIN_SECONDS=30 (clips shorter than this skip hwaccel)
PEEPSHOW_GPU_MIN_HEIGHT=1080 (clips below this height skip hwaccel
when in the medium-duration band)
PEEPSHOW_NO_HINTS=1 (silence motion + perf hints on stderr)