Optimised extraction

Perceptual dedup · duration-aware GPU pick · measured on the hero reels

Two things keep peepshow fast and its output lean: an 8×8 perceptual-hash post-pass that drops near-identical frames before they hit your model, and a duration-aware --gpu auto heuristic that skips hardware decode when init overhead would dominate. Both are on by default. The numbers below are real peepshow runs against the videos this site already uses for demos.

Frame dedup, measured

Less filler. Same story.

--dedup on (default) hashes every emitted frame as an 8×8 dHash and drops any frame within hamming distance 5 of the previous keeper. Tight enough to catch lighting flicker and slideshow repeats, loose enough that high-motion content keeps every distinct frame. Numbers are seven real runs — same input, dedup off vs dedup on.

UI demo28s · 720p · talking head 18 10 −44%
Earth-at-night12s · timelapse · h264 6 2 −67%
Earth-at-night12s · webm 6 2 −67%
Earth-at-night12s · mov 6 3 −50%
Big Buck Bunny12s · animation 6 5 −17%
Sintel12s · animation 7 6 −14%
Jellyfish10s · high motion 6 6 0%

raw frames (dedup off) kept after dedup (default) cost: ~20–150 ms · single ffmpeg pass

Threshold tuned so high-motion content (jellyfish) keeps every distinct frame — zero false drops — while near-static timelapses collapse 67% of the redundancy. Tweak with --dedup-distance N or disable with --no-dedup. The manifest carries extraction.framesDeduped + extraction.dedupDistance so reports and sinks see exactly what was dropped.

Motion signal (free byproduct)

While dedup is hashing every frame anyway, peepshow computes the average pairwise hamming distance across the kept frames and ships it on the manifest as extraction.motionSignalAvg + extraction.motionSignalLevel (low <6 · medium <15 · high 15+). LLM agents reading the manifest can adapt their narration; reports show "high motion · zero dups" instead of just "0 dropped". Below is what the signal actually reads on every video this site uses.

video	duration	dedup dropped	motionSignalAvg	level
UI demo (homepage)	36 s	8	`15.7`	high
claude-code demo	34 s	6	`17.1`	high
cursor demo	34 s	6	`17.1`	high
cline demo	34 s	6	`16.8`	high
codex demo	34 s	6	`16.7`	high
gemini demo	34 s	6	`17.1`	high
windsurf demo	34 s	6	`16.9`	high
hub demo	22 s	2	`14.0`	medium
report-outro	8 s	1	`13.5`	medium
big-buck-bunny	12 s	1	`23.0`	high
sintel	12 s	1	`21.2`	high
jellyfish	10 s	0	`10.6`	medium
earth-at-night (h264)	12 s	4	`6.0`	medium
earth-at-night (mov)	12 s	3	`6.0`	medium
earth-at-night (webm)	12 s	4	`6.0`	medium

Two callouts. (1) The agent demos are screen recordings of terminal + editor windows — the cuts between panes register as high motion, which is why the signal lands at ~17. (2) Jellyfish has continuous gentle wave motion but each frame is gradual — the signal reads medium, not high, which is the right call: gradual change is the territory where dedup correctly keeps every frame without claiming "rapid action."

Adaptive density (high-motion answer)

When dedup keeps everything (motion = high, framesDeduped = 0) and there's still budget under --max, peepshow re-extracts at higher density. Same input, second pass at ~80% of --max, guarded so it can never recurse. The result: high-motion clips get proportionally more frames inside the same budget.

# big-buck-bunny.mp4 · --max 20 · --dedup-distance 0 (forces dedup to keep all → adaptive fires)
$ peepshow big-buck-bunny.mp4 --no-adaptive
  framesEmitted = 6   framesDeduped = 0   motionSignalLevel = high

$ peepshow big-buck-bunny.mp4
  framesEmitted = 14  framesDeduped = 2   motionSignalLevel = high
                  ^^                          ↑ second pass hit dedup again, kept 14 distinct
                  └── 2.3× density on the same clip without raising --max

Adaptive only fires for fps strategy (scene mode already optimised the spacing) and only when the conditions in shouldRunAdaptiveDensity match — fully covered by 8 unit tests in tests/dedup.test.ts. Disable with --no-adaptive when you need a deterministic fps in every run.

GPU smart-pick, measured

Skip the hwaccel tax.

Forcing hardware decode on a short H.264 clip costs more than it saves: videotoolbox initialisation plus GPU↔CPU memory copies dominate the wall clock when there aren't enough frames to amortise them. --gpu auto notices and picks CPU. The chart shows time saved by the smart-pick versus a naïve always-GPU default — same clips, same output, ~3× faster.

claude-code demo34 s · 720p · h264 2234 ms forced GPU 739 ms auto 3.0×
cursor demo34 s · 720p · h264 2397 ms forced GPU 655 ms auto 3.6×
codex demo34 s · 720p · h264 2213 ms forced GPU 823 ms auto 2.7×
subs10 s · 720p · h264 1054 ms forced GPU 340 ms auto 3.1×
report-outro8 s · 720p · h264 749 ms forced GPU 282 ms auto 2.7×

always GPU (naïve default · --gpu videotoolbox) smart-pick (default · --gpu auto) measured: short H.264 clips · ffmpeg 7.x

Numbers above are short H.264 screen recordings. The smart-pick falls back to software decode whenever its duration + resolution heuristic thinks GPU init won't amortise. Roles reverse on hardware where the GPU has a real advantage — see the table below.

When GPU actually wins

Hardware decoders are not codec-agnostic, and not every platform's ffmpeg path has the same overhead. The smart-pick covers the safe middle ground; pin --gpu <backend> when you know your machine + codec live in a column where GPU pays back.

Platform · backend	H.264 / AVC	HEVC / H.265	AV1
macOS · `videotoolbox` (Apple Silicon)	Often slower than CPU through ffmpeg — frame-by-frame extract path adds copy overhead. Use `--no-gpu`.	Wins on long clips.	Wins on M3+.
Linux · `vaapi` (Intel / AMD)	Wins on 1080p+ clips >15 s.	Wins big — main use case.	Wins on Intel Arc / AMD RDNA 3+.
Linux / Windows · `cuda` (NVIDIA)	Wins on most clips >5 s — NVDEC is fast.	Wins big.	Wins on RTX 30+.
Windows · `d3d11va`	Wins on 1080p+ >15 s.	Wins.	Wins on RTX 30+ / Arc.

On Linux + NVIDIA (the standard "throw 4 hours of CCTV at it" workload) --gpu cuda typically halves wall-clock time on H.264 and does dramatically better on HEVC. We don't have a Linux + NVIDIA bench in the repo, so we don't claim a number on this page — pin the backend and trust your own extraction.elapsedMs. The smart-pick is a safe cross-platform default, not the optimum for every machine.

Configuration cheat-sheet

# Dedup
--dedup on|auto|off          (default on; auto = fps-fallback only)
--dedup-distance 0..64       (default 5; lower = stricter)
--no-dedup                   (alias for --dedup off)

# Adaptive density (high-motion answer)
--adaptive on|off            (default on; second pass when motion=high
                              + dedup dropped 0 + framesEmitted < max*0.6)
--no-adaptive                (alias for --adaptive off)

# GPU
--gpu auto|off|videotoolbox|cuda|qsv|vaapi|amf|d3d11va  (default auto)
--no-gpu                     (alias for --gpu off)

# Env overrides
PEEPSHOW_GPU_MIN_SECONDS=30  (clips shorter than this skip hwaccel)
PEEPSHOW_GPU_MIN_HEIGHT=1080 (clips below this height skip hwaccel
                              when in the medium-duration band)
PEEPSHOW_NO_HINTS=1          (silence motion + perf hints on stderr)