Steps
- Install peepshow + an embedding CLI
Recommend `pip install open-clip-torch` + a thin CLI wrapper, OR `clip-cli` / `siglip-cli` from PyPI.
npm install -g peepshow pip install open-clip-torch # plus a clip-cli script of your choice - Run with --embed-frames
Auto-detects `embed-cli`, then `clip-cli`, then `siglip-cli` on PATH.
peepshow ./demo.mp4 --embed-frames - Pick a model
Any `open_clip` model name or a HuggingFace SigLIP id.
peepshow ./demo.mp4 --embed-frames --embed-model siglip-base-patch16-naflex - Push to a vector sink
Vectors land in the sink directly.
peepshow ./demo.mp4 --embed-frames --sink chroma # query later: chroma-client query --collection peepshow --text 'person walking dog'
Why it works
Vector sinks need an embedding vector per frame. Without `--embed-frames`, sinks like Chroma have to compute embeddings themselves (per-frame, at query time, or via a second batch pass). With `--embed-frames`, peepshow does it once at extraction time and the vectors flow through the same JSON manifest to the sink. Top-level `EmbeddingInfo` reports model, dim, frames embedded, total vector bytes.
When it helps
- Long-term video archives — semantic search over years of footage.
- Multimodal RAG — pair frame embeddings with transcript text in the same vector DB.
- Surveillance / security forensics — 'find every frame that looks like this query image' across thousands of clips.
- Creative / asset workflows — search a footage library by description ('sunset over water', 'busy street').
Pitfalls
- Embedding CLIs aren't standardised — `embed-cli`, `clip-cli`, `siglip-cli` are placeholders. Wire up your own thin wrapper around `open_clip_torch` or HuggingFace `transformers`.
- Vector dim varies by model (CLIP ViT-B/32 = 512, ViT-L/14 = 768, SigLIP-Large = 1024). Sinks may need schema updates per model.
- Per-frame GPU passes — long videos add minutes on CPU.