Mistral has no native video. peepshow bridges.
Mistral's vision is image-only. peepshow turns video + animated formats into the frame timeline Mistral already accepts.
- Only path to video on Pixtral / Mistral Large 3. Mistral's vision API accepts image_url parts. peepshow turns video into that shape.
- EU data residency. Mistral hosts in Europe. Pair with peepshow's local-first extraction to keep PII off non-EU clouds.
- Open-weights variants. Pixtral 12B has open weights — run it on-prem with vLLM or llama.cpp. peepshow's pipeline doesn't change.
- Animated GIF / APNG / WebP. Mistral vision treats these as a static image. peepshow extracts the full motion.
- Token-cost predictable. N × per-image vision price. peepshow lets you pick N (defaults to ~20).
- Same bundle on Le Chat / API / open weights. Extract once, feed any Mistral endpoint.
Token-cost math (worked examples)
| Clip | Native upload | peepshow + Mistral |
|---|---|---|
| 30s product demo (peepshow) | — | ~3K (6 frames + transcript) |
| 10-minute lecture (peepshow) | — | ~7K (20 scene frames + transcript) |
| 1-hour CCTV reel (peepshow) | — | ~12K (30 motion frames + sparse transcript) |
| 3-hour conference (peepshow + chunked) | — | ~32K (60 scene frames + chaptered transcript) |
Pixtral's per-image cost varies by size class. Pixtral Large uses ~1100 tokens for a 1024×1024 frame; Pixtral 12B is cheaper.
Install (CLI)
npm install -g peepshow
# Set Mistral credentials:
export MISTRAL_API_KEY=...
# Run extraction:
peepshow ./demo.mp4 --emit json > run.jsonInstall (Mistral API directly, no CLI)
Calling the Mistral API from your own code? Run peepshow first, then feed the JSON manifest in as multimodal parts:
# Hand frames + transcript to Pixtral
node -e '
import { Mistral } from "@mistralai/mistralai";
import { readFileSync } from "node:fs";
const run = JSON.parse(readFileSync("run.json", "utf8"));
const content = [
{ type: "text", text: "Summarise this clip." },
...run.frames.map(f => ({
type: "image_url",
imageUrl: "data:image/jpeg;base64," + readFileSync(f.path).toString("base64")
})),
{ type: "text", text: "Transcript:\n" + (run.transcript?.text ?? "") }
];
const client = new Mistral({ apiKey: process.env.MISTRAL_API_KEY });
const r = await client.chat.complete({
model: "pixtral-large-latest",
messages: [{ role: "user", content }]
});
console.log(r.choices[0].message.content);
'Animated GIF / APNG / WebP — peepshow's killer move on Mistral
Pixtral and Mistral Large 3 read animated images as a single frame. peepshow extracts the full motion sequence so the model sees what's actually happening.
peepshow ./meme.gif # animated GIF → frame timeline
peepshow ./tutorial.apng # animated PNG → frames
peepshow ./loop.webp # animated WebP → framesFrame strategy presets
peepshow picks scene-change frames by default. For Mistral specifically, these presets are worth knowing:
--strategy scene --max 16Default for Pixtral Large — keeps cost lean.--strategy scene --max 30 --resize 1024Pixtral 12B (cheaper per image) — 30 frames at 1024px fits budget.--strategy fps --fps 0.5 --max 24Steady-motion content — predictable cadence.
All 95 sinks still fire
Same CLI = same sinks. Push frames to SQLite, embed captions into Chroma, mirror to S3, drop a thumbnail in Slack, file a GitHub issue with the offending frame attached — all from one Mistral run. Browse the full sink catalogue →.
Report + LLM analysis loop
Every run also writes a self-contained report.html + manifest.json next to the frames (see the Report page). When Mistral consumes the frames, the analysis flows back into the report — whoever opens it next sees the model's understanding without re-running the prompt.
echo '{"summary":"<Mistral's summary>","provider":"pixtral-large-latest"}' \
| peepshow report annotate "<outputDir>"When to skip peepshow + use Mistral direct
- Source is already a single image.
- Running Mistral Small or 7B text-only — no vision capability.
- Need EU-only inference with no extraction step (rare).
For everything beyond those edge cases, peepshow is the bridge: video + animated formats + transcript → Mistral reads them as images + text.