Steps
- Install peepshow + WhisperX (local) OR set a cloud API key
For local diarisation, `pip install whisperx`. Cloud paths use existing API keys.
npm install -g peepshow pip install whisperx export HF_TOKEN=hf_... # required for pyannote model download - Run with --diarise
WhisperX path used if `whisperx` is on PATH; cloud providers honour `--transcribe` selection.
peepshow ./meeting.mp4 --diarise - Pick a cloud provider explicitly
Pass through to Deepgram's `?diarize=true` or AssemblyAI's `speaker_labels:true`.
peepshow ./meeting.mp4 --diarise --transcribe deepgram peepshow ./meeting.mp4 --diarise --transcribe assemblyai - Cap speaker count
Helps the diariser converge faster.
peepshow ./call.mp4 --diarise --max-speakers 3
Why it works
Diarisation is the missing piece of meeting / interview transcription. WhisperX (local) ships pyannote.audio under the hood — needs a Hugging Face token to download the pyannote models, then runs entirely offline. Deepgram and AssemblyAI both ship server-side diarisation with a single flag. peepshow's `--diarise` routes through the existing transcription provider chain, so you opt-in once and the right path fires based on what's available.
When it helps
- Meetings / standups / interviews — know who said what without re-listening.
- Podcast post-production — auto-tag segments by host.
- Court / legal interviews where speaker attribution matters.
- Customer-support call review — separate agent vs caller.
Pitfalls
- WhisperX needs a Hugging Face token (`HF_TOKEN`) to download pyannote models on first run.
- Cloud providers don't always agree on speaker IDs — WhisperX uses `SPEAKER_00`, Deepgram uses numeric, AssemblyAI uses `A`/`B`/`C`. peepshow normalises to the provider's native scheme.
- Diarisation accuracy drops on overlapping speech and very short utterances.