Three Claude Code skills that chain together: transcribe a meeting, analyze it with full project context, then generate an audio briefing. Everything runs locally — no cloud APIs, no API keys, no recurring costs.
Get Started Setup GuideTranscribe any audio or video file with automatic speaker identification. Handles meetings, interviews, podcasts, voice memos — any recording with one or more speakers.
The bridge between transcription and audio. Reads your transcript, gathers full project context (CLAUDE.md, docs, git history), and produces a strategic analysis with a TTS-ready voice version.
Convert any text or markdown file to an MP3. The skill handles markdown cleanup, acronym expansion, voice selection, and MP3 encoding automatically.
Clone the repo and copy the skill folders into your Claude Code config directory. That's all the "installation" there is for the skills themselves.
Kokoro is the default engine. Install the CLI with pipx, then download the model files (~350 MB). This is a one-time setup.
Required for converting WAV output to MP3. If you're on a Mac, one command.
Open Claude Code and type the slash command. That's it. The skill handles engine selection, voice picking, format conversion, and output reporting.
Run kokoro-tts --help-voices for the full list of 50 voices.
Record a client call on your phone. Transcribe it with speaker labels. Run a strategic analysis that pulls in your full project context. Then generate an audio briefing you can listen to on the drive home.
The strategic analysis skill reads your CLAUDE.md, docs, competitor files, and git history — so the output is grounded in everything you know, not just the transcript.
Kokoro is the default and handles 95% of use cases. But the skill also supports Orpheus (for emotional, natural-sounding speech with tags like <laugh>) and Coqui XTTS v2 (for cloning any voice from a 6-second WAV sample).
Just tell the skill which engine you want when it asks. Setup instructions for the optional engines are in the GitHub README.
| Engine | Speed | Best For |
|---|---|---|
| Kokoro | Seconds | Daily use, bulk |
| Orpheus | ~3.5x RT | Emotion, prosody |
| Coqui XTTS | ~1x RT | Voice cloning |