Sunbelt Computer Software

get-transcripts

Transcribe video files, audio files, and YouTube URLs locally using OpenAI Whisper on Apple Silicon — no cloud, no API keys, no cost per minute.

Optimised for Mac Mini M4 / MacBook Pro M-series using the MPS (Metal Performance Shaders) backend for GPU-accelerated transcription. A 30-minute video transcribes in ~8 minutes on M4 with the medium model.

Features

Local & private — audio never leaves your machine
YouTube support — paste a URL and get a transcript directly (via yt-dlp)
All Whisper models — tiny through large-v3, selectable with --model
5 output formats — .txt, .vtt, .srt, .tsv, .json generated in one pass
Reusable Python module — import src.transcribe in your own scripts
Shell script wrapper — one command, no Python required
30 tests — unit + integration, CI-friendly

Prerequisites

macOS with Apple Silicon (M1/M2/M3/M4)
micromamba for Python environment management
Homebrew for system packages
ffmpeg — installed automatically in the setup steps below

Setup

# 1. Install ffmpeg
brew install ffmpeg

# 2. Clone the repo
git clone https://github.com/troyscott/get-transcripts.git
cd get-transcripts

# 3. Create the Python environment
micromamba env create -f environment.yml
micromamba activate transcribe

# 4. Pre-download the default model (~1.5 GB, one-time — cached to ~/.cache/whisper/)
python -c "import whisper; whisper.load_model('medium')"

Verify your GPU is available:

python -c "import torch; print('MPS available:', torch.backends.mps.is_available())"
# Expected: MPS available: True

Usage

Shell script (quickest)

# Local file (video or audio)
./scripts/run.sh demo.mp4

# YouTube URL
./scripts/run.sh https://youtu.be/dQw4w9WgXcQ

# Different model
./scripts/run.sh demo.mp4 --model large-v3

Python CLI (more options)

micromamba activate transcribe

# Local file
python -m src.cli demo.mp4

# YouTube URL
python -m src.cli https://youtu.be/dQw4w9WgXcQ --model small

# All options
python -m src.cli demo.mp4 --model medium --device mps --output-dir ./out --language en

As a Python module

from pathlib import Path
from src.transcribe import run

result = run(
    "demo.mp4",                  # or a YouTube URL
    model_name="medium",
    device="mps",
    output_dir=Path("./output"),
)
print(result["text"])

Available models

Model	Size	Speed (M4 MPS)	Best for
`tiny`	75 MB	~1 min/hr	Testing
`base`	145 MB	~2 min/hr	Fast drafts
`small`	460 MB	~4 min/hr	Good accuracy
`medium`	1.5 GB	~8 min/hr	Default
`large-v3`	3 GB	~16 min/hr	Best accuracy

Outputs are written to ./output/:

Running tests

# Unit tests only (fast, no model download required)
micromamba run -n transcribe pytest tests/unit/ -v

# Full integration test (downloads tiny model ~75 MB on first run)
micromamba run -n transcribe pytest tests/ -v --run-integration

Common issues

MPS available: False — Requires Apple Silicon + macOS 12.3+. Fall back to --device cpu in scripts/run.sh (slower).

Hallucinated captions during silence — Trim silent sections with ffmpeg before transcribing:

ffmpeg -i in.mp4 -ss 00:00:05 -to 00:00:-05 -c copy trimmed.mp4

Domain terms mis-transcribed (e.g. ASTM → "ASTAM") — Do a find-and-replace pass on the .vtt output before use.

Project structure

get-transcripts/
├── src/
│   ├── transcribe.py       # Core module (extract, download, transcribe, run)
│   └── cli.py              # Argparse CLI entry point
├── scripts/
│   └── run.sh              # One-shot shell wrapper
├── tests/
│   ├── conftest.py         # Shared fixtures
│   ├── unit/
│   │   ├── test_audio.py              # ffmpeg extraction tests
│   │   └── test_transcribe_module.py  # Module unit + error path tests
│   └── integration/
│       └── test_transcription.py      # Full pipeline tests (tiny model)
├── environment.yml         # micromamba environment
├── pyproject.toml          # Project config + ruff rules
├── CLAUDE.md               # AI assistant context
└── SPEC.md                 # Design document

File	Description
`.txt`	Plain text transcript
`.vtt`	Timestamped captions (WebVTT)
`.srt`	Timestamped captions (SubRip)
`.tsv`	Tab-separated with timestamps
`.json`	Full Whisper output with metadata

Sunbelt Computer Software

PL/B Language Development and Support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

get-transcripts

Features

Prerequisites

Setup

Usage

Shell script (quickest)

Python CLI (more options)

As a Python module

Available models

Running tests

Common issues

Project structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SPEC.md		SPEC.md
environment.yml		environment.yml
pyproject.toml		pyproject.toml

Sunbelt Computer Software

PL/B Language Development and Support

Folders and files

Latest commit

History

Repository files navigation

get-transcripts

Features

Prerequisites

Setup

Usage

Shell script (quickest)

Python CLI (more options)

As a Python module

Available models

Running tests

Common issues

Project structure

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages