Removing 'um' from a recording is harder than it sounds

Software Development(doug.sh)view on HackerNews

speech-to-textdisfluenciesumsuhsersffmpegpythoncliopen-source

Author: dougcalobrisi

Date: 6/12/2026

Article Summary:

A local CLI tool called erm is developed to automatically remove disfluencies (ums, uhs, and ers) from speech recordings by using a combination of speech-to-text models and audio processing techniques.