Audio transcription issues

When transcripts are inaccurate, missing, or the job fails — here's how to fix it.

May 21, 2026

Vulgate transcribes audio uploads using high-quality speech-to-text models. The vast majority of files come back with usable transcripts, but a few common issues come up — this article covers them.

The job ends in error

Open Uploads → Processing in the sidebar and click the failed row for the specific reason. The most common are:

  • Unsupported format — Convert with ffmpeg to MP3 or WAV and re-upload. We accept MP3, M4A, AAC, WAV, FLAC, OGG, Opus, and common video containers (audio track is transcribed).
  • Silent file — A file with less than a few seconds of audible content can't be transcribed. Confirm the file plays audibly in your local player.
  • Very long file — Long files may time out. Split with ffmpeg:
    ffmpeg -i input.mp3 -f segment -segment_time 3600 -c copy chunk_%03d.mp3
    
    Then upload each chunk individually.
  • Spend cap exceeded — Your plan's ingest credits for the month are spent. Upgrade or wait until next cycle.

For other failure messages, copy the exact text and email info@vulgate.ai with the Job ID.

The transcript is inaccurate

Quality of transcription is mostly a function of input quality:

  • Background noise. Cafés, crowds, HVAC hum, or wind seriously degrade accuracy. Re-record in a quieter environment if possible.
  • Multiple overlapping speakers. Vulgate detects speaker turns but overlapping speech often runs together. Split-recording each speaker on their own track and uploading separately produces dramatically better results.
  • Low sample rate. Files sampled at 8 kHz (vintage phone audio, some VoIP) hit more errors than 16 kHz or higher.
  • Heavy accents or dialects. Most major dialects are well-supported, but very strong regional accents can produce literal-but-wrong transcriptions ("council" → "consul").
  • Specialized vocabulary. Theological, legal, or scientific terms may be transcribed phonetically.

Mixed-language audio

If a recording switches between languages (e.g. an Italian-language sermon with Latin liturgical passages), the transcript will follow the dominant language and stumble on the secondary one. Best practice: split the file at language boundaries before uploading, even if it's just two files.

When to contact us

Email info@vulgate.ai with the Job ID if:

  • You get a failure message not covered above.
  • Transcription quality is dramatically worse than expected on a file you've verified is clean.
  • The same file fails repeatedly despite being well within size and format limits.

Related

Search help