Supported file formats

A reference for every file type Vulgate accepts, plus what works best for high-quality search.

May 21, 2026

Vulgate accepts a wide range of text, image, and audio formats. Below is the full list of what's supported today, along with notes about quality and conversion.

Text

FormatExtensionsNotes
PDF.pdfNative or scanned. Scanned PDFs are OCR'd.
Microsoft Word.docxConverted to text on ingest.
Plain text.txtUTF-8 plain text.
HTML.html, .htmBody text is extracted; navigation chrome is stripped.

For best results on PDFs, prefer searchable / native PDFs over scans. Native PDFs preserve the underlying text and produce cleaner output. If you only have scans, that's fine — Vulgate runs OCR on every page.

Images

FormatExtensionsNotes
JPEG.jpg, .jpegMost common scan format.
PNG.pngGood for line-art and screenshots.
TIFF.tif, .tiffHigh-quality scans.

Image quality matters. Aim for at least 300 DPI for printed text and 600 DPI for handwritten or worn manuscripts. Very dark or low-contrast scans may need adjustment in a tool like Adobe Acrobat or Apple Preview first.

Audio and video

FormatExtensionsNotes
MP3.mp3Most common; widely compatible.
M4A / AAC.m4a, .aacApple-friendly.
WAV.wavLossless; larger files.
FLAC.flacLossless; great for archival audio.
OGG / Opus.ogg, .opusOpen formats.
Video.mp4, .movAudio track is transcribed; audio-only uploads are usually best.

For best transcription, upload clean, single-speaker audio sampled at 16 kHz or higher. Vulgate uses AssemblyAI under the hood, which handles dozens of languages out of the box. See Uploading audio files for the full guide.

What we don't accept

  • Archives (zip, rar, 7z). Unzip first — you can drag-and-drop the resulting folder to upload all files at once.
  • Encrypted or password-protected PDFs. Remove the password before uploading.
  • Spreadsheets (xlsx, csv). Vulgate is optimized for prose, not tabular data. If you need to make a spreadsheet searchable, export the columns you care about as text first.

If you have a format we don't yet support, let us know via the contact form — we add formats based on user demand.

Search help