Turn any MP4 file into a fully formatted text transcript with timestamps and SRT/VTT export. Free, browser-based, and accurate in 98+ languages. Works on iPhone, Android, and desktop.
MP4 is the lingua franca of video — every phone camera, every screen recorder, every download button. But MP4 is a container, not a document: you can't search it, can't paste quotes from it, can't feed it to an LLM. This page turns an MP4 into something you can. Drop the file into the box above and a browser-side audio extractor, a Whisper API call, and a tidy segment viewer give you text in about one-twentieth of the video's runtime.
Because the audio track is already standard AAC or MP3, we can extract it losslessly with ffmpeg.wasm — no re-encoding quality loss. A 60-minute 1080p MP4 that weighs 600 MB boils down to a 30 MB mono MP3 in under 15 seconds on a modern laptop. That's what actually leaves your device for transcription; the video stays with you.
iPhone and iPad record .mov and .mp4 files with H.264 + AAC, both of which ffmpeg handles without issue. If your file is the new HEVC (H.265) format from recent iOS versions, it still transcribes — only the video codec matters for playback, and we only touch the audio. You don't need to convert to MP4 first.
A timestamped transcript rendered segment-by-segment in the browser. Copyable text. One-click SRT and VTT export with the original MP4's timing preserved, ready to burn into the video or upload alongside it. An optional 5-bullet summary if you checked the box. And an optional translation into Chinese, Spanish, French, German, Japanese, or Portuguese, done by Claude Haiku so the translation is paragraph-coherent, not word-for-word robotic.
No. We only upload the compressed audio (typically under 30 MB even for multi-hour videos). The original MP4 never leaves your device.
200 MB per file on the free tier, 5 GB on Pro. Duration limits apply separately: 10 minutes free, 4 hours Pro.
Yes — screen recordings are a common use case, especially for meeting replays and tutorials. As long as there's an audible voice track, it transcribes.
Silent MP4s will produce an empty transcript — there's nothing to transcribe. The job returns a ‘no speech detected’ state and doesn't count against your daily quota.
Not through the web UI yet. Trim the MP4 with a free editor like CapCut or iMovie first, then upload the trimmed clip.