How to Use Video to Text | Upload, Transcribe & Export
Learn how to upload a file, choose language settings, start transcription, and export your transcript as CSV, SRT, VTT, or TXT.
How to use Video to Text is straightforward: sign in, upload a supported file, choose your language settings, wait for transcription, and export the result in the format you need.
This guide covers the current flow in the app. If you want format details first, see Supported Media Formats for Video to Text. If you want language guidance, read Supported Languages in Video to Text.
Step 1: Sign in and prepare your minutes
Before you start a transcription, sign in to your account. The app checks account access and available minutes before it begins processing.
New users receive 30 free minutes. If your remaining minutes are not enough for the file you selected, the app prompts you to top up before transcription starts.
Step 2: Upload a supported audio or video file
Click the upload area and choose your file.
Video to Text currently accepts these file types:
- Audio:
.aac,.flac,.m4a,.mp3,.oga,.ogg,.opus,.wav - Video:
.mp4,.m4v,.webm
The upload rules are also important:
- maximum file size: 5 GB
- maximum media duration: under 10 hours
If your file breaks either limit, the app stops the workflow before transcription begins.
Step 3: Choose your language option
After selecting a file, choose the language setup that matches your recording:
- pick a specific language if you know it
- use auto detection if you are not sure
- use multilingual detection if the recording switches between languages
This choice matters most for long interviews, podcasts, and international team meetings.
Step 4: Turn on speaker labels if you need them
Speaker labels are useful for conversations with more than one person. When this option is enabled, the transcript can separate content by speaker, which makes reviews and exports easier to read.
This is especially helpful for:
- meetings
- interviews
- panel discussions
- classroom recordings
Step 5: Wait for upload and transcription to finish
Once you submit the file, the app uploads it to storage and then starts transcription. Progress messages appear while the job is running.
Video to Text is built for quick turnaround, but total time still depends on a few things:
- file duration
- upload speed
- file size
- whether the audio is clear
If you want a closer look at expected timing, read How Long Does Video to Text Take?.
Step 6: Export the transcript
When transcription is complete, the app takes you to the export page. You can export the transcript as:
csvsrtvtttxt
Choose the format that matches your next step:
- use
srtorvttfor subtitles - use
csvfor spreadsheet review or structured handoff - use
txtfor plain text editing and note-taking
Quick workflow summary
If you just need the short version, here it is:
- Sign in.
- Upload a supported file.
- Confirm the file is under 5 GB and under 10 hours.
- Pick a language option.
- Turn on speaker labels if needed.
- Start transcription.
- Export the finished transcript in
csv,srt,vtt, ortxt.
Tips for a smoother result
- Upload the cleanest source file you have.
- Use the correct language option whenever possible.
- Turn on speaker labels for group conversations.
- Export in more than one format if different teammates need different outputs.
