13 β Output and Files
Source: src/vocal10n/pipeline/file_writer.py.
File Layout
Outputs land under output/ (configurable via output.directory):
output/
βββ audio/ # WAV recordings (when save_wav=true)
βββ subtitles/
β βββ YYYY-MM-DD_HH-MM-SS_source.srt
β βββ YYYY-MM-DD_HH-MM-SS_source.txt
β βββ YYYY-MM-DD_HH-MM-SS_target.srt
β βββ YYYY-MM-DD_HH-MM-SS_target.txt
βββ training_data/ # Reserved for training collection.
The session timestamp is set when PipelineCoordinator.start_session()
runs and prefixes every file produced for that session.
Per-Format Toggles
From config.output (also exposed in the Output tab):
| Key | Content |
|---|---|
save_source_txt | Corrected source transcript with punctuation. One line per confirmed segment. |
save_source_srt | Corrected source subtitle file. SRT format, no punctuation, space-separated, with proper timestamps. |
save_target_txt | Translated transcript with punctuation. |
save_target_srt | Translated subtitle file. SRT format, no punctuation, with timestamps aligned to the source segment. |
save_wav | Raw mic audio at 16 kHz mono, suitable for training data collection. |
If every flag is off, no file writer is created and there is no I/O overhead.
Async Writes
All writes happen on a background thread inside FileWriter. The Qt
event loop and the inference workers are never blocked on disk I/O. The
writer is created on start_session() and flushed/closed on
stop_session().
SRT Format Details
- Timestamps come from the latency tracker / segment times rather than wall-clock, so the SRT timeline matches the audio that was actually transcribed.
- Punctuation is stripped from the SRT-formatted text and replaced with
spaces, matching the original spec in
initialplan.md. - Source and target SRTs share segment boundaries β when a confirmed source segment is translated, the matching target subtitle inherits the sourceβs start/end times. This guarantees the two languages stay synchronised in playback.
WAV
When save_wav=true, PipelineCoordinator keeps a reference to
AudioCapture and the writer streams from the mic ring buffer at the
session offset. AEC processing is not applied to the saved audio β
the file represents the raw mic input, useful for retraining or
debugging.