13 — Output and Files

Source: src/vocal10n/pipeline/file_writer.py.

File Layout

Outputs land under output/ (configurable via output.directory):

output/
├── audio/                  # WAV recordings (when save_wav=true)
├── subtitles/
│   ├── YYYY-MM-DD_HH-MM-SS_source.srt
│   ├── YYYY-MM-DD_HH-MM-SS_source.txt
│   ├── YYYY-MM-DD_HH-MM-SS_target.srt
│   └── YYYY-MM-DD_HH-MM-SS_target.txt
└── training_data/          # Reserved for training collection.

The session timestamp is set when PipelineCoordinator.start_session() runs and prefixes every file produced for that session.

Per-Format Toggles

From config.output (also exposed in the Output tab):

Key	Content
`save_source_txt`	Corrected source transcript with punctuation. One line per confirmed segment.
`save_source_srt`	Corrected source subtitle file. SRT format, no punctuation, space-separated, with proper timestamps.
`save_target_txt`	Translated transcript with punctuation.
`save_target_srt`	Translated subtitle file. SRT format, no punctuation, with timestamps aligned to the source segment.
`save_wav`	Raw mic audio at 16 kHz mono, suitable for training data collection.

If every flag is off, no file writer is created and there is no I/O overhead.

Async Writes

All writes happen on a background thread inside FileWriter. The Qt event loop and the inference workers are never blocked on disk I/O. The writer is created on start_session() and flushed/closed on stop_session().

SRT Format Details

Timestamps come from the latency tracker / segment times rather than wall-clock, so the SRT timeline matches the audio that was actually transcribed.
Punctuation is stripped from the SRT-formatted text and replaced with spaces, matching the original spec in initialplan.md.
Source and target SRTs share segment boundaries — when a confirmed source segment is translated, the matching target subtitle inherits the source’s start/end times. This guarantees the two languages stay synchronised in playback.

WAV

When save_wav=true, PipelineCoordinator keeps a reference to AudioCapture and the writer streams from the mic ring buffer at the session offset. AEC processing is not applied to the saved audio — the file represents the raw mic input, useful for retraining or debugging.