13 β€” Output and Files

Source: src/vocal10n/pipeline/file_writer.py.

File Layout

Outputs land under output/ (configurable via output.directory):

output/
β”œβ”€β”€ audio/                  # WAV recordings (when save_wav=true)
β”œβ”€β”€ subtitles/
β”‚   β”œβ”€β”€ YYYY-MM-DD_HH-MM-SS_source.srt
β”‚   β”œβ”€β”€ YYYY-MM-DD_HH-MM-SS_source.txt
β”‚   β”œβ”€β”€ YYYY-MM-DD_HH-MM-SS_target.srt
β”‚   └── YYYY-MM-DD_HH-MM-SS_target.txt
└── training_data/          # Reserved for training collection.

The session timestamp is set when PipelineCoordinator.start_session() runs and prefixes every file produced for that session.

Per-Format Toggles

From config.output (also exposed in the Output tab):

KeyContent
save_source_txtCorrected source transcript with punctuation. One line per confirmed segment.
save_source_srtCorrected source subtitle file. SRT format, no punctuation, space-separated, with proper timestamps.
save_target_txtTranslated transcript with punctuation.
save_target_srtTranslated subtitle file. SRT format, no punctuation, with timestamps aligned to the source segment.
save_wavRaw mic audio at 16 kHz mono, suitable for training data collection.

If every flag is off, no file writer is created and there is no I/O overhead.

Async Writes

All writes happen on a background thread inside FileWriter. The Qt event loop and the inference workers are never blocked on disk I/O. The writer is created on start_session() and flushed/closed on stop_session().

SRT Format Details

  • Timestamps come from the latency tracker / segment times rather than wall-clock, so the SRT timeline matches the audio that was actually transcribed.
  • Punctuation is stripped from the SRT-formatted text and replaced with spaces, matching the original spec in initialplan.md.
  • Source and target SRTs share segment boundaries β€” when a confirmed source segment is translated, the matching target subtitle inherits the source’s start/end times. This guarantees the two languages stay synchronised in playback.

WAV

When save_wav=true, PipelineCoordinator keeps a reference to AudioCapture and the writer streams from the mic ring buffer at the session offset. AEC processing is not applied to the saved audio β€” the file represents the raw mic input, useful for retraining or debugging.