06 β Pipeline and State
This chapter covers the cross-cutting plumbing that lets the modules cooperate without depending on each other directly.
SystemState β Single Source of Truth
vocal10n.state.SystemState is a QObject carrying every piece of state
that the UI cares about: model statuses, enable flags, languages, and
the four live text buffers (current/accumulated Γ source/translation).
Key properties:
stt_status,llm_status,tts_status,tts_qwen3_statusβModelStatusenum (UNLOADED,LOADING,LOADED,ERROR).stt_enabled,llm_enabled,tts_enabled,tts_source_enabled,tts_target_enabled,speaker_tagging.source_language,target_languageβLanguageenum.current_*_text,accumulated_*_textβ strings updated by workers.
Each property has a matching *_changed Qt signal that fires only on
real change. UI widgets connect to these signals; worker threads write
through the property setters under an internal RLock.
Event Dispatcher
vocal10n.pipeline.events defines:
EventType(invocal10n.constants) β values such asSTT_PARTIAL,STT_CONFIRMED,TRANSLATION_PARTIAL,TRANSLATION_CONFIRMED,TTS_STARTED,TTS_FINISHED,PIPELINE_READY.- Dataclass payloads:
Event,TextEvent,TranslationEvent. EventDispatcherβ a thread-safesubscribe(type, callback)/publish(event)pub/sub, accessed throughget_dispatcher().
Modules publish into the dispatcher and subscribe to whichever event types they need. This is what keeps the wiring loose: the STT module does not know about the LLM or TTS modules.
Latency Tracker
vocal10n.pipeline.latency.LatencyTracker records per-stage timestamps
keyed by an utterance id, and exposes rolling-window aggregates that the
UI uses for the Section A metrics:
stt_partial_msβ mic-frame-arrival β first partial.stt_confirmed_msβ mic-frame-arrival β confirmed segment.translation_msβ confirmed β translation produced.tts_ttfa_msβ translated β first audio frame heard (TTFA).
PipelineCoordinator
vocal10n.pipeline.coordinator.PipelineCoordinator owns session
lifecycle rather than per-event flow:
start_session()creates aFileWriterif anyoutput.*flag is on, subscribes it toSTT_CONFIRMEDandTRANSLATION_CONFIRMED, and publishesPIPELINE_READY.stop_session()flushes pending buffers, unsubscribes the writer, finalises SRT / TXT / WAV files.- It can hold a reference to
AudioCaptureso the WAV writer can read the mic ring buffer at the right offset.
The coordinator deliberately does not load or unload models; that lives in each moduleβs controller.
Module Controllers
Each backend module exposes a controller that the UI talks to:
vocal10n.stt.controller.STTControllerβ load/unload, language switching, enable/disable, ownsSTTWorker.vocal10n.llm.controller.LLMControllerβ load/unload, prompt updates, KB hot-reload, debounced translation.vocal10n.tts.controller.TTSControllerβ GPT-SoVITS path: server health, queue, playback.vocal10n.tts.qwen3_controller.Qwen3TTSControllerβ same surface for the Qwen3-TTS backend.
Controllers are the only objects the UI tabs hold references to. They
mediate writes to SystemState and subscribe to dispatcher events.