- Implement real-time streaming preview using Parakeet EOU (160ms chunks)
- Add batch transcription on completion for accurate final result
- Prefer Whisper large-v3-turbo (2.7% WER) over Parakeet (6.05% WER) when available
- Remove audio preprocessing that hurts ASR accuracy (gain control, noise reduction)
- Add streaming audio callback support in Recorder and CoreAudioRecorder
- Raw audio passthrough - SDK handles resampling internally
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update FluidAudio from v0.7.7 (2dd0bd1) to v0.7.8 (8136bd0)
Performance improvements:
- 5% faster ASR inference
- 10% fewer missing words on long audio files
- 0.5% improved WER for v2 and v3 models
Stability improvements:
- Fixed ANE concurrency crashes (<3% latency impact)
- Switched ASR to stateless for better batching support
- Improved concurrency safety
This is a backward-compatible update with no API breaking changes.
No code changes required - all existing Parakeet integration works as-is.
Full changelog: https://github.com/FluidInference/FluidAudio/compare/v0.7.7...v0.7.8
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
- Update FluidAudio to f47209a which includes ESpeakNG framework fix
- Fix VadConfig API compatibility: threshold -> defaultThreshold
- Add WorkspaceSettings to allow FluidAudio's unsafe build flags
This resolves the dyld crash: "Library not loaded: ESpeakNG.framework/Versions/A/ESpeakNG"
Fixed upstream in FluidInference/FluidAudio#159 and FluidInference/FluidAudio#160
Tested: All 4 UI tests pass