vOOice/QUALITY_OF_LIFE_IMPROVEMENTS.md
tmm22 c3e7263890
Update QUALITY_OF_LIFE_IMPROVEMENTS.md
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2025-11-03 12:07:57 +10:00

1810 lines
49 KiB
Markdown

# Quality of Life Improvements for VoiceLink Community
**Date:** November 3, 2025
**Analysis Type:** User & Developer Experience Audit
**Scope:** VoiceInk codebase fork analysis
---
## Executive Summary
This document identifies quality of life improvements for both users and developers of VoiceLink Community. The analysis covers UX/UI enhancements, workflow optimizations, accessibility features, code quality improvements, and maintainability enhancements.
**Priority Legend:**
- 🔴 **Critical** - High impact, relatively easy to implement
- 🟠 **High** - Significant improvement, moderate effort
- 🟡 **Medium** - Nice to have, moderate effort
- 🟢 **Low** - Polish items, lower priority
---
## User-Facing Improvements
### 1. Recording & Transcription Workflow
#### 🔴 Critical: Recording State Visual Feedback
**Issue:** Current recorder provides minimal feedback during transcription/enhancement phases.
**Current State:**
- Status changes between `.recording`, `.transcribing`, `.enhancing`, `.busy`
- Limited visual differentiation in the mini recorder
- No progress indicator during long transcriptions
**Proposed Solution:**
```swift
// Add to RecorderStatusDisplay
struct RecorderStatusDisplay: View {
let currentState: RecordingState
let audioMeter: Float
@State private var transcriptionProgress: Double = 0
var statusText: String {
switch currentState {
case .recording: return "Recording..."
case .transcribing: return "Transcribing..."
case .enhancing: return "Enhancing with AI..."
case .busy: return "Processing..."
case .idle: return "Ready"
}
}
var body: some View {
VStack(spacing: 4) {
// Current visualizer
AudioVisualizerView(audioMeter: audioMeter)
// Add progress bar for processing states
if currentState != .recording && currentState != .idle {
ProgressView(value: transcriptionProgress)
.progressViewStyle(.linear)
.frame(height: 2)
}
Text(statusText)
.font(.caption2)
.foregroundColor(.secondary)
}
}
}
```
**Benefits:**
- Users know exactly what's happening during each phase
- Reduces anxiety during long transcriptions
- Clear visual state transitions
---
#### 🔴 Critical: Keyboard Shortcut for Cancel Recording
**Issue:** User must wait for transcription to complete or manually close recorder. Double-tap Escape is not discoverable.
**Current State:**
```swift
// Custom cancel shortcut is optional and hidden
@State private var isCustomCancelEnabled = false
if isCustomCancelEnabled {
KeyboardShortcuts.Recorder(for: .cancelRecorder)
}
```
**Proposed Solution:**
- Make Escape cancellation always available with clear UI indication
- Add cancel button to recorder UI
- Show "Press ESC to cancel" hint during recording
```swift
// In MiniRecorderView
if whisperState.recordingState == .recording {
Button(action: {
Task {
await whisperState.cancelRecording()
}
}) {
Image(systemName: "xmark.circle.fill")
.foregroundColor(.red)
}
.help("Cancel recording (ESC)")
}
```
**Benefits:**
- Immediate control over recording session
- Prevents accidental long transcriptions
- Improved user confidence
---
#### 🟠 High: Quick Retry Last Transcription
**Issue:** Already implemented but could be more discoverable and integrated.
**Current State:**
- Keyboard shortcut exists (`.retryLastTranscription`)
- Not visible in UI
- No indication when retry is in progress
**Proposed Enhancement:**
- Add retry button to transcription history cards
- Show retry indicator in mini recorder
- Add "Retry with different model" option
```swift
// In TranscriptionCard
HStack {
Button("Retry") {
LastTranscriptionService.retryLastTranscription(
from: modelContext,
whisperState: whisperState
)
}
Menu {
ForEach(whisperState.allAvailableModels, id: \.name) { model in
Button(model.displayName) {
// Retry with specific model
}
}
} label: {
Image(systemName: "chevron.down")
}
}
```
---
#### 🟠 High: Recording Length Indicator
**Issue:** No visual indication of recording duration.
**Proposed Solution:**
```swift
// Add to RecorderStatusDisplay
@State private var recordingDuration: TimeInterval = 0
private let timer = Timer.publish(every: 0.1, on: .main, in: .common).autoconnect()
Text(formatDuration(recordingDuration))
.font(.system(.caption, design: .monospaced))
.foregroundColor(.primary)
.onReceive(timer) { _ in
if whisperState.recordingState == .recording {
recordingDuration += 0.1
}
}
private func formatDuration(_ duration: TimeInterval) -> String {
let minutes = Int(duration) / 60
let seconds = Int(duration) % 60
return String(format: "%02d:%02d", minutes, seconds)
}
```
**Benefits:**
- Users know how long they've been recording
- Helps prevent accidentally long recordings
- Visual feedback that recording is active
---
### 2. Transcription History & Management
#### 🔴 Critical: Bulk Actions Performance
**Issue:** Selecting all transcriptions can be slow with large datasets.
**Current Implementation:**
```swift
// Loads all transcriptions into memory
private func selectAllTranscriptions() async {
let allTranscriptions = try modelContext.fetch(allDescriptor)
selectedTranscriptions = Set(allTranscriptions)
}
```
**Proposed Optimization:**
```swift
// Only fetch IDs for selection, lazy load full objects when needed
private func selectAllTranscriptions() async {
var descriptor = FetchDescriptor<Transcription>()
descriptor.propertiesToFetch = [\.id, \.timestamp]
let ids = try modelContext.fetch(descriptor).map { $0.id }
selectedTranscriptions = Set(ids)
}
// Update delete to work with IDs
private func deleteSelectedTranscriptions() {
let predicate = #Predicate<Transcription> { transcription in
selectedTranscriptions.contains(transcription.id)
}
try? modelContext.delete(model: Transcription.self, where: predicate)
}
```
**Benefits:**
- Faster selection on large datasets
- Reduced memory footprint
- More responsive UI
---
#### 🟠 High: Smart Search & Filters
**Issue:** Current search is basic text matching only.
**Proposed Enhancements:**
```swift
struct TranscriptionFilters: View {
@Binding var filters: FilterOptions
var body: some View {
HStack {
// Search text (existing)
TextField("Search", text: $filters.searchText)
// Date range
Menu {
Button("Today") { filters.dateRange = .today }
Button("Last 7 days") { filters.dateRange = .week }
Button("Last 30 days") { filters.dateRange = .month }
Button("Custom...") { filters.showDatePicker = true }
} label: {
Label("Date Range", systemImage: "calendar")
}
// Model filter
Menu {
Button("All Models") { filters.model = nil }
ForEach(availableModels) { model in
Button(model.displayName) {
filters.model = model
}
}
} label: {
Label("Model", systemImage: "brain.head.profile")
}
// Power Mode filter
Menu {
Button("All") { filters.powerMode = nil }
ForEach(powerModes) { mode in
Button("\(mode.emoji) \(mode.name)") {
filters.powerMode = mode
}
}
} label: {
Label("Power Mode", systemImage: "sparkles")
}
// Status filter
Picker("Status", selection: $filters.status) {
Text("All").tag(nil as TranscriptionStatus?)
Text("Completed").tag(TranscriptionStatus.completed)
Text("Failed").tag(TranscriptionStatus.failed)
}
}
}
}
```
**Benefits:**
- Find transcriptions faster
- Filter by context (Power Mode, model used)
- Better organization for power users
---
#### 🟡 Medium: Transcription Tagging System
**Issue:** No way to organize or categorize transcriptions.
**Proposed Solution:**
```swift
// Add to Transcription model
@Model
class Transcription {
// ... existing properties
var tags: [String] = []
var category: String?
}
// UI for tagging
struct TagEditor: View {
@Binding var tags: [String]
@State private var newTag = ""
var body: some View {
VStack(alignment: .leading) {
// Existing tags
FlowLayout {
ForEach(tags, id: \.self) { tag in
TagChip(tag: tag) {
tags.removeAll { $0 == tag }
}
}
}
// Add new tag
HStack {
TextField("Add tag", text: $newTag)
.textFieldStyle(.roundedBorder)
Button("Add") {
if !newTag.isEmpty {
tags.append(newTag)
newTag = ""
}
}
}
}
}
}
```
---
### 3. Audio Input & Device Management
#### 🔴 Critical: Audio Device Switching Without Restart
**Issue:** Changing audio device mid-recording can cause crashes (noted in AudioDeviceManager).
**Current State:**
```swift
// AudioDeviceManager.swift line 36
// No proper cleanup of audio engine before device change
```
**Proposed Fix:**
```swift
func setSelectedDevice(_ deviceID: AudioDeviceID) async throws {
// Stop recording if active
let wasRecording = isRecordingActive
if wasRecording {
await whisperState?.recorder.stopRecording()
}
// Wait for audio engine to release resources
try await Task.sleep(nanoseconds: 100_000_000) // 100ms
// Switch device
selectedDeviceID = deviceID
if let uid = getDeviceUID(deviceID: deviceID) {
UserDefaults.standard.selectedAudioDeviceUID = uid
}
// Restart recording if it was active
if wasRecording {
await whisperState?.recorder.startRecording()
}
}
```
**Benefits:**
- Safe device switching
- No crashes or audio corruption
- Better multi-device workflow
---
#### 🟠 High: Audio Level Monitoring in Settings
**Issue:** Can't test microphone levels before recording.
**Proposed Solution:**
```swift
// Add to AudioInputSettingsView
struct MicrophoneLevelMeter: View {
@StateObject private var monitor = AudioLevelMonitor()
var body: some View {
VStack(alignment: .leading, spacing: 8) {
Text("Microphone Test")
.font(.headline)
HStack {
ProgressView(value: monitor.currentLevel, total: 1.0)
.progressViewStyle(.linear)
.tint(monitor.currentLevel > 0.8 ? .red : .green)
Text("\(Int(monitor.currentLevel * 100))%")
.monospacedDigit()
}
Toggle("Monitor Input", isOn: $monitor.isMonitoring)
}
}
}
class AudioLevelMonitor: ObservableObject {
@Published var currentLevel: Float = 0
@Published var isMonitoring = false
private var audioEngine: AVAudioEngine?
func startMonitoring() {
// Setup audio tap on input node
}
func stopMonitoring() {
audioEngine?.stop()
}
}
```
---
#### 🟡 Medium: Prioritized Device Auto-Selection Improvements
**Issue:** Prioritized device mode exists but UX is unclear.
**Proposed Enhancement:**
```swift
// In AudioInputSettingsView
struct PrioritizedDeviceEditor: View {
@Binding var devices: [PrioritizedDevice]
var body: some View {
VStack(alignment: .leading, spacing: 12) {
Text("Device Priority Order")
.font(.headline)
Text("VoiceLink will automatically use the highest priority available device.")
.font(.caption)
.foregroundColor(.secondary)
List {
ForEach(devices) { device in
HStack {
Image(systemName: "line.3.horizontal")
.foregroundColor(.secondary)
Text(device.name)
Spacer()
Text("Priority \(device.priority)")
.foregroundColor(.secondary)
}
}
.onMove { from, to in
devices.move(fromOffsets: from, toOffset: to)
updatePriorities()
}
}
HStack {
Button("Add Current Device") {
if let current = AudioDeviceManager.shared.selectedDeviceID {
// Add to priority list
}
}
Button("Test Priority Order") {
AudioDeviceManager.shared.selectHighestPriorityAvailableDevice()
}
}
}
}
}
```
---
### 4. Power Mode Enhancements
#### 🟠 High: Power Mode Active Indicator
**Issue:** Hard to tell when Power Mode is active and which config is applied.
**Proposed Solution:**
```swift
// Add to MiniRecorderView
if let activeConfig = PowerModeManager.shared.currentActiveConfiguration,
activeConfig.isEnabled {
HStack(spacing: 4) {
Text(activeConfig.emoji)
Text(activeConfig.name)
.font(.caption2)
}
.padding(.horizontal, 8)
.padding(.vertical, 4)
.background(Color.accentColor.opacity(0.2))
.cornerRadius(8)
}
// Add to MenuBar
if let activeConfig = PowerModeManager.shared.currentActiveConfiguration {
Section("Power Mode Active") {
Text("\(activeConfig.emoji) \(activeConfig.name)")
.font(.system(size: 12, weight: .semibold))
Button("Disable Power Mode") {
Task {
await PowerModeSessionManager.shared.endSession()
}
}
}
}
```
---
#### 🟡 Medium: Power Mode Testing Tools
**Issue:** Hard to test Power Mode configs without switching apps.
**Proposed Solution:**
```swift
// Add to PowerModeView
struct PowerModeTestingPanel: View {
@State private var testURL = ""
@State private var testAppBundleID = ""
var body: some View {
GroupBox("Test Configuration") {
VStack(spacing: 12) {
TextField("App Bundle ID", text: $testAppBundleID)
.textFieldStyle(.roundedBorder)
TextField("Browser URL", text: $testURL)
.textFieldStyle(.roundedBorder)
Button("Simulate Activation") {
// Test which config would activate
let config = PowerModeManager.shared.findMatchingConfiguration(
appBundleID: testAppBundleID,
url: testURL
)
if let config = config {
// Show preview of what would be applied
} else {
// Show "No matching configuration"
}
}
}
}
}
}
```
---
### 5. UI/UX Polish
#### 🔴 Critical: First-Run Setup Improvements
**Issue:** Onboarding could be more streamlined.
**Proposed Enhancements:**
```swift
// Add quick-start preset
struct OnboardingPresetView: View {
var body: some View {
VStack(spacing: 24) {
Text("Choose Your Setup")
.font(.title)
HStack(spacing: 20) {
PresetCard(
title: "Simple",
subtitle: "Just transcription",
icon: "mic.fill"
) {
// Disable AI features, use base model
applySimplePreset()
}
PresetCard(
title: "Powered",
subtitle: "AI enhancement enabled",
icon: "sparkles"
) {
// Enable AI, setup Ollama
applyPoweredPreset()
}
PresetCard(
title: "Custom",
subtitle: "Configure manually",
icon: "slider.horizontal.3"
) {
// Show full onboarding
showFullOnboarding()
}
}
}
}
}
```
---
#### 🟠 High: Keyboard Shortcut Cheat Sheet
**Issue:** Many shortcuts exist but aren't discoverable.
**Proposed Solution:**
```swift
// Add help overlay accessible via Cmd+?
struct ShortcutCheatSheet: View {
var body: some View {
VStack(alignment: .leading, spacing: 12) {
Text("Keyboard Shortcuts")
.font(.title2)
Section("Recording") {
ShortcutRow(
action: "Start/Stop Recording",
shortcut: hotkeyManager.selectedHotkey1.displayName
)
ShortcutRow(
action: "Cancel Recording",
shortcut: "ESC ESC"
)
}
Section("Paste") {
ShortcutRow(
action: "Paste Original",
shortcut: KeyboardShortcuts.getShortcut(for: .pasteLastTranscription)
)
ShortcutRow(
action: "Paste Enhanced",
shortcut: KeyboardShortcuts.getShortcut(for: .pasteLastEnhancement)
)
}
Section("History") {
ShortcutRow(action: "Search", shortcut: "⌘F")
ShortcutRow(action: "Delete", shortcut: "⌫")
}
}
.padding()
.frame(width: 500, height: 600)
}
}
```
---
#### 🟡 Medium: Theme/Appearance Customization
**Issue:** UI is fixed, no customization options.
**Proposed Solution:**
```swift
// Add to Settings
struct AppearanceSettingsView: View {
@AppStorage("recorderOpacity") private var recorderOpacity = 0.9
@AppStorage("recorderScale") private var recorderScale = 1.0
@AppStorage("useCompactUI") private var useCompactUI = false
var body: some View {
SettingsSection(
icon: "paintbrush",
title: "Appearance",
subtitle: "Customize the look of VoiceLink"
) {
VStack(alignment: .leading, spacing: 12) {
HStack {
Text("Recorder Opacity")
Spacer()
Slider(value: $recorderOpacity, in: 0.5...1.0)
.frame(width: 200)
Text("\(Int(recorderOpacity * 100))%")
.monospacedDigit()
}
HStack {
Text("Recorder Size")
Spacer()
Slider(value: $recorderScale, in: 0.8...1.5)
.frame(width: 200)
Text("\(Int(recorderScale * 100))%")
.monospacedDigit()
}
Toggle("Compact UI Mode", isOn: $useCompactUI)
Divider()
Button("Reset to Defaults") {
recorderOpacity = 0.9
recorderScale = 1.0
useCompactUI = false
}
}
}
}
}
```
---
### 6. Accessibility Improvements
#### 🟠 High: Better Screen Reader Support
**Issue:** Some UI elements lack proper accessibility labels.
**Proposed Fixes:**
```swift
// Add to critical UI elements
Button(action: startRecording) {
Image(systemName: "mic.fill")
}
.accessibilityLabel("Start recording")
.accessibilityHint("Tap to begin voice recording")
// Recorder status
Text(statusText)
.accessibilityLabel("Recording status: \(statusText)")
.accessibilityAddTraits(.updatesFrequently)
// Audio visualizer
AudioVisualizerView(audioMeter: meter)
.accessibilityLabel("Audio level: \(Int(meter * 100)) percent")
.accessibilityAddTraits(.updatesFrequently)
```
---
#### 🟡 Medium: High Contrast Mode Support
**Issue:** UI may be hard to read in bright environments.
**Proposed Solution:**
```swift
@Environment(\.accessibilityReduceTransparency) var reduceTransparency
var backgroundView: some View {
if reduceTransparency {
Color.black // Solid background
} else {
// Existing translucent background
ZStack {
Color.black.opacity(0.9)
VisualEffectView(...)
}
}
}
```
---
### 7. Export & Integration Features
#### 🟠 High: Export Format Options
**Issue:** Only CSV export is available.
**Proposed Solution:**
```swift
enum ExportFormat: String, CaseIterable {
case csv = "CSV"
case json = "JSON"
case markdown = "Markdown"
case txt = "Plain Text"
case srt = "Subtitles (SRT)"
}
struct ExportOptionsView: View {
@State private var format: ExportFormat = .csv
@State private var includeAudio = false
@State private var includeMetadata = true
var body: some View {
VStack {
Picker("Format", selection: $format) {
ForEach(ExportFormat.allCases, id: \.self) { format in
Text(format.rawValue).tag(format)
}
}
Toggle("Include audio files", isOn: $includeAudio)
Toggle("Include metadata", isOn: $includeMetadata)
Button("Export") {
exportTranscriptions(
format: format,
includeAudio: includeAudio,
includeMetadata: includeMetadata
)
}
}
}
}
```
---
#### 🟡 Medium: Webhook Integration
**Issue:** No way to send transcriptions to external services automatically.
**Proposed Solution:**
```swift
// Add webhook configuration
struct WebhookSettings: Codable {
var url: String
var enabled: Bool
var includeAudio: Bool
var headers: [String: String]
}
// Trigger after transcription completes
func sendToWebhook(_ transcription: Transcription) async throws {
guard let settings = loadWebhookSettings(),
settings.enabled else { return }
let payload: [String: Any] = [
"text": transcription.text,
"timestamp": transcription.timestamp.ISO8601Format(),
"model": transcription.transcriptionModelName ?? "unknown",
"duration": transcription.duration
]
guard let url = URL(string: settings.url), url.scheme?.lowercased() == "https" else {
throw WebhookError.invalidURL
}
var request = URLRequest(url: url)
request.httpMethod = "POST"
request.httpBody = try JSONSerialization.data(withJSONObject: payload)
// Add custom headers
for (key, value) in settings.headers {
request.addValue(value, forHTTPHeaderField: key)
}
let (_, response) = try await URLSession.shared.data(for: request)
guard (response as? HTTPURLResponse)?.statusCode == 200 else {
throw WebhookError.requestFailed
}
}
```
---
## Developer-Facing Improvements
### 1. Code Architecture & Organization
#### 🔴 Critical: State Management Consolidation
**Issue:** State is scattered across multiple `@Published` properties and UserDefaults.
**Current Problems:**
- 50+ UserDefaults keys spread across files
- No centralized configuration management
- Hard to track what settings exist
- Difficult to implement import/export
**Proposed Solution:**
```swift
// Create centralized app state
@MainActor
class AppState: ObservableObject {
// MARK: - Singleton
static let shared = AppState()
// MARK: - Recording Settings
@AppStorage("RecorderType") var recorderType: RecorderType = .mini
@AppStorage("AppendTrailingSpace") var appendTrailingSpace = true
@AppStorage("UseAppleScriptPaste") var useAppleScriptPaste = false
@AppStorage("preserveTranscriptInClipboard") var preserveClipboard = false
// MARK: - Audio Settings
@AppStorage("selectedAudioDeviceUID") var selectedAudioDeviceUID: String?
@AppStorage("audioInputMode") var audioInputMode: AudioInputMode = .systemDefault
@AppStorage("isSystemMuteEnabled") var isSystemMuteEnabled = false
// MARK: - AI Settings
@AppStorage("enableAIEnhancementFeatures") var enableAIFeatures = false
@AppStorage("IsTextFormattingEnabled") var isTextFormattingEnabled = true
@AppStorage("IsWordReplacementEnabled") var isWordReplacementEnabled = false
// MARK: - UI Settings
@AppStorage("hasCompletedOnboarding") var hasCompletedOnboarding = false
@AppStorage("isMenuBarOnly") var isMenuBarOnly = false
// MARK: - Cleanup Settings
@AppStorage("IsTranscriptionCleanupEnabled") var isTranscriptionCleanupEnabled = false
@AppStorage("TranscriptionCleanupDelay") var cleanupDelay: Double = 0
// MARK: - Export/Import
func exportSettings() -> AppSettings {
AppSettings(
recorderType: recorderType,
appendTrailingSpace: appendTrailingSpace,
// ... all other settings
)
}
func importSettings(_ settings: AppSettings) {
recorderType = settings.recorderType
appendTrailingSpace = settings.appendTrailingSpace
// ... all other settings
}
}
struct AppSettings: Codable {
let recorderType: RecorderType
let appendTrailingSpace: Bool
// ... all settings as codable properties
}
```
**Benefits:**
- Single source of truth
- Type-safe access to settings
- Easy import/export
- Better testability
- Clearer dependencies
---
#### 🟠 High: Service Layer Standardization
**Issue:** Services have inconsistent interfaces and error handling.
**Current State:**
- Some services use protocols, some don't
- Error types vary across services
- Async/await not consistently used
**Proposed Solution:**
```swift
// Standard service protocol
protocol Service: AnyObject {
associatedtype Configuration
associatedtype Error: LocalizedError
var isConfigured: Bool { get }
func configure(_ config: Configuration) async throws
func reset() async
}
// Standard error handling
protocol ServiceError: LocalizedError {
var errorTitle: String { get }
var errorDescription: String? { get }
var recoverySuggestion: String? { get }
var underlyingError: Error? { get }
}
// Example implementation
class TranscriptionServiceBase: Service {
typealias Configuration = TranscriptionConfig
typealias Error = TranscriptionError
var isConfigured: Bool {
// Check if service is ready
}
func configure(_ config: TranscriptionConfig) async throws {
// Setup service
}
func reset() async {
// Cleanup resources
}
}
// Standardized error
enum TranscriptionError: ServiceError {
case modelNotLoaded
case audioProcessingFailed(Error)
case networkError(Error)
case invalidConfiguration
var errorTitle: String {
switch self {
case .modelNotLoaded: return "Model Not Loaded"
case .audioProcessingFailed: return "Audio Processing Failed"
case .networkError: return "Network Error"
case .invalidConfiguration: return "Invalid Configuration"
}
}
var errorDescription: String? {
switch self {
case .modelNotLoaded:
return "The transcription model is not loaded."
case .audioProcessingFailed(let error):
return "Failed to process audio: \(error.localizedDescription)"
case .networkError(let error):
return "Network request failed: \(error.localizedDescription)"
case .invalidConfiguration:
return "Service configuration is invalid."
}
}
var recoverySuggestion: String? {
switch self {
case .modelNotLoaded:
return "Please download or select a transcription model in Settings."
case .audioProcessingFailed:
return "Try recording again or check your audio input settings."
case .networkError:
return "Check your internet connection and API credentials."
case .invalidConfiguration:
return "Review your service configuration in Settings."
}
}
var underlyingError: Error? {
switch self {
case .audioProcessingFailed(let error), .networkError(let error):
return error
default:
return nil
}
}
}
```
---
#### 🟠 High: Dependency Injection Improvements
**Issue:** Many classes create their own dependencies, making testing difficult.
**Current State:**
```swift
class WhisperState {
// Hard-coded dependencies
private var localTranscriptionService: LocalTranscriptionService!
private lazy var cloudTranscriptionService = CloudTranscriptionService()
private lazy var nativeAppleTranscriptionService = NativeAppleTranscriptionService()
}
```
**Proposed Solution:**
```swift
// Create service container
@MainActor
class ServiceContainer {
static let shared = ServiceContainer()
// Services
let transcriptionService: TranscriptionServiceProtocol
let enhancementService: AIEnhancementService
let audioDeviceManager: AudioDeviceManager
let powerModeManager: PowerModeManager
init(
transcriptionService: TranscriptionServiceProtocol? = nil,
enhancementService: AIEnhancementService? = nil,
audioDeviceManager: AudioDeviceManager? = nil,
powerModeManager: PowerModeManager? = nil
) {
self.transcriptionService = transcriptionService ?? LocalTranscriptionService()
self.enhancementService = enhancementService ?? AIEnhancementService()
self.audioDeviceManager = audioDeviceManager ?? AudioDeviceManager.shared
self.powerModeManager = powerModeManager ?? PowerModeManager.shared
}
}
// Updated WhisperState
class WhisperState {
private let services: ServiceContainer
init(
modelContext: ModelContext,
services: ServiceContainer = .shared
) {
self.modelContext = modelContext
self.services = services
}
func transcribeAudio(on transcription: Transcription) async {
let service = services.transcriptionService
let text = try await service.transcribe(...)
}
}
```
**Benefits:**
- Testable with mock services
- Clear dependencies
- Easier to swap implementations
- Better code organization
---
### 2. Testing Infrastructure
#### 🔴 Critical: Unit Testing Setup
**Issue:** No automated tests exist.
**Proposed Solution:**
```swift
// Create test target structure
VoiceInkTests/
├── Models/
├── TranscriptionModelTests.swift
└── PowerModeConfigTests.swift
├── Services/
├── TranscriptionServiceTests.swift
├── AIEnhancementServiceTests.swift
└── AudioDeviceManagerTests.swift
├── Utilities/
├── TextFormatterTests.swift
└── WordReplacementTests.swift
└── Mocks/
├── MockTranscriptionService.swift
├── MockAIService.swift
└── MockAudioDevice.swift
// Example test
import XCTest
@testable import VoiceInk
class TranscriptionServiceTests: XCTestCase {
var service: LocalTranscriptionService!
var mockModelContext: ModelContext!
override func setUp() {
super.setUp()
service = LocalTranscriptionService(
modelsDirectory: testModelsDirectory,
whisperState: mockWhisperState
)
}
func testTranscribeShortAudio() async throws {
let testAudioURL = Bundle(for: type(of: self))
.url(forResource: "test_audio", withExtension: "wav")!
let model = PredefinedModels.whisperBase
let result = try await service.transcribe(
audioURL: testAudioURL,
model: model
)
XCTAssertFalse(result.isEmpty)
XCTAssertTrue(result.contains("test"))
}
func testTranscribeWithInvalidAudio() async {
let invalidURL = URL(fileURLWithPath: "/nonexistent.wav")
do {
_ = try await service.transcribe(
audioURL: invalidURL,
model: PredefinedModels.whisperBase
)
XCTFail("Should throw error")
} catch {
XCTAssertTrue(error is TranscriptionError)
}
}
}
```
---
#### 🟠 High: UI Testing for Critical Flows
**Proposed Tests:**
```swift
class OnboardingUITests: XCTestCase {
func testCompleteOnboarding() {
let app = XCUIApplication()
app.launch()
// Should show onboarding for first run
XCTAssertTrue(app.staticTexts["Welcome to VoiceLink"].exists)
// Step through onboarding
app.buttons["Continue"].tap()
app.buttons["Grant Permissions"].tap()
app.buttons["Select Model"].tap()
app.buttons["Finish"].tap()
// Should show main app
XCTAssertTrue(app.staticTexts["Dashboard"].exists)
}
}
class RecordingUITests: XCTestCase {
func testStartStopRecording() {
let app = XCUIApplication()
app.launch()
// Trigger recording via hotkey
XCUIApplication().typeKey("r", modifierFlags: .command)
// Recorder should appear
XCTAssertTrue(app.windows["MiniRecorder"].exists)
// Stop recording
XCUIApplication().typeKey("r", modifierFlags: .command)
// Should show transcription
let historyTab = app.buttons["History"]
historyTab.tap()
XCTAssertTrue(app.tables["TranscriptionHistory"].cells.count > 0)
}
}
```
---
### 3. Documentation Improvements
#### 🟠 High: API Documentation
**Issue:** Many public APIs lack documentation.
**Proposed Solution:**
```swift
/// Manages transcription of audio files using various AI models.
///
/// `WhisperState` coordinates the entire transcription workflow including:
/// - Audio recording and playback
/// - Model loading and management
/// - Transcription execution
/// - AI enhancement integration
/// - Power Mode session management
///
/// ## Usage
/// ```swift
/// let whisperState = WhisperState(
/// modelContext: modelContext,
/// enhancementService: enhancementService
/// )
///
/// // Start recording
/// await whisperState.toggleRecord()
///
/// // Transcription happens automatically when recording stops
/// ```
///
/// ## Thread Safety
/// This class is marked `@MainActor` and all methods must be called on the main thread.
///
/// ## See Also
/// - ``TranscriptionService``
/// - ``AIEnhancementService``
/// - ``PowerModeSessionManager``
@MainActor
class WhisperState: NSObject, ObservableObject {
/// The current state of the recording/transcription process.
///
/// Possible states:
/// - `.idle`: Ready to start recording
/// - `.recording`: Currently capturing audio
/// - `.transcribing`: Converting audio to text
/// - `.enhancing`: Applying AI enhancement
/// - `.busy`: Processing, user action blocked
@Published var recordingState: RecordingState = .idle
/// Starts or stops recording based on current state.
///
/// When called while idle, begins audio recording. When called during recording,
/// stops capture and automatically begins transcription.
///
/// - Throws: `RecordingError` if audio capture fails to start
/// - Important: Requires microphone permission granted
///
/// ## Example
/// ```swift
/// // Start recording
/// await whisperState.toggleRecord()
///
/// // ... user speaks ...
///
/// // Stop and transcribe
/// await whisperState.toggleRecord()
/// ```
func toggleRecord() async {
// Implementation
}
}
```
---
#### 🟡 Medium: Architecture Decision Records (ADRs)
**Proposed Structure:**
```markdown
# docs/architecture/
├── ADR-001-state-management.md
├── ADR-002-transcription-pipeline.md
├── ADR-003-power-mode-sessions.md
├── ADR-004-audio-device-handling.md
└── ADR-005-error-handling-strategy.md
# Example ADR
# ADR-003: Power Mode Session Management
## Status
Accepted
## Context
Power Mode needs to temporarily override app settings when a specific app/URL
is detected, then restore original settings when the context changes.
## Decision
Use session-based state management with UserDefaults persistence for crash recovery.
## Consequences
Positive:
- Settings survive app crashes
- Clear session lifecycle
- Easy to test and debug
Negative:
- Extra UserDefaults reads/writes
- Need to handle abandoned sessions
## Alternatives Considered
1. In-memory only (loses state on crash)
2. SwiftData models (overkill for ephemeral state)
```
---
### 4. Debugging & Logging
#### 🔴 Critical: Structured Logging System
**Issue:** Logging is inconsistent (mix of `print()`, `Logger`, and `#if DEBUG`).
**Proposed Solution:**
```swift
// Create unified logging system
import OSLog
extension Logger {
/// Logger for transcription operations
static let transcription = Logger(
subsystem: Bundle.main.bundleIdentifier!,
category: "Transcription"
)
/// Logger for audio operations
static let audio = Logger(
subsystem: Bundle.main.bundleIdentifier!,
category: "Audio"
)
/// Logger for Power Mode
static let powerMode = Logger(
subsystem: Bundle.main.bundleIdentifier!,
category: "PowerMode"
)
/// Logger for AI enhancement
static let ai = Logger(
subsystem: Bundle.main.bundleIdentifier!,
category: "AI"
)
}
// Usage
Logger.transcription.info("Starting transcription for audio: \(url.lastPathComponent)")
Logger.transcription.error("Transcription failed: \(error.localizedDescription)")
// Replace all print statements
// ❌ Remove
print("🔄 Starting transcription...")
// ✅ Replace with
Logger.transcription.info("Starting transcription")
```
**Benefits:**
- Structured log filtering
- Performance insights
- Better debugging
- Production-ready logging
---
#### 🟠 High: Debug Menu for Development
**Proposed Addition:**
```swift
#if DEBUG
struct DebugMenu: View {
@EnvironmentObject var whisperState: WhisperState
@EnvironmentObject var enhancementService: AIEnhancementService
var body: some View {
Menu("🐛 Debug") {
Section("State Inspection") {
Button("Print WhisperState") {
printState(whisperState)
}
Button("Print Service Status") {
printServices()
}
Button("Export Logs") {
exportLogs()
}
}
Section("Test Actions") {
Button("Simulate Recording") {
Task { await simulateRecording() }
}
Button("Trigger Test Transcription") {
Task { await testTranscription() }
}
Button("Force Power Mode Session") {
Task { await forcePowerMode() }
}
}
Section("Reset") {
Button("Clear All Transcriptions") {
deleteAllTranscriptions()
}
Button("Reset User Defaults") {
resetUserDefaults()
}
Button("Clear Model Cache") {
clearModelCache()
}
}
}
}
}
#endif
```
---
### 5. Performance Optimizations
#### 🟠 High: Model Loading Performance
**Issue:** Model loading blocks UI during startup.
**Current State:**
```swift
// Loads model synchronously
func loadModel(_ model: WhisperModel) async throws {
let context = try WhisperContext(url: model.url)
whisperContext = context
}
```
**Proposed Optimization:**
```swift
// Add background preloading
class ModelPreloader {
private var preloadedModels: [String: WhisperContext] = [:]
func preloadDefaultModel() async {
guard let defaultModel = UserDefaults.standard.defaultModelName else { return }
Task.detached(priority: .utility) {
do {
let context = try await self.loadModelInBackground(defaultModel)
await MainActor.run {
self.preloadedModels[defaultModel] = context
}
} catch {
Logger.transcription.error("Failed to preload model: \(error)")
}
}
}
func getModel(_ name: String) async throws -> WhisperContext {
if let cached = preloadedModels[name] {
return cached
}
return try await loadModelInBackground(name)
}
}
```
---
#### 🟡 Medium: Transcription History Virtualization
**Current State:**
- Pagination implemented but could be more efficient
- All visible transcriptions kept in memory
**Proposed Enhancement:**
```swift
// Use LazyVGrid with proper item sizing
LazyVStack(spacing: 10, pinnedViews: [.sectionHeaders]) {
ForEach(displayedTranscriptions) { transcription in
TranscriptionCard(transcription: transcription)
.id(transcription.id)
.frame(height: cardHeight(for: transcription))
.onAppear {
if transcription == displayedTranscriptions.last {
Task { await loadMoreContent() }
}
}
}
}
// Cache card heights
private var cardHeights: [UUID: CGFloat] = [:]
private func cardHeight(for transcription: Transcription) -> CGFloat {
if let cached = cardHeights[transcription.id] {
return cached
}
let baseHeight: CGFloat = 100
let isExpanded = expandedTranscription == transcription
let height = isExpanded ? 300 : baseHeight
cardHeights[transcription.id] = height
return height
}
```
---
### 6. Build & Development Workflow
#### 🟠 High: Continuous Integration Setup
**Proposed GitHub Actions:**
```yaml
# .github/workflows/ci.yml
name: CI
on:
push:
branches: [ custom-main-v2 ]
pull_request:
branches: [ custom-main-v2 ]
jobs:
build:
runs-on: macos-14
steps:
- uses: actions/checkout@v3
- name: Setup Xcode
uses: maxim-lobanov/setup-xcode@v1
with:
xcode-version: '15.0'
- name: Cache SPM
uses: actions/cache@v3
with:
path: .build
key: ${{ runner.os }}-spm-${{ hashFiles('**/Package.resolved') }}
- name: Build
run: xcodebuild -project VoiceInk.xcodeproj -scheme VoiceInk -configuration Debug build
- name: Run Tests
run: xcodebuild test -project VoiceInk.xcodeproj -scheme VoiceInk -destination 'platform=macOS'
- name: SwiftLint
run: |
brew install swiftlint
swiftlint lint --reporter github-actions-logging
code-quality:
runs-on: macos-14
steps:
- uses: actions/checkout@v3
- name: Check for TODOs
run: |
if grep -r "TODO\|FIXME\|XXX" VoiceInk/ --exclude-dir={Build,DerivedData} | grep -v "QUALITY_OF_LIFE"; then
echo "⚠️ Found untracked TODOs/FIXMEs"
exit 1
fi
```
---
#### 🟡 Medium: Pre-commit Hooks
**Proposed Setup:**
```bash
#!/bin/bash
# .git/hooks/pre-commit
echo "Running pre-commit checks..."
# Format Swift code
if command -v swiftformat &> /dev/null; then
swiftformat VoiceInk/ --quiet
git add VoiceInk/**/*.swift
fi
# Lint
if command -v swiftlint &> /dev/null; then
swiftlint lint --quiet --config .swiftlint.yml
if [ $? -ne 0 ]; then
echo "❌ SwiftLint found issues"
exit 1
fi
fi
# Check for debug prints
if git diff --cached --name-only | grep "\.swift$" | xargs grep -n "print(" | grep -v "// OK:"; then
echo "❌ Found print() statements. Use Logger instead."
exit 1
fi
# Check for force unwraps in production code
if git diff --cached --name-only | grep "\.swift$" | grep -v "Test" | xargs grep -n "!" | grep -v "// OK:"; then
echo "⚠️ Found force unwraps. Consider safe unwrapping."
fi
echo "✅ Pre-commit checks passed"
```
---
## Implementation Priorities
### Phase 1: Critical User Experience (2-3 weeks)
1. ✅ Recording state visual feedback
2. ✅ Keyboard shortcut for cancel
3. ✅ Recording length indicator
4. ✅ Audio device switching fixes
5. ✅ Bulk actions performance
### Phase 2: Developer Infrastructure (2-3 weeks)
1. ✅ State management consolidation
2. ✅ Structured logging system
3. ✅ Unit testing setup
4. ✅ Service layer standardization
5. ✅ Dependency injection
### Phase 3: Feature Enhancements (3-4 weeks)
1. ✅ Smart search & filters
2. ✅ Power Mode active indicator
3. ✅ First-run setup improvements
4. ✅ Export format options
5. ✅ Keyboard shortcut cheat sheet
### Phase 4: Polish & Optimization (2-3 weeks)
1. ✅ Theme/appearance customization
2. ✅ Accessibility improvements
3. ✅ Performance optimizations
4. ✅ API documentation
5. ✅ CI/CD setup
---
## Metrics for Success
### User Metrics
- **Setup Time**: Reduce first-run to transcription from 5min → 2min
- **Discoverability**: 80%+ users find keyboard shortcuts within first week
- **Error Recovery**: 90%+ users successfully recover from recording failures
- **Performance**: History view remains responsive with 1000+ transcriptions
### Developer Metrics
- **Test Coverage**: Achieve 60%+ code coverage
- **Build Time**: Keep clean build under 2 minutes
- **Code Quality**: Maintain SwiftLint score >95%
- **Documentation**: 100% public API documented
---
## Long-Term Vision
### Advanced Features (Future)
1. **Multi-language Live Translation**
- Transcribe in one language, output in another
- Real-time translation during recording
2. **Voice Commands**
- "Start recording", "Stop recording"
- "Enhance last transcription"
- "Open settings"
3. **Collaborative Features**
- Share transcriptions with team
- Collaborative editing
- Comments and annotations
4. **Advanced Analytics**
- Speaking patterns analysis
- Word frequency insights
- Time-of-day productivity tracking
5. **Plugin System**
- Custom transcription filters
- Third-party AI providers
- Custom export formats
---
## Contributing
To implement these improvements:
1. **Choose an item** from the list above
2. **Create a branch**: `feature/improvement-name`
3. **Implement the change** following AGENTS.md guidelines
4. **Add tests** if applicable
5. **Update documentation** as needed
6. **Submit PR** with before/after screenshots for UI changes
---
## Appendix: Code Snippets Library
### A. Safe Optional Unwrapping Pattern
```swift
// ❌ Avoid
let text = transcription.enhancedText!
// ✅ Use
guard let text = transcription.enhancedText else {
Logger.ai.warning("No enhanced text available")
return transcription.text
}
```
### B. Async Task with Cancellation
```swift
private var task: Task<Void, Never>?
func startBackgroundWork() {
task?.cancel()
task = Task {
do {
try await Task.sleep(for: .seconds(1))
guard !Task.isCancelled else { return }
await performWork()
} catch {
// Handle cancellation
}
}
}
func stopBackgroundWork() {
task?.cancel()
task = nil
}
```
### C. UserDefaults Extension
```swift
extension UserDefaults {
enum Keys {
static let recorderType = "RecorderType"
static let appendTrailingSpace = "AppendTrailingSpace"
static let selectedAudioDeviceUID = "selectedAudioDeviceUID"
}
var recorderType: RecorderType {
get {
guard let raw = string(forKey: Keys.recorderType),
let type = RecorderType(rawValue: raw) else {
return .mini
}
return type
}
set {
set(newValue.rawValue, forKey: Keys.recorderType)
}
}
}
```
### D. View Modifier for Consistent Styling
```swift
struct CardStyle: ViewModifier {
let isSelected: Bool
func body(content: Content) -> some View {
content
.padding()
.background(CardBackground(isSelected: isSelected))
.cornerRadius(10)
.shadow(radius: isSelected ? 4 : 2)
}
}
extension View {
func cardStyle(isSelected: Bool = false) -> some View {
modifier(CardStyle(isSelected: isSelected))
}
}
// Usage
VStack {
Text("Content")
}
.cardStyle(isSelected: true)
```
---
**Last Updated:** November 3, 2025
**Version:** 1.0
**Maintained By:** VoiceLink Community