vOOice/QUALITY_OF_LIFE_IMPROVEMENTS.md
tmm22 c3e7263890
Update QUALITY_OF_LIFE_IMPROVEMENTS.md
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2025-11-03 12:07:57 +10:00

49 KiB

Quality of Life Improvements for VoiceLink Community

Date: November 3, 2025
Analysis Type: User & Developer Experience Audit
Scope: VoiceInk codebase fork analysis


Executive Summary

This document identifies quality of life improvements for both users and developers of VoiceLink Community. The analysis covers UX/UI enhancements, workflow optimizations, accessibility features, code quality improvements, and maintainability enhancements.

Priority Legend:

  • 🔴 Critical - High impact, relatively easy to implement
  • 🟠 High - Significant improvement, moderate effort
  • 🟡 Medium - Nice to have, moderate effort
  • 🟢 Low - Polish items, lower priority

User-Facing Improvements

1. Recording & Transcription Workflow

🔴 Critical: Recording State Visual Feedback

Issue: Current recorder provides minimal feedback during transcription/enhancement phases.

Current State:

  • Status changes between .recording, .transcribing, .enhancing, .busy
  • Limited visual differentiation in the mini recorder
  • No progress indicator during long transcriptions

Proposed Solution:

// Add to RecorderStatusDisplay
struct RecorderStatusDisplay: View {
    let currentState: RecordingState
    let audioMeter: Float
    @State private var transcriptionProgress: Double = 0
    
    var statusText: String {
        switch currentState {
        case .recording: return "Recording..."
        case .transcribing: return "Transcribing..."
        case .enhancing: return "Enhancing with AI..."
        case .busy: return "Processing..."
        case .idle: return "Ready"
        }
    }
    
    var body: some View {
        VStack(spacing: 4) {
            // Current visualizer
            AudioVisualizerView(audioMeter: audioMeter)
            
            // Add progress bar for processing states
            if currentState != .recording && currentState != .idle {
                ProgressView(value: transcriptionProgress)
                    .progressViewStyle(.linear)
                    .frame(height: 2)
            }
            
            Text(statusText)
                .font(.caption2)
                .foregroundColor(.secondary)
        }
    }
}

Benefits:

  • Users know exactly what's happening during each phase
  • Reduces anxiety during long transcriptions
  • Clear visual state transitions

🔴 Critical: Keyboard Shortcut for Cancel Recording

Issue: User must wait for transcription to complete or manually close recorder. Double-tap Escape is not discoverable.

Current State:

// Custom cancel shortcut is optional and hidden
@State private var isCustomCancelEnabled = false
if isCustomCancelEnabled {
    KeyboardShortcuts.Recorder(for: .cancelRecorder)
}

Proposed Solution:

  • Make Escape cancellation always available with clear UI indication
  • Add cancel button to recorder UI
  • Show "Press ESC to cancel" hint during recording
// In MiniRecorderView
if whisperState.recordingState == .recording {
    Button(action: {
        Task {
            await whisperState.cancelRecording()
        }
    }) {
        Image(systemName: "xmark.circle.fill")
            .foregroundColor(.red)
    }
    .help("Cancel recording (ESC)")
}

Benefits:

  • Immediate control over recording session
  • Prevents accidental long transcriptions
  • Improved user confidence

🟠 High: Quick Retry Last Transcription

Issue: Already implemented but could be more discoverable and integrated.

Current State:

  • Keyboard shortcut exists (.retryLastTranscription)
  • Not visible in UI
  • No indication when retry is in progress

Proposed Enhancement:

  • Add retry button to transcription history cards
  • Show retry indicator in mini recorder
  • Add "Retry with different model" option
// In TranscriptionCard
HStack {
    Button("Retry") {
        LastTranscriptionService.retryLastTranscription(
            from: modelContext,
            whisperState: whisperState
        )
    }
    
    Menu {
        ForEach(whisperState.allAvailableModels, id: \.name) { model in
            Button(model.displayName) {
                // Retry with specific model
            }
        }
    } label: {
        Image(systemName: "chevron.down")
    }
}

🟠 High: Recording Length Indicator

Issue: No visual indication of recording duration.

Proposed Solution:

// Add to RecorderStatusDisplay
@State private var recordingDuration: TimeInterval = 0
private let timer = Timer.publish(every: 0.1, on: .main, in: .common).autoconnect()

Text(formatDuration(recordingDuration))
    .font(.system(.caption, design: .monospaced))
    .foregroundColor(.primary)
    .onReceive(timer) { _ in
        if whisperState.recordingState == .recording {
            recordingDuration += 0.1
        }
    }

private func formatDuration(_ duration: TimeInterval) -> String {
    let minutes = Int(duration) / 60
    let seconds = Int(duration) % 60
    return String(format: "%02d:%02d", minutes, seconds)
}

Benefits:

  • Users know how long they've been recording
  • Helps prevent accidentally long recordings
  • Visual feedback that recording is active

2. Transcription History & Management

🔴 Critical: Bulk Actions Performance

Issue: Selecting all transcriptions can be slow with large datasets.

Current Implementation:

// Loads all transcriptions into memory
private func selectAllTranscriptions() async {
    let allTranscriptions = try modelContext.fetch(allDescriptor)
    selectedTranscriptions = Set(allTranscriptions)
}

Proposed Optimization:

// Only fetch IDs for selection, lazy load full objects when needed
private func selectAllTranscriptions() async {
    var descriptor = FetchDescriptor<Transcription>()
    descriptor.propertiesToFetch = [\.id, \.timestamp]
    
    let ids = try modelContext.fetch(descriptor).map { $0.id }
    selectedTranscriptions = Set(ids)
}

// Update delete to work with IDs
private func deleteSelectedTranscriptions() {
    let predicate = #Predicate<Transcription> { transcription in
        selectedTranscriptions.contains(transcription.id)
    }
    try? modelContext.delete(model: Transcription.self, where: predicate)
}

Benefits:

  • Faster selection on large datasets
  • Reduced memory footprint
  • More responsive UI

🟠 High: Smart Search & Filters

Issue: Current search is basic text matching only.

Proposed Enhancements:

struct TranscriptionFilters: View {
    @Binding var filters: FilterOptions
    
    var body: some View {
        HStack {
            // Search text (existing)
            TextField("Search", text: $filters.searchText)
            
            // Date range
            Menu {
                Button("Today") { filters.dateRange = .today }
                Button("Last 7 days") { filters.dateRange = .week }
                Button("Last 30 days") { filters.dateRange = .month }
                Button("Custom...") { filters.showDatePicker = true }
            } label: {
                Label("Date Range", systemImage: "calendar")
            }
            
            // Model filter
            Menu {
                Button("All Models") { filters.model = nil }
                ForEach(availableModels) { model in
                    Button(model.displayName) {
                        filters.model = model
                    }
                }
            } label: {
                Label("Model", systemImage: "brain.head.profile")
            }
            
            // Power Mode filter
            Menu {
                Button("All") { filters.powerMode = nil }
                ForEach(powerModes) { mode in
                    Button("\(mode.emoji) \(mode.name)") {
                        filters.powerMode = mode
                    }
                }
            } label: {
                Label("Power Mode", systemImage: "sparkles")
            }
            
            // Status filter
            Picker("Status", selection: $filters.status) {
                Text("All").tag(nil as TranscriptionStatus?)
                Text("Completed").tag(TranscriptionStatus.completed)
                Text("Failed").tag(TranscriptionStatus.failed)
            }
        }
    }
}

Benefits:

  • Find transcriptions faster
  • Filter by context (Power Mode, model used)
  • Better organization for power users

🟡 Medium: Transcription Tagging System

Issue: No way to organize or categorize transcriptions.

Proposed Solution:

// Add to Transcription model
@Model
class Transcription {
    // ... existing properties
    var tags: [String] = []
    var category: String?
}

// UI for tagging
struct TagEditor: View {
    @Binding var tags: [String]
    @State private var newTag = ""
    
    var body: some View {
        VStack(alignment: .leading) {
            // Existing tags
            FlowLayout {
                ForEach(tags, id: \.self) { tag in
                    TagChip(tag: tag) {
                        tags.removeAll { $0 == tag }
                    }
                }
            }
            
            // Add new tag
            HStack {
                TextField("Add tag", text: $newTag)
                    .textFieldStyle(.roundedBorder)
                Button("Add") {
                    if !newTag.isEmpty {
                        tags.append(newTag)
                        newTag = ""
                    }
                }
            }
        }
    }
}

3. Audio Input & Device Management

🔴 Critical: Audio Device Switching Without Restart

Issue: Changing audio device mid-recording can cause crashes (noted in AudioDeviceManager).

Current State:

// AudioDeviceManager.swift line 36
// No proper cleanup of audio engine before device change

Proposed Fix:

func setSelectedDevice(_ deviceID: AudioDeviceID) async throws {
    // Stop recording if active
    let wasRecording = isRecordingActive
    if wasRecording {
        await whisperState?.recorder.stopRecording()
    }
    
    // Wait for audio engine to release resources
    try await Task.sleep(nanoseconds: 100_000_000) // 100ms
    
    // Switch device
    selectedDeviceID = deviceID
    if let uid = getDeviceUID(deviceID: deviceID) {
        UserDefaults.standard.selectedAudioDeviceUID = uid
    }
    
    // Restart recording if it was active
    if wasRecording {
        await whisperState?.recorder.startRecording()
    }
}

Benefits:

  • Safe device switching
  • No crashes or audio corruption
  • Better multi-device workflow

🟠 High: Audio Level Monitoring in Settings

Issue: Can't test microphone levels before recording.

Proposed Solution:

// Add to AudioInputSettingsView
struct MicrophoneLevelMeter: View {
    @StateObject private var monitor = AudioLevelMonitor()
    
    var body: some View {
        VStack(alignment: .leading, spacing: 8) {
            Text("Microphone Test")
                .font(.headline)
            
            HStack {
                ProgressView(value: monitor.currentLevel, total: 1.0)
                    .progressViewStyle(.linear)
                    .tint(monitor.currentLevel > 0.8 ? .red : .green)
                
                Text("\(Int(monitor.currentLevel * 100))%")
                    .monospacedDigit()
            }
            
            Toggle("Monitor Input", isOn: $monitor.isMonitoring)
        }
    }
}

class AudioLevelMonitor: ObservableObject {
    @Published var currentLevel: Float = 0
    @Published var isMonitoring = false
    private var audioEngine: AVAudioEngine?
    
    func startMonitoring() {
        // Setup audio tap on input node
    }
    
    func stopMonitoring() {
        audioEngine?.stop()
    }
}

🟡 Medium: Prioritized Device Auto-Selection Improvements

Issue: Prioritized device mode exists but UX is unclear.

Proposed Enhancement:

// In AudioInputSettingsView
struct PrioritizedDeviceEditor: View {
    @Binding var devices: [PrioritizedDevice]
    
    var body: some View {
        VStack(alignment: .leading, spacing: 12) {
            Text("Device Priority Order")
                .font(.headline)
            
            Text("VoiceLink will automatically use the highest priority available device.")
                .font(.caption)
                .foregroundColor(.secondary)
            
            List {
                ForEach(devices) { device in
                    HStack {
                        Image(systemName: "line.3.horizontal")
                            .foregroundColor(.secondary)
                        Text(device.name)
                        Spacer()
                        Text("Priority \(device.priority)")
                            .foregroundColor(.secondary)
                    }
                }
                .onMove { from, to in
                    devices.move(fromOffsets: from, toOffset: to)
                    updatePriorities()
                }
            }
            
            HStack {
                Button("Add Current Device") {
                    if let current = AudioDeviceManager.shared.selectedDeviceID {
                        // Add to priority list
                    }
                }
                
                Button("Test Priority Order") {
                    AudioDeviceManager.shared.selectHighestPriorityAvailableDevice()
                }
            }
        }
    }
}

4. Power Mode Enhancements

🟠 High: Power Mode Active Indicator

Issue: Hard to tell when Power Mode is active and which config is applied.

Proposed Solution:

// Add to MiniRecorderView
if let activeConfig = PowerModeManager.shared.currentActiveConfiguration,
   activeConfig.isEnabled {
    HStack(spacing: 4) {
        Text(activeConfig.emoji)
        Text(activeConfig.name)
            .font(.caption2)
    }
    .padding(.horizontal, 8)
    .padding(.vertical, 4)
    .background(Color.accentColor.opacity(0.2))
    .cornerRadius(8)
}

// Add to MenuBar
if let activeConfig = PowerModeManager.shared.currentActiveConfiguration {
    Section("Power Mode Active") {
        Text("\(activeConfig.emoji) \(activeConfig.name)")
            .font(.system(size: 12, weight: .semibold))
        
        Button("Disable Power Mode") {
            Task {
                await PowerModeSessionManager.shared.endSession()
            }
        }
    }
}

🟡 Medium: Power Mode Testing Tools

Issue: Hard to test Power Mode configs without switching apps.

Proposed Solution:

// Add to PowerModeView
struct PowerModeTestingPanel: View {
    @State private var testURL = ""
    @State private var testAppBundleID = ""
    
    var body: some View {
        GroupBox("Test Configuration") {
            VStack(spacing: 12) {
                TextField("App Bundle ID", text: $testAppBundleID)
                    .textFieldStyle(.roundedBorder)
                
                TextField("Browser URL", text: $testURL)
                    .textFieldStyle(.roundedBorder)
                
                Button("Simulate Activation") {
                    // Test which config would activate
                    let config = PowerModeManager.shared.findMatchingConfiguration(
                        appBundleID: testAppBundleID,
                        url: testURL
                    )
                    
                    if let config = config {
                        // Show preview of what would be applied
                    } else {
                        // Show "No matching configuration"
                    }
                }
            }
        }
    }
}

5. UI/UX Polish

🔴 Critical: First-Run Setup Improvements

Issue: Onboarding could be more streamlined.

Proposed Enhancements:

// Add quick-start preset
struct OnboardingPresetView: View {
    var body: some View {
        VStack(spacing: 24) {
            Text("Choose Your Setup")
                .font(.title)
            
            HStack(spacing: 20) {
                PresetCard(
                    title: "Simple",
                    subtitle: "Just transcription",
                    icon: "mic.fill"
                ) {
                    // Disable AI features, use base model
                    applySimplePreset()
                }
                
                PresetCard(
                    title: "Powered",
                    subtitle: "AI enhancement enabled",
                    icon: "sparkles"
                ) {
                    // Enable AI, setup Ollama
                    applyPoweredPreset()
                }
                
                PresetCard(
                    title: "Custom",
                    subtitle: "Configure manually",
                    icon: "slider.horizontal.3"
                ) {
                    // Show full onboarding
                    showFullOnboarding()
                }
            }
        }
    }
}

🟠 High: Keyboard Shortcut Cheat Sheet

Issue: Many shortcuts exist but aren't discoverable.

Proposed Solution:

// Add help overlay accessible via Cmd+?
struct ShortcutCheatSheet: View {
    var body: some View {
        VStack(alignment: .leading, spacing: 12) {
            Text("Keyboard Shortcuts")
                .font(.title2)
            
            Section("Recording") {
                ShortcutRow(
                    action: "Start/Stop Recording",
                    shortcut: hotkeyManager.selectedHotkey1.displayName
                )
                ShortcutRow(
                    action: "Cancel Recording",
                    shortcut: "ESC ESC"
                )
            }
            
            Section("Paste") {
                ShortcutRow(
                    action: "Paste Original",
                    shortcut: KeyboardShortcuts.getShortcut(for: .pasteLastTranscription)
                )
                ShortcutRow(
                    action: "Paste Enhanced",
                    shortcut: KeyboardShortcuts.getShortcut(for: .pasteLastEnhancement)
                )
            }
            
            Section("History") {
                ShortcutRow(action: "Search", shortcut: "⌘F")
                ShortcutRow(action: "Delete", shortcut: "⌫")
            }
        }
        .padding()
        .frame(width: 500, height: 600)
    }
}

🟡 Medium: Theme/Appearance Customization

Issue: UI is fixed, no customization options.

Proposed Solution:

// Add to Settings
struct AppearanceSettingsView: View {
    @AppStorage("recorderOpacity") private var recorderOpacity = 0.9
    @AppStorage("recorderScale") private var recorderScale = 1.0
    @AppStorage("useCompactUI") private var useCompactUI = false
    
    var body: some View {
        SettingsSection(
            icon: "paintbrush",
            title: "Appearance",
            subtitle: "Customize the look of VoiceLink"
        ) {
            VStack(alignment: .leading, spacing: 12) {
                HStack {
                    Text("Recorder Opacity")
                    Spacer()
                    Slider(value: $recorderOpacity, in: 0.5...1.0)
                        .frame(width: 200)
                    Text("\(Int(recorderOpacity * 100))%")
                        .monospacedDigit()
                }
                
                HStack {
                    Text("Recorder Size")
                    Spacer()
                    Slider(value: $recorderScale, in: 0.8...1.5)
                        .frame(width: 200)
                    Text("\(Int(recorderScale * 100))%")
                        .monospacedDigit()
                }
                
                Toggle("Compact UI Mode", isOn: $useCompactUI)
                
                Divider()
                
                Button("Reset to Defaults") {
                    recorderOpacity = 0.9
                    recorderScale = 1.0
                    useCompactUI = false
                }
            }
        }
    }
}

6. Accessibility Improvements

🟠 High: Better Screen Reader Support

Issue: Some UI elements lack proper accessibility labels.

Proposed Fixes:

// Add to critical UI elements
Button(action: startRecording) {
    Image(systemName: "mic.fill")
}
.accessibilityLabel("Start recording")
.accessibilityHint("Tap to begin voice recording")

// Recorder status
Text(statusText)
    .accessibilityLabel("Recording status: \(statusText)")
    .accessibilityAddTraits(.updatesFrequently)

// Audio visualizer
AudioVisualizerView(audioMeter: meter)
    .accessibilityLabel("Audio level: \(Int(meter * 100)) percent")
    .accessibilityAddTraits(.updatesFrequently)

🟡 Medium: High Contrast Mode Support

Issue: UI may be hard to read in bright environments.

Proposed Solution:

@Environment(\.accessibilityReduceTransparency) var reduceTransparency

var backgroundView: some View {
    if reduceTransparency {
        Color.black // Solid background
    } else {
        // Existing translucent background
        ZStack {
            Color.black.opacity(0.9)
            VisualEffectView(...)
        }
    }
}

7. Export & Integration Features

🟠 High: Export Format Options

Issue: Only CSV export is available.

Proposed Solution:

enum ExportFormat: String, CaseIterable {
    case csv = "CSV"
    case json = "JSON"
    case markdown = "Markdown"
    case txt = "Plain Text"
    case srt = "Subtitles (SRT)"
}

struct ExportOptionsView: View {
    @State private var format: ExportFormat = .csv
    @State private var includeAudio = false
    @State private var includeMetadata = true
    
    var body: some View {
        VStack {
            Picker("Format", selection: $format) {
                ForEach(ExportFormat.allCases, id: \.self) { format in
                    Text(format.rawValue).tag(format)
                }
            }
            
            Toggle("Include audio files", isOn: $includeAudio)
            Toggle("Include metadata", isOn: $includeMetadata)
            
            Button("Export") {
                exportTranscriptions(
                    format: format,
                    includeAudio: includeAudio,
                    includeMetadata: includeMetadata
                )
            }
        }
    }
}

🟡 Medium: Webhook Integration

Issue: No way to send transcriptions to external services automatically.

Proposed Solution:

// Add webhook configuration
struct WebhookSettings: Codable {
    var url: String
    var enabled: Bool
    var includeAudio: Bool
    var headers: [String: String]
}

// Trigger after transcription completes
func sendToWebhook(_ transcription: Transcription) async throws {
    guard let settings = loadWebhookSettings(),
          settings.enabled else { return }
    
    let payload: [String: Any] = [
        "text": transcription.text,
        "timestamp": transcription.timestamp.ISO8601Format(),
        "model": transcription.transcriptionModelName ?? "unknown",
        "duration": transcription.duration
    ]
    
    guard let url = URL(string: settings.url), url.scheme?.lowercased() == "https" else {
        throw WebhookError.invalidURL
    }
    
    var request = URLRequest(url: url)
    request.httpMethod = "POST"
    request.httpBody = try JSONSerialization.data(withJSONObject: payload)
    
    // Add custom headers
    for (key, value) in settings.headers {
        request.addValue(value, forHTTPHeaderField: key)
    }
    
    let (_, response) = try await URLSession.shared.data(for: request)
    
    guard (response as? HTTPURLResponse)?.statusCode == 200 else {
        throw WebhookError.requestFailed
    }
}

Developer-Facing Improvements

1. Code Architecture & Organization

🔴 Critical: State Management Consolidation

Issue: State is scattered across multiple @Published properties and UserDefaults.

Current Problems:

  • 50+ UserDefaults keys spread across files
  • No centralized configuration management
  • Hard to track what settings exist
  • Difficult to implement import/export

Proposed Solution:

// Create centralized app state
@MainActor
class AppState: ObservableObject {
    // MARK: - Singleton
    static let shared = AppState()
    
    // MARK: - Recording Settings
    @AppStorage("RecorderType") var recorderType: RecorderType = .mini
    @AppStorage("AppendTrailingSpace") var appendTrailingSpace = true
    @AppStorage("UseAppleScriptPaste") var useAppleScriptPaste = false
    @AppStorage("preserveTranscriptInClipboard") var preserveClipboard = false
    
    // MARK: - Audio Settings
    @AppStorage("selectedAudioDeviceUID") var selectedAudioDeviceUID: String?
    @AppStorage("audioInputMode") var audioInputMode: AudioInputMode = .systemDefault
    @AppStorage("isSystemMuteEnabled") var isSystemMuteEnabled = false
    
    // MARK: - AI Settings
    @AppStorage("enableAIEnhancementFeatures") var enableAIFeatures = false
    @AppStorage("IsTextFormattingEnabled") var isTextFormattingEnabled = true
    @AppStorage("IsWordReplacementEnabled") var isWordReplacementEnabled = false
    
    // MARK: - UI Settings
    @AppStorage("hasCompletedOnboarding") var hasCompletedOnboarding = false
    @AppStorage("isMenuBarOnly") var isMenuBarOnly = false
    
    // MARK: - Cleanup Settings
    @AppStorage("IsTranscriptionCleanupEnabled") var isTranscriptionCleanupEnabled = false
    @AppStorage("TranscriptionCleanupDelay") var cleanupDelay: Double = 0
    
    // MARK: - Export/Import
    func exportSettings() -> AppSettings {
        AppSettings(
            recorderType: recorderType,
            appendTrailingSpace: appendTrailingSpace,
            // ... all other settings
        )
    }
    
    func importSettings(_ settings: AppSettings) {
        recorderType = settings.recorderType
        appendTrailingSpace = settings.appendTrailingSpace
        // ... all other settings
    }
}

struct AppSettings: Codable {
    let recorderType: RecorderType
    let appendTrailingSpace: Bool
    // ... all settings as codable properties
}

Benefits:

  • Single source of truth
  • Type-safe access to settings
  • Easy import/export
  • Better testability
  • Clearer dependencies

🟠 High: Service Layer Standardization

Issue: Services have inconsistent interfaces and error handling.

Current State:

  • Some services use protocols, some don't
  • Error types vary across services
  • Async/await not consistently used

Proposed Solution:

// Standard service protocol
protocol Service: AnyObject {
    associatedtype Configuration
    associatedtype Error: LocalizedError
    
    var isConfigured: Bool { get }
    func configure(_ config: Configuration) async throws
    func reset() async
}

// Standard error handling
protocol ServiceError: LocalizedError {
    var errorTitle: String { get }
    var errorDescription: String? { get }
    var recoverySuggestion: String? { get }
    var underlyingError: Error? { get }
}

// Example implementation
class TranscriptionServiceBase: Service {
    typealias Configuration = TranscriptionConfig
    typealias Error = TranscriptionError
    
    var isConfigured: Bool {
        // Check if service is ready
    }
    
    func configure(_ config: TranscriptionConfig) async throws {
        // Setup service
    }
    
    func reset() async {
        // Cleanup resources
    }
}

// Standardized error
enum TranscriptionError: ServiceError {
    case modelNotLoaded
    case audioProcessingFailed(Error)
    case networkError(Error)
    case invalidConfiguration
    
    var errorTitle: String {
        switch self {
        case .modelNotLoaded: return "Model Not Loaded"
        case .audioProcessingFailed: return "Audio Processing Failed"
        case .networkError: return "Network Error"
        case .invalidConfiguration: return "Invalid Configuration"
        }
    }
    
    var errorDescription: String? {
        switch self {
        case .modelNotLoaded:
            return "The transcription model is not loaded."
        case .audioProcessingFailed(let error):
            return "Failed to process audio: \(error.localizedDescription)"
        case .networkError(let error):
            return "Network request failed: \(error.localizedDescription)"
        case .invalidConfiguration:
            return "Service configuration is invalid."
        }
    }
    
    var recoverySuggestion: String? {
        switch self {
        case .modelNotLoaded:
            return "Please download or select a transcription model in Settings."
        case .audioProcessingFailed:
            return "Try recording again or check your audio input settings."
        case .networkError:
            return "Check your internet connection and API credentials."
        case .invalidConfiguration:
            return "Review your service configuration in Settings."
        }
    }
    
    var underlyingError: Error? {
        switch self {
        case .audioProcessingFailed(let error), .networkError(let error):
            return error
        default:
            return nil
        }
    }
}

🟠 High: Dependency Injection Improvements

Issue: Many classes create their own dependencies, making testing difficult.

Current State:

class WhisperState {
    // Hard-coded dependencies
    private var localTranscriptionService: LocalTranscriptionService!
    private lazy var cloudTranscriptionService = CloudTranscriptionService()
    private lazy var nativeAppleTranscriptionService = NativeAppleTranscriptionService()
}

Proposed Solution:

// Create service container
@MainActor
class ServiceContainer {
    static let shared = ServiceContainer()
    
    // Services
    let transcriptionService: TranscriptionServiceProtocol
    let enhancementService: AIEnhancementService
    let audioDeviceManager: AudioDeviceManager
    let powerModeManager: PowerModeManager
    
    init(
        transcriptionService: TranscriptionServiceProtocol? = nil,
        enhancementService: AIEnhancementService? = nil,
        audioDeviceManager: AudioDeviceManager? = nil,
        powerModeManager: PowerModeManager? = nil
    ) {
        self.transcriptionService = transcriptionService ?? LocalTranscriptionService()
        self.enhancementService = enhancementService ?? AIEnhancementService()
        self.audioDeviceManager = audioDeviceManager ?? AudioDeviceManager.shared
        self.powerModeManager = powerModeManager ?? PowerModeManager.shared
    }
}

// Updated WhisperState
class WhisperState {
    private let services: ServiceContainer
    
    init(
        modelContext: ModelContext,
        services: ServiceContainer = .shared
    ) {
        self.modelContext = modelContext
        self.services = services
    }
    
    func transcribeAudio(on transcription: Transcription) async {
        let service = services.transcriptionService
        let text = try await service.transcribe(...)
    }
}

Benefits:

  • Testable with mock services
  • Clear dependencies
  • Easier to swap implementations
  • Better code organization

2. Testing Infrastructure

🔴 Critical: Unit Testing Setup

Issue: No automated tests exist.

Proposed Solution:

// Create test target structure
VoiceInkTests/
├── Models/
   ├── TranscriptionModelTests.swift
   └── PowerModeConfigTests.swift
├── Services/
   ├── TranscriptionServiceTests.swift
   ├── AIEnhancementServiceTests.swift
   └── AudioDeviceManagerTests.swift
├── Utilities/
   ├── TextFormatterTests.swift
   └── WordReplacementTests.swift
└── Mocks/
    ├── MockTranscriptionService.swift
    ├── MockAIService.swift
    └── MockAudioDevice.swift

// Example test
import XCTest
@testable import VoiceInk

class TranscriptionServiceTests: XCTestCase {
    var service: LocalTranscriptionService!
    var mockModelContext: ModelContext!
    
    override func setUp() {
        super.setUp()
        service = LocalTranscriptionService(
            modelsDirectory: testModelsDirectory,
            whisperState: mockWhisperState
        )
    }
    
    func testTranscribeShortAudio() async throws {
        let testAudioURL = Bundle(for: type(of: self))
            .url(forResource: "test_audio", withExtension: "wav")!
        
        let model = PredefinedModels.whisperBase
        let result = try await service.transcribe(
            audioURL: testAudioURL,
            model: model
        )
        
        XCTAssertFalse(result.isEmpty)
        XCTAssertTrue(result.contains("test"))
    }
    
    func testTranscribeWithInvalidAudio() async {
        let invalidURL = URL(fileURLWithPath: "/nonexistent.wav")
        
        do {
            _ = try await service.transcribe(
                audioURL: invalidURL,
                model: PredefinedModels.whisperBase
            )
            XCTFail("Should throw error")
        } catch {
            XCTAssertTrue(error is TranscriptionError)
        }
    }
}

🟠 High: UI Testing for Critical Flows

Proposed Tests:

class OnboardingUITests: XCTestCase {
    func testCompleteOnboarding() {
        let app = XCUIApplication()
        app.launch()
        
        // Should show onboarding for first run
        XCTAssertTrue(app.staticTexts["Welcome to VoiceLink"].exists)
        
        // Step through onboarding
        app.buttons["Continue"].tap()
        app.buttons["Grant Permissions"].tap()
        app.buttons["Select Model"].tap()
        app.buttons["Finish"].tap()
        
        // Should show main app
        XCTAssertTrue(app.staticTexts["Dashboard"].exists)
    }
}

class RecordingUITests: XCTestCase {
    func testStartStopRecording() {
        let app = XCUIApplication()
        app.launch()
        
        // Trigger recording via hotkey
        XCUIApplication().typeKey("r", modifierFlags: .command)
        
        // Recorder should appear
        XCTAssertTrue(app.windows["MiniRecorder"].exists)
        
        // Stop recording
        XCUIApplication().typeKey("r", modifierFlags: .command)
        
        // Should show transcription
        let historyTab = app.buttons["History"]
        historyTab.tap()
        
        XCTAssertTrue(app.tables["TranscriptionHistory"].cells.count > 0)
    }
}

3. Documentation Improvements

🟠 High: API Documentation

Issue: Many public APIs lack documentation.

Proposed Solution:

/// Manages transcription of audio files using various AI models.
///
/// `WhisperState` coordinates the entire transcription workflow including:
/// - Audio recording and playback
/// - Model loading and management
/// - Transcription execution
/// - AI enhancement integration
/// - Power Mode session management
///
/// ## Usage
/// ```swift
/// let whisperState = WhisperState(
///     modelContext: modelContext,
///     enhancementService: enhancementService
/// )
///
/// // Start recording
/// await whisperState.toggleRecord()
///
/// // Transcription happens automatically when recording stops
/// ```
///
/// ## Thread Safety
/// This class is marked `@MainActor` and all methods must be called on the main thread.
///
/// ## See Also
/// - ``TranscriptionService``
/// - ``AIEnhancementService``
/// - ``PowerModeSessionManager``
@MainActor
class WhisperState: NSObject, ObservableObject {
    
    /// The current state of the recording/transcription process.
    ///
    /// Possible states:
    /// - `.idle`: Ready to start recording
    /// - `.recording`: Currently capturing audio
    /// - `.transcribing`: Converting audio to text
    /// - `.enhancing`: Applying AI enhancement
    /// - `.busy`: Processing, user action blocked
    @Published var recordingState: RecordingState = .idle
    
    /// Starts or stops recording based on current state.
    ///
    /// When called while idle, begins audio recording. When called during recording,
    /// stops capture and automatically begins transcription.
    ///
    /// - Throws: `RecordingError` if audio capture fails to start
    /// - Important: Requires microphone permission granted
    ///
    /// ## Example
    /// ```swift
    /// // Start recording
    /// await whisperState.toggleRecord()
    ///
    /// // ... user speaks ...
    ///
    /// // Stop and transcribe
    /// await whisperState.toggleRecord()
    /// ```
    func toggleRecord() async {
        // Implementation
    }
}

🟡 Medium: Architecture Decision Records (ADRs)

Proposed Structure:

# docs/architecture/
├── ADR-001-state-management.md
├── ADR-002-transcription-pipeline.md
├── ADR-003-power-mode-sessions.md
├── ADR-004-audio-device-handling.md
└── ADR-005-error-handling-strategy.md

# Example ADR
# ADR-003: Power Mode Session Management

## Status
Accepted

## Context
Power Mode needs to temporarily override app settings when a specific app/URL
is detected, then restore original settings when the context changes.

## Decision
Use session-based state management with UserDefaults persistence for crash recovery.

## Consequences
Positive:
- Settings survive app crashes
- Clear session lifecycle
- Easy to test and debug

Negative:
- Extra UserDefaults reads/writes
- Need to handle abandoned sessions

## Alternatives Considered
1. In-memory only (loses state on crash)
2. SwiftData models (overkill for ephemeral state)

4. Debugging & Logging

🔴 Critical: Structured Logging System

Issue: Logging is inconsistent (mix of print(), Logger, and #if DEBUG).

Proposed Solution:

// Create unified logging system
import OSLog

extension Logger {
    /// Logger for transcription operations
    static let transcription = Logger(
        subsystem: Bundle.main.bundleIdentifier!,
        category: "Transcription"
    )
    
    /// Logger for audio operations
    static let audio = Logger(
        subsystem: Bundle.main.bundleIdentifier!,
        category: "Audio"
    )
    
    /// Logger for Power Mode
    static let powerMode = Logger(
        subsystem: Bundle.main.bundleIdentifier!,
        category: "PowerMode"
    )
    
    /// Logger for AI enhancement
    static let ai = Logger(
        subsystem: Bundle.main.bundleIdentifier!,
        category: "AI"
    )
}

// Usage
Logger.transcription.info("Starting transcription for audio: \(url.lastPathComponent)")
Logger.transcription.error("Transcription failed: \(error.localizedDescription)")

// Replace all print statements
// ❌ Remove
print("🔄 Starting transcription...")

// ✅ Replace with
Logger.transcription.info("Starting transcription")

Benefits:

  • Structured log filtering
  • Performance insights
  • Better debugging
  • Production-ready logging

🟠 High: Debug Menu for Development

Proposed Addition:

#if DEBUG
struct DebugMenu: View {
    @EnvironmentObject var whisperState: WhisperState
    @EnvironmentObject var enhancementService: AIEnhancementService
    
    var body: some View {
        Menu("🐛 Debug") {
            Section("State Inspection") {
                Button("Print WhisperState") {
                    printState(whisperState)
                }
                
                Button("Print Service Status") {
                    printServices()
                }
                
                Button("Export Logs") {
                    exportLogs()
                }
            }
            
            Section("Test Actions") {
                Button("Simulate Recording") {
                    Task { await simulateRecording() }
                }
                
                Button("Trigger Test Transcription") {
                    Task { await testTranscription() }
                }
                
                Button("Force Power Mode Session") {
                    Task { await forcePowerMode() }
                }
            }
            
            Section("Reset") {
                Button("Clear All Transcriptions") {
                    deleteAllTranscriptions()
                }
                
                Button("Reset User Defaults") {
                    resetUserDefaults()
                }
                
                Button("Clear Model Cache") {
                    clearModelCache()
                }
            }
        }
    }
}
#endif

5. Performance Optimizations

🟠 High: Model Loading Performance

Issue: Model loading blocks UI during startup.

Current State:

// Loads model synchronously
func loadModel(_ model: WhisperModel) async throws {
    let context = try WhisperContext(url: model.url)
    whisperContext = context
}

Proposed Optimization:

// Add background preloading
class ModelPreloader {
    private var preloadedModels: [String: WhisperContext] = [:]
    
    func preloadDefaultModel() async {
        guard let defaultModel = UserDefaults.standard.defaultModelName else { return }
        
        Task.detached(priority: .utility) {
            do {
                let context = try await self.loadModelInBackground(defaultModel)
                await MainActor.run {
                    self.preloadedModels[defaultModel] = context
                }
            } catch {
                Logger.transcription.error("Failed to preload model: \(error)")
            }
        }
    }
    
    func getModel(_ name: String) async throws -> WhisperContext {
        if let cached = preloadedModels[name] {
            return cached
        }
        
        return try await loadModelInBackground(name)
    }
}

🟡 Medium: Transcription History Virtualization

Current State:

  • Pagination implemented but could be more efficient
  • All visible transcriptions kept in memory

Proposed Enhancement:

// Use LazyVGrid with proper item sizing
LazyVStack(spacing: 10, pinnedViews: [.sectionHeaders]) {
    ForEach(displayedTranscriptions) { transcription in
        TranscriptionCard(transcription: transcription)
            .id(transcription.id)
            .frame(height: cardHeight(for: transcription))
            .onAppear {
                if transcription == displayedTranscriptions.last {
                    Task { await loadMoreContent() }
                }
            }
    }
}

// Cache card heights
private var cardHeights: [UUID: CGFloat] = [:]

private func cardHeight(for transcription: Transcription) -> CGFloat {
    if let cached = cardHeights[transcription.id] {
        return cached
    }
    
    let baseHeight: CGFloat = 100
    let isExpanded = expandedTranscription == transcription
    let height = isExpanded ? 300 : baseHeight
    
    cardHeights[transcription.id] = height
    return height
}

6. Build & Development Workflow

🟠 High: Continuous Integration Setup

Proposed GitHub Actions:

# .github/workflows/ci.yml
name: CI

on:
  push:
    branches: [ custom-main-v2 ]
  pull_request:
    branches: [ custom-main-v2 ]

jobs:
  build:
    runs-on: macos-14
    
    steps:
    - uses: actions/checkout@v3
    
    - name: Setup Xcode
      uses: maxim-lobanov/setup-xcode@v1
      with:
        xcode-version: '15.0'
    
    - name: Cache SPM
      uses: actions/cache@v3
      with:
        path: .build
        key: ${{ runner.os }}-spm-${{ hashFiles('**/Package.resolved') }}
    
    - name: Build
      run: xcodebuild -project VoiceInk.xcodeproj -scheme VoiceInk -configuration Debug build
    
    - name: Run Tests
      run: xcodebuild test -project VoiceInk.xcodeproj -scheme VoiceInk -destination 'platform=macOS'
    
    - name: SwiftLint
      run: |
        brew install swiftlint
        swiftlint lint --reporter github-actions-logging

  code-quality:
    runs-on: macos-14
    
    steps:
    - uses: actions/checkout@v3
    
    - name: Check for TODOs
      run: |
        if grep -r "TODO\|FIXME\|XXX" VoiceInk/ --exclude-dir={Build,DerivedData} | grep -v "QUALITY_OF_LIFE"; then
          echo "⚠️ Found untracked TODOs/FIXMEs"
          exit 1
        fi

🟡 Medium: Pre-commit Hooks

Proposed Setup:

#!/bin/bash
# .git/hooks/pre-commit

echo "Running pre-commit checks..."

# Format Swift code
if command -v swiftformat &> /dev/null; then
    swiftformat VoiceInk/ --quiet
    git add VoiceInk/**/*.swift
fi

# Lint
if command -v swiftlint &> /dev/null; then
    swiftlint lint --quiet --config .swiftlint.yml
    if [ $? -ne 0 ]; then
        echo "❌ SwiftLint found issues"
        exit 1
    fi
fi

# Check for debug prints
if git diff --cached --name-only | grep "\.swift$" | xargs grep -n "print(" | grep -v "// OK:"; then
    echo "❌ Found print() statements. Use Logger instead."
    exit 1
fi

# Check for force unwraps in production code
if git diff --cached --name-only | grep "\.swift$" | grep -v "Test" | xargs grep -n "!" | grep -v "// OK:"; then
    echo "⚠️ Found force unwraps. Consider safe unwrapping."
fi

echo "✅ Pre-commit checks passed"

Implementation Priorities

Phase 1: Critical User Experience (2-3 weeks)

  1. Recording state visual feedback
  2. Keyboard shortcut for cancel
  3. Recording length indicator
  4. Audio device switching fixes
  5. Bulk actions performance

Phase 2: Developer Infrastructure (2-3 weeks)

  1. State management consolidation
  2. Structured logging system
  3. Unit testing setup
  4. Service layer standardization
  5. Dependency injection

Phase 3: Feature Enhancements (3-4 weeks)

  1. Smart search & filters
  2. Power Mode active indicator
  3. First-run setup improvements
  4. Export format options
  5. Keyboard shortcut cheat sheet

Phase 4: Polish & Optimization (2-3 weeks)

  1. Theme/appearance customization
  2. Accessibility improvements
  3. Performance optimizations
  4. API documentation
  5. CI/CD setup

Metrics for Success

User Metrics

  • Setup Time: Reduce first-run to transcription from 5min → 2min
  • Discoverability: 80%+ users find keyboard shortcuts within first week
  • Error Recovery: 90%+ users successfully recover from recording failures
  • Performance: History view remains responsive with 1000+ transcriptions

Developer Metrics

  • Test Coverage: Achieve 60%+ code coverage
  • Build Time: Keep clean build under 2 minutes
  • Code Quality: Maintain SwiftLint score >95%
  • Documentation: 100% public API documented

Long-Term Vision

Advanced Features (Future)

  1. Multi-language Live Translation

    • Transcribe in one language, output in another
    • Real-time translation during recording
  2. Voice Commands

    • "Start recording", "Stop recording"
    • "Enhance last transcription"
    • "Open settings"
  3. Collaborative Features

    • Share transcriptions with team
    • Collaborative editing
    • Comments and annotations
  4. Advanced Analytics

    • Speaking patterns analysis
    • Word frequency insights
    • Time-of-day productivity tracking
  5. Plugin System

    • Custom transcription filters
    • Third-party AI providers
    • Custom export formats

Contributing

To implement these improvements:

  1. Choose an item from the list above
  2. Create a branch: feature/improvement-name
  3. Implement the change following AGENTS.md guidelines
  4. Add tests if applicable
  5. Update documentation as needed
  6. Submit PR with before/after screenshots for UI changes

Appendix: Code Snippets Library

A. Safe Optional Unwrapping Pattern

// ❌ Avoid
let text = transcription.enhancedText!

// ✅ Use
guard let text = transcription.enhancedText else {
    Logger.ai.warning("No enhanced text available")
    return transcription.text
}

B. Async Task with Cancellation

private var task: Task<Void, Never>?

func startBackgroundWork() {
    task?.cancel()
    task = Task {
        do {
            try await Task.sleep(for: .seconds(1))
            guard !Task.isCancelled else { return }
            await performWork()
        } catch {
            // Handle cancellation
        }
    }
}

func stopBackgroundWork() {
    task?.cancel()
    task = nil
}

C. UserDefaults Extension

extension UserDefaults {
    enum Keys {
        static let recorderType = "RecorderType"
        static let appendTrailingSpace = "AppendTrailingSpace"
        static let selectedAudioDeviceUID = "selectedAudioDeviceUID"
    }
    
    var recorderType: RecorderType {
        get {
            guard let raw = string(forKey: Keys.recorderType),
                  let type = RecorderType(rawValue: raw) else {
                return .mini
            }
            return type
        }
        set {
            set(newValue.rawValue, forKey: Keys.recorderType)
        }
    }
}

D. View Modifier for Consistent Styling

struct CardStyle: ViewModifier {
    let isSelected: Bool
    
    func body(content: Content) -> some View {
        content
            .padding()
            .background(CardBackground(isSelected: isSelected))
            .cornerRadius(10)
            .shadow(radius: isSelected ? 4 : 2)
    }
}

extension View {
    func cardStyle(isSelected: Bool = false) -> some View {
        modifier(CardStyle(isSelected: isSelected))
    }
}

// Usage
VStack {
    Text("Content")
}
.cardStyle(isSelected: true)

Last Updated: November 3, 2025
Version: 1.0
Maintained By: VoiceLink Community