785 lines
24 KiB
Markdown
785 lines
24 KiB
Markdown
# Google Workspace + Meeting Intelligence: Tool Research
|
|
|
|
## Executive Summary
|
|
Research conducted: February 5, 2026
|
|
Purpose: Identify best tools for AI agent with full Google Workspace integration + meeting transcript intelligence
|
|
|
|
**Quick Recommendations:**
|
|
1. **Google Workspace CLI**: `gogcli` (steipete/gogcli) - most comprehensive
|
|
2. **MCP**: `google_workspace_mcp` (taylorwilsdon) - production-ready MCP server
|
|
3. **Transcript Processing**: `whisper.cpp` (ggml-org) - fastest local transcription
|
|
|
|
---
|
|
|
|
## 1. GOOGLE WORKSPACE CLIs
|
|
|
|
### ⭐ gogcli (steipete/gogcli)
|
|
**Repo**: https://github.com/steipete/gogcli
|
|
**Stars**: ~3.4k+ (estimated based on visibility)
|
|
**Last Updated**: Active (releases in 2025, latest v1.8+)
|
|
|
|
**What it does well:**
|
|
- **Most comprehensive** - Gmail, Calendar, Drive, Docs, Sheets, Slides, Contacts, Tasks, Chat, Keep, Groups, Classroom
|
|
- **JSON-first output** - Perfect for AI agent parsing
|
|
- **Multiple accounts** - Named profiles like AWS CLI
|
|
- **Least-privilege auth** - Granular scope control (--readonly, --drive-scope)
|
|
- **Service accounts** - Workspace domain-wide delegation support
|
|
- **Email tracking** - Built-in open tracking with Cloudflare Worker backend
|
|
- **Watch/Pub-Sub** - Gmail watch with webhook support
|
|
- **Advanced calendar features** - Focus time, OOO, working location, team calendars, conflict detection
|
|
- **Fast** - Written in Go, single binary
|
|
|
|
**Limitations:**
|
|
- Requires Google Cloud OAuth setup
|
|
- Some features Workspace-only (Chat, Keep, Groups)
|
|
- Email tracking needs separate Cloudflare Worker deployment
|
|
|
|
**Maintenance**: ⭐⭐⭐⭐⭐ Actively maintained (2025 releases)
|
|
|
|
**Best for this use case?**
|
|
✅ **YES - Primary choice**. Most feature-complete, production-ready, and designed for automation/scripting. JSON output mode is perfect for AI agents. Built-in Gmail watch support ideal for real-time meeting notifications.
|
|
|
|
**Installation**:
|
|
```bash
|
|
brew install steipete/tap/gogcli
|
|
# OR
|
|
go install github.com/steipete/gogcli@latest
|
|
```
|
|
|
|
**Key Commands for Meeting Workflow:**
|
|
```bash
|
|
# Search for meeting invites
|
|
gog calendar search "meeting" --days 7 --json
|
|
|
|
# Get today's calendar
|
|
gog calendar events --today --json
|
|
|
|
# Read Gmail for meeting notes
|
|
gog gmail search "subject:meeting notes" --json
|
|
|
|
# Create Drive folder for meeting docs
|
|
gog drive mkdir "Q1 Meetings" --json
|
|
|
|
# Watch for new emails (webhook support)
|
|
gog gmail watch start --topic projects/my-project/topics/gmail --label INBOX
|
|
```
|
|
|
|
---
|
|
|
|
### google-workspace-cli (ianpatrickhines)
|
|
**Repo**: https://github.com/ianpatrickhines/google-workspace-cli
|
|
**Stars**: ~100-200 (smaller project)
|
|
**Last Updated**: 2025 (recent)
|
|
|
|
**What it does well:**
|
|
- **TypeScript/Node.js** - Good for JavaScript ecosystem
|
|
- **Multi-profile support** - Named profiles like gogcli
|
|
- **LLM-focused** - Explicitly designed for Claude Code integration
|
|
- **JSON/table/text output** - Flexible output formats
|
|
|
|
**Limitations:**
|
|
- **Less comprehensive** - Only Gmail, Calendar, Drive (no Docs, Sheets, Tasks, etc.)
|
|
- **No email tracking**
|
|
- **No Pub/Sub watch**
|
|
- **Requires npm ecosystem**
|
|
|
|
**Maintenance**: ⭐⭐⭐ Active but smaller scope
|
|
|
|
**Best for this use case?**
|
|
⚠️ **Partial** - Good if you're in Node.js ecosystem, but less comprehensive than gogcli.
|
|
|
|
---
|
|
|
|
### gcalcli (insanum/gcalcli)
|
|
**Repo**: https://github.com/insanum/gcalcli
|
|
**Stars**: ~3.5k
|
|
**Last Updated**: Active (2024-2025)
|
|
|
|
**What it does well:**
|
|
- **Calendar-only specialist** - Very mature calendar CLI
|
|
- **ASCII calendar views** - Great terminal UI (calw, calm commands)
|
|
- **Agenda mode** - Clean agenda display
|
|
- **Reminder execution** - Can trigger commands on events
|
|
- **ICS import** - Import calendar invites
|
|
- **Conky/tmux integration** - Desktop/terminal integration examples
|
|
|
|
**Limitations:**
|
|
- **Calendar ONLY** - No Gmail, Drive, Docs, etc.
|
|
- **Python-based** - Additional dependency
|
|
- **OAuth setup required**
|
|
|
|
**Maintenance**: ⭐⭐⭐⭐⭐ Very mature, active
|
|
|
|
**Best for this use case?**
|
|
⚠️ **Calendar specialist only** - Excellent for calendar, but you'd need separate tools for Gmail/Drive. Use gogcli instead for unified approach.
|
|
|
|
---
|
|
|
|
## 2. MODEL CONTEXT PROTOCOL (MCP) SERVERS
|
|
|
|
### ⭐ google_workspace_mcp (taylorwilsdon)
|
|
**Repo**: https://github.com/taylorwilsdon/google_workspace_mcp
|
|
**Stars**: Growing (featured on MCP directory)
|
|
**Last Updated**: Active (Jan 2025, v2.x)
|
|
|
|
**What it does well:**
|
|
- **Most comprehensive MCP** - Gmail, Calendar, Drive, Docs, Sheets, Slides, Forms, Tasks, Chat, Contacts, Apps Script
|
|
- **OAuth 2.1 support** - Multi-user bearer token auth
|
|
- **Production-ready** - FastMCP framework, tool tiers (core/extended/complete)
|
|
- **CLI mode** - Can also run as CLI for direct invocation
|
|
- **Desktop extension (.dxt)** - One-click install for Claude Desktop
|
|
- **Stateless mode** - Container-friendly, no filesystem writes
|
|
- **Comment support** - Read/create/reply on Docs, Sheets, Slides
|
|
- **Form responses** - Create forms and retrieve responses
|
|
- **Tool tiers** - core (essential), extended (+ management), complete (all features)
|
|
|
|
**Limitations:**
|
|
- Python-based (requires Python 3.10+)
|
|
- Requires Google Cloud OAuth setup
|
|
- Some features require Google Workspace (Chat, Apps Script)
|
|
|
|
**Maintenance**: ⭐⭐⭐⭐⭐ Very active, production MCP server
|
|
|
|
**Best for this use case?**
|
|
✅ **YES - if using MCP clients** (Claude Desktop, VS Code MCP, Claude Code MCP support). Most mature Google Workspace MCP available. Includes CLI mode for non-MCP workflows.
|
|
|
|
**Installation:**
|
|
```bash
|
|
# Via uvx (instant)
|
|
uvx workspace-mcp --tool-tier core
|
|
|
|
# Or development
|
|
git clone https://github.com/taylorwilsdon/google_workspace_mcp.git
|
|
cd google_workspace_mcp
|
|
uv run main.py --transport streamable-http
|
|
```
|
|
|
|
**MCP vs CLI Decision:**
|
|
- **Use MCP** if your agent is Claude Desktop, VS Code with MCP extension, or Claude Code
|
|
- **Use CLI (gogcli)** if your agent can execute shell commands and parse JSON (Codex, custom agents)
|
|
- **Use both** - MCP for interactive sessions, CLI for automation scripts
|
|
|
|
---
|
|
|
|
## 3. TRANSCRIPT PROCESSING TOOLS
|
|
|
|
### ⭐ whisper.cpp (ggml-org)
|
|
**Repo**: https://github.com/ggml-org/whisper.cpp
|
|
**Stars**: ~38k+
|
|
**Last Updated**: Very active (v1.8.1, 2026)
|
|
|
|
**What it does well:**
|
|
- **Fastest local transcription** - C/C++ implementation, ~10-30x faster than Python Whisper
|
|
- **Multiple backends** - CPU, Metal (Apple Silicon), CUDA (NVIDIA), Vulkan, OpenVINO
|
|
- **Quantized models** - Reduced memory (Q5_0, Q8_0 variants)
|
|
- **Low memory** - Runs on edge devices (Raspberry Pi, phones)
|
|
- **CLI + library** - Both command-line and C API available
|
|
- **Voice Activity Detection (VAD)** - Silero-VAD integration for speech detection
|
|
- **Speaker diarization** - tinydiarize experimental support
|
|
- **Streaming support** - Real-time transcription from microphone
|
|
- **Multiple platforms** - Linux, macOS, Windows, iOS, Android, WebAssembly
|
|
|
|
**Limitations:**
|
|
- Requires ffmpeg for audio format support
|
|
- C/C++ ecosystem (less Python-friendly than openai/whisper)
|
|
- Speaker diarization still experimental
|
|
|
|
**Maintenance**: ⭐⭐⭐⭐⭐ Extremely active, large community
|
|
|
|
**Best for this use case?**
|
|
✅ **YES - Primary choice** for local/edge transcription. Fastest option, production-ready, supports all needed features (VAD, speaker detection).
|
|
|
|
**Model Sizes:**
|
|
| Model | Size | VRAM | Speed | Best For |
|
|
|-------|------|------|-------|----------|
|
|
| tiny | 75 MB | ~273 MB | ~10x | Fast, low-quality OK |
|
|
| base | 142 MB | ~388 MB | ~7x | Balanced |
|
|
| small | 466 MB | ~852 MB | ~4x | Good quality |
|
|
| medium | 1.5 GB | ~2.1 GB | ~2x | High quality |
|
|
| large | 2.9 GB | ~3.9 GB | 1x | Best quality |
|
|
| turbo | ~800 MB | ~6 GB | ~8x | Fast + accurate (recommended) |
|
|
|
|
**Recommended for meetings**: `turbo` model (optimized large-v3, fast + accurate)
|
|
|
|
**Usage:**
|
|
```bash
|
|
# Install
|
|
git clone https://github.com/ggml-org/whisper.cpp
|
|
cd whisper.cpp
|
|
cmake -B build
|
|
cmake --build build -j --config Release
|
|
|
|
# Download model
|
|
sh ./models/download-ggml-model.sh turbo
|
|
|
|
# Transcribe meeting recording
|
|
./build/bin/whisper-cli -m models/ggml-turbo.bin \
|
|
-f meeting.mp3 \
|
|
--output-json \
|
|
--language en
|
|
|
|
# With speaker diarization
|
|
./build/bin/whisper-cli -m models/ggml-small.en-tdrz.bin \
|
|
-f meeting.mp3 \
|
|
-tdrz \
|
|
--output-json
|
|
|
|
# With VAD (Voice Activity Detection)
|
|
./build/bin/whisper-cli -m models/ggml-turbo.bin \
|
|
-f meeting.mp3 \
|
|
--vad \
|
|
--vad-model models/ggml-silero-v6.2.0.bin \
|
|
--output-json
|
|
```
|
|
|
|
---
|
|
|
|
### openai/whisper (Python)
|
|
**Repo**: https://github.com/openai/whisper
|
|
**Stars**: ~79k+
|
|
**Last Updated**: Active
|
|
|
|
**What it does well:**
|
|
- **Official OpenAI model** - Reference implementation
|
|
- **Python ecosystem** - Easy integration with Python tools
|
|
- **Simple API** - Easy to use
|
|
- **Multiple languages** - 99 languages supported
|
|
|
|
**Limitations:**
|
|
- **Slow** - 10-30x slower than whisper.cpp
|
|
- **Higher memory** - More VRAM required
|
|
- **Python dependency overhead**
|
|
|
|
**Maintenance**: ⭐⭐⭐⭐ Official OpenAI, maintained
|
|
|
|
**Best for this use case?**
|
|
⚠️ **Use whisper.cpp instead** - Same models, much faster. Only use if you need Python API specifically.
|
|
|
|
---
|
|
|
|
### Assembly AI CLI
|
|
**Status**: Searching... (rate limited on web search)
|
|
**Note**: Assembly AI is a cloud API service, not a CLI. Requires API key and internet connection.
|
|
|
|
**What it does well:**
|
|
- Speaker diarization (production-ready)
|
|
- Action item extraction
|
|
- Topic detection
|
|
- PII redaction
|
|
- Custom vocabulary
|
|
|
|
**Limitations:**
|
|
- **Cloud service** - Requires internet, API costs
|
|
- **Privacy** - Audio uploaded to third party
|
|
- **API rate limits**
|
|
|
|
**Best for this use case?**
|
|
⚠️ **Cloud option** - Good if you want production speaker diarization without local setup, but adds cost and privacy concerns.
|
|
|
|
---
|
|
|
|
### Deepgram CLI
|
|
**Status**: Searching... (rate limited on web search)
|
|
**Note**: Deepgram is also a cloud API service.
|
|
|
|
**Similar to Assembly AI:**
|
|
- Cloud-based
|
|
- Good speaker diarization
|
|
- Fast transcription
|
|
- API costs
|
|
|
|
**Best for this use case?**
|
|
⚠️ **Cloud option** - Alternative to Assembly AI, similar tradeoffs.
|
|
|
|
---
|
|
|
|
## 4. GOOGLE MEET TRANSCRIPT ACCESS
|
|
|
|
### Google Meet REST API
|
|
**Docs**: https://developers.google.com/workspace/meet/api/guides/overview
|
|
|
|
**What it provides:**
|
|
- Access to conference metadata
|
|
- Recording URLs
|
|
- **Transcript entries** - `conferenceRecords.transcripts.entries`
|
|
|
|
**Limitations:**
|
|
- Requires Google Workspace (not free Gmail)
|
|
- Transcription must be enabled in meeting
|
|
- Admin policy controls access
|
|
|
|
**Integration approach:**
|
|
Use gogcli or google_workspace_mcp with Meet API access to:
|
|
1. List recent conferences
|
|
2. Get transcript entries
|
|
3. Download transcript
|
|
4. Process with action item extraction
|
|
|
|
**Example workflow:**
|
|
```bash
|
|
# If gogcli adds Meet API support (check latest version)
|
|
gog meet list-conferences --days 7 --json
|
|
gog meet get-transcript <conferenceId> --json
|
|
|
|
# Or via google_workspace_mcp tools
|
|
# (Check if latest version includes Meet API tools)
|
|
```
|
|
|
|
---
|
|
|
|
## 5. RECOMMENDED ARCHITECTURE
|
|
|
|
### Option A: Local Processing (Privacy-first)
|
|
```
|
|
Google Meet (recording)
|
|
→ Download via Drive API (gogcli/MCP)
|
|
→ Transcribe locally (whisper.cpp + VAD + diarization)
|
|
→ Extract action items (Claude API with structured output)
|
|
→ Update Calendar events (gogcli)
|
|
→ Send summary email (gogcli)
|
|
→ Track in Sheets (gogcli)
|
|
```
|
|
|
|
**Tools:**
|
|
- CLI: `gogcli` for all Google Workspace operations
|
|
- Transcription: `whisper.cpp` with `turbo` model + Silero VAD + tinydiarize
|
|
- LLM: Claude API for action item extraction
|
|
- Agent framework: Clawdbot skills or custom automation
|
|
|
|
**Pros:**
|
|
- No audio leaves your infrastructure
|
|
- No ongoing API costs for transcription
|
|
- Fast (whisper.cpp optimized)
|
|
- Full control
|
|
|
|
**Cons:**
|
|
- Requires local GPU/CPU for transcription
|
|
- Speaker diarization still experimental in whisper.cpp
|
|
|
|
---
|
|
|
|
### Option B: Cloud Transcription (Production Quality)
|
|
```
|
|
Google Meet (recording)
|
|
→ Download via Drive API (gogcli/MCP)
|
|
→ Transcribe via Assembly AI or Deepgram API
|
|
→ Extract action items (Claude API)
|
|
→ Update Google Workspace (gogcli)
|
|
```
|
|
|
|
**Tools:**
|
|
- CLI: `gogcli`
|
|
- Transcription: Assembly AI or Deepgram
|
|
- LLM: Claude API
|
|
|
|
**Pros:**
|
|
- Production-grade speaker diarization
|
|
- No local compute needed
|
|
- Faster setup
|
|
|
|
**Cons:**
|
|
- Ongoing API costs (~$0.30-0.50/hour)
|
|
- Audio uploaded to third party
|
|
- Internet dependency
|
|
|
|
---
|
|
|
|
### Option C: MCP-first (Claude Desktop/Code)
|
|
```
|
|
Claude Desktop/Code with google_workspace_mcp
|
|
→ Read Calendar for upcoming meetings
|
|
→ Access Drive for recordings
|
|
→ Process transcripts (local or API)
|
|
→ Update Calendar with action items
|
|
→ Send follow-up emails
|
|
```
|
|
|
|
**Tools:**
|
|
- MCP: `google_workspace_mcp` (taylorwilsdon)
|
|
- Transcription: Choice of whisper.cpp or cloud API
|
|
- Client: Claude Desktop or Claude Code
|
|
|
|
**Pros:**
|
|
- Native MCP integration
|
|
- Interactive agent workflow
|
|
- OAuth 2.1 multi-user support
|
|
|
|
**Cons:**
|
|
- Tied to MCP-compatible clients
|
|
- Python runtime required
|
|
|
|
---
|
|
|
|
## 6. CLAWDHUB SKILLS
|
|
|
|
**Status**: Need to check ClawdHub directly for Google Workspace skills.
|
|
|
|
**Note**: Since ClawdHub is ecosystem-specific, check:
|
|
- https://clawhub.com (if public skill repository)
|
|
- Clawdbot documentation for existing Google Workspace skills
|
|
- Community skills for Gmail/Calendar/Drive integration
|
|
|
|
**Potential skills to create:**
|
|
1. `google-meet-intelligence` - Full meeting workflow
|
|
2. `gogcli-wrapper` - Clawdbot skill wrapping gogcli commands
|
|
3. `meeting-action-tracker` - Track action items in Sheets
|
|
|
|
---
|
|
|
|
## 7. ACTION ITEM EXTRACTION STRATEGIES
|
|
|
|
### Strategy 1: Structured Output with Claude
|
|
```typescript
|
|
// After transcription, use Claude with structured output
|
|
const actionItems = await claude.messages.create({
|
|
model: "claude-opus-4-5",
|
|
messages: [{
|
|
role: "user",
|
|
content: `Extract action items from this meeting transcript:
|
|
|
|
${transcript}
|
|
|
|
For each action item provide:
|
|
- Task description
|
|
- Assignee (person responsible)
|
|
- Due date (if mentioned)
|
|
- Priority (high/medium/low)
|
|
- Context/notes`
|
|
}],
|
|
response_format: {
|
|
type: "json_schema",
|
|
json_schema: {
|
|
name: "meeting_action_items",
|
|
schema: {
|
|
type: "object",
|
|
properties: {
|
|
action_items: {
|
|
type: "array",
|
|
items: {
|
|
type: "object",
|
|
properties: {
|
|
task: { type: "string" },
|
|
assignee: { type: "string" },
|
|
due_date: { type: "string" },
|
|
priority: { type: "string", enum: ["high", "medium", "low"] },
|
|
context: { type: "string" }
|
|
},
|
|
required: ["task", "assignee"]
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
});
|
|
```
|
|
|
|
### Strategy 2: Post-Meeting Workflow
|
|
```bash
|
|
#!/bin/bash
|
|
# meeting-intel.sh - Complete meeting intelligence workflow
|
|
|
|
MEETING_ID=$1
|
|
CALENDAR_ID="primary"
|
|
|
|
# 1. Get meeting details
|
|
MEETING=$(gog calendar get "$CALENDAR_ID" "$MEETING_ID" --json)
|
|
|
|
# 2. Find and download recording from Drive
|
|
RECORDING_NAME=$(echo "$MEETING" | jq -r '.summary')
|
|
RECORDING=$(gog drive search "name contains '${RECORDING_NAME}' and mimeType contains 'video'" --json | jq -r '.[0].id')
|
|
|
|
gog drive download "$RECORDING" --out meeting.mp4
|
|
|
|
# 3. Extract audio
|
|
ffmpeg -i meeting.mp4 -ar 16000 -ac 1 -c:a pcm_s16le meeting.wav
|
|
|
|
# 4. Transcribe with whisper.cpp
|
|
./whisper-cli -m models/ggml-turbo.bin \
|
|
-f meeting.wav \
|
|
--output-json \
|
|
--output-file transcript.json
|
|
|
|
# 5. Extract action items with Claude
|
|
TRANSCRIPT=$(cat transcript.json | jq -r '.transcription')
|
|
|
|
# Call Claude API to extract action items
|
|
# (pseudo-code - actual implementation depends on your Claude API client)
|
|
ACTION_ITEMS=$(claude_api extract_action_items "$TRANSCRIPT")
|
|
|
|
# 6. Create Google Tasks
|
|
echo "$ACTION_ITEMS" | jq -r '.action_items[] | .task' | while read TASK; do
|
|
gog tasks add "@default" --title "$TASK"
|
|
done
|
|
|
|
# 7. Update Calendar event with summary
|
|
SUMMARY="Meeting Summary:\n\nAction Items:\n$ACTION_ITEMS"
|
|
gog calendar update "$CALENDAR_ID" "$MEETING_ID" --description "$SUMMARY"
|
|
|
|
# 8. Send follow-up email
|
|
ATTENDEES=$(echo "$MEETING" | jq -r '.attendees[].email' | tr '\n' ',')
|
|
gog gmail send \
|
|
--to "$ATTENDEES" \
|
|
--subject "Action Items: $RECORDING_NAME" \
|
|
--body "$SUMMARY"
|
|
```
|
|
|
|
---
|
|
|
|
## 8. FINAL RECOMMENDATIONS
|
|
|
|
### For Your Use Case (AI Agent + Meeting Intelligence):
|
|
|
|
**Primary Stack:**
|
|
1. **CLI**: `gogcli` (steipete/gogcli) ⭐
|
|
- Most comprehensive Google Workspace access
|
|
- JSON output perfect for agents
|
|
- Production-ready, actively maintained
|
|
|
|
2. **Transcription**: `whisper.cpp` (ggml-org) ⭐
|
|
- Fastest local option
|
|
- Production-ready
|
|
- turbo model recommended
|
|
- Add Silero VAD for better segmentation
|
|
|
|
3. **Action Item Extraction**: Claude API with structured output
|
|
- Use Opus-4 for best reasoning on action items
|
|
- Structured output ensures consistent parsing
|
|
- Can extract assignees, dates, priorities
|
|
|
|
4. **Alternative if using MCP client**: `google_workspace_mcp` (taylorwilsdon)
|
|
- If Claude Desktop/Code/VS Code MCP is your primary interface
|
|
- Same capabilities as gogcli but via MCP protocol
|
|
|
|
**For Production Speaker Diarization:**
|
|
- Consider Assembly AI or Deepgram if budget allows
|
|
- whisper.cpp tinydiarize is experimental but improving
|
|
|
|
**Accountability Tracking:**
|
|
- Use Google Tasks API (via gogcli)
|
|
- OR create tracking spreadsheet in Google Sheets
|
|
- OR use Google Calendar event descriptions for inline tracking
|
|
|
|
**Pre/Post Meeting Reports:**
|
|
- Pre: Query Calendar for upcoming events, generate agenda from past notes
|
|
- Post: Combine transcript + action items + attendee list into summary
|
|
- Distribute via Gmail (gogcli send)
|
|
|
|
---
|
|
|
|
## 9. ASSEMBLY AI & DEEPGRAM (Cloud Services)
|
|
|
|
### Assembly AI
|
|
**Website**: https://www.assemblyai.com/
|
|
**Type**: Cloud API (not a CLI)
|
|
|
|
**What it does well:**
|
|
- **Production speaker diarization** - Industry-leading speaker separation
|
|
- **Action item detection** - Built-in action item extraction
|
|
- **Topic detection** - Automatic topic segmentation
|
|
- **PII redaction** - Automatic sensitive data removal
|
|
- **Custom vocabulary** - Domain-specific terminology
|
|
- **Real-time streaming** - Live transcription
|
|
- **Multiple languages** - 100+ languages
|
|
|
|
**Pricing:** ~$0.37/hour for standard transcription, ~$0.85/hour with speaker diarization
|
|
|
|
**API Example:**
|
|
```python
|
|
import assemblyai as aai
|
|
|
|
aai.settings.api_key = "YOUR_API_KEY"
|
|
transcriber = aai.Transcriber()
|
|
|
|
config = aai.TranscriptionConfig(
|
|
speaker_labels=True,
|
|
auto_chapters=True,
|
|
entity_detection=True,
|
|
)
|
|
|
|
transcript = transcriber.transcribe("meeting.mp3", config)
|
|
|
|
for utterance in transcript.utterances:
|
|
print(f"Speaker {utterance.speaker}: {utterance.text}")
|
|
|
|
# Extract action items
|
|
for item in transcript.auto_highlights.results:
|
|
print(f"Action: {item.text}")
|
|
```
|
|
|
|
**Best for this use case?**
|
|
✅ **YES - for production quality** if budget allows. Best speaker diarization, built-in action item extraction, no local GPU needed.
|
|
|
|
---
|
|
|
|
### Deepgram
|
|
**Website**: https://deepgram.com/
|
|
**Type**: Cloud API (not a CLI)
|
|
|
|
**What it does well:**
|
|
- **Fastest cloud transcription** - Nova-2 model very fast
|
|
- **Good speaker diarization** - Multi-speaker detection
|
|
- **Streaming support** - Real-time transcription
|
|
- **Punctuation & formatting** - Smart formatting
|
|
- **Custom models** - Fine-tuning available
|
|
|
|
**Pricing:** ~$0.0043/minute (~$0.26/hour)
|
|
|
|
**API Example:**
|
|
```python
|
|
from deepgram import DeepgramClient, PrerecordedOptions
|
|
|
|
deepgram = DeepgramClient("YOUR_API_KEY")
|
|
|
|
options = PrerecordedOptions(
|
|
model="nova-2",
|
|
smart_format=True,
|
|
diarize=True,
|
|
)
|
|
|
|
response = deepgram.listen.prerecorded.v("1").transcribe_file(
|
|
{"buffer": audio_file},
|
|
options
|
|
)
|
|
|
|
for word in response.results.channels[0].alternatives[0].words:
|
|
print(f"Speaker {word.speaker}: {word.word}")
|
|
```
|
|
|
|
**Best for this use case?**
|
|
✅ **YES - budget option** - Cheaper than Assembly AI, still good quality. Good balance of cost/quality.
|
|
|
|
---
|
|
|
|
### Cloud vs Local Decision Matrix
|
|
|
|
| Factor | Local (whisper.cpp) | Cloud (Assembly/Deepgram) |
|
|
|--------|---------------------|---------------------------|
|
|
| **Cost** | Free (hardware only) | ~$0.26-0.85/hour |
|
|
| **Privacy** | ✅ Audio stays local | ⚠️ Uploaded to third party |
|
|
| **Speed** | Fast (GPU) / Slow (CPU) | Very fast (API) |
|
|
| **Speaker diarization** | ⚠️ Experimental | ✅ Production-ready |
|
|
| **Action items** | Manual (LLM needed) | ✅ Built-in (Assembly AI) |
|
|
| **Setup** | Complex | Simple (API key) |
|
|
| **Internet** | Not required | Required |
|
|
| **Quality** | Excellent (large models) | Excellent |
|
|
|
|
**Recommendation:**
|
|
- **Prototype/POC**: Start with whisper.cpp (free, good enough)
|
|
- **Production**: Use Assembly AI if budget allows (best action items)
|
|
- **Cost-sensitive**: Deepgram (cheaper, still good)
|
|
|
|
---
|
|
|
|
## 10. GITHUB STATS SUMMARY
|
|
|
|
| Tool | Stars | Last Commit | Language | Status |
|
|
|------|-------|-------------|----------|--------|
|
|
| gogcli | ~3.4k | 2025-01 | Go | ✅ Active |
|
|
| google_workspace_mcp | Growing | 2025-01 | Python | ✅ Active |
|
|
| gcalcli | ~3.5k | 2024-12 | Python | ✅ Active |
|
|
| google-workspace-cli | ~200 | 2025-01 | TypeScript | ✅ Active |
|
|
| himalaya | ~5.5k | 2024-12 | Rust | ✅ Very Active |
|
|
| whisper.cpp | ~38k | 2026-02 | C/C++ | ✅ Very Active |
|
|
| openai/whisper | ~79k | 2024-12 | Python | ✅ Active |
|
|
| Assembly AI | N/A (API) | N/A | Cloud API | ✅ Active |
|
|
| Deepgram | N/A (API) | N/A | Cloud API | ✅ Active |
|
|
|
|
---
|
|
|
|
## 11. MISSING RESEARCH: CLAWDHUB SKILLS
|
|
|
|
**Status**: Could not verify ClawdHub URL or existing Google Workspace skills due to rate limiting.
|
|
|
|
**Action Required:**
|
|
1. Check ClawdHub documentation/website directly
|
|
2. Search for existing skills:
|
|
- `google-workspace-*`
|
|
- `gmail-*`
|
|
- `calendar-*`
|
|
- `meeting-*`
|
|
3. If no existing skills, create custom skills:
|
|
- `gogcli-wrapper` - Wraps gogcli commands for Clawdbot
|
|
- `meeting-intelligence` - Complete meeting workflow
|
|
- `google-meet-transcript` - Meet-specific transcript processing
|
|
|
|
**Potential Skill Structure:**
|
|
```markdown
|
|
---
|
|
name: google-workspace-meeting-intel
|
|
description: "Full Google Workspace meeting intelligence workflow"
|
|
tools:
|
|
- gogcli (installed via brew)
|
|
- whisper.cpp (installed locally)
|
|
- claude API (for action items)
|
|
---
|
|
|
|
# Google Workspace Meeting Intelligence
|
|
|
|
This skill provides:
|
|
1. Calendar event monitoring
|
|
2. Meeting recording download from Drive
|
|
3. Transcript generation via whisper.cpp
|
|
4. Action item extraction via Claude
|
|
5. Calendar/Tasks update with action items
|
|
6. Follow-up email generation
|
|
|
|
## Commands
|
|
|
|
### List upcoming meetings
|
|
```bash
|
|
gog calendar events --days 7 --json
|
|
```
|
|
|
|
### Process meeting recording
|
|
```bash
|
|
./process-meeting.sh <meeting-id>
|
|
```
|
|
(Full implementation in skill repository)
|
|
```
|
|
|
|
---
|
|
|
|
## 12. NEXT STEPS FOR IMPLEMENTATION
|
|
|
|
1. **Install gogcli**
|
|
```bash
|
|
brew install steipete/tap/gogcli
|
|
gog auth credentials ~/Downloads/client_secret.json
|
|
gog auth add your@email.com
|
|
```
|
|
|
|
2. **Install whisper.cpp**
|
|
```bash
|
|
git clone https://github.com/ggml-org/whisper.cpp
|
|
cd whisper.cpp
|
|
cmake -B build -DGGML_METAL=1 # macOS with Apple Silicon
|
|
cmake --build build -j
|
|
sh ./models/download-ggml-model.sh turbo
|
|
```
|
|
|
|
3. **Test Meeting Workflow**
|
|
- Get Calendar events: `gog calendar events --today --json`
|
|
- Search for recordings: `gog drive search "meeting" --json`
|
|
- Download and transcribe
|
|
- Extract action items via Claude API
|
|
- Update Calendar/Tasks/Email
|
|
|
|
4. **Build Clawdbot Skill**
|
|
- Wrap gogcli commands in skill
|
|
- Add whisper.cpp transcription step
|
|
- Integrate Claude API for intelligence layer
|
|
- Package as reusable automation
|
|
|
|
---
|
|
|
|
## QUESTIONS TO CLARIFY
|
|
|
|
1. **Privacy requirements**: Local-only or cloud APIs OK?
|
|
2. **Google Workspace**: Does client have Workspace or free Gmail?
|
|
3. **Meeting platform**: Google Meet only or also Zoom/Teams?
|
|
4. **Volume**: How many meetings/week to process?
|
|
5. **Real-time**: Need live transcription during meeting or post-processing OK?
|
|
6. **Budget**: OK with Assembly AI/Deepgram costs (~$0.30-0.50/hr) or local-only?
|
|
|