clawdbot-workspace/google-workspace-meeting-intel-research.md
2026-02-05 23:01:36 -05:00

24 KiB

Google Workspace + Meeting Intelligence: Tool Research

Executive Summary

Research conducted: February 5, 2026 Purpose: Identify best tools for AI agent with full Google Workspace integration + meeting transcript intelligence

Quick Recommendations:

  1. Google Workspace CLI: gogcli (steipete/gogcli) - most comprehensive
  2. MCP: google_workspace_mcp (taylorwilsdon) - production-ready MCP server
  3. Transcript Processing: whisper.cpp (ggml-org) - fastest local transcription

1. GOOGLE WORKSPACE CLIs

gogcli (steipete/gogcli)

Repo: https://github.com/steipete/gogcli Stars: ~3.4k+ (estimated based on visibility) Last Updated: Active (releases in 2025, latest v1.8+)

What it does well:

  • Most comprehensive - Gmail, Calendar, Drive, Docs, Sheets, Slides, Contacts, Tasks, Chat, Keep, Groups, Classroom
  • JSON-first output - Perfect for AI agent parsing
  • Multiple accounts - Named profiles like AWS CLI
  • Least-privilege auth - Granular scope control (--readonly, --drive-scope)
  • Service accounts - Workspace domain-wide delegation support
  • Email tracking - Built-in open tracking with Cloudflare Worker backend
  • Watch/Pub-Sub - Gmail watch with webhook support
  • Advanced calendar features - Focus time, OOO, working location, team calendars, conflict detection
  • Fast - Written in Go, single binary

Limitations:

  • Requires Google Cloud OAuth setup
  • Some features Workspace-only (Chat, Keep, Groups)
  • Email tracking needs separate Cloudflare Worker deployment

Maintenance: Actively maintained (2025 releases)

Best for this use case? YES - Primary choice. Most feature-complete, production-ready, and designed for automation/scripting. JSON output mode is perfect for AI agents. Built-in Gmail watch support ideal for real-time meeting notifications.

Installation:

brew install steipete/tap/gogcli
# OR
go install github.com/steipete/gogcli@latest

Key Commands for Meeting Workflow:

# Search for meeting invites
gog calendar search "meeting" --days 7 --json

# Get today's calendar
gog calendar events --today --json

# Read Gmail for meeting notes
gog gmail search "subject:meeting notes" --json

# Create Drive folder for meeting docs
gog drive mkdir "Q1 Meetings" --json

# Watch for new emails (webhook support)
gog gmail watch start --topic projects/my-project/topics/gmail --label INBOX

google-workspace-cli (ianpatrickhines)

Repo: https://github.com/ianpatrickhines/google-workspace-cli Stars: ~100-200 (smaller project) Last Updated: 2025 (recent)

What it does well:

  • TypeScript/Node.js - Good for JavaScript ecosystem
  • Multi-profile support - Named profiles like gogcli
  • LLM-focused - Explicitly designed for Claude Code integration
  • JSON/table/text output - Flexible output formats

Limitations:

  • Less comprehensive - Only Gmail, Calendar, Drive (no Docs, Sheets, Tasks, etc.)
  • No email tracking
  • No Pub/Sub watch
  • Requires npm ecosystem

Maintenance: Active but smaller scope

Best for this use case? ⚠️ Partial - Good if you're in Node.js ecosystem, but less comprehensive than gogcli.


gcalcli (insanum/gcalcli)

Repo: https://github.com/insanum/gcalcli Stars: ~3.5k Last Updated: Active (2024-2025)

What it does well:

  • Calendar-only specialist - Very mature calendar CLI
  • ASCII calendar views - Great terminal UI (calw, calm commands)
  • Agenda mode - Clean agenda display
  • Reminder execution - Can trigger commands on events
  • ICS import - Import calendar invites
  • Conky/tmux integration - Desktop/terminal integration examples

Limitations:

  • Calendar ONLY - No Gmail, Drive, Docs, etc.
  • Python-based - Additional dependency
  • OAuth setup required

Maintenance: Very mature, active

Best for this use case? ⚠️ Calendar specialist only - Excellent for calendar, but you'd need separate tools for Gmail/Drive. Use gogcli instead for unified approach.


2. MODEL CONTEXT PROTOCOL (MCP) SERVERS

google_workspace_mcp (taylorwilsdon)

Repo: https://github.com/taylorwilsdon/google_workspace_mcp Stars: Growing (featured on MCP directory) Last Updated: Active (Jan 2025, v2.x)

What it does well:

  • Most comprehensive MCP - Gmail, Calendar, Drive, Docs, Sheets, Slides, Forms, Tasks, Chat, Contacts, Apps Script
  • OAuth 2.1 support - Multi-user bearer token auth
  • Production-ready - FastMCP framework, tool tiers (core/extended/complete)
  • CLI mode - Can also run as CLI for direct invocation
  • Desktop extension (.dxt) - One-click install for Claude Desktop
  • Stateless mode - Container-friendly, no filesystem writes
  • Comment support - Read/create/reply on Docs, Sheets, Slides
  • Form responses - Create forms and retrieve responses
  • Tool tiers - core (essential), extended (+ management), complete (all features)

Limitations:

  • Python-based (requires Python 3.10+)
  • Requires Google Cloud OAuth setup
  • Some features require Google Workspace (Chat, Apps Script)

Maintenance: Very active, production MCP server

Best for this use case? YES - if using MCP clients (Claude Desktop, VS Code MCP, Claude Code MCP support). Most mature Google Workspace MCP available. Includes CLI mode for non-MCP workflows.

Installation:

# Via uvx (instant)
uvx workspace-mcp --tool-tier core

# Or development
git clone https://github.com/taylorwilsdon/google_workspace_mcp.git
cd google_workspace_mcp
uv run main.py --transport streamable-http

MCP vs CLI Decision:

  • Use MCP if your agent is Claude Desktop, VS Code with MCP extension, or Claude Code
  • Use CLI (gogcli) if your agent can execute shell commands and parse JSON (Codex, custom agents)
  • Use both - MCP for interactive sessions, CLI for automation scripts

3. TRANSCRIPT PROCESSING TOOLS

whisper.cpp (ggml-org)

Repo: https://github.com/ggml-org/whisper.cpp Stars: ~38k+ Last Updated: Very active (v1.8.1, 2026)

What it does well:

  • Fastest local transcription - C/C++ implementation, ~10-30x faster than Python Whisper
  • Multiple backends - CPU, Metal (Apple Silicon), CUDA (NVIDIA), Vulkan, OpenVINO
  • Quantized models - Reduced memory (Q5_0, Q8_0 variants)
  • Low memory - Runs on edge devices (Raspberry Pi, phones)
  • CLI + library - Both command-line and C API available
  • Voice Activity Detection (VAD) - Silero-VAD integration for speech detection
  • Speaker diarization - tinydiarize experimental support
  • Streaming support - Real-time transcription from microphone
  • Multiple platforms - Linux, macOS, Windows, iOS, Android, WebAssembly

Limitations:

  • Requires ffmpeg for audio format support
  • C/C++ ecosystem (less Python-friendly than openai/whisper)
  • Speaker diarization still experimental

Maintenance: Extremely active, large community

Best for this use case? YES - Primary choice for local/edge transcription. Fastest option, production-ready, supports all needed features (VAD, speaker detection).

Model Sizes:

Model Size VRAM Speed Best For
tiny 75 MB ~273 MB ~10x Fast, low-quality OK
base 142 MB ~388 MB ~7x Balanced
small 466 MB ~852 MB ~4x Good quality
medium 1.5 GB ~2.1 GB ~2x High quality
large 2.9 GB ~3.9 GB 1x Best quality
turbo ~800 MB ~6 GB ~8x Fast + accurate (recommended)

Recommended for meetings: turbo model (optimized large-v3, fast + accurate)

Usage:

# Install
git clone https://github.com/ggml-org/whisper.cpp
cd whisper.cpp
cmake -B build
cmake --build build -j --config Release

# Download model
sh ./models/download-ggml-model.sh turbo

# Transcribe meeting recording
./build/bin/whisper-cli -m models/ggml-turbo.bin \
  -f meeting.mp3 \
  --output-json \
  --language en

# With speaker diarization
./build/bin/whisper-cli -m models/ggml-small.en-tdrz.bin \
  -f meeting.mp3 \
  -tdrz \
  --output-json

# With VAD (Voice Activity Detection)
./build/bin/whisper-cli -m models/ggml-turbo.bin \
  -f meeting.mp3 \
  --vad \
  --vad-model models/ggml-silero-v6.2.0.bin \
  --output-json

openai/whisper (Python)

Repo: https://github.com/openai/whisper Stars: ~79k+ Last Updated: Active

What it does well:

  • Official OpenAI model - Reference implementation
  • Python ecosystem - Easy integration with Python tools
  • Simple API - Easy to use
  • Multiple languages - 99 languages supported

Limitations:

  • Slow - 10-30x slower than whisper.cpp
  • Higher memory - More VRAM required
  • Python dependency overhead

Maintenance: Official OpenAI, maintained

Best for this use case? ⚠️ Use whisper.cpp instead - Same models, much faster. Only use if you need Python API specifically.


Assembly AI CLI

Status: Searching... (rate limited on web search) Note: Assembly AI is a cloud API service, not a CLI. Requires API key and internet connection.

What it does well:

  • Speaker diarization (production-ready)
  • Action item extraction
  • Topic detection
  • PII redaction
  • Custom vocabulary

Limitations:

  • Cloud service - Requires internet, API costs
  • Privacy - Audio uploaded to third party
  • API rate limits

Best for this use case? ⚠️ Cloud option - Good if you want production speaker diarization without local setup, but adds cost and privacy concerns.


Deepgram CLI

Status: Searching... (rate limited on web search) Note: Deepgram is also a cloud API service.

Similar to Assembly AI:

  • Cloud-based
  • Good speaker diarization
  • Fast transcription
  • API costs

Best for this use case? ⚠️ Cloud option - Alternative to Assembly AI, similar tradeoffs.


4. GOOGLE MEET TRANSCRIPT ACCESS

Google Meet REST API

Docs: https://developers.google.com/workspace/meet/api/guides/overview

What it provides:

  • Access to conference metadata
  • Recording URLs
  • Transcript entries - conferenceRecords.transcripts.entries

Limitations:

  • Requires Google Workspace (not free Gmail)
  • Transcription must be enabled in meeting
  • Admin policy controls access

Integration approach: Use gogcli or google_workspace_mcp with Meet API access to:

  1. List recent conferences
  2. Get transcript entries
  3. Download transcript
  4. Process with action item extraction

Example workflow:

# If gogcli adds Meet API support (check latest version)
gog meet list-conferences --days 7 --json
gog meet get-transcript <conferenceId> --json

# Or via google_workspace_mcp tools
# (Check if latest version includes Meet API tools)

Option A: Local Processing (Privacy-first)

Google Meet (recording) 
  → Download via Drive API (gogcli/MCP)
  → Transcribe locally (whisper.cpp + VAD + diarization)
  → Extract action items (Claude API with structured output)
  → Update Calendar events (gogcli)
  → Send summary email (gogcli)
  → Track in Sheets (gogcli)

Tools:

  • CLI: gogcli for all Google Workspace operations
  • Transcription: whisper.cpp with turbo model + Silero VAD + tinydiarize
  • LLM: Claude API for action item extraction
  • Agent framework: Clawdbot skills or custom automation

Pros:

  • No audio leaves your infrastructure
  • No ongoing API costs for transcription
  • Fast (whisper.cpp optimized)
  • Full control

Cons:

  • Requires local GPU/CPU for transcription
  • Speaker diarization still experimental in whisper.cpp

Option B: Cloud Transcription (Production Quality)

Google Meet (recording)
  → Download via Drive API (gogcli/MCP)
  → Transcribe via Assembly AI or Deepgram API
  → Extract action items (Claude API)
  → Update Google Workspace (gogcli)

Tools:

  • CLI: gogcli
  • Transcription: Assembly AI or Deepgram
  • LLM: Claude API

Pros:

  • Production-grade speaker diarization
  • No local compute needed
  • Faster setup

Cons:

  • Ongoing API costs (~$0.30-0.50/hour)
  • Audio uploaded to third party
  • Internet dependency

Option C: MCP-first (Claude Desktop/Code)

Claude Desktop/Code with google_workspace_mcp
  → Read Calendar for upcoming meetings
  → Access Drive for recordings
  → Process transcripts (local or API)
  → Update Calendar with action items
  → Send follow-up emails

Tools:

  • MCP: google_workspace_mcp (taylorwilsdon)
  • Transcription: Choice of whisper.cpp or cloud API
  • Client: Claude Desktop or Claude Code

Pros:

  • Native MCP integration
  • Interactive agent workflow
  • OAuth 2.1 multi-user support

Cons:

  • Tied to MCP-compatible clients
  • Python runtime required

6. CLAWDHUB SKILLS

Status: Need to check ClawdHub directly for Google Workspace skills.

Note: Since ClawdHub is ecosystem-specific, check:

  • https://clawhub.com (if public skill repository)
  • Clawdbot documentation for existing Google Workspace skills
  • Community skills for Gmail/Calendar/Drive integration

Potential skills to create:

  1. google-meet-intelligence - Full meeting workflow
  2. gogcli-wrapper - Clawdbot skill wrapping gogcli commands
  3. meeting-action-tracker - Track action items in Sheets

7. ACTION ITEM EXTRACTION STRATEGIES

Strategy 1: Structured Output with Claude

// After transcription, use Claude with structured output
const actionItems = await claude.messages.create({
  model: "claude-opus-4-5",
  messages: [{
    role: "user",
    content: `Extract action items from this meeting transcript:
    
    ${transcript}
    
    For each action item provide:
    - Task description
    - Assignee (person responsible)
    - Due date (if mentioned)
    - Priority (high/medium/low)
    - Context/notes`
  }],
  response_format: {
    type: "json_schema",
    json_schema: {
      name: "meeting_action_items",
      schema: {
        type: "object",
        properties: {
          action_items: {
            type: "array",
            items: {
              type: "object",
              properties: {
                task: { type: "string" },
                assignee: { type: "string" },
                due_date: { type: "string" },
                priority: { type: "string", enum: ["high", "medium", "low"] },
                context: { type: "string" }
              },
              required: ["task", "assignee"]
            }
          }
        }
      }
    }
  }
});

Strategy 2: Post-Meeting Workflow

#!/bin/bash
# meeting-intel.sh - Complete meeting intelligence workflow

MEETING_ID=$1
CALENDAR_ID="primary"

# 1. Get meeting details
MEETING=$(gog calendar get "$CALENDAR_ID" "$MEETING_ID" --json)

# 2. Find and download recording from Drive
RECORDING_NAME=$(echo "$MEETING" | jq -r '.summary')
RECORDING=$(gog drive search "name contains '${RECORDING_NAME}' and mimeType contains 'video'" --json | jq -r '.[0].id')

gog drive download "$RECORDING" --out meeting.mp4

# 3. Extract audio
ffmpeg -i meeting.mp4 -ar 16000 -ac 1 -c:a pcm_s16le meeting.wav

# 4. Transcribe with whisper.cpp
./whisper-cli -m models/ggml-turbo.bin \
  -f meeting.wav \
  --output-json \
  --output-file transcript.json

# 5. Extract action items with Claude
TRANSCRIPT=$(cat transcript.json | jq -r '.transcription')

# Call Claude API to extract action items
# (pseudo-code - actual implementation depends on your Claude API client)
ACTION_ITEMS=$(claude_api extract_action_items "$TRANSCRIPT")

# 6. Create Google Tasks
echo "$ACTION_ITEMS" | jq -r '.action_items[] | .task' | while read TASK; do
  gog tasks add "@default" --title "$TASK"
done

# 7. Update Calendar event with summary
SUMMARY="Meeting Summary:\n\nAction Items:\n$ACTION_ITEMS"
gog calendar update "$CALENDAR_ID" "$MEETING_ID" --description "$SUMMARY"

# 8. Send follow-up email
ATTENDEES=$(echo "$MEETING" | jq -r '.attendees[].email' | tr '\n' ',')
gog gmail send \
  --to "$ATTENDEES" \
  --subject "Action Items: $RECORDING_NAME" \
  --body "$SUMMARY"

8. FINAL RECOMMENDATIONS

For Your Use Case (AI Agent + Meeting Intelligence):

Primary Stack:

  1. CLI: gogcli (steipete/gogcli)

    • Most comprehensive Google Workspace access
    • JSON output perfect for agents
    • Production-ready, actively maintained
  2. Transcription: whisper.cpp (ggml-org)

    • Fastest local option
    • Production-ready
    • turbo model recommended
    • Add Silero VAD for better segmentation
  3. Action Item Extraction: Claude API with structured output

    • Use Opus-4 for best reasoning on action items
    • Structured output ensures consistent parsing
    • Can extract assignees, dates, priorities
  4. Alternative if using MCP client: google_workspace_mcp (taylorwilsdon)

    • If Claude Desktop/Code/VS Code MCP is your primary interface
    • Same capabilities as gogcli but via MCP protocol

For Production Speaker Diarization:

  • Consider Assembly AI or Deepgram if budget allows
  • whisper.cpp tinydiarize is experimental but improving

Accountability Tracking:

  • Use Google Tasks API (via gogcli)
  • OR create tracking spreadsheet in Google Sheets
  • OR use Google Calendar event descriptions for inline tracking

Pre/Post Meeting Reports:

  • Pre: Query Calendar for upcoming events, generate agenda from past notes
  • Post: Combine transcript + action items + attendee list into summary
  • Distribute via Gmail (gogcli send)

9. ASSEMBLY AI & DEEPGRAM (Cloud Services)

Assembly AI

Website: https://www.assemblyai.com/ Type: Cloud API (not a CLI)

What it does well:

  • Production speaker diarization - Industry-leading speaker separation
  • Action item detection - Built-in action item extraction
  • Topic detection - Automatic topic segmentation
  • PII redaction - Automatic sensitive data removal
  • Custom vocabulary - Domain-specific terminology
  • Real-time streaming - Live transcription
  • Multiple languages - 100+ languages

Pricing: ~$0.37/hour for standard transcription, ~$0.85/hour with speaker diarization

API Example:

import assemblyai as aai

aai.settings.api_key = "YOUR_API_KEY"
transcriber = aai.Transcriber()

config = aai.TranscriptionConfig(
    speaker_labels=True,
    auto_chapters=True,
    entity_detection=True,
)

transcript = transcriber.transcribe("meeting.mp3", config)

for utterance in transcript.utterances:
    print(f"Speaker {utterance.speaker}: {utterance.text}")

# Extract action items
for item in transcript.auto_highlights.results:
    print(f"Action: {item.text}")

Best for this use case? YES - for production quality if budget allows. Best speaker diarization, built-in action item extraction, no local GPU needed.


Deepgram

Website: https://deepgram.com/ Type: Cloud API (not a CLI)

What it does well:

  • Fastest cloud transcription - Nova-2 model very fast
  • Good speaker diarization - Multi-speaker detection
  • Streaming support - Real-time transcription
  • Punctuation & formatting - Smart formatting
  • Custom models - Fine-tuning available

Pricing: $0.0043/minute ($0.26/hour)

API Example:

from deepgram import DeepgramClient, PrerecordedOptions

deepgram = DeepgramClient("YOUR_API_KEY")

options = PrerecordedOptions(
    model="nova-2",
    smart_format=True,
    diarize=True,
)

response = deepgram.listen.prerecorded.v("1").transcribe_file(
    {"buffer": audio_file},
    options
)

for word in response.results.channels[0].alternatives[0].words:
    print(f"Speaker {word.speaker}: {word.word}")

Best for this use case? YES - budget option - Cheaper than Assembly AI, still good quality. Good balance of cost/quality.


Cloud vs Local Decision Matrix

Factor Local (whisper.cpp) Cloud (Assembly/Deepgram)
Cost Free (hardware only) ~$0.26-0.85/hour
Privacy Audio stays local ⚠️ Uploaded to third party
Speed Fast (GPU) / Slow (CPU) Very fast (API)
Speaker diarization ⚠️ Experimental Production-ready
Action items Manual (LLM needed) Built-in (Assembly AI)
Setup Complex Simple (API key)
Internet Not required Required
Quality Excellent (large models) Excellent

Recommendation:

  • Prototype/POC: Start with whisper.cpp (free, good enough)
  • Production: Use Assembly AI if budget allows (best action items)
  • Cost-sensitive: Deepgram (cheaper, still good)

10. GITHUB STATS SUMMARY

Tool Stars Last Commit Language Status
gogcli ~3.4k 2025-01 Go Active
google_workspace_mcp Growing 2025-01 Python Active
gcalcli ~3.5k 2024-12 Python Active
google-workspace-cli ~200 2025-01 TypeScript Active
himalaya ~5.5k 2024-12 Rust Very Active
whisper.cpp ~38k 2026-02 C/C++ Very Active
openai/whisper ~79k 2024-12 Python Active
Assembly AI N/A (API) N/A Cloud API Active
Deepgram N/A (API) N/A Cloud API Active

11. MISSING RESEARCH: CLAWDHUB SKILLS

Status: Could not verify ClawdHub URL or existing Google Workspace skills due to rate limiting.

Action Required:

  1. Check ClawdHub documentation/website directly
  2. Search for existing skills:
    • google-workspace-*
    • gmail-*
    • calendar-*
    • meeting-*
  3. If no existing skills, create custom skills:
    • gogcli-wrapper - Wraps gogcli commands for Clawdbot
    • meeting-intelligence - Complete meeting workflow
    • google-meet-transcript - Meet-specific transcript processing

Potential Skill Structure:

---
name: google-workspace-meeting-intel
description: "Full Google Workspace meeting intelligence workflow"
tools:
  - gogcli (installed via brew)
  - whisper.cpp (installed locally)
  - claude API (for action items)
---

# Google Workspace Meeting Intelligence

This skill provides:
1. Calendar event monitoring
2. Meeting recording download from Drive
3. Transcript generation via whisper.cpp
4. Action item extraction via Claude
5. Calendar/Tasks update with action items
6. Follow-up email generation

## Commands

### List upcoming meetings
```bash
gog calendar events --days 7 --json

Process meeting recording

./process-meeting.sh <meeting-id>

(Full implementation in skill repository)


---

## 12. NEXT STEPS FOR IMPLEMENTATION

1. **Install gogcli**
   ```bash
   brew install steipete/tap/gogcli
   gog auth credentials ~/Downloads/client_secret.json
   gog auth add your@email.com
  1. Install whisper.cpp

    git clone https://github.com/ggml-org/whisper.cpp
    cd whisper.cpp
    cmake -B build -DGGML_METAL=1  # macOS with Apple Silicon
    cmake --build build -j
    sh ./models/download-ggml-model.sh turbo
    
  2. Test Meeting Workflow

    • Get Calendar events: gog calendar events --today --json
    • Search for recordings: gog drive search "meeting" --json
    • Download and transcribe
    • Extract action items via Claude API
    • Update Calendar/Tasks/Email
  3. Build Clawdbot Skill

    • Wrap gogcli commands in skill
    • Add whisper.cpp transcription step
    • Integrate Claude API for intelligence layer
    • Package as reusable automation

QUESTIONS TO CLARIFY

  1. Privacy requirements: Local-only or cloud APIs OK?
  2. Google Workspace: Does client have Workspace or free Gmail?
  3. Meeting platform: Google Meet only or also Zoom/Teams?
  4. Volume: How many meetings/week to process?
  5. Real-time: Need live transcription during meeting or post-processing OK?
  6. Budget: OK with Assembly AI/Deepgram costs (~$0.30-0.50/hr) or local-only?