2026-02-21 Session Notes

Running TeichAI Qwen3-14B with Ollama

User inquired about running TeichAI/Qwen3-14B-Claude-4.5-Opus-High-Reasoning-Distill-GGUF (a 14B parameter model) with Ollama. The model is based on Qwen3 and fine-tuned on Claude Opus 4.5 reasoning datasets, optimized for coding, science, and general purpose tasks.

Research revealed multiple GGUF quantizations available ranging from 3-bit (6.66GB) to 16-bit (29.5GB). Q4_K_M (9GB) was recommended as the optimal balance between quality and performance. Two approaches were documented: (1) direct pull via Ollama's HuggingFace integration, or (2) manual download with custom Modelfile.

Clarity was provided that the model name is marketing language—it's actually a Qwen3-14B fine-tuned on synthetic data, not genuine Claude. User decided to proceed with the setup, providing a Modelfile for Q8_0 quantization (15.7GB full precision variant).

917 B Raw Permalink Blame History

2026-02-21 Session Notes

Running TeichAI Qwen3-14B with Ollama

917 B

Raw Permalink Blame History