The Problem
Every engineering team carries an invisible tax: context loss. New hires spend weeks reading stale docs. Experienced engineers re-answer the same questions in Slack. Critical architectural decisions live in 2-year-old threads nobody can find.
The average developer spends 23 minutes finding context for a single question. Multiply that across a 20-person team and you've lost days of productivity every week.
The Solution
Thread.ai ingests a team's entire knowledge surface — Slack history, GitHub PRs, Notion pages, Jira tickets, internal docs — and makes it instantly queryable in natural language.
Ask "why did we migrate away from MongoDB?" and get an answer with links to the exact Slack thread, PR, and ADR where that decision was made.
Technical Approach
The core system is a RAG (Retrieval-Augmented Generation) pipeline:
- Ingestion — Documents are chunked with a sliding window (512 tokens, 64 overlap) and embedded via OpenAI's
text-embedding-3-small - Storage — Embeddings stored in PostgreSQL with pgvector extension for fast cosine similarity search
- Retrieval — On query, the user's question is embedded and top-k chunks retrieved with a minimum similarity threshold of 0.78
- Generation — Retrieved context fed to GPT-4 with strict source citation requirements
The strict similarity threshold was the key reliability unlock — "I don't know" is far better than a confident hallucination.
Results
After six weeks in beta with four enterprise teams:
- Time-to-answer dropped from 23 minutes → 8 minutes
- Re-ask rate dropped 40%
- Developer surveys showed 25% subjective productivity improvement
- One client estimated 30% reduction in knowledge-retrieval hours