Building an MDA Trends Platform — A Case Study
An end-to-end demonstration of a production AI stack: open-world extraction, semantic clustering, and a two-tier LLM design that keeps a real product under $0.50/month in steady state.
Every public company has to file a report with the SEC explaining what's going on in their business. Inside that report there's a section called MD&A — Management's Discussion and Analysis — where executives basically write in plain English what's changing, what worries them, and where they see things heading. It's the most human part of an otherwise very dry document.
mda-trends asks one question: what are companies starting to say that they weren't saying last quarter? That's where the early signal is.
How it works
Every day the system pulls the latest filings from SEC EDGAR and compares each company's MD&A against their previous one — looking only at what's new. Boilerplate legal language gets filtered out. What's left are the paragraphs where something actually changed.
Those paragraphs go to an AI model that extracts structured signals: what's the topic, and which direction it's moving. A clustering algorithm then groups similar signals across companies — finding themes that are spreading across industries, not just one sector having a bad quarter. A second, more capable model names each cluster and checks whether it actually holds together as a real theme.
The output is a two-lane feed: confirmed trends that have hit a threshold of evidence, and emerging signals that are still forming.
The stack
| Data | SEC EDGAR full-text API (free, no auth required) |
| Orchestration | GitHub Actions cron job — runs daily at 6am UTC |
| Extraction | Claude Haiku 4.5 via Batch API (50% cheaper, schema-enforced output) |
| Embeddings | MiniLM running locally on CPU — no external API |
| Clustering | Agglomerative clustering with cosine distance — fully deterministic |
| Naming & quality gate | Claude Sonnet 4.6 — one call per cluster |
| Database | Supabase (Postgres + pgvector, free tier) |
| Frontend | Next.js 14 on Vercel |