I bridge the gap between what AI can do and what users actually need. A decade of shipping product at the intersection of machine learning, language systems, and human behaviour.
Senior product roles across AI-native, enterprise, and research-backed organisations.
Building applied AI tools in the research and knowledge domain. Shipped a production LLM system for corpus-level theological text analysis — semantic search, structured output pipelines, and a conversational interface over 30+ volumes.
Led product strategy for a B2B AI platform serving Fortune 500 customers. Drove 3× ARR growth through ML-powered workflow automation features, tightening the feedback loop between model iterations and user-facing outcomes.
First PM hire at a Berlin-based NLP company. Shaped core product from pre-launch through Series A — API design, developer experience, and annotation tooling that became the company's primary competitive moat.
Translated client ML research into deployed product features across banking, media, and health sectors. Pioneered the firm's shift from ad-hoc analytics to systematic product thinking, cutting delivery cycles by 40%.
Focused on the applied edge of language AI — where research meets real-world deployment.
A production AI system for deep semantic exploration of the Swedenborg corpus — over 30 volumes of 18th-century Latin and Swedish theological writing. Built on a custom RAG pipeline with structured citation, glossary-grounded generation, and multilingual support (EN/DE).
Handles nuanced rhetorical queries requiring cross-volume reasoning, not just keyword matching. Used by researchers and readers seeking coherent synthesis across a complex canon.
Open the appCivic-tech tools at the intersection of AI, participatory governance, and public deliberation — a platform for data-driven democratic engagement.
Visit projectA practitioner's perspective on building product strategy around probabilistic outputs — evaluation frameworks, trust design, and failure mode taxonomy.
The model is rarely the bottleneck. The gap is almost always in how AI output gets embedded into people's existing mental models and workflows — friction in the seam, not the system.
You can't ship AI responsibly without knowing what 'good' looks like. Building evaluation infrastructure is product work, not engineering chore — it's where the strategy lives.
Users don't just want answers — they need to know when to trust them. Designing for calibrated confidence is a first-class product requirement, not a footnote.
Every user interaction is a signal about where the model's world model diverges from the user's. Products that capture this systematically compound in quality; those that don't stagnate.
Matthew has an unusually clear mental model for where AI creates genuine user value versus where it creates noise. He cuts through hype faster than anyone I've worked with.
He's the rare PM who can write a PRD, question an evaluation metric, and reframe the whole strategy in the same meeting — without losing the thread.
Matthew brought real intellectual depth to our domain. The research engine he built isn't a chatbot — it's a genuinely useful tool for people who spend their lives with this material.