Aleks Gotsa
I build LLM applications, agent workflows, and the routing and evaluation infrastructure that makes them reliable in production. My work sits at the seam between research-style techniques — distillation, LoRA, RAG — and the production realities like cost routing, fallback layers, and monitoring without a dedicated SRE.
Currently shipping at GAZDA in Uzhhorod, where I've built an MCP-based automation layer wiring GPT-4 and Claude into internal tooling, and put ~40 production agent workflows in front of the business. On the side I run Tiny Minds, a 3-person experimental AI lab where I'm building a teacher-expert distillation runtime, and recently shipped Cortex, an open-source multi-pass RAG agent with per-claim verification.
Now 2026
--distill flag, 79/79 tests passing; end-to-end benchmark at 96.25% across 80 prompts.Projects 02
CortexShipped · MIT
A six-stage async research engine — plan → gather → detect gaps → synthesize → verify → remember — exposed via REST + SSE, a CLI, and an MCP server with four tools. About 6.7K LOC, ~30–90s end-to-end with live stage streaming.
The load-bearing piece is the per-claim verifier: it re-reads every cited source and returns confirmed / weakened / unsupported verdicts, catching synthesizer over-claims that no amount of extra retrieval can fix. Cost-routed by task shape — bounded JSON to Haiku, open-ended semantic work to Sonnet — for a ~3–4× cost reduction.
Tiny MindIn Progress
A local-first multi-service runtime with confidence-thresholded escalation: request → router → local expert → teacher fallback. Router, expert, memory, and teacher run as separate FastAPI processes with SQLite-backed memory.
Three local experts (style, planning, retrieval) with per-expert confidence scoring and an isolation evaluation harness — all three at 20/20 isolation post-v0.5 hot fixes; end-to-end benchmark at 96.25% across 80 prompts. LoRA fine-tuning pipeline on SmolLM2-135M runs end-to-end for the teacher-trace-to-expert distillation loop; the thesis — small experts absorbing teacher work without quality collapse — is pending empirical validation.
Contact Open
Reach me at gotsaaleks@gmail.com. I read everything and reply to most.