Interlinear
Learn Latin From Caesar. Eventually.
The language tutor that knows why you're wrong—not just that you are
Screenshots




Interlinear landing page - Language learning that actually sticks
Results
5 error types
Different fixes
typo ≠ case error ≠ vocabulary gap
Any text
Becomes a course
exercises, dialogs, vocabulary
Classical
Languages that work
Latin morphology, not pattern matching
Ever wanted to learn Latin from Caesar? Practice Greek with Cleopatra? Have Cicero correct your subjunctive?
That's where this is going.
But first, the foundation. Every language app gives you the same feedback: ❌ Wrong. Try again. That's not teaching—it's a slot machine with educational branding. The difference between a typo and a conceptual gap is the difference between "you fat-fingered it" and "you don't understand accusative case." A human tutor sees this instantly. Duolingo doesn't try.
Interlinear does. When you write puellam instead of puella, it identifies a case error and explains why nominative was expected. Different errors, different interventions. The same evaluation harness patterns we use for production LLMs—applied to language learning.
Then: course generation. Upload any reading and the system generates a full course—comprehension exercises, translation drills, contextual dialogs, vocabulary cards. All calibrated to your demonstrated level.
Then: the dialogs come alive. AI-generated talking heads. Real-time conversation practice with historical figures, native speakers, patient tutors who never get frustrated. The error correction, the course generation, the pedagogical engine—all of it feeds into conversations that actually teach.
First course in development: Introduction to Old Norse. Academic rigor. Research preparation. And eventually, a conversation with a skald.
For Engineers
Architecture
Mastra orchestrates multi-stage content generation and error analysis. The error classifier distinguishes five categories—each triggering different pedagogical responses. CEFR calibration is continuous and per-skill (you might be B2 vocabulary but A2 subjunctive). Classical language support includes real inflection analysis, not pattern matching.
View diagram
Key Decisions
Error Taxonomy Over Binary Feedback
Production LLM evals distinguish retrieval failures from reasoning errors from formatting issues. Same logic applies to language learning. Different root causes need different interventions.
Course Generation From Any Text
Fixed curricula are the bottleneck. Let students learn from texts they actually care about. The platform becomes the course authoring tool.
Classical Language First
Latin morphology is the hardest test case. If you can handle 50+ forms per noun and proper case analysis, modern languages are trivial. Built the hard thing first.
Continuous CEFR Calibration
Students are uneven. A2 in one skill, B2 in another. The system tracks competency per grammatical concept, not per student level.
What Was Hard
Generating coherent course progressions, not just random exercises. Each generated course needs narrative structure—skills build on each other. Still iterating on progression algorithms.
Stack
Demo Videos
Tutor roleplay - Practice dialogs with AI feedback