Research instrument · visual storytelling primitives exposed to vindler · View lab →
Graphix
AI-Native Graphic Novels
201 MCP tools exploring the structural primitives of visual storytelling, from narrative decomposition to print-ready pages
Results
201
MCP tools
projects, characters, panels, generation, composition
3,400+
Tests passing
core, server, UI across Bun + Vitest + Playwright
4
Packages
core, server, client, UI monorepo
A story goes in as natural language. What comes out is beats, images, layouts, compositions, print-ready pages. Somewhere between the input and the output, something interesting happens: the narrative gets decomposed into structural primitives that scaffold everything downstream.
That decomposition is what I'm actually studying. What is the fundamental representation of the kernels of storytelling? Not "how do you generate pretty pictures with AI," but: what are the minimal structural units that a story needs to become a visual sequence? Graphix is the first tool that binds everything required to explore that question in one place, exposed to an instrumented agent that lets me begin approaching it experimentally.
The product layer works: write story beats, the system scaffolds panels from narrative structure. Characters maintain visual consistency through IP-Adapter embeddings and LoRA associations. ControlNet handles pose and composition. A generation tree tracks every variant so you can branch and compare. 201 MCP tools give Claude control over the full pipeline from project creation through page layout and PDF export, with the ComfyUI MCP server handling the actual image generation.
Right now this handles static graphic novels: panels, pages, consistent characters, print-ready output. The next milestone is interactive panels with click-to-animate via image-to-video. After that, animated shorts: the full T2I to I2V to V2V pipeline.
A note on artists: this doesn't replace them. The structural decomposition I'm studying is exactly the kind of thing working illustrators and storyboard artists already do intuitively. The goal is to understand and formalize that intuition, not to automate it away. If anything, better formal representations of narrative structure make the case for why human editorial judgment matters more, not less: I'm hoping this project allows us to show not just that, but how, that which we value about creativity is, indeed, uniquely human.
For Engineers
Architecture
Monorepo with four packages. Core handles business logic and domain models: projects, characters, panels, pages, generation trees. Server exposes both REST (Hono) and MCP (stdio) interfaces, so Claude and the web UI talk to the same backend.
The client package is auto-generated from OpenAPI specs for type-safe frontend communication. The UI is React 19 with Zustand state management, Fabric.js for canvas-based page composition, and D3 for generation tree visualization.
The server delegates image generation to the ComfyUI MCP server, which handles the distributed GPU pipeline. Character consistency flows through IP-Adapter embeddings and LoRA associations managed at the Graphix layer. SQLite locally, optional Turso for cloud sync.
Key Decisions
Story-First, Not Image-First
Most AI image tools start with a prompt. Graphix starts with a narrative. Story beats scaffold panel generation, so the visual output serves the story rather than the other way around.
MCP as Primary Interface
The UI exists for visual composition and review. The creative direction happens through Claude. 201 tools means the agent can manage the full pipeline without touching the GUI.
Character Consistency Engine
IP-Adapter embeddings and LoRA associations maintain visual identity across panels. The system manages this automatically rather than requiring manual prompt engineering per panel.
Local-First Storage
SQLite for local development, Turso for optional cloud sync. No vendor lock-in, no mandatory account, full data ownership.
What Was Hard
Visual consistency is the central unsolved problem.
- IP-Adapter influence varies with pose and lighting changes; a character can drift across panels even with the same embedding
- Generation trees grow fast. Better pruning heuristics are needed to keep them navigable without losing useful variants
- The narrative decomposition layer is functional but the mapping from story beats to panel structure is still too rigid. More expressive structural primitives would help
Stack