- JSON chart-spec with versioned schema
- Spec → Plotly figure renderer (server-side)
- Spec → Jupyter notebook exporter
PlotForge — plain-language plots, refined in turns.
A plotting interface aimed at students, educators, and lightweight analysts who would rather describe a chart than write the Matplotlib for it. PlotForge turns a natural-language request into a working plot, then lets the user refine — "log scale," "colour by cohort," "swap to a step line" — without losing the data or the previous chart.
What problem this solves.
The barrier to good exploratory plotting is rarely the data — it is the tooling. Matplotlib is powerful and unforgiving; spreadsheet charts forgive nothing about reproducibility. Students and analysts who want to ask a chart a question end up either fighting an API or producing slides that nobody can re-render.
Group 04 chose the middle of that gap: a tool where the request is in English, the output is a real Plotly figure with a reproducible spec, and the refinement loop is a conversation, not a documentation hunt.
The system, end to end.
PlotForge runs each request through a function-calling LLM whose tools correspond to plot operations: select columns, apply transforms, choose a chart type, set scales, group, colour, annotate. The model emits a chart spec — JSON, not code — that the renderer turns into a Plotly figure. The spec is the contract: every refinement turn modifies the spec, not the rendered output, which means undo and re-export are both trivial.
Two design choices keep the tool honest. First, the LLM never gets the raw dataset — only the schema, sample rows, and summary statistics. That keeps prompts cheap and prevents the model from hallucinating values that are not in the data. Second, every spec round-trips to a downloadable Python notebook, so users can take the chart out of PlotForge and into their own analysis without lock-in.
What it's built on.
| Spec language | JSON chart-spec with versioned schema |
|---|---|
| LLM tool layer | Function-calling tools mirror spec operations |
| Data backbone | DuckDB for in-browser CSV / Parquet ingest |
| Interaction | Streamlit shell hosts the chat + chart panes |
- Function-calling tools mirror spec operations
- Prompt context: schema + sample rows + summary stats
- Per-turn diff against the previous spec
- DuckDB for in-browser CSV / Parquet ingest
- Pandas for transform execution
- Type-coerced columns with explicit cast warnings
- Streamlit shell hosts the chat + chart panes
- Per-message replay so any turn can be re-rendered
- Notebook download surfaces the full spec history
What the team shipped.
What sets this capstone apart.
Refine the contract, not the canvas.
Every chart is a JSON spec underneath. Refinement turns edit the spec; the renderer is downstream. That single boundary buys undo, re-export, and a clean handoff to a notebook for free.
Schema and stats, never the values.
The LLM sees columns, types, and summary stats — not the rows. Prompts stay small, the bill stays small, and the model can't hallucinate a datapoint it never saw.
Every chart exits as a notebook.
When you're done, PlotForge hands you a Jupyter notebook that reproduces the chart end-to-end. No lock-in, no proprietary format, no "export to image" dead-end.
How this project landed.
Two-person teams are often the hardest to scope. The proposal began with the all-too-familiar "a better chart tool" framing — a brief broad enough to dissolve under contact with a deadline. The first review forced the team to name the audience: students and lightweight analysts, not data scientists.
Once the audience was named, the spec-first architecture fell out of it. The team defended the JSON spec against a dozen reasonable shortcuts and ended up with a tool whose seams are visible and reusable. The capstone shipped on time, with the notebook-export path doing exactly what it promised.