Group 04 · Capstone · Fall 2025

PlotForge — plain-language plots, refined in turns.

A plotting interface aimed at students, educators, and lightweight analysts who would rather describe a chart than write the Matplotlib for it. PlotForge turns a natural-language request into a working plot, then lets the user refine — "log scale," "colour by cohort," "swap to a step line" — without losing the data or the previous chart.

Domain Tooling · data

Stack PandasPlotlyStreamlitOpenAI function-callingDuckDB

Demoed · Fall 2025

IWhat we built

What problem this solves.

The barrier to good exploratory plotting is rarely the data — it is the tooling. Matplotlib is powerful and unforgiving; spreadsheet charts forgive nothing about reproducibility. Students and analysts who want to ask a chart a question end up either fighting an API or producing slides that nobody can re-render.

Group 04 chose the middle of that gap: a tool where the request is in English, the output is a real Plotly figure with a reproducible spec, and the refinement loop is a conversation, not a documentation hunt.

IIHow it works

The system, end to end.

PlotForge runs each request through a function-calling LLM whose tools correspond to plot operations: select columns, apply transforms, choose a chart type, set scales, group, colour, annotate. The model emits a chart spec — JSON, not code — that the renderer turns into a Plotly figure. The spec is the contract: every refinement turn modifies the spec, not the rendered output, which means undo and re-export are both trivial.

Two design choices keep the tool honest. First, the LLM never gets the raw dataset — only the schema, sample rows, and summary statistics. That keeps prompts cheap and prevents the model from hallucinating values that are not in the data. Second, every spec round-trips to a downloadable Python notebook, so users can take the chart out of PlotForge and into their own analysis without lock-in.

Pipeline · PlotForge

Ingest

Plot intent

user prompt

Transform

Spec language

JSON chart-spec, versioned

Model

LLM tool layer

function calls; schema + samples

Storage

Data backbone

DuckDB ingest, Pandas transforms

Transform

Renderer

spec → Plotly figure

Surface

Interaction

Streamlit chat + chart panes

Ingest Transform Model Storage Surface Feedback

IIIThe stack

What it's built on.

Layer · tool / library
Spec language	JSON chart-spec with versioned schema
LLM tool layer	Function-calling tools mirror spec operations
Data backbone	DuckDB for in-browser CSV / Parquet ingest
Interaction	Streamlit shell hosts the chat + chart panes

Spec language

JSON chart-spec with versioned schema
Spec → Plotly figure renderer (server-side)
Spec → Jupyter notebook exporter

LLM tool layer

Function-calling tools mirror spec operations
Prompt context: schema + sample rows + summary stats
Per-turn diff against the previous spec

Data backbone

DuckDB for in-browser CSV / Parquet ingest
Pandas for transform execution
Type-coerced columns with explicit cast warnings

Interaction

Streamlit shell hosts the chat + chart panes
Per-message replay so any turn can be re-rendered
Notebook download surfaces the full spec history

IVDeliverables

What the team shipped.

Source repository GitHub · code, tests, README

Demo video Capstone day · screen recording, 4–6 min

Write-up PDF Final brief · methods, evaluation, reflection

Slide deck Capstone presentation · 10 slides

VWhat sets it apart

What sets this capstone apart.

Takeaway 01 · Spec over render

Refine the contract, not the canvas.

Every chart is a JSON spec underneath. Refinement turns edit the spec; the renderer is downstream. That single boundary buys undo, re-export, and a clean handoff to a notebook for free.

Takeaway 02 · No raw data to the model

Schema and stats, never the values.

The LLM sees columns, types, and summary stats — not the rows. Prompts stay small, the bill stays small, and the model can't hallucinate a datapoint it never saw.

Takeaway 03 · Take it home

Every chart exits as a notebook.

When you're done, PlotForge hands you a Jupyter notebook that reproduces the chart end-to-end. No lock-in, no proprietary format, no "export to image" dead-end.

VIIInstructor note

How this project landed.

Two-person teams are often the hardest to scope. The proposal began with the all-too-familiar "a better chart tool" framing — a brief broad enough to dissolve under contact with a deadline. The first review forced the team to name the audience: students and lightweight analysts, not data scientists.

Once the audience was named, the spec-first architecture fell out of it. The team defended the JSON spec against a dozen reasonable shortcuts and ended up with a tool whose seams are visible and reusable. The capstone shipped on time, with the notebook-export path doing exactly what it promised.

PlotForge — plain-language plots, refined in turns.

Refine the contract, not the canvas.

Schema and stats, never the values.

Every chart exits as a notebook.

Related work in the cohort.

CodeCaster

W&M Degree Map