Group 05 · Capstone · Spring 2026

BURT++ — the bug report, translated.

A guided assistant that turns the kind of bug report a non-technical user actually writes — "the page is weird now" — into the kind of ticket an engineering team can act on: reproduction steps, environment, severity hint, and the obvious next question, asked back before the ticket files.

Domain Dev tools · QA
Stack LLM clarificationJira-shaped outputReact widgetNode APIPostgres
Demoed · Spring 2026
IWhat we built

What problem this solves.

The cost of a bad bug report is enormous. A title like "it's broken" routes the wrong way, sits in triage, and resurfaces a week later when the user is even angrier. Most of the friction is upstream: the user who hits the bug is rarely the one who can describe it in the words a sprint board expects.

BURT++ steps into that gap. Instead of asking the user for a perfect ticket, it asks the user for the report they would write anyway — then runs a clarification loop that fills in the gaps with the smallest number of follow-up questions it can get away with.

IIHow it works

The system, end to end.

Each report goes through a structured clarification cycle. The model reads the user's freeform description, identifies the missing slots — reproduction steps, browser / OS, expected vs actual, frequency — and asks only for what is missing. Questions are scored by information gain, so the user is never asked something the report already implied.

When the slots are filled, BURT++ emits a Jira-shaped ticket: title, description, repro steps, environment, severity hint. The hint is explicit about uncertainty ("likely P3, escalate if reproducible in production") so the triage engineer gets a starting point, not a foregone conclusion.

Pipeline · BURT++
Ingest
Bug draft
reporter writes
Model
Slot detection
structured-output classifier
Transform
Gain-ranked Qs
information-gain ranking
Surface
Follow-up
≤2 follow-up turns
Transform
Ticket emission
Jira-shaped JSON
Surface
Embed surface
Jira / Linear / GitHub Issues
Ingest Transform Model Surface
IIIThe stack

What it's built on.

Layer · tool / library
Slot detection Structured-output LLM call against the ticket schema
Follow-up generation Information-gain ranking over candidate questions
Ticket emission Jira-shaped JSON with title / desc / repro / env / severity
Embed surface Drop-in React widget that hosts the chat in-app
Slot detection
  • Structured-output LLM call against the ticket schema
  • Per-slot "is-filled" classification with calibrated thresholds
  • Optional free-form notes preserved verbatim into the ticket
Follow-up generation
  • Information-gain ranking over candidate questions
  • Cap of two follow-up turns before the system commits
  • Per-question reasoning kept in the audit log, not surfaced
Ticket emission
  • Jira-shaped JSON with title / desc / repro / env / severity
  • Webhook adapters for Jira, Linear, and GitHub Issues
  • Severity hint phrased as a confidence-tagged statement
Embed surface
  • Drop-in React widget that hosts the chat in-app
  • Token-scoped per-tenant; no shared LLM context
  • Telemetry surface for triage-engineer feedback signals
IVDeliverables

What the team shipped.

Source repository GitHub · code, tests, README
Demo video Capstone day · screen recording, 4–6 min
Write-up PDF Final brief · methods, evaluation, reflection
Slide deck Capstone presentation · 10 slides
VWhat sets it apart

What sets this capstone apart.

Takeaway 01 · Slots, not forms

Detect what's missing, ask for that.

BURT++ doesn't show a 12-field form. It reads the user's description, identifies the slots that are actually empty, and asks for those — and only those. The user feels heard; the engineer gets a complete ticket.

Takeaway 02 · Gain-ranked questions

One good follow-up beats five bad ones.

Each follow-up is scored by expected information gain. The system asks for severity-determining context first and stylistic details never. The whole interaction is two messages, not ten.

Takeaway 03 · Hint with calibration

Severity, with the uncertainty attached.

The output ticket includes a severity hint phrased with its own confidence — "likely P3, escalate if reproducible in production." Triage starts from a sane prior instead of a guess.

VIIInstructor note

How this project landed.

Solo capstones are scope traps. The first proposal had BURT++ doing slot detection, ticket emission, telemetry, and a separate auto-deduplication system on top — four projects in one. The first review cut the dedup work to a future-work footnote and made the team commit to the conversational core.

What shipped is tight: a working clarification loop, a clean Jira-shaped output, and a defensible answer to the "why two turns and not five" question that came up in critique. The capstone reads as one focused product, not four sketches.