What Is a Software Regression? Causes, Detection & Prevention
A software regression is something that used to work and broke after a change. Here is what causes regressions, how to detect them against a known-good baseline, and why AI-generated code is making them more common.

Definition
A software regression is a defect where functionality that previously worked stops working after a change — a code edit, dependency upgrade, configuration shift, or merge. It is degradation caused by change, not new untested behavior. Because a known-good baseline exists, the offending change can usually be found by bisecting.
The canonical definition comes from the testing-standards body. ISTQB defines a regression as 'a degradation in the quality of a component or system due to a change.' Read that sentence carefully: the trigger is a change, and the failure is degradation of something that already existed. That is the whole distinction. A net-new bug is behavior that never worked; a regression is behavior that worked yesterday and does not work today, which means there is a prior version to compare against.
That baseline is not a footnote — it is the single most useful property of a regression. It is why git bisect exists. If the feature passed on commit A and fails on commit Z, the breaking change is somewhere between them, and you can binary-search the range to find it. A brand-new bug gives you no such anchor. So the first question to ask of any defect is not 'where did it throw' but 'did this ever work' — the answer routes you to two completely different debugging strategies.
Why it matters
Regressions matter because they are the failure mode that scales with how fast you ship. The production-facing version even has a metric: DORA tracks change fail rate — the share of deployments that cause a failure in production requiring a hotfix, rollback, or patch — as one of its four core delivery measures. That makes the regression a benchmarkable number, not a vibe. If your change fail rate is climbing, your safeguards are losing the race against your deploy frequency.
And the substrate they surface from is enormous. CISQ put the cost of poor software quality in the US at $2.41 trillion, including roughly $1.52 trillion of accumulated technical debt. Technical debt is precisely the brittle, under-tested code from which regressions repeatedly emerge whenever someone touches it. The harder problem is that detection is a needle-in-a-haystack search: in Google's 'Taming Google-Scale Continuous Testing' study, only 1.23% of test executions actually caught a real breakage, while about 84% of observed pass-to-fail transitions were flaky tests rather than genuine regressions. Most of the signal your suite produces is noise, which is why a stable baseline and reliable tests matter more than raw test count.
How this shows up in a real BugMojo bug report
Here is the honest limit of a stack trace on a regression, and where BugMojo fits. A trace tells you where code failed and the path that got there. But a regression's root cause is the change that altered behavior, and that change is frequently nowhere near the line that threw — a config flag, an upgraded dependency, a different API response shape. The failing code is not necessarily the code that changed. So a clean trace points you at a symptom while the cause sits three files and one deploy away.
In a BugMojo report the trace does not arrive alone. The browser extension captures the failure with its surrounding state — an rrweb session replay, the console output, and the network request that fed the bad data — so the frame at Pricing.tsx:142 sits next to the exact GET /api/plan response whose new tier field your code never handled. That is the difference between 'something broke near line 142' and 'the plan endpoint started returning a shape this branch did not account for.' The state is what tells you a change caused it.
Then BugMojo hands that whole bundle to an AI agent (Claude Code, Cursor) over an MCP server. The agent reads the replay, the console, and the network response together with your repository, so it can correlate the failing state with the diff that introduced it instead of guessing from a trace. That is the uncontested wedge: production error monitors attach a trace with breadcrumbs, but none of them ship an MCP layer that lets an agent read the captured session behind the regression.
| Feature | Capability | BugMojo | Prod error monitor (Sentry/BugSnag) |
|---|---|---|---|
| Stack trace attached to the report | — | ✓ | ✓ |
| rrweb session replay of the regression | — | ✓ | — |
| Console + network captured with the failure | — | ✓ | Breadcrumbs |
| Captured bug bundle handed to an AI agent over MCP | — | ✓ | — |
| Aggregate uncaught exceptions across a production fleet | — | — | ✓ |
| Release health and change-fail-rate trends at scale | — | — | ✓ |
BugMojo captures the failing session with its rrweb replay, console, and network — then hands the whole bundle to Claude Code or Cursor over MCP, so your agent reads the state behind the regression and can find the change that broke it.
Install the extensionFrequently asked questions
Frequently asked questions
Sources
- Regression — "A degradation in the quality of a component or system due to a change" (ISTQB Glossary) — ISTQB (International Software Testing Qualifications Board) (2025)
- Announcing the 2024 DORA report — AI adoption vs. delivery stability and throughput — Google Cloud / DORA (2024-10)
- Accelerate State of DevOps Report 2024 — change fail rate as a core delivery metric — DORA (DevOps Research and Assessment) (2024)
- AI Copilot Code Quality: 2025 Research — code churn and copy/paste across 211M changed lines — GitClear (2025-02)
- The Cost of Poor Software Quality in the US: A 2022 Report — $2.41T total, $1.52T technical debt — CISQ (Consortium for Information & Software Quality) (2022-12)
- Taming Google-Scale Continuous Testing — 1.23% of test runs catch a real breakage; ~84% of pass-to-fail transitions are flaky — Google Research / IEEE ICSE-SEIP (2017)
- Introducing the Model Context Protocol — open standard for connecting AI agents to tools and data — Anthropic (2024-11)
Get bug-tracking insights, weekly.
Engineering deep-dives, QA playbooks, and honest tool comparisons. No spam — unsubscribe in one click.

