The GitHub Merge Queue Incident: How a Flawed Feature Flag Caused Silent Code Deletion

Introduction

On April 23rd, 2026, at 16:05 UTC, a subtle but devastating bug slipped into GitHub's merge queue. For over three hours, developers worldwide approved pull requests, saw green checks, and unknowingly watched their code vanish from main branches. This wasn't a bad commit or a rogue developer—it was GitHub itself quietly deleting lines of code. The official incident report likely downplayed the scale, but the reality was far worse. Let's dive into what happened, why it was so insidious, and what it means for how we trust our development tools.

The GitHub Merge Queue Incident: How a Flawed Feature Flag Caused Silent Code Deletion — Source: dev.to

The Day GitHub's Merge Queue Betrayed Trust

At first glance, everything seemed normal. Engineers reviewed pull requests with clean diffs, clicked merge, and saw the familiar green checkmarks. No warnings, no failed checks, no outage banners. Yet behind the scenes, a silent catastrophe was unfolding. A PR showing a modest +29 / -34 diff would merge into a commit of +245 / -1,137 lines. Thousands of lines of previously shipped code—approved, reviewed, and assumed safe—simply disappeared. Every subsequent merge built on this corrupted history.

GitHub's user interface lied without hesitation. The status page remained green. The platform gave no sign of trouble, leaving teams to discover the damage only when they noticed critical code missing from their main branch. It was a perfect storm: invisible deletion paired with a completely trustworthy UI.

Under the Hood: The Merge Base Computation Failure

GitHub's merge queue operates by creating temporary branches for each queued pull request. Normally, these branches start from the tip of main plus the PR's changes. Continuous integration runs against this temporary branch, and if it passes, the branch is merged to main.

On April 23rd, the queue began building these temporary branches from an incorrect starting point. Instead of branching from the current tip of main, it branched from wherever the feature branch had originally diverged from main. For a feature branch created 50 commits behind main, the temporary branch included those 50 stale commits. When pushed to main, it effectively removed the 50 commits that other engineers had added in the meantime. The temporary branch was internally consistent, so CI passed without issue—but main suffered a massive rollback.

The root cause? A new code path designed to adjust merge base computation was intended to be gated behind a feature flag for an unreleased feature. The gating was incomplete. The new behavior leaked into production and applied to all squash merge groups, causing the miscalculation.

Why This Bug Was Particularly Dangerous

Three factors amplified the severity of this incident:

The PR UI lied: Developers approved a +29/-34 change but got a commit worth +245/-1,137. The most fundamental contract of a code review system—that what you review is what merges—was broken.
Silent operation: No merge conflicts, no failed checks, no banners. Teams had no immediate alert. Only manual inspection of main revealed the missing code.
Scaling with activity: Faster repos had feature branches further behind main, causing more damage per merge. Active projects suffered disproportionately.

Lessons Learned: Trust but Verify

This incident underscores a critical principle: even the most trusted tools can fail silently. GitHub's merge queue is a powerful productivity booster, but its safety relies on correct merge base computation. When that fails, the consequences cascade invisibly.

What can teams do? First, implement post-merge validation scripts that compare expected vs. actual commit diffs. Second, treat merge queue merges as suspicious until proven clean—don't assume the UI is accurate. Third, advocate for better testing of feature flags in production-like environments. Finally, maintain a culture of skepticism: when code goes missing, check the tools, not just the commits.

GitHub has since fixed the root cause and likely improved their flag gating. But the incident serves as a reminder that infrastructure we rely on can become a liability. The best defense is proactive monitoring and a healthy distrust of green checkmarks.