You do not need a tool to measure DORA. Two of the four numbers are already in GitHub, and the other two are one decision away.



Most teams can recite the four DORA metrics and could not tell you their own. The usual reasons are that measuring them sounds like a project, or that it means buying a platform. Neither is true if you deploy through GitHub, because the data you need is already there.

The gap between teams is real and worth measuring against. The 2024 DORA report put the top performers at multiple deploys a day, a lead time under a day, a change failure rate around 5 per cent, and recovery in under an hour, and only about 19 per cent of teams reached that bar (gitdailies, 2026). One note before you start chasing a tier, though: the 2025 DORA programme retired the elite-to-low buckets entirely in favour of percentiles, so the benchmarks are a rough compass, not a finish line (gitdailies, 2026).

Here is the honest split. Deployment frequency and lead time come almost directly from GitHub Actions, because GitHub records every workflow run, commit, and merge with a timestamp. Change failure rate and time to restore need one more thing from you, a definition of what counts as a failure and what counts as recovered, because that signal does not live in the pipeline (gitdailies, 2026). This post shows how to compute all four from your own data, with real queries.

Deployment frequency: count what actually reached production

Deployment frequency is the simplest metric and the easiest to get wrong by counting the wrong thing. You count the successful runs of your production deploy workflow in a window, but the subtlety is choosing the right event: the run, release, or merge that means code actually went live, not a CI build and not a staging deploy (gitdailies, 2026).

# Deployment frequency: successful prod-deploy runs in the last 30 days
export SINCE=$(date -u -d '30 days ago' +%FT%TZ)
gh api --paginate -f branch=main -f status=success \
  "repos/$OWNER/$REPO/actions/workflows/deploy.yml/runs" \
  --jq '[.workflow_runs[] | select(.created_at >= env.SINCE)] | length'

This asks GitHub for the runs of deploy.yml on main that succeeded, keeps the ones inside the window, and counts them. Divide by the days in the window to turn the count into a rate.

The trade-off is that the number is only as honest as your workflow boundaries. If one deploy run actually ships ten services, or you deploy by a path the query cannot see, the count lies, so pick the single event that unambiguously means production.

As a reference, the 2024 report's top tier was multiple deploys a day, lead time under a day, change failure rate around 5 per cent, and recovery under an hour, reached by roughly 19 per cent of teams (gitdailies, 2026; Taskade, 2026). Shipping at least once a week already puts you ahead of most of the survey.

Lead time for changes: from commit to live

Lead time is the most controllable metric and the most useful for spotting where the pipeline drags. It is the time from a commit to that commit running in production (gitmore, 2026), and the clean place to capture it is the moment a deployment reports success, when both timestamps are available.

# .github/workflows/lead-time.yml
on:
  deployment_status:

jobs:
  lead-time:
    if: github.event.deployment_status.state == 'success'
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with: { fetch-depth: 0 }
      - name: Record commit-to-deploy minutes
        run: |
          COMMIT_TS=$(git show -s --format=%ct ${{ github.sha }})
          NOW=$(date +%s)
          echo "lead_time_minutes=$(( (NOW - COMMIT_TS) / 60 ))"

The workflow fires when a deployment succeeds, reads the commit timestamp from git, subtracts it from now, and emits the lead time in minutes, which you append to a log or push to a dashboard (Aviator, 2025).

The trade-off is that the commit time is a rough start point, because a commit can sit on a branch for days before the pull request even opens. For a stricter number, use the first commit on the branch or the pull request creation time as the start, which is closer to the real DORA definition.

Change failure rate: define failure, then it is arithmetic

Change failure rate is the share of deployments that need an urgent rollback, hotfix, or patch (gitmore, 2026). The metric is trivial division. The hard part is deciding what counts as a failure, because GitHub knows a deploy failed but not whether a deploy that went green then broke production.

# Change failure rate: failed deploys / all deploys, last 30 days
export SINCE=$(date -u -d '30 days ago' +%FT%TZ)
runs=$(gh api --paginate -f branch=main \
  "repos/$OWNER/$REPO/actions/workflows/deploy.yml/runs" \
  --jq '[.workflow_runs[] | select(.created_at >= env.SINCE)]')

total=$(echo "$runs" | jq 'length')
failed=$(echo "$runs" | jq '[.[] | select(.conclusion == "failure")] | length')
echo "change_failure_rate=$(( failed * 100 / total ))%"

This counts failed deploy runs over total deploy runs. That catches deploys that failed to apply, but not a deploy that went green and then caused an incident, so most teams also count revert and hotfix deploys, which tools detect automatically through rollbacks, revert pull requests, and hotfix labels (Datadog, 2026). Top teams sit near 5 per cent, low performers far higher (gitmore, 2026).

The trade-off is that a pure pipeline view undercounts, because the worst failures are the ones that deploy cleanly and break things later. The honest version links deployments to your incident source; until then, treat the CI-only number as a floor, not the truth.

Optimising deployment frequency while ignoring change failure rate just means shipping bugs faster. The four are designed to be read as a set, never one at a time, and never as a league table that ranks a frontend team against an embedded one (gitmore, 2026).

Time to restore: pair the failure with the recovery

Time to restore measures how quickly service is fully restored after a change caused a failure (gitmore, 2026). As a pipeline proxy, that is the time from a failed deploy to the next successful one. The truer version runs from when the incident opened to when it closed.

# Time to restore (CI proxy): pair each failed deploy with the next success
gh api --paginate -f branch=main \
  "repos/$OWNER/$REPO/actions/workflows/deploy.yml/runs" \
  --jq '.workflow_runs[] | [.created_at, .conclusion] | @tsv' | sort | \
awk -F'\t' '$2=="failure" && !f { f=1; ft=$1 }
            $2=="success" && f  { print ft, $1; f=0 }'

It lists every deploy as a timestamp and an outcome, sorts them oldest first, and for each failure prints the next success, giving you the failure-to-recovery pairs to average.

The trade-off is that the proxy assumes the next green deploy is the fix, which is often but not always true, and it misses failures resolved by a config change or a rollback that is not itself a deploy. For a real number, measure from when the incident opened to when it closed.

Make it run itself, and read all four together

None of this is useful as a one-off. DORA is about direction over time, and you need a couple of months of data before any of it means anything, because a single sprint is not a baseline (gitmore, 2026). Put the computation in a scheduled workflow that runs weekly and publishes the four numbers somewhere visible.

# .github/workflows/dora-report.yml
on:
  schedule:
    - cron: '0 6 * * 1'   # every Monday, 6am

jobs:
  report:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Compute the four numbers
        env:
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          OWNER: ${{ github.repository_owner }}
          REPO: ${{ github.event.repository.name }}
        run: ./scripts/dora.sh

A Monday-morning run executes the same queries, computes the four metrics, and writes them to a report, an issue, or a dashboard, so the numbers stay live instead of being recomputed by hand whenever someone asks.

The trade-off, and it is the important one, is that a dashboard invites misuse. The moment these numbers are tied to performance reviews or used to rank teams, people optimise the number instead of the work, skipping reviews to cut lead time or avoiding risky but valuable changes to protect a failure rate (gitmore, 2026). Measure to find bottlenecks, not to grade people.

The part worth sitting with

So before you scope a metrics project or sign up for a platform, open your repository and ask one question: which workflow run, release, or merge means a change actually reached production? Answer that, and deployment frequency and lead time fall out of data GitHub already keeps, in a handful of queries you can put on a weekly schedule this afternoon. Change failure rate and time to restore take one more decision from you, what counts as a failure and what counts as recovered, and they will be approximate until you wire in your incidents, but an approximate number you watch every week beats a perfect one you never compute. The point was never to score a tier, especially now that the tiers are gone. It is to see, in your own data, where a change slows down and how often it breaks, so you can fix the pipeline instead of arguing about it. The numbers are already in there. You just have to count them.

Author note

I am Mohan Gopi, an Associate DevOps Engineer at Frigga Cloud Labs. I work across AWS, GCP, and Azure, with GitHub Actions as the deployment backbone for everything I ship, which means almost everything DORA wants to measure is already passing through my workflows. The pattern I keep seeing is teams treating these four numbers as something you buy rather than something you count, and then never looking at them because the tool was one more thing to set up. I started by computing just deployment frequency and lead time straight from the Actions API on a weekly cron, and that alone surfaced a slow review stage I had been blaming on the build. I added a rough change failure rate from failed and reverted deploys next, and only later linked it to incidents for the honest version. My advice is to measure badly and soon rather than perfectly and never, watch the trend, and never once put these numbers next to someone's name. If you want to swap the exact queries I run, I am on LinkedIn → Mohan Gopi.

Post a Comment

Previous Post Next Post