The engineering team looks fine from the outside. Standups happen. Tickets close. Demos run. The velocity chart is not obviously catastrophic. And yet, something is off. The sprint reviews feel a little defensive. The senior engineers seem quieter than usual. Deploys are happening less often than they did six months ago, but nobody has said why.
This is the gap that breaks companies quietly. Not in a dramatic moment of failure, but in a slow accumulation of signals that look like normal engineering friction until the resignation letter lands or the outage happens and nobody can explain how things got to this point.
Jellyfish's 2024 State of Engineering Management Report found that while 46% of engineers reported their teams were experiencing burnout, only 34% of executives reported the same. That 12-point gap is not a rounding error. It is the distance between what is actually happening in your engineering organization and what is visible from the leadership level.
If you are a CEO reading this, forward it to your CTO. If you are a CTO, read it and see whether any of these patterns sound familiar.
Sign one
The on-call rotation has become one person's problem.
This starts gradually. One engineer is more comfortable with the production environment than the others. They respond faster. They know where to look. So informally, incidents route to them. Then it becomes expectation. Then it becomes resentment.
On-call, when it is distributed properly, is tolerable. When it concentrates on one or two people, it becomes a persistent low-grade emergency that consumes cognitive capacity even when nothing is actively broken. Those engineers are always half-present, always half-waiting for the page. They stop doing deep work because they have learned not to trust their own availability.
The Catchpoint SRE Report 2025 found that operational toil rose to 30% of engineering time in 2025, the first increase in five years. 88% of developers reported working more than 40 hours per week, with on-call obligations cited as a primary driver. The engineers carrying the most on-call load are usually your most experienced ones, which means the people you can least afford to lose are the ones absorbing the most damage.
The question to ask is simple: if someone asked your on-call engineer right now who is covering next week, could they answer without having to check Slack?
Sign two
Deployments have quietly become rarer and more ceremonial.
There is a pattern that starts with a bad deployment six months ago. Something broke in production. The postmortem was thorough. And then, without anyone explicitly deciding this, deployments started happening less often. More people got added to the approval chain. Someone started doing manual checks before every release. The deploy started requiring the senior engineer to be available.
Nobody called this a process change. It just happened, one cautious decision at a time, until deploying to production became a ceremony with specific participants and a specific time window and a specific level of anxiety attached to it.
According to DORA benchmarks, high-performing engineering teams deploy multiple times per day. Low performers deploy monthly or quarterly. The gap between those two numbers is not a gap in tooling. It is a gap in confidence — confidence in the pipeline, in the rollback mechanism, in the monitoring. When teams lose that confidence, they compensate with friction. And friction compounds.
Manual deployments are also where human error concentrates. Legacy systems requiring manual intervention during critical deployments increase the likelihood of human error during critical releases. The engineering hours spent on deployment ceremony are hours not spent building the product, and the irony is that the ceremony usually makes things less safe, not more.
Sign three
Nobody trusts the CI pipeline anymore.
You can tell a pipeline has lost the team's trust when engineers start re-running failed tests without looking at why they failed. This is called "flake blindness" — when tests are unreliable often enough that a red build no longer means something broke. It just means the pipeline needs another attempt.
Flaky tests and unreliable CI are not minor annoyances. They erode the fundamental contract of automated testing: that a passing build means something real. Mary Moore-Simmons, VP of Engineering at Keebo, put it plainly: "Engineers are typically the most expensive people in a company and making them wait for builds to finish or forcing them to manually fix flaky tests is a major productivity killer." Research from JetBrains found that engineering teams lose as much as 20% of weekly time to inefficiencies, technical debt, and tooling issues rather than product work.
The downstream effect is subtler than lost hours. When engineers learn not to trust automated checks, they start doing things manually that should be automated. They double-check deployments. They run scripts locally before pushing. They ask someone to "take a look" before merging. Each of these manual steps is a signal that the automation has lost its authority. The pipeline is still running. It just is not being believed.
Sign four
The alert inbox has become noise that everyone agrees to ignore.
This one is particularly dangerous because it looks like a solved problem. "We get a lot of alerts, but we know which ones matter." That is almost always a rationalization for an alert system that has stopped working.
Research cited by incident.io shows teams receive over 2,000 alerts weekly, with only 3% requiring immediate action. 73% of organizations experienced outages directly linked to ignored alerts, according to Splunk's State of Observability 2025. The pattern is predictable: teams add alerts after incidents to make sure they never miss that thing again. The alerts accumulate. Nobody removes them. Eventually the signal-to-noise ratio inverts and the only way to survive the volume is to stop reading it carefully.
The test for whether your team has alert fatigue is not whether they complain about alerts. It is whether, during a recent incident, the relevant alert had already fired before anyone noticed the problem. If the answer is yes, the system is alerting but nobody is listening. That is not an observability problem. That is an organizational problem wearing an observability costume.
Sign five
Your best engineers have stopped suggesting improvements.
This is the quietest signal and the most serious one. Good engineers always have opinions about how to make systems better. They notice the technical debt. They see the process inefficiency. They have ideas about what to automate. When they stop surfacing those ideas, it is not because the ideas have dried up. It is because they have learned that surfacing them leads nowhere, or costs them more energy than they have available.
The Harness State of the Developer Experience 2024 Report, based on 500 engineering leaders and practitioners, found that 52% of developers say burnout is the primary reason their peers leave. 62% have experienced scope creep in their roles. The engineers who leave first are rarely the disengaged ones. They are the ones who cared the most, tried the hardest to improve things, and eventually concluded that the energy required was not coming back to them.
The withdrawal happens in stages. First, they stop suggesting systemic improvements. Then they stop volunteering for things outside their sprint. Then they start updating their LinkedIn. By the time the resignation letter arrives, the signs were present for months. They were just not the kind of signs that show up in a velocity chart or a sprint review.
The thing these five signals share is that none of them look like emergencies from the outside. On-call is covered. Deployments still happen. The pipeline runs. Alerts fire. Engineers are still in meetings. The organization looks functional right up until the moment it is not.
What they actually represent is accumulated friction, the slow buildup of toil and mistrust and exhaustion that happens when engineering systems and processes are not maintained with the same care as the product itself. The product gets sprints and reviews and retrospectives. The infrastructure around the engineering team quietly degrades.
If you recognized your organization in two or more of these, that is worth taking seriously before it becomes urgent. Not with a survey or a team offsite, but with specific questions: who is actually carrying the on-call load? What is the deployment frequency right now compared to six months ago? When did an engineer last suggest something that changed how you work? The answers will tell you more than any metric dashboard.

