The three failure modes that kill 80% of ops projects

Sometime around the third call with a new prospect, we end up telling them about a project we walked away from the week before. The prospect leans in. They want to know why we would turn down work.

The honest answer is we have learned to recognize the shape of work that will not finish. Same engagement, same failure, three different times. If the symptoms are present on the diagnostic call, the build will not survive week eight. So we walk.

Below is the rough taxonomy. Three failure modes that account for almost every ops project we have seen die. None of them are about the technology. All of them are about sequencing.

We turn down 40% of inbound. Here is why.

Most ops projects we see die in roughly the same way. The team finds the wrong problem, builds for the demo, and then leaves the client unable to operate what was built. The project rots, gets rebuilt by someone else, or quietly stops running and nobody admits it. It looks like a series of unrelated mistakes. It is actually the same three patterns, replayed.

Our diagnostic call now exists to spot those patterns early. If we sense any of the three in the first conversation, we will probably pass. We say so up front. People appreciate it more than you would expect.

Failure mode one: building before diagnosing

Here is the conversation. Founder calls. They say, "we want to build an AI agent for X." They have already decided what to build. Our job, in their head, is to implement an answer they have already arrived at.

This is the deadliest mode because it feels like progress. The buyer is decisive, has scope ready, and just wants someone to ship. The trouble is the answer they arrived at is almost always one or two layers above the actual bottleneck. They did not diagnose. They guessed.

A client called us once and said they needed an AI sales coach for their reps. Two weeks of diagnostic later, we found that 60% of their reps were not logging activity in the CRM. No coach in the world fixes a system with no data to coach on. The fix was a much smaller, much less interesting build that surfaced the missing logs back into the rep's workflow. Six weeks in, pipeline cleanliness was at 91%. The coach was no longer needed.

Mode 1 · what the data says

72%

of "we know what we want" briefs change scope inside week 2

measurable ROI when scope is not diagnosed first

0/12

clients who refused diagnosis and shipped a system that lasted six months

Failure mode two: solving for the demo

The system looks great in a Friday afternoon meeting. The founder is impressed. The board update writes itself. Two weeks later, real data hits production and the wheels come off.

Solving for the demo happens when the team optimizes for one or two golden examples that hide the long tail of real cases. The model works on sample inputs and falls apart on the messy ones nobody bothered to test. The fix is mundane. Build the eval set first. Demo against the messy data, not the curated kind.

If you cannot answer "what makes this fail?" in the second meeting, you are building for the demo, not the system.

A useful gut check we have started using: ask the buyer to walk us through three real examples that are nasty. Edge cases, exceptions, the ones the team groans about. If they cannot produce three, they have not been close enough to the problem to know its shape. We recommend a longer diagnostic.

Failure mode three: skipping the handoff

The build ships. The team is happy. The contract ends. Six weeks later, the system has stopped running. Nobody on the client side knows how to fix it. They quietly call the original team back, or they let the system die.

The handoff is the unglamorous work that determines whether the project lives past 90 days. We treat it as a milestone, not an event. Two days of operator training. A runbook for the top five common failures. A 30-day post-launch Slack channel where the client's ops team can ask questions without paying us extra. The first sign that an engagement is at risk is the buyer who says "just have your team handle changes after launch." That arrangement is how systems rot.

◆ Operator note

The build is the easy part. Operating it is the work. If your buyer cannot picture what their Tuesday morning looks like 90 days after launch, the project is already at risk.

The unglamorous habits that prevent it

Three habits, applied in order, fix all three failure modes.

Two-week paid diagnostic before any build. The diagnostic is not a sales call. It is a small piece of paid work where we get to see the problem from the inside. If we cannot find a real bottleneck, we say so and refund.
Every system ships with an eval harness. A small, scrappy one is fine. The eval set lives in the same repo as the code. Failing evals means the build does not deploy.
Handoff is a milestone, not an event. Two days of training. A runbook. 30 days of Slack support. Operators on the client side own the system at the end, not us.

The 90-day check

The thing nobody tells you about ops projects is that the failure shows up as silence. The system gets quietly mothballed. The team stops mentioning it in standup. The dashboard URL does not get clicked. There is no dramatic blow-up, just a slow fade.

We have started checking in 90 days post-launch. Not for an upsell. To make sure the system is still running. If it is not, that is information we want to feed back into the next engagement. The 90-day check is the cheapest piece of customer success we do.

If any of these failure modes feel familiar, you can book a diagnostic. We are happy to walk away with a "you do not need us yet."

We turn down 40% of inbound. Here is why.

Failure mode one: building before diagnosing

Failure mode two: solving for the demo

Failure mode three: skipping the handoff

The unglamorous habits that prevent it

The 90-day check

Keep reading.

How a $5M DTC company actually runs its books

Evals, not vibes: how we ship AI that does not embarrass us

The smart CRM: stop replacing it, start layering on top