How a $5M DTC company actually runs its books

In early March, a founder we know forwarded an email. "My CFO is quitting, my month-end takes nine days, and I think we have an AP problem. Can you look?" That was the entire pitch. Two weeks later we knew what was wrong. Two months later it was fixed.

This is a lightly anonymized walkthrough of what we found, what we changed, and (importantly) what we left alone. There is a current pattern in the AI consulting world where every engagement turns into "and now everything is an agent." That is not how operations work, that is not how this engagement worked, and the result is better for it.

The intake call

We do every diagnostic the same way. A 30-minute call to figure out if we should even be on the engagement, then two weeks of paid diagnostic if we should. The intake call has three questions:

What is the function that absorbs every new ops hire? Whatever that is, that is your bottleneck.
What does the CFO complain about on Friday afternoons? That is your near-term cost center.
If you could automate one thing tomorrow with no risk, what would it be? That is the founder's gut talking. Usually right.

For this client all three answers were the same. AP. They were running a five-person AP function on $8M in revenue. The benchmark for a CPG company at that scale is one to two people. Five was a tell.

Diagnostic snapshot · week 1

5 → 2

AP headcount target

$340k

Annual unlock

9 → 2

Days to close

What was broken

The team had bought four pieces of software in three years. Each one promised to fix AP. Each one fixed a slice of it. None of them spoke to each other. The result was that the AP team's actual job had become reconciling between the AP tools.

Three of the five AP people were not paying invoices. They were copying data between systems so the books would close. This is the most common operational failure mode we see, by a mile.

The job people are paid to do is rarely the job they are actually doing. The diagnostic is finding the gap.

What we kept

Almost everything. The accounting system was fine. The bill-pay rails were fine. The expense tool was fine. We kept all three. The fix was a single integration layer that read from all three, deduplicated, and wrote back to the system of record.

No new SaaS subscriptions. No re-implementing the GL. No "let's move you to NetSuite." The least-glamorous answer is almost always the right one when the bottleneck is connective tissue.

Where AI showed up

In two places. One: classifying invoices into the right GL accounts. The team had been doing this manually because the existing tool's auto-categorization was 64% accurate and they did not trust it. We fine-tuned a small model on three years of their own historical data and got to 96%. The remaining 4% routes to a human for review with the model's confidence and reasoning attached.

Two: drafting vendor follow-up emails when an invoice was missing PO context. The model knew the tone, the standard ask, the escalation history. The AP person reviewed and sent. This was a 70% time reduction on a task that was eating four hours a day.

◆ Operator note

We measure every AI step against a deterministic baseline. If the model is not measurably better than a rule, we ship the rule and move on.

Where AI did not

Approval routing. Fraud checks. Anything that touched a payment going out the door. Those stay deterministic, audited, and human-owned. We do not build AI into the path of money leaving your bank account. Anyone who tells you they do is either lying or about to.

The handoff

Six weeks after kickoff, the system was live. Eight weeks in, the second AP hire transitioned to AR (where the same diagnostic now applies, a story for another note). Twelve weeks in, we ran the final review with the founder, the CFO, and the new lead operator. The KPIs we had signed off on at the start were hit. We documented the system, ran two days of training, and stepped out.

That is the engagement. AI is in there. It is not the point.

Names and a few specifics have been blurred at the client's request. Numbers are unchanged. If this matches your shape of pain, book a diagnostic.

The intake call

What was broken

What we kept

Where AI showed up

Where AI did not

The handoff

Keep reading.

The three failure modes that kill 80% of ops projects

Evals, not vibes: how we ship AI that does not embarrass us

The smart CRM: stop replacing it, start layering on top