Back to notes
CostGuide5 min

Estimate cost before adding more AI calls

A practical cost model for deciding whether a workflow needs one model call, several calls, or a cheaper deterministic step.

Open source doc
Real example

Example: reduce a five-call workflow to two calls

A prototype uses one AI call to summarize, one to classify, one to extract, one to draft, and one to validate. It works, but cost grows quickly with document volume.

Combine extraction and classification into one structured response, move validation into code, cache document facts, and reserve drafting for records a human actually approves.

The product keeps the useful AI behavior while cutting unnecessary calls from every low-value record.

Tutorial path

How to implement it

Step 01
List every model call in the workflow and mark whether it is user-facing or background.
Step 02
Estimate input size, output size, frequency, retry rate, and failure rate.
Step 03
Move deterministic classification, formatting, and permission logic into code.
Step 04
Cache stable outputs and reuse extracted document facts across later steps.
Step 05
Compare model cost against reviewer time saved and revenue impact.
Checklist

Ready when these are true

Workflow-level cost estimate
Retries included
Caching opportunities found
Smaller model tested
Human review cost included
Field notes

What matters in practice

01
The cheapest AI call is the one replaced by validation, caching, or a smaller model.
02
Cost should be estimated per completed workflow, not only per request.
03
Human review time belongs in the cost model too.
Avoid these mistakes

Common failure modes

01
Do not optimize cost by removing validation.
02
Do not count only successful calls; retries and failures cost money too.
03
Do not run drafting before the workflow knows the item is worth drafting.
Practical tip
Estimate cost per completed business outcome, not per model request.
Apply this to a build
Contact
Bring the workflow, deadline, and constraints.
Send the desired outcome, current bottleneck, users, and timeline. I will respond with a practical path for the build.