Production AI Playbook: Deterministic Steps & AI Steps

Eliott Ardisson

Founder & CEO - Basalt Studio

Apr 7, 2026

Updated Jun 8, 2026

insights

A practical guide to building production-ready AI workflows by combining deterministic logic with AI steps — fewer errors, lower costs, more reliable automation.

ai agents

automation

programmatic

Key Takeaways

Pure AI workflows break in production — not because the models are bad, but because unstructured data and missing validation create compounding failures
Deterministic steps (validation, routing, formatting) are free, instant, and 100% reliable; AI steps are best reserved for tasks that genuinely require interpretation or generation
The most reliable production systems layer deterministic pre-processing, structured AI calls, and deterministic output validation — in that order
Every AI output should pass through at least one business-logic validation check before it touches a downstream system
Cost control in AI workflows is mostly about identifying which steps don’t need AI at all

The Reliability Problem with Pure AI Workflows

Most AI workflow failures in production don’t happen because the model got something wrong. They happen because unclean input, missing validation, or unchecked output created a cascade that no one designed for.

A common pattern: a team connects an LLM to a support ticket workflow, tests it on a clean sample, gets sharp results, and ships it. Three weeks later, a customer name with special characters breaks the parser. A sarcastic complaint gets classified as positive feedback. A generated reply references a product feature that doesn’t exist.

None of these are model failures in the traditional sense. They’re architectural failures — places where deterministic guardrails should have existed but didn’t.

The fix isn’t a more capable model. It’s building the workflow correctly: deterministic logic where rules apply cleanly, AI where interpretation is genuinely needed, and validation gates at every handoff between the two.

What Deterministic and AI Steps Actually Mean

These terms get used loosely, so it’s worth being precise.

Deterministic steps follow explicit rules and always produce the same output for the same input. There is no probability involved. Examples:

Email format validation via regex
Date parsing and arithmetic
Required field checks
Database lookups and joins
Conditional routing based on field values
Whitespace trimming, case normalization, character escaping

AI steps handle tasks where the answer depends on context, interpretation, or generation. Examples:

Classifying the intent behind a customer message
Summarizing a long document
Extracting structured data from an invoice with no fixed format
Generating a personalized reply
Detecting sentiment across varied writing styles

The practical principle: deterministic steps cost nothing at inference time and never hallucinate. AI steps consume tokens, add latency, and introduce probabilistic variance. Design accordingly.

Where Deterministic Logic Should Always Win

Input Validation

Before any data touches an AI model, deterministic checks should confirm it’s worth processing. This includes:

Required field presence: if a field is mandatory, check field !== null && field !== "" — do not ask a model to infer completeness
Format validation: emails, phone numbers, URLs, and dates should pass format checks before reaching a prompt
Data type enforcement: a numeric field containing “N/A” will confuse downstream logic; catch it early
Text normalization: trim whitespace, standardize case, remove or escape characters that break JSON or distort prompts

Feeding messy input to an AI model and hoping it figures things out is how you get garbage-in-garbage-out at scale.

Routing and Conditional Logic

If you can write the rule explicitly, write it explicitly:

if (ticket.priority === "urgent") → urgent_queue
else if (account.tier === "enterprise") → enterprise_team
else → general_support

This executes in milliseconds, costs nothing, and behaves identically every time. Asking a model to infer the correct queue from ticket metadata is slower, more expensive, and introduces unnecessary variance.

Calculations and State Transitions

Invoice totals, tax calculations, date arithmetic, unit conversions, order status progressions — these should always be deterministic. There is no ambiguity to resolve. Using AI for arithmetic is both a reliability risk and a waste of tokens.

Where AI Genuinely Earns Its Place

Understanding Unstructured Text

This is where AI has a real advantage over rule-based systems. A customer complaint written in British understatement, a legal query with embedded context, a support ticket mixing three languages — deterministic rules can’t reliably handle these. AI can.

Specific tasks where AI adds clear value:

Intent classification from natural language where tone and phrasing matter
Sentiment analysis across varying writing styles and cultural registers
Document summarization where preserving nuance matters more than extracting keywords
Named entity extraction from documents with no fixed structure (invoices, contracts, resumes)

Generation with Context

When a response needs to account for customer history, account tier, prior interactions, and current context, AI-generated output is substantially better than template-based alternatives. The key constraint: the generation should be tightly prompted and the output should be validated before it’s sent anywhere.

Pattern Recognition That’s Hard to Codify

Some patterns are genuinely difficult to encode as explicit rules — fraud signals in behavioral data, early churn indicators in support ticket language, emerging themes in customer feedback. AI handles these well when the inputs are clean and the outputs are validated.

The Four-Layer Hybrid Architecture

Building reliable hybrid workflows means thinking in layers, not steps.

Layer 1: Deterministic Pre-Processing

Every workflow starts here. Before anything reaches an AI model:

Validate required fields exist and are non-empty
Enforce format rules on structured fields
Normalize text (case, whitespace, encoding)
Extract structured metadata that can be used for routing later
Confirm the input is actually worth processing

A practical example: a customer inquiry comes in as " URGENT!!! order #9823 missing — email me: Jane.DOE@Company.COM ". Deterministic pre-processing extracts the order number, normalizes the email, strips excess punctuation, sets a priority flag based on keyword detection, and passes clean, structured data to the AI layer. The model gets a well-formed input; the workflow gets predictable outputs.

Layer 2: Structured AI Interactions

Vague prompts produce vague outputs. When calling a model, specify exactly what you need back:

Instead of: “Analyze this support email and tell me what to do.”

Use: “Classify this inquiry into one of: refund_request, order_status, technical_support, billing_question, general_inquiry. Also return: customer_emotion as one of [positive, neutral, frustrated, angry], urgency_level as one of [low, medium, high], and requires_escalation as true or false.”

Structured outputs mean the next deterministic layer has predictable values to work with. This is not a nice-to-have — it’s what makes the rest of the workflow reliable.

Layer 3: Deterministic Output Validation

Never pass AI output directly to a downstream system without checking it. Specifically:

Confirm required fields are populated
Verify classification values fall within the allowed set
Check that generated text doesn’t contain placeholder strings or template artifacts
Validate extracted numeric values are actually numeric and within plausible ranges
Confirm extracted dates parse correctly

Then route based on validated outputs:

if (classification === "refund_request" && order_value > 500) {
    route_to = "senior_support"
} else if (urgency === "high" || escalation === true) {
    route_to = "priority_queue"
} else {
    route_to = "standard_queue"
}

This routing logic is deterministic. The AI provided the classification; the workflow decides what to do with it.

Layer 4: Quality Monitoring and Continuous Improvement

Deterministic monitoring checks run after execution:

Track AI confidence scores and flag low-confidence outputs for human review
Monitor classification distributions over time — a sudden shift often signals a prompt regression or data quality change
Log every validation failure with enough context to diagnose the cause
Measure token usage per workflow run to catch cost regressions early

Practical Workflow Examples

Customer Support Triage

Naive approach: customer email → LLM → response → send

Hybrid approach:

Extract customer ID and any order numbers deterministically
Look up account tier, order history, and open tickets in the CRM
AI classifies intent and sentiment with structured output
Deterministic routing based on account tier plus classification
AI generates reply using full customer context
Deterministic check confirms no placeholders, appropriate length, no hallucinated product names
Log interaction, set follow-up flag if needed

The reliability difference between these two approaches is significant. In our work with founder-led service businesses deploying first-generation support agents, the most common failure point is step 6 — generated responses that pass intent checks but contain factual errors about the product or service. A deterministic fact-check layer against a known product catalogue catches these before they reach customers.

Lead Qualification Pipeline

Deterministic validation of form submission fields
Enrichment lookup for company size and industry
AI analysis of inquiry text for buying intent, pain points, and urgency signals
Deterministic scoring using explicit rules (company size bracket, mentioned budget, timeline language)
Combined qualification score merges AI insights with rule-based score
Routing to appropriate sales rep based on territory and score threshold
AI drafts personalized first-touch email with all available context

Invoice and Document Processing

File format and readability check
Text extraction with deterministic cleanup of common OCR artifacts
AI extraction of vendor name, amounts, dates, and line items
Deterministic validation: amounts are numeric, dates parse correctly, required fields present
Business rule checks: amounts within expected range, vendor on approved list
Route failed validations to human review with specific error context
Sync validated data to accounting system

McKinsey research on back-office automation suggests document processing workflows designed with explicit validation layers maintain substantially higher accuracy rates than end-to-end AI approaches, particularly for high-volume, compliance-sensitive use cases.

Common Pitfalls

Using AI for tasks that have clear rules. If you can write an if/then statement that handles 95% of cases, do that. Reserve AI for the remaining 5% that genuinely require interpretation. This applies especially to routing, formatting, and validation.

Skipping output validation. Developers often validate inputs carefully and then trust AI outputs implicitly. This is backwards. AI outputs are where validation matters most.

Designing only for the happy path. The workflow that handles clean, well-formed inputs is straightforward to build. The reliable workflow handles malformed inputs, model timeouts, unexpected classification values, and downstream API failures gracefully.

No fallback logic. When an AI step fails — timeout, API error, low-confidence output — the workflow needs an explicit path. That path is usually: log the failure, route to human review, do not silently drop the item.

Over-engineering before proving value. Complex multi-model pipelines with cascading AI steps are hard to debug and expensive to run. Start with one AI step surrounded by deterministic logic. Prove it works at volume before adding sophistication.

Cost Control in AI Workflows

Token costs compound quickly at scale. The most effective cost controls are architectural:

Deterministic shortcuts for obvious cases: if an email contains “unsubscribe” as the first word, route it deterministically rather than classifying it with a model
Caching for repeated inputs: identical or near-identical inputs with deterministic pre-processing can often reuse prior outputs
Confidence thresholds: only invoke a more capable (more expensive) model when a lighter-weight check returns low confidence
Batching where latency allows: grouping similar requests reduces per-call overhead for certain API configurations

Gartner has noted that AI infrastructure cost management is becoming a primary operational concern for organizations scaling from pilot to production — and that most overspend comes from invoking AI for tasks that don’t require it.

Measuring What Matters

Useful metrics for hybrid workflows fall into three categories:

Reliability: error rate per 1,000 workflow executions; percentage of runs completing without human intervention; exception rate by failure type

Cost efficiency: token usage per successful execution; cost per processed item; ratio of deterministic to AI steps (higher is generally better, all else equal)

Quality: accuracy of AI classifications validated against human review samples; compliance with business rules in generated outputs; consistency of output format over time

Review these monthly. The most common optimization finding is AI steps that could be replaced by deterministic rules based on patterns in the failure logs.

Where to Go From Here

The principle underlying all of this is straightforward: use deterministic logic wherever rules apply cleanly, use AI where interpretation is genuinely required, and validate every handoff between the two. Workflows built this way are cheaper to run, easier to debug, and meaningfully more reliable in production than pure AI approaches.

Most teams find that starting with a workflow audit — mapping existing processes to identify where AI adds value versus where deterministic rules suffice — shortens implementation time considerably and surfaces cost optimization opportunities before any code is written.

If you’d like to think through the architecture for your specific workflows, you’re welcome to book a strategy call: https://cal.com/eliott-ardisson-kzq7zs/ai-strategy-call