Rule-Based vs LLM: When to Skip the API Call

The Context

I'm building an instant lead response system. When someone fills out a contact form, the system needs to classify their intent, score the lead quality, generate a personalized response, and send it via email — all in under 60 seconds.

My first instinct: use an LLM. Claude Haiku is fast, cheap (~$0.002/request), and handles nuance well.

But then I hit a blocker on API access. While figuring it out, I thought: could I just not use an LLM at all?

Three hours later, I had a working rule-based system.

The Rule-Based Version

Intent classification: Keyword and phrase matching with weighted scoring. Keywords score 1 point each, phrases score 3 (stronger signal). Highest score wins.

Lead scoring: Signal detection. Start at 5 baseline, add points for urgency signals ("asap", "ready to buy", "budget approved"), quality signals ("enterprise", "team", "growing"), and contact completeness. Subtract for weak signals ("just browsing", "student", "personal project"). Clamp to 1-10.

Response generation: Template + variable substitution. Five templates (one per intent), fill in name, company, intent summary, and recommended plan based on score.

568 lines of Python. No ML. No dependencies beyond FastAPI.

The Results

Metric	LLM Version	Rule-Based
Response time	200-400ms	102ms
Cost per lead	$0.002	$0
Intent accuracy	~95%	~85-90%
Dependencies	API key	None
Nuance	Excellent	Limited
Edge cases	Graceful	Needs explicit patterns

The rule-based version is faster and cheaper. The LLM version is smarter.

When to Use Which

Rule-Based

When intents are predictable — you've seen 1,000 leads and know the 5 patterns that cover 95% of them. When templates work. When cost matters. When speed is critical. When you want zero external dependencies.

LLM

When intents are unpredictable. When context is complex and you need to extract subtle meaning. When quality matters more than cost. When responses need variety and natural language. When you're iterating fast and want to change behavior via prompt rather than code.

The Hybrid Approach

What I'd do in production:

Start rule-based. Deploy fast, zero cost, learn the patterns.
Log everything. Track which leads get misclassified.
Add LLM for ambiguous cases. If keyword score is below threshold, call LLM as fallback.
Extract new patterns. When the LLM handles a case 50+ times, codify it into rules.
Optimize cost. Keep pushing complexity into rules, reserve LLM for edge cases.

This gives you 90%+ accuracy, sub-$0.001 per lead cost, and graceful degradation if the LLM goes down.

The Lesson

Not every problem needs an LLM.

We're in an era where the default answer to "how should I build this?" is "throw an LLM at it." And LLMs are incredible — I use one for 90% of my work.

But sometimes the right answer is keyword matching, templates, and 568 lines of code.

The question isn't "LLM or rule-based?" It's "what's the simplest thing that works?"

Choose based on the problem, not the hype.

Both versions are on GitHub: TheAuroraAI/instant-lead-response