Limitations
MicroResolve is a lexical decision engine. It works by matching words and learned term weights — not by understanding meaning. These properties have real consequences.
Cold start accuracy
With only a handful of seed phrases per intent, expect 60–75% exact match accuracy. The engine improves rapidly with corrections — CLINC150 goes from 74.8% to 90.3% in three correction rounds — but the cold start is real.
Mitigation: Use the LLM import pipeline to generate diverse seed phrases at setup time. 10–20 phrases per intent significantly improves cold start accuracy.
Out-of-vocabulary terms
If a user writes something with no vocabulary overlap with any intent’s phrases, the engine will return no matches or low-confidence matches.
# No phrases contain "refund" or "money back" → poor matchns.resolve("I want my money back") # low confidence if not trainedMitigation: Add synonym phrases to the intent. The LLM import pipeline does this automatically for common terms.
Polysemy and ambiguity
Words shared across intents reduce confidence. “Cancel” appears in both cancel_order and cancel_subscription — without additional context words, the engine may be uncertain.
Mitigation: The scoring layer handles many cases automatically through IDF weighting. For persistent ambiguity, raise the gap parameter to require a stronger score separation before committing to a single intent.
Not a semantic search engine
MicroResolve does not understand paraphrase, metaphor, or abstract language:
"I'm done with this service" → may not match cancel_subscription"my delivery is lost in space" → may not match track_orderMitigation: Use MicroResolve as a prefilter. High-confidence matches (score > 0.7) handle directly. Low-confidence queries fall through to an LLM.
100+ intent scale
At 100+ intents with overlapping terminology, exact match decreases but recall stays high (94.7% in our MCP benchmark). The correct intent is usually in the top results.
Mitigation: Use top-K results with a secondary LLM pass for disambiguation. This is cheaper than full LLM classification on every query.
When to use MicroResolve
| Good fit | Not a good fit |
|---|---|
| Structured domains (support, e-commerce, tools) | Free-form creative or conversational queries |
| Known intent taxonomy | Fully open-ended intent discovery |
| Low-latency classification at the edge | One-off queries with no training data |
| Prefilter before LLM | Cases requiring deep semantic understanding |
| Cost-sensitive at scale | Small volume where LLM cost does not matter |
Next
- Benchmarks — measured accuracy under real conditions
- Threshold Tuning — reduce false positives without sacrificing recall
- Concepts — how the scoring model works