The LuBot Cascade
4-Tier Intent Routing — 95% of decisions cost zero LLM tokens
0
Queries Routed
0
Zero-Token Decisions
0%
Token-Free Rate
Watch the Cascade Route a Query
correlation query
greeting
data query
capabilities
follow-up
complex reasoning
web search
PhD analysis
concentration risk
aggregation
what-if scenario
root cause
identity
anomaly
forecast
Route →
Click a preset or type a query to see the Cascade in action
▶ Auto-Demo (for screen recording)
Without Cascade every query costs ~300 tokens. With Cascade:
0
tokens saved so far
Tier 0 - Deterministic Detection
0ms – Regex
PhD Analysis
Correlation
Concentration
Anomaly
Web Search Fast-Path
not caught → pass down
Tier 1 - Core Intents (80% of queries)
0ms – Pattern Match
GREETING
IDENTITY
CAPABILITIES
MEMORY_RECALL
DATA_QUERY
WEB_SEARCH
DOCUMENT_QA
PREDICTION
DATA_LIBRARY
DOCUMENT_GENERATION
not caught → semantic search
Tier 2 - NVIDIA Embeddings (15% of queries)
5ms – Semantic
ADVICE_REQUEST
FOLLOWUP
CLARIFICATION
DEEP_DIVE
COMPARISON
still ambiguous → call LLM
Tier 3 - NVIDIA LLM Fallback (5% of queries)
100ms – LLM
Ambiguous Queries
Complex Multi-Intent
Edge Cases