The LuBot Cascade

0
Queries Routed
0
Zero-Token Decisions
0%
Token-Free Rate
Watch the Cascade Route a Query
Click a preset or type a query to see the Cascade in action
Without Cascade every query costs ~300 tokens. With Cascade: 0 tokens saved so far
Tier 0 - Deterministic Detection
0ms – Regex
PhD Analysis Correlation Concentration Anomaly Web Search Fast-Path
not caught → pass down
Tier 1 - Core Intents (80% of queries)
0ms – Pattern Match
GREETING IDENTITY CAPABILITIES MEMORY_RECALL DATA_QUERY WEB_SEARCH DOCUMENT_QA PREDICTION DATA_LIBRARY DOCUMENT_GENERATION
not caught → semantic search
Tier 2 - NVIDIA Embeddings (15% of queries)
5ms – Semantic
ADVICE_REQUEST FOLLOWUP CLARIFICATION DEEP_DIVE COMPARISON
still ambiguous → call LLM
Tier 3 - NVIDIA LLM Fallback (5% of queries)
100ms – LLM
Ambiguous Queries Complex Multi-Intent Edge Cases