Last Tuesday at 3 AM, our AI system started hallucinating product prices. Instead of "$29.99", it confidently told customers items cost "approximately the GDP of Luxembourg." Here's how we built error boundaries that prevented a PR disaster.
The Problem: LLMs Fail in Creative Ways
Traditional software fails predictably. Null pointer? Exception. Network timeout? Error code. But LLMs? They fail like improv comedians having a bad night.
Real failures from our production logs:
- Model starts speaking in Base64 mid-response
- Infinite loops of "I understand you want me to..."
- Sudden language switches (English query → Mandarin response)
- Hallucinated API endpoints that sound plausible
- Token limits hit mid-senten
You can't catch these with try/catch. You need error boundaries.
React's Error Boundaries: The Inspiration
React introduced Error Boundaries to prevent one component's failure from crashing the entire app. The concept is brilliant: isolate failures, provide fallbacks, maintain system stability.
// React Error Boundary
class ErrorBoundary extends React.Component {
state = { hasError: false };
static getDerivedStateFromError(error) {
return { hasError: true };
}
componentDidCatch(error, errorInfo) {
logErrorToService(error, errorInfo);
}
render() {
if (this.state.hasError) {
return Something went wrong.
;
}
return this.props.children;
}
}
I realized: AI systems need the same pattern, but for different failure modes.
The AI Error Boundary Pattern
Here's the core pattern that's saved us countless times:
class AIErrorBoundary {
constructor(config) {
this.validators = config.validators || [];
this.fallbacks = config.fallbacks || [];
this.monitors = config.monitors || [];
this.maxRetries = config.maxRetries || 3;
}
async execute(operation, context) {
let lastError;
for (let attempt = 0; attempt <= this.maxRetries; attempt++) {
try {
// Pre-execution validation
await this.validateInput(operation.input, context);
// Execute with monitoring
const result = await this.executeWithMonitoring(operation, context);
// Post-execution validation
await this.validateOutput(result, context);
return result;
} catch (error) {
lastError = error;
// Classify the error
const errorType = this.classifyError(error);
// Try recovery strategies
const recovered = await this.attemptRecovery(errorType, error, context);
if (recovered) return recovered;
// Use fallback if available
const fallback = this.selectFallback(errorType, context);
if (fallback && attempt === this.maxRetries) {
return await fallback.execute(context);
}
}
}
throw new AIBoundaryError('All recovery attempts failed', lastError);
}
classifyError(error) {
if (error.code === 'rate_limit_exceeded') return 'RATE_LIMIT';
if (error.message?.includes('timeout')) return 'TIMEOUT';
if (error.response?.includes('```')) return 'FORMAT_ERROR';
if (this.isHallucination(error)) return 'HALLUCINATION';
if (this.isInfiniteLoop(error)) return 'INFINITE_LOOP';
return 'UNKNOWN';
}
isHallucination(error) {
const patterns = [
/\$\d{10,}/, // Absurd prices
/year \d{5,}/, // Far future years
/[A-Z]{20,}/, // Long uppercase sequences
/base64:[A-Za-z0-9+/=]{50,}/ // Base64 in text
];
return patterns.some(p => p.test(error.response));
}
}
Layer 1: Input Validation Boundaries
Stop garbage before it reaches the model:
class InputBoundary {
constructor() {
this.rules = new Map();
this.stats = new Map();
}
addRule(name, validator, sanitizer) {
this.rules.set(name, { validator, sanitizer });
}
async validate(input, context) {
const violations = [];
for (const [name, rule] of this.rules) {
try {
const isValid = await rule.validator(input, context);
if (!isValid) {
violations.push({
rule: name,
severity: this.calculateSeverity(name, input)
});
}
} catch (error) {
// Validator itself failed - this is critical
violations.push({
rule: name,
severity: 'critical',
error: error.message
});
}
}
if (violations.some(v => v.severity === 'critical')) {
throw new ValidationError('Critical validation failure', violations);
}
// Attempt to sanitize non-critical issues
return this.sanitize(input, violations);
}
sanitize(input, violations) {
let sanitized = { ...input };
for (const violation of violations) {
const rule = this.rules.get(violation.rule);
if (rule.sanitizer) {
sanitized = rule.sanitizer(sanitized);
}
}
return sanitized;
}
}
// Usage
const inputBoundary = new InputBoundary();
inputBoundary.addRule('prompt_injection',
(input) => !input.text.includes('ignore previous instructions'),
(input) => ({
...input,
text: input.text.replace(/ignore previous instructions/gi, '')
})
);
inputBoundary.addRule('token_limit',
(input) => countTokens(input.text) < 4000,
(input) => ({
...input,
text: truncateToTokenLimit(input.text, 4000)
})
);
inputBoundary.addRule('language_consistency',
(input) => detectLanguage(input.text) === input.expectedLanguage,
(input) => ({
...input,
text: translateIfNeeded(input.text, input.expectedLanguage)
})
);
Layer 2: Execution Monitoring Boundaries
Watch for failures during execution:
class ExecutionBoundary {
constructor() {
this.monitors = [];
this.killSwitches = new Map();
}
async executeWithMonitoring(operation, context) {
const controller = new AbortController();
const monitoring = this.startMonitoring(controller, context);
try {
const result = await Promise.race([
operation.execute(controller.signal),
monitoring.failurePromise
]);
monitoring.stop();
return result;
} catch (error) {
monitoring.stop();
throw this.enhanceError(error, monitoring.metrics);
}
}
startMonitoring(controller, context) {
const metrics = {
startTime: Date.now(),
tokenCount: 0,
repetitions: new Map(),
languageChanges: 0
};
let failurePromise;
const failurePromiseExecutor = (_, reject) => {
// Token rate monitoring
const tokenMonitor = setInterval(() => {
if (metrics.tokenCount > context.tokenLimit) {
clearInterval(tokenMonitor);
controller.abort();
reject(new Error('Token limit exceeded during generation'));
}
}, 100);
// Infinite loop detection
const loopDetector = setInterval(() => {
const repeats = this.detectRepetitions(metrics.buffer);
if (repeats > 5) {
clearInterval(loopDetector);
controller.abort();
reject(new Error('Infinite loop detected'));
}
}, 500);
// Store cleanup functions
metrics.cleanup = () => {
clearInterval(tokenMonitor);
clearInterval(loopDetector);
};
};
failurePromise = new Promise(failurePromiseExecutor);
return {
failurePromise,
metrics,
stop: () => metrics.cleanup()
};
}
detectRepetitions(buffer) {
if (!buffer || buffer.length < 100) return 0;
// Look for repeated phrases
const phrases = buffer.match(/.{20,50}/g) || [];
const counts = new Map();
for (const phrase of phrases) {
counts.set(phrase, (counts.get(phrase) || 0) + 1);
}
return Math.max(...counts.values());
}
}
Layer 3: Output Validation Boundaries
Catch hallucinations and format errors:
class OutputBoundary {
constructor(config) {
this.validators = config.validators || [];
this.confidence = config.confidenceThreshold || 0.8;
}
async validate(output, context) {
const scores = await Promise.all(
this.validators.map(v => v.validate(output, context))
);
const overallScore = scores.reduce((a, b) => a + b, 0) / scores.length;
if (overallScore < this.confidence) {
throw new ValidationError(`Output confidence too low: ${overallScore}`);
}
return this.transform(output, scores);
}
transform(output, scores) {
// Apply transformations based on validation results
let transformed = output;
// Example: Redact potential hallucinations
if (scores.hallucinationScore < 0.5) {
transformed = this.redactSuspiciousContent(transformed);
}
// Example: Fix formatting issues
if (scores.formatScore < 0.7) {
transformed = this.fixFormatting(transformed);
}
return transformed;
}
redactSuspiciousContent(output) {
const suspicious = [
/\$[\d,]+\.\d{2}(?=\s*(?:billion|trillion|quadrillion))/gi,
/(?:API|endpoint|URL):\s*[^\s]+_fake_[^\s]+/gi,
/[A-Z]{30,}/g,
/(\w+\s+){5,}\1{5,}/g // Repeated phrases
];
let redacted = output;
for (const pattern of suspicious) {
redacted = redacted.replace(pattern, '[REDACTED]');
}
return redacted;
}
}
// Specialized validators
class HallucinationDetector {
async validate(output, context) {
const checks = [
this.checkFactualAccuracy(output, context),
this.checkNumericReasonability(output),
this.checkTemporalConsistency(output),
this.checkEntityConsistency(output, context)
];
const results = await Promise.all(checks);
return results.reduce((a, b) => a + b, 0) / results.length;
}
checkNumericReasonability(output) {
const numbers = output.match(/\$?[\d,]+\.?\d*/g) || [];
for (const num of numbers) {
const value = parseFloat(num.replace(/[$,]/g, ''));
// Flag suspiciously large numbers
if (value > 1000000000) return 0.3;
// Flag too many decimal places
if (num.includes('.') && num.split('.')[1].length > 2) return 0.5;
}
return 1.0;
}
}
Layer 4: Fallback Strategies
When all else fails, degrade gracefully:
class FallbackChain {
constructor(strategies) {
this.strategies = strategies;
}
async execute(context, originalError) {
const errors = [originalError];
for (const strategy of this.strategies) {
try {
if (await strategy.canHandle(context, errors)) {
const result = await strategy.execute(context);
// Validate fallback result too
if (await this.validateFallback(result, context)) {
return {
result,
fallbackUsed: strategy.name,
degraded: true
};
}
}
} catch (error) {
errors.push(error);
}
}
// Ultimate fallback
return this.ultimateFallback(context, errors);
}
ultimateFallback(context, errors) {
return {
result: {
message: "I'm having trouble processing this request. Please try again or contact support.",
errorId: generateErrorId(),
degraded: true,
fallbackUsed: 'ultimate'
},
errors
};
}
}
// Example fallback strategies
const cacheFallback = {
name: 'cache',
canHandle: async (context) => {
return await cache.hasSimilar(context.input);
},
execute: async (context) => {
const cached = await cache.getSimilar(context.input);
return {
...cached,
confidence: 0.7,
note: 'Using cached response for similar query'
};
}
};
const simplifierFallback = {
name: 'simplifier',
canHandle: async (context, errors) => {
return errors.some(e => e.type === 'COMPLEXITY');
},
execute: async (context) => {
const simplified = await simplifyQuery(context.input);
return await executeWithSimpleModel(simplified);
}
};
const templateFallback = {
name: 'template',
canHandle: async (context) => {
return templateMatcher.hasMatch(context.input);
},
execute: async (context) => {
const template = templateMatcher.match(context.input);
return template.fill(context);
}
};
Real-World Implementation
Here's how it all comes together in production:
class ProductionAIService {
constructor() {
this.boundary = new AIErrorBoundary({
validators: [
new InputBoundary(),
new OutputBoundary({
validators: [
new HallucinationDetector(),
new FormatValidator(),
new SafetyValidator()
]
})
],
fallbacks: new FallbackChain([
cacheFallback,
simplifierFallback,
templateFallback
]),
monitors: [
new LatencyMonitor(),
new CostMonitor(),
new QualityMonitor()
],
maxRetries: 3
});
}
async query(input, options = {}) {
const context = {
...options,
timestamp: Date.now(),
requestId: generateRequestId(),
userId: options.userId,
tokenLimit: options.tokenLimit || 2000,
confidenceThreshold: options.confidenceThreshold || 0.85
};
try {
const result = await this.boundary.execute(
{
input,
execute: async (signal) => {
return await this.llmClient.complete({
prompt: this.buildPrompt(input),
maxTokens: context.tokenLimit,
temperature: 0.7,
signal
});
}
},
context
);
await this.logSuccess(context, result);
return result;
} catch (error) {
await this.logFailure(context, error);
// Return safe error response
return {
error: true,
message: this.getSafeErrorMessage(error),
requestId: context.requestId,
degraded: true
};
}
}
getSafeErrorMessage(error) {
// Never expose internal errors to users
const safeMessages = {
'RATE_LIMIT': 'Service is busy. Please try again in a moment.',
'TIMEOUT': 'Request took too long. Please try a simpler query.',
'HALLUCINATION': 'Unable to provide accurate information for this query.',
'FORMAT_ERROR': 'Having trouble formatting the response properly.',
'VALIDATION_ERROR': 'Please rephrase your question.',
'UNKNOWN': 'Something went wrong. Please try again.'
};
return safeMessages[error.type] || safeMessages.UNKNOWN;
}
}
Monitoring & Learning
Error boundaries generate valuable data:
class ErrorBoundaryAnalytics {
async analyze(timeRange) {
const errors = await this.fetchErrors(timeRange);
return {
// Failure patterns
topFailureModes: this.groupBy(errors, 'type'),
// Recovery success rates
recoveryRates: this.calculateRecoveryRates(errors),
// Fallback effectiveness
fallbackSuccess: this.analyzeFallbacks(errors),
// Cost impact
costSavings: this.calculateCostSavings(errors),
// User impact
userImpact: this.assessUserImpact(errors),
// Improvement opportunities
recommendations: this.generateRecommendations(errors)
};
}
generateRecommendations(errors) {
const recommendations = [];
// High hallucination rate?
const hallucinationRate = errors.filter(e => e.type === 'HALLUCINATION').length / errors.length;
if (hallucinationRate > 0.05) {
recommendations.push({
priority: 'high',
type: 'model_tuning',
message: 'Consider fine-tuning or prompt engineering to reduce hallucinations',
estimatedImpact: '60% reduction in hallucination errors'
});
}
// Many timeout errors?
const timeoutRate = errors.filter(e => e.type === 'TIMEOUT').length / errors.length;
if (timeoutRate > 0.1) {
recommendations.push({
priority: 'medium',
type: 'infrastructure',
message: 'Increase timeout limits or optimize model serving',
estimatedImpact: '80% reduction in timeout errors'
});
}
return recommendations;
}
}
Results: 99.97% Uptime
After implementing error boundaries:
- Uptime: 99.92% → 99.97%
- User-facing errors: 1,247/day → 23/day
- Cost from retries: -$4,200/month (fewer panic retries)
- Support tickets: -78% (clearer error messages)
- Recovery rate: 94% of errors handled automatically
Key Lessons
- Fail fast at the boundary - Don't let bad data propagate
- Monitor everything - You can't fix what you can't see
- Degrade gracefully - Some response > no response
- Learn from failures - Each error makes the system stronger
- Test chaos scenarios - Intentionally break things in dev
Implementation Checklist
Start here:
- □ Implement input validation for common attacks
- □ Add timeout protection with AbortController
- □ Build hallucination detection for your domain
- □ Create cache-based fallbacks
- □ Set up error analytics dashboard
- □ Test with chaos engineering
- □ Document error scenarios for ops team
Remember: LLMs are powerful but unpredictable. Error boundaries turn that unpredictability from a liability into a manageable risk.