Error Boundaries for AI: What React Taught Me About LLM Failures

Last Tuesday at 3 AM, our AI system started hallucinating product prices. Instead of "$29.99", it confidently told customers items cost "approximately the GDP of Luxembourg." Here's how we built error boundaries that prevented a PR disaster.

The Problem: LLMs Fail in Creative Ways

Traditional software fails predictably. Null pointer? Exception. Network timeout? Error code. But LLMs? They fail like improv comedians having a bad night.

Real failures from our production logs:

Model starts speaking in Base64 mid-response
Infinite loops of "I understand you want me to..."
Sudden language switches (English query → Mandarin response)
Hallucinated API endpoints that sound plausible
Token limits hit mid-senten

You can't catch these with try/catch. You need error boundaries.

React's Error Boundaries: The Inspiration

React introduced Error Boundaries to prevent one component's failure from crashing the entire app. The concept is brilliant: isolate failures, provide fallbacks, maintain system stability.

// React Error Boundary
class ErrorBoundary extends React.Component {
  state = { hasError: false };
  
  static getDerivedStateFromError(error) {
    return { hasError: true };
  }
  
  componentDidCatch(error, errorInfo) {
    logErrorToService(error, errorInfo);
  }
  
  render() {
    if (this.state.hasError) {
      return Something went wrong.;
    }
    return this.props.children;
  }
}

I realized: AI systems need the same pattern, but for different failure modes.

The AI Error Boundary Pattern

Here's the core pattern that's saved us countless times:

class AIErrorBoundary {
  constructor(config) {
    this.validators = config.validators || [];
    this.fallbacks = config.fallbacks || [];
    this.monitors = config.monitors || [];
    this.maxRetries = config.maxRetries || 3;
  }

  async execute(operation, context) {
    let lastError;
    
    for (let attempt = 0; attempt <= this.maxRetries; attempt++) {
      try {
        // Pre-execution validation
        await this.validateInput(operation.input, context);
        
        // Execute with monitoring
        const result = await this.executeWithMonitoring(operation, context);
        
        // Post-execution validation
        await this.validateOutput(result, context);
        
        return result;
      } catch (error) {
        lastError = error;
        
        // Classify the error
        const errorType = this.classifyError(error);
        
        // Try recovery strategies
        const recovered = await this.attemptRecovery(errorType, error, context);
        if (recovered) return recovered;
        
        // Use fallback if available
        const fallback = this.selectFallback(errorType, context);
        if (fallback && attempt === this.maxRetries) {
          return await fallback.execute(context);
        }
      }
    }
    
    throw new AIBoundaryError('All recovery attempts failed', lastError);
  }

  classifyError(error) {
    if (error.code === 'rate_limit_exceeded') return 'RATE_LIMIT';
    if (error.message?.includes('timeout')) return 'TIMEOUT';
    if (error.response?.includes('```')) return 'FORMAT_ERROR';
    if (this.isHallucination(error)) return 'HALLUCINATION';
    if (this.isInfiniteLoop(error)) return 'INFINITE_LOOP';
    return 'UNKNOWN';
  }

  isHallucination(error) {
    const patterns = [
      /\$\d{10,}/,           // Absurd prices
      /year \d{5,}/,         // Far future years
      /[A-Z]{20,}/,          // Long uppercase sequences
      /base64:[A-Za-z0-9+/=]{50,}/ // Base64 in text
    ];
    
    return patterns.some(p => p.test(error.response));
  }
}

Layer 1: Input Validation Boundaries

Stop garbage before it reaches the model:

class InputBoundary {
  constructor() {
    this.rules = new Map();
    this.stats = new Map();
  }

  addRule(name, validator, sanitizer) {
    this.rules.set(name, { validator, sanitizer });
  }

  async validate(input, context) {
    const violations = [];
    
    for (const [name, rule] of this.rules) {
      try {
        const isValid = await rule.validator(input, context);
        if (!isValid) {
          violations.push({
            rule: name,
            severity: this.calculateSeverity(name, input)
          });
        }
      } catch (error) {
        // Validator itself failed - this is critical
        violations.push({
          rule: name,
          severity: 'critical',
          error: error.message
        });
      }
    }

    if (violations.some(v => v.severity === 'critical')) {
      throw new ValidationError('Critical validation failure', violations);
    }

    // Attempt to sanitize non-critical issues
    return this.sanitize(input, violations);
  }

  sanitize(input, violations) {
    let sanitized = { ...input };
    
    for (const violation of violations) {
      const rule = this.rules.get(violation.rule);
      if (rule.sanitizer) {
        sanitized = rule.sanitizer(sanitized);
      }
    }
    
    return sanitized;
  }
}

// Usage
const inputBoundary = new InputBoundary();

inputBoundary.addRule('prompt_injection', 
  (input) => !input.text.includes('ignore previous instructions'),
  (input) => ({
    ...input,
    text: input.text.replace(/ignore previous instructions/gi, '')
  })
);

inputBoundary.addRule('token_limit',
  (input) => countTokens(input.text) < 4000,
  (input) => ({
    ...input,
    text: truncateToTokenLimit(input.text, 4000)
  })
);

inputBoundary.addRule('language_consistency',
  (input) => detectLanguage(input.text) === input.expectedLanguage,
  (input) => ({
    ...input,
    text: translateIfNeeded(input.text, input.expectedLanguage)
  })
);

Layer 2: Execution Monitoring Boundaries

Watch for failures during execution:

class ExecutionBoundary {
  constructor() {
    this.monitors = [];
    this.killSwitches = new Map();
  }

  async executeWithMonitoring(operation, context) {
    const controller = new AbortController();
    const monitoring = this.startMonitoring(controller, context);
    
    try {
      const result = await Promise.race([
        operation.execute(controller.signal),
        monitoring.failurePromise
      ]);
      
      monitoring.stop();
      return result;
    } catch (error) {
      monitoring.stop();
      throw this.enhanceError(error, monitoring.metrics);
    }
  }

  startMonitoring(controller, context) {
    const metrics = {
      startTime: Date.now(),
      tokenCount: 0,
      repetitions: new Map(),
      languageChanges: 0
    };

    let failurePromise;
    const failurePromiseExecutor = (_, reject) => {
      // Token rate monitoring
      const tokenMonitor = setInterval(() => {
        if (metrics.tokenCount > context.tokenLimit) {
          clearInterval(tokenMonitor);
          controller.abort();
          reject(new Error('Token limit exceeded during generation'));
        }
      }, 100);

      // Infinite loop detection
      const loopDetector = setInterval(() => {
        const repeats = this.detectRepetitions(metrics.buffer);
        if (repeats > 5) {
          clearInterval(loopDetector);
          controller.abort();
          reject(new Error('Infinite loop detected'));
        }
      }, 500);

      // Store cleanup functions
      metrics.cleanup = () => {
        clearInterval(tokenMonitor);
        clearInterval(loopDetector);
      };
    };

    failurePromise = new Promise(failurePromiseExecutor);

    return {
      failurePromise,
      metrics,
      stop: () => metrics.cleanup()
    };
  }

  detectRepetitions(buffer) {
    if (!buffer || buffer.length < 100) return 0;
    
    // Look for repeated phrases
    const phrases = buffer.match(/.{20,50}/g) || [];
    const counts = new Map();
    
    for (const phrase of phrases) {
      counts.set(phrase, (counts.get(phrase) || 0) + 1);
    }
    
    return Math.max(...counts.values());
  }
}

Layer 3: Output Validation Boundaries

Catch hallucinations and format errors:

class OutputBoundary {
  constructor(config) {
    this.validators = config.validators || [];
    this.confidence = config.confidenceThreshold || 0.8;
  }

  async validate(output, context) {
    const scores = await Promise.all(
      this.validators.map(v => v.validate(output, context))
    );

    const overallScore = scores.reduce((a, b) => a + b, 0) / scores.length;

    if (overallScore < this.confidence) {
      throw new ValidationError(`Output confidence too low: ${overallScore}`);
    }

    return this.transform(output, scores);
  }

  transform(output, scores) {
    // Apply transformations based on validation results
    let transformed = output;

    // Example: Redact potential hallucinations
    if (scores.hallucinationScore < 0.5) {
      transformed = this.redactSuspiciousContent(transformed);
    }

    // Example: Fix formatting issues
    if (scores.formatScore < 0.7) {
      transformed = this.fixFormatting(transformed);
    }

    return transformed;
  }

  redactSuspiciousContent(output) {
    const suspicious = [
      /\$[\d,]+\.\d{2}(?=\s*(?:billion|trillion|quadrillion))/gi,
      /(?:API|endpoint|URL):\s*[^\s]+_fake_[^\s]+/gi,
      /[A-Z]{30,}/g,
      /(\w+\s+){5,}\1{5,}/g  // Repeated phrases
    ];

    let redacted = output;
    for (const pattern of suspicious) {
      redacted = redacted.replace(pattern, '[REDACTED]');
    }

    return redacted;
  }
}

// Specialized validators
class HallucinationDetector {
  async validate(output, context) {
    const checks = [
      this.checkFactualAccuracy(output, context),
      this.checkNumericReasonability(output),
      this.checkTemporalConsistency(output),
      this.checkEntityConsistency(output, context)
    ];

    const results = await Promise.all(checks);
    return results.reduce((a, b) => a + b, 0) / results.length;
  }

  checkNumericReasonability(output) {
    const numbers = output.match(/\$?[\d,]+\.?\d*/g) || [];
    
    for (const num of numbers) {
      const value = parseFloat(num.replace(/[$,]/g, ''));
      
      // Flag suspiciously large numbers
      if (value > 1000000000) return 0.3;
      
      // Flag too many decimal places
      if (num.includes('.') && num.split('.')[1].length > 2) return 0.5;
    }
    
    return 1.0;
  }
}

Layer 4: Fallback Strategies

When all else fails, degrade gracefully:

class FallbackChain {
  constructor(strategies) {
    this.strategies = strategies;
  }

  async execute(context, originalError) {
    const errors = [originalError];

    for (const strategy of this.strategies) {
      try {
        if (await strategy.canHandle(context, errors)) {
          const result = await strategy.execute(context);
          
          // Validate fallback result too
          if (await this.validateFallback(result, context)) {
            return {
              result,
              fallbackUsed: strategy.name,
              degraded: true
            };
          }
        }
      } catch (error) {
        errors.push(error);
      }
    }

    // Ultimate fallback
    return this.ultimateFallback(context, errors);
  }

  ultimateFallback(context, errors) {
    return {
      result: {
        message: "I'm having trouble processing this request. Please try again or contact support.",
        errorId: generateErrorId(),
        degraded: true,
        fallbackUsed: 'ultimate'
      },
      errors
    };
  }
}

// Example fallback strategies
const cacheFallback = {
  name: 'cache',
  canHandle: async (context) => {
    return await cache.hasSimilar(context.input);
  },
  execute: async (context) => {
    const cached = await cache.getSimilar(context.input);
    return {
      ...cached,
      confidence: 0.7,
      note: 'Using cached response for similar query'
    };
  }
};

const simplifierFallback = {
  name: 'simplifier',
  canHandle: async (context, errors) => {
    return errors.some(e => e.type === 'COMPLEXITY');
  },
  execute: async (context) => {
    const simplified = await simplifyQuery(context.input);
    return await executeWithSimpleModel(simplified);
  }
};

const templateFallback = {
  name: 'template',
  canHandle: async (context) => {
    return templateMatcher.hasMatch(context.input);
  },
  execute: async (context) => {
    const template = templateMatcher.match(context.input);
    return template.fill(context);
  }
};

Real-World Implementation

Here's how it all comes together in production:

class ProductionAIService {
  constructor() {
    this.boundary = new AIErrorBoundary({
      validators: [
        new InputBoundary(),
        new OutputBoundary({
          validators: [
            new HallucinationDetector(),
            new FormatValidator(),
            new SafetyValidator()
          ]
        })
      ],
      fallbacks: new FallbackChain([
        cacheFallback,
        simplifierFallback,
        templateFallback
      ]),
      monitors: [
        new LatencyMonitor(),
        new CostMonitor(),
        new QualityMonitor()
      ],
      maxRetries: 3
    });
  }

  async query(input, options = {}) {
    const context = {
      ...options,
      timestamp: Date.now(),
      requestId: generateRequestId(),
      userId: options.userId,
      tokenLimit: options.tokenLimit || 2000,
      confidenceThreshold: options.confidenceThreshold || 0.85
    };

    try {
      const result = await this.boundary.execute(
        {
          input,
          execute: async (signal) => {
            return await this.llmClient.complete({
              prompt: this.buildPrompt(input),
              maxTokens: context.tokenLimit,
              temperature: 0.7,
              signal
            });
          }
        },
        context
      );

      await this.logSuccess(context, result);
      return result;

    } catch (error) {
      await this.logFailure(context, error);
      
      // Return safe error response
      return {
        error: true,
        message: this.getSafeErrorMessage(error),
        requestId: context.requestId,
        degraded: true
      };
    }
  }

  getSafeErrorMessage(error) {
    // Never expose internal errors to users
    const safeMessages = {
      'RATE_LIMIT': 'Service is busy. Please try again in a moment.',
      'TIMEOUT': 'Request took too long. Please try a simpler query.',
      'HALLUCINATION': 'Unable to provide accurate information for this query.',
      'FORMAT_ERROR': 'Having trouble formatting the response properly.',
      'VALIDATION_ERROR': 'Please rephrase your question.',
      'UNKNOWN': 'Something went wrong. Please try again.'
    };

    return safeMessages[error.type] || safeMessages.UNKNOWN;
  }
}

Monitoring & Learning

Error boundaries generate valuable data:

class ErrorBoundaryAnalytics {
  async analyze(timeRange) {
    const errors = await this.fetchErrors(timeRange);
    
    return {
      // Failure patterns
      topFailureModes: this.groupBy(errors, 'type'),
      
      // Recovery success rates
      recoveryRates: this.calculateRecoveryRates(errors),
      
      // Fallback effectiveness
      fallbackSuccess: this.analyzeFallbacks(errors),
      
      // Cost impact
      costSavings: this.calculateCostSavings(errors),
      
      // User impact
      userImpact: this.assessUserImpact(errors),
      
      // Improvement opportunities
      recommendations: this.generateRecommendations(errors)
    };
  }

  generateRecommendations(errors) {
    const recommendations = [];

    // High hallucination rate?
    const hallucinationRate = errors.filter(e => e.type === 'HALLUCINATION').length / errors.length;
    if (hallucinationRate > 0.05) {
      recommendations.push({
        priority: 'high',
        type: 'model_tuning',
        message: 'Consider fine-tuning or prompt engineering to reduce hallucinations',
        estimatedImpact: '60% reduction in hallucination errors'
      });
    }

    // Many timeout errors?
    const timeoutRate = errors.filter(e => e.type === 'TIMEOUT').length / errors.length;
    if (timeoutRate > 0.1) {
      recommendations.push({
        priority: 'medium',
        type: 'infrastructure',
        message: 'Increase timeout limits or optimize model serving',
        estimatedImpact: '80% reduction in timeout errors'
      });
    }

    return recommendations;
  }
}

Results: 99.97% Uptime

After implementing error boundaries:

Uptime: 99.92% → 99.97%
User-facing errors: 1,247/day → 23/day
Cost from retries: -$4,200/month (fewer panic retries)
Support tickets: -78% (clearer error messages)
Recovery rate: 94% of errors handled automatically

Key Lessons

Fail fast at the boundary - Don't let bad data propagate
Monitor everything - You can't fix what you can't see
Degrade gracefully - Some response > no response
Learn from failures - Each error makes the system stronger
Test chaos scenarios - Intentionally break things in dev

Implementation Checklist

Start here:

□ Implement input validation for common attacks
□ Add timeout protection with AbortController
□ Build hallucination detection for your domain
□ Create cache-based fallbacks
□ Set up error analytics dashboard
□ Test with chaos engineering
□ Document error scenarios for ops team

Remember: LLMs are powerful but unpredictable. Error boundaries turn that unpredictability from a liability into a manageable risk.