Error Handling
This document describes all error responses returned by the LLMG gateway, their HTTP status codes, and how to handle them.
Error Response Format
Section titled “Error Response Format”All errors follow a consistent JSON format:
{ "error": { "type": "error_type", "message": "Human-readable description", "code": "ERROR_CODE" }}Error Types
Section titled “Error Types”| Error Type | HTTP Status | Description |
|---|---|---|
authentication_error | 401 | Invalid or missing API key |
rate_limit_error | 429 | Rate limit exceeded |
invalid_request_error | 400 | Malformed request or invalid parameters |
not_found_error | 404 | Requested resource does not exist |
provider_error | 502 | Upstream provider returned an error |
timeout_error | 504 | Request timed out |
internal_error | 500 | Unexpected server error |
unsupported_feature | 501 | Feature not supported by the provider |
Detailed Error Reference
Section titled “Detailed Error Reference”Authentication Error (401)
Section titled “Authentication Error (401)”Returned when the API key is missing, invalid, or expired.
Example:
{ "error": { "type": "authentication_error", "message": "Authentication failed", "code": "AUTH_ERROR" }}Common Causes:
- Missing
Authorizationheader - Invalid API key format
- Expired or revoked API key
- Provider API key not configured
Resolution:
# Verify your API key is setecho $OPENAI_API_KEY
# Include in requestcurl -H "Authorization: Bearer $OPENAI_API_KEY" ...Rate Limit Error (429)
Section titled “Rate Limit Error (429)”Returned when request limits are exceeded for the provider or gateway.
Example:
{ "error": { "type": "rate_limit_error", "message": "Rate limit exceeded", "code": "RATE_LIMIT_ERROR", "retry_after": 60 }}Common Causes:
- Exceeded provider’s requests-per-minute limit
- Exceeded gateway’s configured rate limit
- Token-based rate limiting (input/output tokens)
Resolution:
- Implement exponential backoff
- Check the
Retry-Afterheader - Reduce request frequency
- Upgrade your provider plan for higher limits
Retry Logic Example:
import timeimport random
def retry_with_backoff(func, max_retries=5): for attempt in range(max_retries): try: return func() except RateLimitError as e: if attempt == max_retries - 1: raise # Exponential backoff with jitter sleep_time = (2 ** attempt) + random.uniform(0, 1) time.sleep(sleep_time)Invalid Request Error (400)
Section titled “Invalid Request Error (400)”Returned when the request is malformed or contains invalid parameters.
Example - Missing Required Field:
{ "error": { "type": "invalid_request_error", "message": "Invalid request: missing required field 'messages'", "code": "INVALID_REQUEST" }}Example - Invalid Model:
{ "error": { "type": "invalid_request_error", "message": "Invalid request: model 'gpt-99' not found", "code": "INVALID_REQUEST" }}Example - Invalid Header:
{ "error": { "type": "invalid_request_error", "message": "Invalid request: Invalid header name: ...", "code": "INVALID_REQUEST" }}Common Causes:
- Missing required fields (
model,messages) - Invalid JSON syntax
- Unsupported model name
- Invalid parameter types or values
- Malformed headers
Resolution:
- Validate JSON before sending
- Check the API reference for required fields
- Verify model name is available:
GET /v1/models
Not Found Error (404)
Section titled “Not Found Error (404)”Returned when the requested resource does not exist.
Example:
{ "error": { "type": "not_found_error", "message": "Resource not found", "code": "NOT_FOUND" }}Common Causes:
- Invalid endpoint URL
- Non-existent model
- Deleted or unavailable resource
Resolution:
- Verify the endpoint URL
- Check available models with
GET /v1/models
Provider Error (502)
Section titled “Provider Error (502)”Returned when the upstream LLM provider returns an error.
Example:
{ "error": { "type": "provider_error", "message": "Provider error: OpenAI API returned 500", "code": "PROVIDER_ERROR", "provider": "openai" }}Example - Provider API Error:
{ "error": { "type": "provider_error", "message": "API error: 400 - context length exceeded", "code": "API_ERROR", "provider": "anthropic", "status": 400 }}Common Causes:
- Provider service outage
- Context length exceeded
- Provider-specific validation errors
- Quota exceeded at provider level
Resolution:
- Check provider status page
- Reduce context window (fewer messages or shorter content)
- Verify provider API key has quota remaining
- Try a fallback provider
Timeout Error (504)
Section titled “Timeout Error (504)”Returned when a request exceeds the configured timeout.
Example:
{ "error": { "type": "timeout_error", "message": "Request timed out", "code": "TIMEOUT" }}Common Causes:
- Complex request requiring long processing
- Provider experiencing high latency
- Network connectivity issues
- Timeout configured too low
Resolution:
- Increase timeout in configuration (
timeoutinllmg.toml) - Implement client-side timeouts with retry
- For long requests, consider streaming responses
- Check provider status for latency issues
Configuration:
[server]timeout = 120 # Increase from default 60 secondsInternal Error (500)
Section titled “Internal Error (500)”Returned when an unexpected server error occurs.
Example:
{ "error": { "type": "internal_error", "message": "Internal provider error: serialization failed", "code": "INTERNAL_ERROR" }}Example - Serialization Error:
{ "error": { "type": "internal_error", "message": "Serialization error: invalid unicode", "code": "SERIALIZATION_ERROR" }}Common Causes:
- Unexpected server condition
- Serialization/deserialization failure
- Bug in gateway code
Resolution:
- Retry the request (these are typically transient)
- Check gateway logs for details
- Report persistent issues to the LLMG team
Unsupported Feature Error (501)
Section titled “Unsupported Feature Error (501)”Returned when a feature is not supported by the selected provider.
Example:
{ "error": { "type": "unsupported_feature", "message": "Feature not supported by this provider", "code": "UNSUPPORTED_FEATURE" }}Common Causes:
- Streaming not supported by provider
- Embeddings not available for the provider
list_models()not implemented
Resolution:
- Use a different provider that supports the feature
- Check provider documentation for supported features
- Use fallback provider configuration
HTTP Error Mapping
Section titled “HTTP Error Mapping”| LlmError Variant | HTTP Status | Error Type |
|---|---|---|
HttpError | 502 | provider_error |
ApiError | varies* | provider_error |
AuthError | 401 | authentication_error |
RateLimitError | 429 | rate_limit_error |
InvalidRequest | 400 | invalid_request_error |
ProviderError | 502 | provider_error |
SerializationError | 500 | internal_error |
Unknown | 500 | internal_error |
UnsupportedFeature | 501 | unsupported_feature |
NotFound | 404 | not_found_error |
InternalError | 500 | internal_error |
Timeout | 504 | timeout_error |
*ApiError passes through the provider’s HTTP status code (e.g., 400 for bad request at provider)
Error Handling Best Practices
Section titled “Error Handling Best Practices”1. Implement Retry Logic
Section titled “1. Implement Retry Logic”from tenacity import retry, stop_after_attempt, wait_exponential
@retry( stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=4, max=10), retry=retry_if_exception_type((RateLimitError, TimeoutError)))def call_llmg_api(request): return client.chat.completions.create(**request)2. Handle Specific Errors
Section titled “2. Handle Specific Errors”try: response = client.chat.completions.create(**request)except AuthenticationError: # Refresh API key or alert admin refresh_api_key()except RateLimitError as e: # Backoff and retry time.sleep(e.retry_after) retry_request()except InvalidRequestError as e: # Fix request and don't retry log_error(e) raise3. Use Fallback Providers
Section titled “3. Use Fallback Providers”Configure the gateway with multiple providers to handle provider-specific errors:
[providers]primary = "openai"fallback = ["anthropic", "groq"]4. Monitor Error Rates
Section titled “4. Monitor Error Rates”Track error types in your monitoring:
# Prometheus metricserror_counter = Counter('llmg_errors_total', 'Total errors', ['type'])
def handle_error(error): error_counter.labels(type=error.type).inc() # ... handle error5. Validate Before Sending
Section titled “5. Validate Before Sending”import jsonschema
# Validate request against schema before sendingschema = { "type": "object", "required": ["model", "messages"], "properties": { "model": {"type": "string"}, "messages": {"type": "array"} }}
def validate_request(request): jsonschema.validate(request, schema)Streaming Error Handling
Section titled “Streaming Error Handling”When using streaming responses, errors may occur mid-stream:
import openai
client = openai.OpenAI(base_url="http://localhost:8080/v1")
try: stream = client.chat.completions.create( model="openai/gpt-4", messages=[{"role": "user", "content": "Hello"}], stream=True )
for chunk in stream: if chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end="")
except openai.RateLimitError as e: # Handle rate limit mid-stream print(f"Rate limited. Retry after: {e.headers.get('retry-after')}")except openai.APIError as e: # Handle other API errors print(f"API error: {e.message}")Debugging Errors
Section titled “Debugging Errors”Enable Debug Logging
Section titled “Enable Debug Logging”LLMG_LOG_LEVEL=debug ./llmg-gatewayCheck Provider Response
Section titled “Check Provider Response”# Verbose curl to see full request/response# Note: The gateway uses server-side API keys (env vars), not client-side auth headerscurl -v -X POST http://localhost:8080/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"model": "openai/gpt-4", "messages": [{"role": "user", "content": "Hi"}]}'Common Debugging Steps
Section titled “Common Debugging Steps”- Check gateway health:
GET /health - List available models:
GET /v1/models - Verify environment variables: All
*_API_KEYvars set - Check provider status: Visit provider’s status page
- Review gateway logs: Look for detailed error traces
Getting Help
Section titled “Getting Help”If you encounter persistent errors:
- Check the troubleshooting guide
- Search existing issues
- Create a new issue with:
- Error message and code
- Request details (without API keys)
- Gateway version (
GET /health) - Provider being used