Skip to content

Error Handling

This document describes all error responses returned by the LLMG gateway, their HTTP status codes, and how to handle them.

All errors follow a consistent JSON format:

{
"error": {
"type": "error_type",
"message": "Human-readable description",
"code": "ERROR_CODE"
}
}
Error TypeHTTP StatusDescription
authentication_error401Invalid or missing API key
rate_limit_error429Rate limit exceeded
invalid_request_error400Malformed request or invalid parameters
not_found_error404Requested resource does not exist
provider_error502Upstream provider returned an error
timeout_error504Request timed out
internal_error500Unexpected server error
unsupported_feature501Feature not supported by the provider

Returned when the API key is missing, invalid, or expired.

Example:

{
"error": {
"type": "authentication_error",
"message": "Authentication failed",
"code": "AUTH_ERROR"
}
}

Common Causes:

  • Missing Authorization header
  • Invalid API key format
  • Expired or revoked API key
  • Provider API key not configured

Resolution:

Terminal window
# Verify your API key is set
echo $OPENAI_API_KEY
# Include in request
curl -H "Authorization: Bearer $OPENAI_API_KEY" ...

Returned when request limits are exceeded for the provider or gateway.

Example:

{
"error": {
"type": "rate_limit_error",
"message": "Rate limit exceeded",
"code": "RATE_LIMIT_ERROR",
"retry_after": 60
}
}

Common Causes:

  • Exceeded provider’s requests-per-minute limit
  • Exceeded gateway’s configured rate limit
  • Token-based rate limiting (input/output tokens)

Resolution:

  • Implement exponential backoff
  • Check the Retry-After header
  • Reduce request frequency
  • Upgrade your provider plan for higher limits

Retry Logic Example:

import time
import random
def retry_with_backoff(func, max_retries=5):
for attempt in range(max_retries):
try:
return func()
except RateLimitError as e:
if attempt == max_retries - 1:
raise
# Exponential backoff with jitter
sleep_time = (2 ** attempt) + random.uniform(0, 1)
time.sleep(sleep_time)

Returned when the request is malformed or contains invalid parameters.

Example - Missing Required Field:

{
"error": {
"type": "invalid_request_error",
"message": "Invalid request: missing required field 'messages'",
"code": "INVALID_REQUEST"
}
}

Example - Invalid Model:

{
"error": {
"type": "invalid_request_error",
"message": "Invalid request: model 'gpt-99' not found",
"code": "INVALID_REQUEST"
}
}

Example - Invalid Header:

{
"error": {
"type": "invalid_request_error",
"message": "Invalid request: Invalid header name: ...",
"code": "INVALID_REQUEST"
}
}

Common Causes:

  • Missing required fields (model, messages)
  • Invalid JSON syntax
  • Unsupported model name
  • Invalid parameter types or values
  • Malformed headers

Resolution:

  • Validate JSON before sending
  • Check the API reference for required fields
  • Verify model name is available: GET /v1/models

Returned when the requested resource does not exist.

Example:

{
"error": {
"type": "not_found_error",
"message": "Resource not found",
"code": "NOT_FOUND"
}
}

Common Causes:

  • Invalid endpoint URL
  • Non-existent model
  • Deleted or unavailable resource

Resolution:

  • Verify the endpoint URL
  • Check available models with GET /v1/models

Returned when the upstream LLM provider returns an error.

Example:

{
"error": {
"type": "provider_error",
"message": "Provider error: OpenAI API returned 500",
"code": "PROVIDER_ERROR",
"provider": "openai"
}
}

Example - Provider API Error:

{
"error": {
"type": "provider_error",
"message": "API error: 400 - context length exceeded",
"code": "API_ERROR",
"provider": "anthropic",
"status": 400
}
}

Common Causes:

  • Provider service outage
  • Context length exceeded
  • Provider-specific validation errors
  • Quota exceeded at provider level

Resolution:

  • Check provider status page
  • Reduce context window (fewer messages or shorter content)
  • Verify provider API key has quota remaining
  • Try a fallback provider

Returned when a request exceeds the configured timeout.

Example:

{
"error": {
"type": "timeout_error",
"message": "Request timed out",
"code": "TIMEOUT"
}
}

Common Causes:

  • Complex request requiring long processing
  • Provider experiencing high latency
  • Network connectivity issues
  • Timeout configured too low

Resolution:

  • Increase timeout in configuration (timeout in llmg.toml)
  • Implement client-side timeouts with retry
  • For long requests, consider streaming responses
  • Check provider status for latency issues

Configuration:

[server]
timeout = 120 # Increase from default 60 seconds

Returned when an unexpected server error occurs.

Example:

{
"error": {
"type": "internal_error",
"message": "Internal provider error: serialization failed",
"code": "INTERNAL_ERROR"
}
}

Example - Serialization Error:

{
"error": {
"type": "internal_error",
"message": "Serialization error: invalid unicode",
"code": "SERIALIZATION_ERROR"
}
}

Common Causes:

  • Unexpected server condition
  • Serialization/deserialization failure
  • Bug in gateway code

Resolution:

  • Retry the request (these are typically transient)
  • Check gateway logs for details
  • Report persistent issues to the LLMG team

Returned when a feature is not supported by the selected provider.

Example:

{
"error": {
"type": "unsupported_feature",
"message": "Feature not supported by this provider",
"code": "UNSUPPORTED_FEATURE"
}
}

Common Causes:

  • Streaming not supported by provider
  • Embeddings not available for the provider
  • list_models() not implemented

Resolution:

  • Use a different provider that supports the feature
  • Check provider documentation for supported features
  • Use fallback provider configuration

LlmError VariantHTTP StatusError Type
HttpError502provider_error
ApiErrorvaries*provider_error
AuthError401authentication_error
RateLimitError429rate_limit_error
InvalidRequest400invalid_request_error
ProviderError502provider_error
SerializationError500internal_error
Unknown500internal_error
UnsupportedFeature501unsupported_feature
NotFound404not_found_error
InternalError500internal_error
Timeout504timeout_error

*ApiError passes through the provider’s HTTP status code (e.g., 400 for bad request at provider)


from tenacity import retry, stop_after_attempt, wait_exponential
@retry(
stop=stop_after_attempt(3),
wait=wait_exponential(multiplier=1, min=4, max=10),
retry=retry_if_exception_type((RateLimitError, TimeoutError))
)
def call_llmg_api(request):
return client.chat.completions.create(**request)
try:
response = client.chat.completions.create(**request)
except AuthenticationError:
# Refresh API key or alert admin
refresh_api_key()
except RateLimitError as e:
# Backoff and retry
time.sleep(e.retry_after)
retry_request()
except InvalidRequestError as e:
# Fix request and don't retry
log_error(e)
raise

Configure the gateway with multiple providers to handle provider-specific errors:

[providers]
primary = "openai"
fallback = ["anthropic", "groq"]

Track error types in your monitoring:

# Prometheus metrics
error_counter = Counter('llmg_errors_total', 'Total errors', ['type'])
def handle_error(error):
error_counter.labels(type=error.type).inc()
# ... handle error
import jsonschema
# Validate request against schema before sending
schema = {
"type": "object",
"required": ["model", "messages"],
"properties": {
"model": {"type": "string"},
"messages": {"type": "array"}
}
}
def validate_request(request):
jsonschema.validate(request, schema)

When using streaming responses, errors may occur mid-stream:

import openai
client = openai.OpenAI(base_url="http://localhost:8080/v1")
try:
stream = client.chat.completions.create(
model="openai/gpt-4",
messages=[{"role": "user", "content": "Hello"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
except openai.RateLimitError as e:
# Handle rate limit mid-stream
print(f"Rate limited. Retry after: {e.headers.get('retry-after')}")
except openai.APIError as e:
# Handle other API errors
print(f"API error: {e.message}")

Terminal window
LLMG_LOG_LEVEL=debug ./llmg-gateway
Terminal window
# Verbose curl to see full request/response
# Note: The gateway uses server-side API keys (env vars), not client-side auth headers
curl -v -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "openai/gpt-4", "messages": [{"role": "user", "content": "Hi"}]}'
  1. Check gateway health: GET /health
  2. List available models: GET /v1/models
  3. Verify environment variables: All *_API_KEY vars set
  4. Check provider status: Visit provider’s status page
  5. Review gateway logs: Look for detailed error traces

If you encounter persistent errors:

  1. Check the troubleshooting guide
  2. Search existing issues
  3. Create a new issue with:
    • Error message and code
    • Request details (without API keys)
    • Gateway version (GET /health)
    • Provider being used