Z.AI (GLM)

Z.AI provides high-performance large language models, including the GLM-4 and GLM-5 series. LLMG supports both the General Inference and Coding Plan endpoints.

Configuration

LLMG supports two distinct provider identifiers for Z.AI to accommodate different plans:

z_ai: For general inference.
z_ai_coding: For the Coding Plan.

Use these as the routing prefix in model names (e.g. z_ai/glm-5, z_ai_coding/GLM-4.7).

Environment Variables

Variable	Required	Description
`Z_AI_API_KEY`	Yes	Your Z.AI API key.

Usage

Gateway

You can use either the general or coding endpoint by prefixing the model with the appropriate provider name.

General Inference

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "z_ai/glm-5",
    "messages": [{"role": "user", "content": "How do I implement a binary search in Rust?"}]
  }'

Coding Plan

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "z_ai_coding/GLM-4.7",
    "messages": [{"role": "user", "content": "Fix the bugs in this snippet..."}]
  }'

Library

use llmg_providers::z_ai::ZaiClient;
use llmg_core::provider::Provider;

// General Plan
let client = ZaiClient::new("your-api-key");

// Coding Plan
let coding_client = ZaiClient::coding("your-api-key");

Features

OpenAI Compatible: Fully supports the OpenAI chat completion protocol.
Dynamic Model Loading: Attempts to fetch the latest available models from Z.AI automatically.
SSE Streaming: Supports real-time streaming of model responses.
Dual Endpoints: Easy switching between general and coding-specialized APIs.