Docs/Language Models API

LLM

Language Models API

Claude API proxy with pay-per-token pricing. Compatible with Anthropic API format.

Overview

The Messages API is compatible with the official Anthropic API format. It supports streaming, vision, tool use, and extended thinking features.

Endpoint

POST/api/v1/messages

Available Models

Prices are per 1M tokens

Model	Input	Output	Description
claude-opus-4-8	$3.6765/1M	$18.3824/1M	Most capable Opus, long-horizon work
claude-opus-4-7	$3.6765/1M	$18.3824/1M	1M context, adaptive thinking
claude-opus-4-7-thinking	$3.6765/1M	$18.3824/1M	Opus 4.7 + Thinking
claude-haiku-4-5-20251001	$0.3530/1M	$1.7648/1M	Fast, affordable
claude-haiku-4-5-20251001-thinking	$0.3530/1M	$1.7648/1M	Haiku + Thinking
claude-sonnet-4-20250514	$1.0589/1M	$5.2942/1M	Balanced
claude-sonnet-4-20250514-thinking	$1.0589/1M	$5.2942/1M	Sonnet + Thinking
claude-sonnet-4-5-20250929	$1.0589/1M	$5.2942/1M	Sonnet 4.5
claude-sonnet-4-5-20250929-thinking	$1.0589/1M	$5.2942/1M	Sonnet 4.5 + Thinking
claude-sonnet-4-6	$2.6471/1M	$13.2353/1M	Latest Sonnet
claude-sonnet-4-6-thinking	$2.6471/1M	$13.2353/1M	Latest Sonnet + Thinking
claude-opus-4-6	$4.4118/1M	$22.0589/1M	Latest Opus
claude-opus-4-6-thinking	$4.4118/1M	$22.0589/1M	Latest Opus + Thinking
claude-opus-4-20250514	$5.2942/1M	$26.4706/1M	Opus 4
claude-opus-4-20250514-thinking	$5.2942/1M	$26.4706/1M	Opus 4 + Thinking
claude-opus-4-5-20251101	$1.7648/1M	$8.8236/1M	Opus 4.5
claude-opus-4-5-20251101-thinking	$1.7648/1M	$8.8236/1M	Opus 4.5 + Thinking

Request Parameters

modelrequiredstring

The model to use (see table above)

messagesrequiredarray

Array of messages with role (user/assistant) and content

max_tokensinteger

Maximum tokens to generate. Default: 1024

systemstring

System prompt to set context

streamboolean

Enable streaming responses. Default: false

temperaturenumber

Sampling temperature (0-1). Default: 1

thinkingobject

Enable extended thinking for *-thinking models

Basic Request

curl -X POST https://apimodels.app/api/v1/messages \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5-20250929",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Hello, Claude!"}
    ]
  }'

Response

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Hello! How can I help you today?"
    }
  ],
  "model": "claude-sonnet-4-5-20250929",
  "stop_reason": "end_turn",
  "usage": {
    "input_tokens": 12,
    "output_tokens": 10
  }
}

Extended Thinking

For *-thinking models, you can enable extended thinking for complex reasoning tasks.

curl -X POST https://apimodels.app/api/v1/messages \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5-20250929-thinking",
    "max_tokens": 16000,
    "thinking": {
      "type": "enabled",
      "budget_tokens": 10000
    },
    "messages": [
      {"role": "user", "content": "Solve this step by step: What is 15% of 340?"}
    ]
  }'

Streaming

Set stream: true to receive Server-Sent Events (SSE).

curl -X POST https://apimodels.app/api/v1/messages \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5-20250929",
    "max_tokens": 1024,
    "stream": true,
    "messages": [
      {"role": "user", "content": "Write a short poem about coding."}
    ]
  }'

Billing

Credits are calculated based on actual token usage:

Cost = (input_tokens * input_price + output_tokens * output_price) / 1,000,000
Credits = Cost(CNY)  // credits are in ¥

Error Codes

400Invalid request parameters

401Invalid or missing API key

402Insufficient credits

404Model not found

429Rate limit exceeded

500Internal server error

View Models Get API Key