Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.xingchaoyiqing.com/llms.txt

Use this file to discover all available pages before exploring further.

The OpenAI Chat Completions API endpoint lets you send chat messages and receive AI-generated responses using the familiar OpenAI request format. You can use GPT, Claude, and Gemini models through this single endpoint, making it easy to switch between model families without changing your integration code. Base URL: http://apillm.globalaiopc.com/gw_llm_power Endpoint: POST /v1/chat/completions

Authentication

Authenticate every request using the Authorization header with your API key:
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

Supported Models

GPT Models

Standard-official-hc-fd-low
gpt-5.4gpt-5.4-officialgpt-5.4-hcgpt-5.4-fdgpt-5.4-low
gpt-5.5gpt-5.5-officialgpt-5.5-hcgpt-5.5-fdgpt-5.5-low

Claude Models

Standard-official-hc-fd
claude-haiku-4-5claude-haiku-4-5-officialclaude-haiku-4-5-hcclaude-haiku-4-5-fd
claude-opus-4-5claude-opus-4-5-officialclaude-opus-4-5-hcclaude-opus-4-5-fd
claude-opus-4-6claude-opus-4-6-officialclaude-opus-4-6-hcclaude-opus-4-6-fd
claude-opus-4-7claude-opus-4-7-officialclaude-opus-4-7-hcclaude-opus-4-7-fd
claude-sonnet-4-5claude-sonnet-4-5-officialclaude-sonnet-4-5-hcclaude-sonnet-4-5-fd
claude-sonnet-4-6claude-sonnet-4-6-officialclaude-sonnet-4-6-hcclaude-sonnet-4-6-fd

Gemini Models

Standard-official-low
gemini-2.5-flash-lite
gemini-2.5-progemini-2.5-pro-officialgemini-2.5-pro-low
gemini-3-flash-previewgemini-3-flash-preview-officialgemini-3-flash-preview-low
gemini-3.1-flash-lite-preview
gemini-3.1-pro-previewgemini-3.1-pro-preview-officialgemini-3.1-pro-preview-low

Model Suffix Reference

SuffixDescription
(none)Standard version
-officialOfficial version
-hcHigh-quality pool (AWS or premium account pool)
-fdProxy pool / mixed account pool
-lowBudget version

Multimodal Support

CapabilitySupported Models
Image analysisAll models
Video analysisGemini models only

Request Parameters

model
string
required
The name of the model to use. For example: gpt-5.4, claude-opus-4-7, or gemini-3.1-pro-preview.
messages
array
required
An array of chat message objects forming the conversation history. Each object must include a role and content field.
messages[].role
string
required
The role of the message author. Accepted values: system, user, or assistant.
messages[].content
string | array
required
The content of the message. Pass a plain text string for standard text input, or an OpenAI-compatible multimodal array for image or video analysis.
temperature
number
Controls the randomness of the model’s output. Values range from 0 to 2. Lower values produce more focused, deterministic responses; higher values produce more varied output. We recommend adjusting either temperature or top_p, but not both simultaneously.
top_p
number
Nucleus sampling parameter. The model considers only the tokens comprising the top top_p probability mass. We recommend adjusting either top_p or temperature, but not both simultaneously.
stream
boolean
When set to true, the response is returned as a stream of Server-Sent Events (SSE). The stream ends with a final data: [DONE] message.
max_completion_tokens
integer
The maximum number of tokens the model may generate in its response.
stop
string | array
One or more sequences at which the model will stop generating further tokens. Pass a single string or an array of up to four strings.

Response Fields

id
string
A unique identifier for the request.
object
string
The type of the returned object. Always chat.completion for non-streaming responses.
created
integer
The Unix timestamp (in seconds) of when the response was created.
model
string
The name of the model that was used to generate the response.
choices[].message.role
string
The role of the generated message. Always assistant.
choices[].message.content
string
The text content generated by the model.
choices[].finish_reason
string
The reason the model stopped generating tokens. Common values include stop (natural end) and length (token limit reached).
usage.prompt_tokens
integer
The number of tokens in the input messages.
usage.completion_tokens
integer
The number of tokens in the generated response.
usage.total_tokens
integer
The total number of tokens used in the request (prompt + completion).

Code Examples

curl -X POST http://apillm.globalaiopc.com/gw_llm_power/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is artificial intelligence?"}
    ],
    "temperature": 0.7
  }'

Example Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1761635478,
  "model": "gpt-5.4",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Artificial intelligence (AI) is the simulation of human intelligence processes by computer systems, including learning, reasoning, and self-correction."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 34,
    "total_tokens": 54
  }
}
To receive a streaming response, set "stream": true in your request body. The API will return a series of Server-Sent Events (SSE), each containing a partial response delta. The stream terminates with a final data: [DONE] message.