OpenAI Chat Completions API for GPT and Claude Models

The OpenAI Chat Completions API endpoint lets you send chat messages and receive AI-generated responses using the familiar OpenAI request format. You can use GPT, Claude, and Gemini models through this single endpoint, making it easy to switch between model families without changing your integration code. Base URL: http://apillm.globalaiopc.com/gw_llm_power Endpoint: POST /v1/chat/completions

Authentication

Authenticate every request using the Authorization header with your API key:

Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

Supported Models

GPT Models

Standard	`-official`	`-hc`	`-fd`	`-low`
`gpt-5.4`	`gpt-5.4-official`	`gpt-5.4-hc`	`gpt-5.4-fd`	`gpt-5.4-low`
`gpt-5.5`	`gpt-5.5-official`	`gpt-5.5-hc`	`gpt-5.5-fd`	`gpt-5.5-low`

Claude Models

Standard	`-official`	`-hc`	`-fd`
`claude-haiku-4-5`	`claude-haiku-4-5-official`	`claude-haiku-4-5-hc`	`claude-haiku-4-5-fd`
`claude-opus-4-5`	`claude-opus-4-5-official`	`claude-opus-4-5-hc`	`claude-opus-4-5-fd`
`claude-opus-4-6`	`claude-opus-4-6-official`	`claude-opus-4-6-hc`	`claude-opus-4-6-fd`
`claude-opus-4-7`	`claude-opus-4-7-official`	`claude-opus-4-7-hc`	`claude-opus-4-7-fd`
`claude-sonnet-4-5`	`claude-sonnet-4-5-official`	`claude-sonnet-4-5-hc`	`claude-sonnet-4-5-fd`
`claude-sonnet-4-6`	`claude-sonnet-4-6-official`	`claude-sonnet-4-6-hc`	`claude-sonnet-4-6-fd`

Gemini Models

Standard	`-official`	`-low`
`gemini-2.5-flash-lite`	—	—
`gemini-2.5-pro`	`gemini-2.5-pro-official`	`gemini-2.5-pro-low`
`gemini-3-flash-preview`	`gemini-3-flash-preview-official`	`gemini-3-flash-preview-low`
`gemini-3.1-flash-lite-preview`	—	—
`gemini-3.1-pro-preview`	`gemini-3.1-pro-preview-official`	`gemini-3.1-pro-preview-low`

Model Suffix Reference

Suffix	Description
(none)	Standard version
`-official`	Official version
`-hc`	High-quality pool (AWS or premium account pool)
`-fd`	Proxy pool / mixed account pool
`-low`	Budget version

Multimodal Support

Capability	Supported Models
Image analysis	All models
Video analysis	Gemini models only

Request Parameters

model

string

required

The name of the model to use. For example: gpt-5.4, claude-opus-4-7, or gemini-3.1-pro-preview.

messages

array

required

An array of chat message objects forming the conversation history. Each object must include a role and content field.

messages[].role

string

required

The role of the message author. Accepted values: system, user, or assistant.

messages[].content

string | array

required

The content of the message. Pass a plain text string for standard text input, or an OpenAI-compatible multimodal array for image or video analysis.

temperature

number

Controls the randomness of the model’s output. Values range from 0 to 2. Lower values produce more focused, deterministic responses; higher values produce more varied output. We recommend adjusting either temperature or top_p, but not both simultaneously.

top_p

number

Nucleus sampling parameter. The model considers only the tokens comprising the top top_p probability mass. We recommend adjusting either top_p or temperature, but not both simultaneously.

stream

boolean

When set to true, the response is returned as a stream of Server-Sent Events (SSE). The stream ends with a final data: [DONE] message.

max_completion_tokens

integer

The maximum number of tokens the model may generate in its response.

stop

string | array

One or more sequences at which the model will stop generating further tokens. Pass a single string or an array of up to four strings.

Response Fields

string

A unique identifier for the request.

object

string

The type of the returned object. Always chat.completion for non-streaming responses.

created

integer

The Unix timestamp (in seconds) of when the response was created.

model

string

The name of the model that was used to generate the response.

choices[].message.role

string

The role of the generated message. Always assistant.

choices[].message.content

string

The text content generated by the model.

choices[].finish_reason

string

The reason the model stopped generating tokens. Common values include stop (natural end) and length (token limit reached).

usage.prompt_tokens

integer

The number of tokens in the input messages.

usage.completion_tokens

integer

The number of tokens in the generated response.

usage.total_tokens

integer

The total number of tokens used in the request (prompt + completion).

Code Examples

curl -X POST http://apillm.globalaiopc.com/gw_llm_power/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is artificial intelligence?"}
    ],
    "temperature": 0.7
  }'

Example Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1761635478,
  "model": "gpt-5.4",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Artificial intelligence (AI) is the simulation of human intelligence processes by computer systems, including learning, reasoning, and self-correction."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 34,
    "total_tokens": 54
  }
}

To receive a streaming response, set "stream": true in your request body. The API will return a series of Server-Sent Events (SSE), each containing a partial response delta. The stream terminates with a final data: [DONE] message.

Documentation Index

​Authentication

​Supported Models

​GPT Models

​Claude Models

​Gemini Models

​Model Suffix Reference

​Multimodal Support

​Request Parameters

​Response Fields

​Code Examples

​Example Response

Authentication

Supported Models

GPT Models

Claude Models

Gemini Models

Model Suffix Reference

Multimodal Support

Request Parameters

Response Fields

Code Examples

Example Response