Create Chat Request (openai)
Text Series
Create Chat Request (openai)
POST
Create Chat Request (openai)
Introduction
Universal text chat API supporting OpenAI-compatible large language models for generating conversational responses. Through a unified API interface, you can call multiple mainstream large models including OpenAI, Claude, DeepSeek, Grok, and Tongyi Qianwen.Authentication
Bearer Token, e.g.
Bearer sk-xxxxxxxxxxRequest Parameters
Model identifier, supported models include:
- OpenAI series:
o4-mini,o3-mini,gpt-5.2,gpt-5.1,gpt-4o,gpt-4o-mini, etc. - Claude series:
claude-opus-4-6,claude-sonnet-4-5-20250929,claude-haiku-4-5-20251001, etc. - DeepSeek series:
deepseek-v3-1-250821,deepseek-v3,deepseek-r1, etc. - Grok series:
grok-4,grok-4-fast-reasoning,grok-3, etc. - Gemini series:
gemini-3-pro-preview,gemini-3-flash-preview,nano-banana-proand-thinking/-nothinking/-thinking-<budget>/-thinking-low/-thinking-highvariants - Domestic models:
glm-5,glm-4.7,doubao-seed-1-8-251228(Doubao Seed series),qwen3-coder-plus,kimi-k2.5, etc.
Conversation message list, each element contains
role (user/system/assistant) and contentRandomness control, 0-2, higher values = more random responses
Whether to enable streaming output, returns SSE format chunked data
Maximum number of tokens to generate, controls response length
Nucleus sampling parameter, 0-1, controls generation diversity
Basic Examples
- Non-Streaming Request
- Streaming Request (SSE)
- Python Example
Advanced Features
Tool Calling (Functions / Tools)
Supports OpenAI-compatible tool calling format, applicable to GPT, Claude, DeepSeek, Grok, Tongyi Qianwen, and other models.- Phase 1: Model Returns Tool Call
- Phase 2: Return Tool Execution Result
Structured Output (JSON Schema)
Supports controlling output format throughresponse_format parameter, applicable to GPT, Claude, Grok, and other models.
Thinking Capability
Some models support thinking capability (Thinking/Reasoning), which can display the reasoning process when generating responses. Different models implement this differently:- DeepSeek
- Tongyi Qianwen
- Gemini
DeepSeek models support enabling thinking capability through the
thinking field:- Default
thinking.typeis"disabled", need to explicitly set to"enabled"to enable - The output form of thinking capability may vary by model version
- It is recommended to use with
stream: truefor better interactive experience
Tongyi Qianwen Extended Features
Tongyi Qianwen models support extended features such as search, speech recognition, etc. All extended parameters need to be placed in theparameters object.
- Search Feature
- Speech Recognition
All extended parameters for Tongyi Qianwen (such as
enable_search, search_options, asr_options, temperature, top_p, etc.) need to be placed in the parameters object, not at the top level of the request body.Web Search Features
Some models support real-time web search, allowing access to the latest information and including citation sources in responses.- Claude Web Search
- Grok Live Search
Claude models do not support enabling web search functionality through the Example with Location Information (showing tool call flow):
web_search_options parameter, so it can only be implemented through tool calls, and may be unstable due to network and prompt reasons. For details, see Tool Calling (Functions / Tools) above.Basic Example (showing tool call flow):- Search functionality will increase response time and token consumption (including search result content)
- Search results will automatically include citation sources in the response
- Supported models include Claude Sonnet 4, Claude 3 Opus, etc.
- In multi-turn conversations, tool calls and results will be visible in message history, and the model can continue the conversation based on previous search results
Stability Notice:
- Web search functionality depends on upstream proxy services and external search services, and may have the following instabilities:
- Network fluctuations: Network connection issues may cause search requests to timeout or fail
- Service limitations: Search services may have rate limits, timeout limits, or temporary unavailability
- Search result quality: Some queries may not find relevant information, or search results may be of poor quality
- Model judgment: The model will automatically determine whether a search is needed based on the question, and in some cases may not trigger a search
- This is an inherent characteristic of web search functionality. It is recommended to:
- Implement retry mechanisms in critical scenarios
- Handle search failures with graceful degradation (e.g., using the model’s knowledge base to answer)
- Avoid relying entirely on web search in scenarios with extremely high real-time requirements
GPT File Input (Responses API)
GPT-5 and other models support file input functionality, which needs to be called through the/v1/responses endpoint, not /v1/chat/completions.
- Upload via File URL
- Upload via Base64 Encoding
You can upload PDF files by linking external URLs:
- File size limit: Single file not exceeding 50 MB, total size of all files in a single request not exceeding 50 MB
- Supported models:
gpt-4o,gpt-4o-mini,gpt-5-chat, and other models that support text and image input - Reasoning models (o1, o3-mini, o4-mini) should also use the
/v1/responsesendpoint if they need to use reasoning capability
Grok Reasoning Capability
Grok models (especiallygrok-4-fast-reasoning) support reasoning capability. The usage in the response distinguishes between completion_tokens and reasoning_tokens:
completion_tokens - reasoning_tokens
Response Format
- Non-Streaming Response
- Streaming Response
Error Handling
| Exception Type | Trigger Scenario | Return Message |
|---|---|---|
| AuthenticationError | Invalid or unauthorized API key | Error: Invalid or unauthorized API key |
| NotFoundError | Model does not exist or is not supported | Error: Model [model] does not exist or is not supported |
| APIConnectionError | Network interruption or server not responding | Error: Cannot connect to API server |
| APIError | Request format error and other server-side exceptions | API request failed: [error details] |
Supported Model Series
OpenAI Series
- GPT-4.1, GPT-4o, GPT-4o Mini, GPT-3.5-turbo
- Reasoning models: o3-mini, o4-mini (need to use
/v1/responsesendpoint)
Claude Series (Anthropic)
- Claude Sonnet 4, Claude 3 Opus, Claude 3 Haiku
DeepSeek Series
- DeepSeek V3, DeepSeek R1
Grok Series (xAI)
- Grok-4, Grok-3, Grok-3-fast, Grok-4-fast-reasoning
Tongyi Qianwen Series (Qwen)
- Qwen3-omni-flash, etc.
Doubao Seed Series
- doubao-seed-1-8-251228, etc.
Other Models
- Gemini series, GLM series (including glm-5), Kimi series, etc.
Notes
- In the
messageslist,systemrole is used to set model behavior,userrole is for user questions - Multi-turn conversations require appending history (including
assistantrole responses) - Requires
openailibrary:pip install openai - Different models may have different levels of support for certain features, it is recommended to check the specific model documentation before use
Related Resources
FAQ
View FAQ for chat interface
Model List
View all supported model information
