Introduction to TCB LLM API Supported Protocols
CloudBaseAI supports the following protocols: Chat Completions, Responses API, and Anthropic Messages API. Any AI tool that supports custom APIs can be directly integrated.
Prerequisites
-
You have activated a TCB environment and obtained the environment ID
cloudBaseEnvID. -
Enable the required model in the AI console.
-
Obtain the
Base URLandAPI Keyin the AI console.
Chat Completions
The most universal large model conversation protocol, OpenAI Compatible, is used by default in the vast majority of AI tools.
Taking tool calling as an example, the complete process requires two API requests:
First Request: Send the user message and available tool definitions
curl https://{{YOUR-ENV-ID}}.api.tcloudbasegateway.com/v1/ai/cloudbase/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {{YOUR-API-KEY}}" \
-d '{
"model": "hy3-preview",
"messages": [
{ "role": "user", "content": "Please check the weather in Beijing for me today." }
],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Obtain the weather forecast for a specified city",
"parameters": {
"type": "object",
"properties": {
"location": { "type": "string", "description": "City name" }
},
"required": ["location"]
}
}
}
]
}'
The model returns finish_reason: "tool_calls", indicating that the client needs to execute a tool. After the client calls the tool and obtains the result, it appends the tool's return value to messages and initiates the second request:
Second Request: Carrying the tool call result
curl https://{{YOUR-ENV-ID}}.api.tcloudbasegateway.com/v1/ai/cloudbase/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {{YOUR-API-KEY}}" \
-d '{
"model": "hy3-preview",
"messages": [
{ "role": "user", "content": "Please check the weather in Beijing for me today." },
{
"role": "assistant",
"tool_calls": [{
"id": "call_123",
"type": "function",
"function": { "name": "get_current_weather", "arguments": "{\"location\": \"Beijing\"}" }
}]
},
{
"role": "tool",
"tool_call_id": "call_123",
"content": "{\"temperature\": \"25℃\", \"weather\": \"sunny\"}"
}
]
}'
The model generates the final response based on the result returned by the tool.
Responses API
The new-generation API launched by OpenAI natively supports state management, linking multi-turn conversations through previous_response_id without requiring clients to manually concatenate historical messages.
First Request: Send the user message and tool definitions
curl "https://{{YOUR-ENV-ID}}.api.tcloudbasegateway.com/v1/ai/cloudbase/responses" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {{YOUR-API-KEY}}" \
-d '{
"model": "hy3-preview",
"input": "What is the weather like in Beijing today?",
"tools": [
{
"type": "function",
"name": "get_current_weather",
"description": "Obtain the real-time weather for a specified city",
"parameters": {
"type": "object",
"properties": {
"location": { "type": "string", "description": "City name" }
},
"required": ["location"]
}
}
],
"tool_choice": "auto"
}'
The model returns an output of type function_call, which contains a response ID (e.g., resp_xxx). After the client calls the tool, it links the context through previous_response_id:
Second Request: Carrying the tool result and previous_response_id
curl "https://{{YOUR-ENV-ID}}.api.tcloudbasegateway.com/v1/ai/cloudbase/responses" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {{YOUR-API-KEY}}" \
-d '{
"model": "hy3-preview",
"previous_response_id": "resp_570bc50971764d1fb1c167d90fc1a584",
"input": [
{
"type": "function_call_output",
"call_id": "call_abc",
"output": "{\"temperature\": \"26°C\", \"condition\": \"sunny\"}"
}
]
}'
Although the client only needs to pass previous_response_id to link the context with the Responses API, the server still maintains the complete conversation history.
Anthropic Messages API
The native API protocol of Anthropic (Claude model).
TCB uses the Authorization: Bearer method for authentication, not the native Anthropic x-api-key header. Requests must include the anthropic-version header.
Request Example (with tool call):
curl "https://{{YOUR-ENV-ID}}.api.tcloudbasegateway.com/v1/ai/cloudbase/v1/messages" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {{YOUR-API-KEY}}" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "hy3-preview",
"max_tokens": 1024,
"tools": [
{
"name": "get_weather",
"description": "Obtain the real-time weather for a specified city",
"input_schema": {
"type": "object",
"properties": {
"city": { "type": "string", "description": "City name" }
},
"required": ["city"]
}
}
],
"messages": [
{ "role": "user", "content": "What is the weather like in Beijing today?" }
]
}'
When the model returns stop_reason: "tool_use", the content array contains a block with type: "tool_use". Record the id within this block for subsequent tool result callback.
Using Cache
The Anthropic Messages API supports two caching modes:
Explicit Caching
Add the "cache_control": {"type": "ephemeral"} tag at the end of specific content blocks. A maximum of 4 breakpoints can be set per request. This is suitable for scenarios requiring precise control over cache locations:
curl "https://{{YOUR-ENV-ID}}.api.tcloudbasegateway.com/v1/ai/cloudbase/v1/messages" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {{YOUR-API-KEY}}" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "hy3-preview",
"max_tokens": 1024,
"system": [
{
"type": "text",
"text": "You are a professional weather query assistant.",
"cache_control": {"type": "ephemeral"}
}
],
"tools": [
{
"name": "get_weather",
"description": "Obtain the real-time weather for a specified city",
"input_schema": {
"type": "object",
"properties": {
"city": { "type": "string", "description": "City name" }
},
"required": ["city"]
},
"cache_control": {"type": "ephemeral"}
}
],
"messages": [
{ "role": "user", "content": "What is the weather like in Beijing today?" }
]
}'
Automatic Caching
Declare "cache_control": {"type": "ephemeral"} at the outermost level of the request body. The system automatically identifies repeated static prefixes for caching, making it suitable for multi-turn conversation scenarios:
curl "https://{{YOUR-ENV-ID}}.api.tcloudbasegateway.com/v1/ai/cloudbase/v1/messages" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {{YOUR-API-KEY}}" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "hy3-preview",
"max_tokens": 1024,
"cache_control": {"type": "ephemeral"},
"system": "You are a professional weather query assistant.",
"tools": [
{
"name": "get_weather",
"description": "Obtain the real-time weather for a specified city",
"input_schema": {
"type": "object",
"properties": {
"city": { "type": "string", "description": "City name" }
},
"required": ["city"]
}
}
],
"messages": [
{ "role": "user", "content": "What is the weather like in Beijing today?" }
]
}'