Chat Completions

The Chat Completions endpoint generates model responses from a conversation. It follows the OpenAI Chat Completions format.

Endpoint

POST https://api.cline.bot/api/v1/chat/completions

Request Headers

Header	Required	Description
`Authorization`	Yes	`Bearer YOUR_API_KEY`
`Content-Type`	Yes	`application/json`
`HTTP-Referer`	No	Your application URL (for usage tracking)
`X-Title`	No	Your application name (for usage logs)

Request Body

Parameter	Type	Required	Default	Description
`model`	string	Yes		Model ID in `provider/model` format. See Models.
`messages`	array	Yes		Conversation messages. Each has `role` (`system`, `user`, `assistant`) and `content`.
`stream`	boolean	No	`true`	Return the response as a stream of Server-Sent Events.
`tools`	array	No		Tool/function definitions in OpenAI format.
`temperature`	number	No	Model default	Sampling temperature (0.0 to 2.0). Lower values are more deterministic.

Message Format

Each message in the messages array has this structure:

{
  "role": "user",
  "content": "Your message here"
}

Roles:

Role	Purpose
`system`	Sets the model’s behavior and persona. Place first in the array.
`user`	The human’s input.
`assistant`	Previous model responses (for multi-turn conversations).

Multi-Turn Conversation

Include previous messages to maintain context:

{
  "model": "anthropic/claude-sonnet-4-6",
  "messages": [
    {"role": "system", "content": "You are a helpful coding assistant."},
    {"role": "user", "content": "What is a closure in JavaScript?"},
    {"role": "assistant", "content": "A closure is a function that..."},
    {"role": "user", "content": "Can you show me an example?"}
  ]
}

Streaming Response

When stream: true (the default), the response is a series of Server-Sent Events:

data: {"id":"gen-abc123","choices":[{"delta":{"role":"assistant"},"index":0}],"model":"anthropic/claude-sonnet-4-6"}

data: {"id":"gen-abc123","choices":[{"delta":{"content":"The capital"},"index":0}],"model":"anthropic/claude-sonnet-4-6"}

data: {"id":"gen-abc123","choices":[{"delta":{"content":" of France"},"index":0}],"model":"anthropic/claude-sonnet-4-6"}

data: {"id":"gen-abc123","choices":[{"delta":{"content":" is Paris."},"index":0,"finish_reason":"stop"}],"model":"anthropic/claude-sonnet-4-6","usage":{"prompt_tokens":14,"completion_tokens":8,"cost":0.000066}}

data: [DONE]

Each data: line contains a JSON chunk. Key fields:

Field	Description
`id`	Generation ID, consistent across all chunks
`choices[0].delta.content`	The new text in this chunk
`choices[0].delta.reasoning`	Reasoning/thinking content (for reasoning models)
`choices[0].finish_reason`	`stop` when complete, `error` on failure
`usage`	Token counts and cost (included in the final chunk)

Usage Object

The final chunk includes token usage and cost:

{
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 42,
    "prompt_tokens_details": {
      "cached_tokens": 0
    },
    "cost": 0.000315
  }
}

Field	Description
`prompt_tokens`	Total input tokens
`completion_tokens`	Total output tokens
`prompt_tokens_details.cached_tokens`	Tokens served from cache (reduces cost)
`cost`	Total cost in USD for this request

Non-Streaming Response

When stream: false, the response is a single JSON object:

{
  "id": "gen-abc123",
  "model": "anthropic/claude-sonnet-4-6",
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      "finish_reason": "stop",
      "index": 0
    }
  ],
  "usage": {
    "prompt_tokens": 14,
    "completion_tokens": 8
  }
}

Tool Calling

You can define tools that the model can call using the OpenAI function calling format:

{
  "model": "anthropic/claude-sonnet-4-6",
  "messages": [
    {"role": "user", "content": "What's the weather in San Francisco?"}
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get the current weather for a location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "City and state, e.g. San Francisco, CA"
            }
          },
          "required": ["location"]
        }
      }
    }
  ]
}

When the model decides to call a tool, the response includes a tool_calls array:

{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "get_weather",
              "arguments": "{\"location\": \"San Francisco, CA\"}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ]
}

To continue the conversation after a tool call, include the tool result:

{
  "messages": [
    {"role": "user", "content": "What's the weather in San Francisco?"},
    {"role": "assistant", "tool_calls": [{"id": "call_abc123", "type": "function", "function": {"name": "get_weather", "arguments": "{\"location\": \"San Francisco, CA\"}"}}]},
    {"role": "tool", "tool_call_id": "call_abc123", "content": "{\"temperature\": 62, \"condition\": \"foggy\"}"},
  ]
}

Reasoning Models

Some models support extended thinking (reasoning). When using these models, the response may include reasoning content in the streaming delta:

{"choices":[{"delta":{"reasoning":"Let me think about this step by step..."}}]}

Reasoning tokens are separate from the main content and appear in the delta.reasoning field. Some providers return encrypted reasoning blocks via delta.reasoning_details that can be passed back in subsequent requests to preserve the reasoning trace.

Not all models support reasoning. See Models for which models have reasoning capabilities.

Complete Example

curl -X POST https://api.cline.bot/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-sonnet-4-6",
    "messages": [
      {"role": "system", "content": "You are a concise assistant. Answer in one sentence."},
      {"role": "user", "content": "Explain what an API is."}
    ],
    "stream": true
  }'

Models

Browse available models and their capabilities.

Errors

Handle errors and implement retry logic.

SDK Examples

Use this endpoint from Python, Node.js, and more.

Authentication

API key management and security practices.

Documentation Index

​Endpoint

​Request Headers

​Request Body

​Message Format

​Multi-Turn Conversation

​Streaming Response

​Usage Object

​Non-Streaming Response

​Tool Calling

​Reasoning Models

​Complete Example

​Related

Models

Errors

SDK Examples

Authentication

Endpoint

Request Headers

Request Body

Message Format

Multi-Turn Conversation

Streaming Response

Usage Object

Non-Streaming Response

Tool Calling

Reasoning Models

Complete Example

Related