API Reference

Complete API documentation for Jan Server services.

Overview

Jan Server provides four main APIs:

LLM API - OpenAI-compatible chat completions, conversations, and models
Response API - Multi-step tool orchestration and AI responses
Media API - Image upload, storage, and management
MCP Tools API - Model Context Protocol tools for search, scraping, and execution

Base URLs

Environment	LLM API	Response API	Media API	MCP Tools	Gateway
Local	http://localhost:8080	http://localhost:8082	http://localhost:8285	http://localhost:8091	http://localhost:8000
Docker	http://llm-api:8080	http://response-api:8082	http://media-api:8285	http://mcp-tools:8091	http://kong:8000

Recommended: Point all public clients at the Kong gateway (port 8000) for consistent authentication, rate limiting, and routing.

Authentication

All API endpoints require authentication via Kong Gateway (port 8000).

Two Ways to Authenticate

1. Bearer Token (Recommended for development):

# Get guest token
curl -X POST http://localhost:8000/llm/auth/guest-login

# Response
{
  "access_token": "eyJhbGci...",
  "refresh_token": "eyJhbGci...",
  "expires_in": 300
}

# Use token
curl -H "Authorization: Bearer <access_token>" \
  http://localhost:8000/v1/models

2. API Key (Recommended for production):

curl -H "X-API-Key: sk_your_api_key_here" \
  http://localhost:8000/v1/chat/completions

Token Refresh

Guest tokens expire after 5 minutes. Use the refresh token to get a new access token:

curl -X POST http://localhost:8000/llm/auth/refresh \
  -H "Content-Type: application/json" \
  -d '{"refresh_token": "your-refresh-token"}'

Quick Start

1. Get Authentication Token

TOKEN=$(curl -X POST http://localhost:8000/llm/auth/guest-login | jq -r '.access_token')

2. Chat Completion

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen2.5-0.5b-instruct",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ]
  }'

3. List Models

curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:8000/v1/models

4. Use MCP Tools

curl -X POST http://localhost:8000/v1/mcp \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "method": "tools/call",
    "params": {
      "name": "google_search",
      "arguments": {"q": "AI news"}
    }
  }'

LLM API

OpenAI-compatible API for chat completions, conversations, and models.

Port: 8080 (direct) / 8000 (via gateway)

Features:

Chat completions with streaming
Conversation management
Project organization
Multi-model support
Vision/image support via jan_* IDs

Endpoints:

POST /v1/chat/completions - Generate AI responses
GET /v1/models - List available models
GET /v1/conversations - List conversations
POST /v1/conversations - Create conversation
GET /v1/projects - List projects
POST /v1/projects - Create project

View LLM API Documentation →

Response API

Multi-step orchestration engine that executes tools and generates responses.

Port: 8082 (direct) / 8000 (via gateway)

Features:

Chain multiple tools together
Up to 8 execution steps
Automatic tool selection
Final answer generation

Endpoints:

POST /v1/responses - Create response with tool execution

Media API

Image upload and storage with S3 integration.

Port: 8285 (direct) / 8000 (via gateway)

Features:

Upload from URL or base64
S3 cloud storage
jan_* ID system
Presigned URL generation
Duplicate detection

Endpoints:

POST /v1/media/upload - Upload media
POST /v1/media/resolve - Resolve jan_* IDs
POST /v1/media/presigned-url - Get presigned URL

MCP Tools API

Model Context Protocol tools for search, scraping, vector search, and code execution.

Port: 8091 (direct) / 8000 (via gateway)

Available Tools:

google_search - Web search via Serper/SearXNG
scrape - Fetch and parse web pages
file_search_index / file_search_query - Vector search
python_exec - Sandboxed Python execution

Endpoints:

POST /v1/mcp (method: tools/list) - List tools
POST /v1/mcp (method: tools/call) - Execute tool

Response Format

All successful responses return JSON:

{
  "data": {...},
  "meta": {...}
}

Error Format

All errors follow this structure:

{
  "error": {
    "type": "invalid_request_error",
    "code": "invalid_parameter",
    "message": "Parameter 'model' is required",
    "param": "model",
    "request_id": "req_123xyz"
  }
}

Error Types

Type	Description	HTTP Status
`invalid_request_error`	Invalid request parameters	400
`auth_error`	Authentication failed	401
`permission_error`	Insufficient permissions	403
`not_found_error`	Resource not found	404
`rate_limit_error`	Too many requests	429
`internal_error`	Server error	500

Rate Limiting

Requests through Kong Gateway are rate-limited:

Development: 100 requests/minute per IP
Headers: X-RateLimit-Limit-minute, X-RateLimit-Remaining-minute
Exceeding limit returns HTTP 429

OpenAI SDK Compatibility

Jan Server is fully compatible with OpenAI SDKs. Simply point the SDK to your Jan Server instance:

Python:

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="your-token"
)

response = client.chat.completions.create(
    model="qwen2.5-0.5b-instruct",
    messages=[{"role": "user", "content": "Hello!"}]
)

Node.js:

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'http://localhost:8000/v1',
  apiKey: 'your-token'
});

const response = await client.chat.completions.create({
  model: 'qwen2.5-0.5b-instruct',
  messages: [{ role: 'user', content: 'Hello!' }]
});

Next Steps

Authentication - Learn about authentication methods
Chat Completions - Generate AI responses
Getting Started - Set up Jan Server
Developer Guides - Development workflows

Need Help?

Base URL

All API requests should be made to:

https://api.Jan.ai

Authentication

API Key Authentication

Include your API key in the Authorization header:

Authorization: Bearer sk-Jan-...

OAuth 2.0 Authentication

For applications that need user-specific access:

Authorization: Bearer <oauth_token>

API Versioning

All API endpoints are versioned. The current version is v1:

https://api.Jan.ai/v1/chat/completions

Request Format

All requests must include:

Content-Type: application/json header
JSON request body
Proper authentication headers

Response Format

All responses are returned as JSON with the following structure:

{
  "id": "resp_1234567890",
  "object": "response",
  "created": 1640995200,
  "model": "gpt-4",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 15,
    "total_tokens": 25
  }
}

Error Handling

Error Response Format

{
  "error": {
    "message": "Invalid request parameters",
    "type": "invalid_request_error",
    "code": "invalid_parameter",
    "param": "model"
  }
}

Common Error Codes

Code	Description
`invalid_parameter`	Invalid request parameters
`rate_limit_exceeded`	Rate limit exceeded
`insufficient_quota`	Insufficient API quota
`model_not_found`	Specified model not available
`unauthorized`	Invalid or missing API key
`forbidden`	Access denied
`not_found`	Resource not found
`server_error`	Internal server error

Rate Limits

Rate limits are applied per API key:

Tier	Requests per Minute	Requests per Day
Free	3	50
Pro	60	100,000
Enterprise	Custom	Custom

Rate limit headers are included in responses:

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 59
X-RateLimit-Reset: 1640995260

Pagination

List endpoints support pagination:

GET /v1/conversations?limit=10&offset=20

Response includes pagination metadata:

{
  "object": "list",
  "data": [...],
  "has_more": true,
  "total_count": 150
}

Streaming

Some endpoints support streaming responses:

curl -X POST https://api.Jan.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4",
    "messages": [{"role": "user", "content": "Hello"}],
    "stream": true
  }'

Streaming responses use Server-Sent Events (SSE):

data: {"id":"resp_123","object":"response","choices":[{"delta":{"content":"Hello"}}]}

data: {"id":"resp_123","object":"response","choices":[{"delta":{"content":" there"}}]}

data: [DONE]

Next Steps

Authentication - Learn about authentication methods
Conversations - Work with multi-turn conversations

API Reference

On this page