API Reference

API Reference

Complete API documentation for Jan Server services

API Reference

Complete API documentation for Jan Server services.

Overview

Jan Server provides four main APIs:

  • LLM API - OpenAI-compatible chat completions, conversations, and models
  • Response API - Multi-step tool orchestration and AI responses
  • Media API - Image upload, storage, and management
  • MCP Tools API - Model Context Protocol tools for search, scraping, and execution

Base URLs

Recommended: Point all public clients at the Kong gateway (port 8000) for consistent authentication, rate limiting, and routing.

Authentication

All API endpoints require authentication via Kong Gateway (port 8000).

Two Ways to Authenticate

1. Bearer Token (Recommended for development):

# Get guest token
curl -X POST http://localhost:8000/llm/auth/guest-login

# Response
{
  "access_token": "eyJhbGci...",
  "refresh_token": "eyJhbGci...",
  "expires_in": 300
}

# Use token
curl -H "Authorization: Bearer <access_token>" \
  http://localhost:8000/v1/models

2. API Key (Recommended for production):

curl -H "X-API-Key: sk_your_api_key_here" \
  http://localhost:8000/v1/chat/completions

Token Refresh

Guest tokens expire after 5 minutes. Use the refresh token to get a new access token:

curl -X POST http://localhost:8000/llm/auth/refresh \
  -H "Content-Type: application/json" \
  -d '{"refresh_token": "your-refresh-token"}'

Quick Start

1. Get Authentication Token

TOKEN=$(curl -X POST http://localhost:8000/llm/auth/guest-login | jq -r '.access_token')

2. Chat Completion

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen2.5-0.5b-instruct",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ]
  }'

3. List Models

curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:8000/v1/models

4. Use MCP Tools

curl -X POST http://localhost:8000/v1/mcp \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "method": "tools/call",
    "params": {
      "name": "google_search",
      "arguments": {"q": "AI news"}
    }
  }'

LLM API

OpenAI-compatible API for chat completions, conversations, and models.

Port: 8080 (direct) / 8000 (via gateway)

Features:

  • Chat completions with streaming
  • Conversation management
  • Project organization
  • Multi-model support
  • Vision/image support via jan_* IDs

Endpoints:

  • POST /v1/chat/completions - Generate AI responses
  • GET /v1/models - List available models
  • GET /v1/conversations - List conversations
  • POST /v1/conversations - Create conversation
  • GET /v1/projects - List projects
  • POST /v1/projects - Create project

View LLM API Documentation →

Response API

Multi-step orchestration engine that executes tools and generates responses.

Port: 8082 (direct) / 8000 (via gateway)

Features:

  • Chain multiple tools together
  • Up to 8 execution steps
  • Automatic tool selection
  • Final answer generation

Endpoints:

  • POST /v1/responses - Create response with tool execution

Media API

Image upload and storage with S3 integration.

Port: 8285 (direct) / 8000 (via gateway)

Features:

  • Upload from URL or base64
  • S3 cloud storage
  • jan_* ID system
  • Presigned URL generation
  • Duplicate detection

Endpoints:

  • POST /v1/media/upload - Upload media
  • POST /v1/media/resolve - Resolve jan_* IDs
  • POST /v1/media/presigned-url - Get presigned URL

MCP Tools API

Model Context Protocol tools for search, scraping, vector search, and code execution.

Port: 8091 (direct) / 8000 (via gateway)

Available Tools:

  • google_search - Web search via Serper/SearXNG
  • scrape - Fetch and parse web pages
  • file_search_index / file_search_query - Vector search
  • python_exec - Sandboxed Python execution

Endpoints:

  • POST /v1/mcp (method: tools/list) - List tools
  • POST /v1/mcp (method: tools/call) - Execute tool

Response Format

All successful responses return JSON:

{
  "data": {...},
  "meta": {...}
}

Error Format

All errors follow this structure:

{
  "error": {
    "type": "invalid_request_error",
    "code": "invalid_parameter",
    "message": "Parameter 'model' is required",
    "param": "model",
    "request_id": "req_123xyz"
  }
}

Error Types

TypeDescriptionHTTP Status
invalid_request_errorInvalid request parameters400
auth_errorAuthentication failed401
permission_errorInsufficient permissions403
not_found_errorResource not found404
rate_limit_errorToo many requests429
internal_errorServer error500

Rate Limiting

Requests through Kong Gateway are rate-limited:

  • Development: 100 requests/minute per IP
  • Headers: X-RateLimit-Limit-minute, X-RateLimit-Remaining-minute
  • Exceeding limit returns HTTP 429

OpenAI SDK Compatibility

Jan Server is fully compatible with OpenAI SDKs. Simply point the SDK to your Jan Server instance:

Python:

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="your-token"
)

response = client.chat.completions.create(
    model="qwen2.5-0.5b-instruct",
    messages=[{"role": "user", "content": "Hello!"}]
)

Node.js:

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'http://localhost:8000/v1',
  apiKey: 'your-token'
});

const response = await client.chat.completions.create({
  model: 'qwen2.5-0.5b-instruct',
  messages: [{ role: 'user', content: 'Hello!' }]
});

Next Steps

Need Help?

Base URL

All API requests should be made to:

https://api.Jan.ai

Authentication

API Key Authentication

Include your API key in the Authorization header:

Authorization: Bearer sk-Jan-...

OAuth 2.0 Authentication

For applications that need user-specific access:

Authorization: Bearer <oauth_token>

API Versioning

All API endpoints are versioned. The current version is v1:

https://api.Jan.ai/v1/chat/completions

Request Format

All requests must include:

  • Content-Type: application/json header
  • JSON request body
  • Proper authentication headers

Response Format

All responses are returned as JSON with the following structure:

{
  "id": "resp_1234567890",
  "object": "response",
  "created": 1640995200,
  "model": "gpt-4",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 15,
    "total_tokens": 25
  }
}

Error Handling

Error Response Format

{
  "error": {
    "message": "Invalid request parameters",
    "type": "invalid_request_error",
    "code": "invalid_parameter",
    "param": "model"
  }
}

Common Error Codes

CodeDescription
invalid_parameterInvalid request parameters
rate_limit_exceededRate limit exceeded
insufficient_quotaInsufficient API quota
model_not_foundSpecified model not available
unauthorizedInvalid or missing API key
forbiddenAccess denied
not_foundResource not found
server_errorInternal server error

Rate Limits

Rate limits are applied per API key:

TierRequests per MinuteRequests per Day
Free350
Pro60100,000
EnterpriseCustomCustom

Rate limit headers are included in responses:

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 59
X-RateLimit-Reset: 1640995260

Pagination

List endpoints support pagination:

GET /v1/conversations?limit=10&offset=20

Response includes pagination metadata:

{
  "object": "list",
  "data": [...],
  "has_more": true,
  "total_count": 150
}

Streaming

Some endpoints support streaming responses:

curl -X POST https://api.Jan.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4",
    "messages": [{"role": "user", "content": "Hello"}],
    "stream": true
  }'

Streaming responses use Server-Sent Events (SSE):

data: {"id":"resp_123","object":"response","choices":[{"delta":{"content":"Hello"}}]}

data: {"id":"resp_123","object":"response","choices":[{"delta":{"content":" there"}}]}

data: [DONE]

Next Steps