Getting Started with Jan Server
Getting Started with Jan Server
Welcome! This guide will help you get Jan Server up and running in minutes.
Note: This guide covers Docker Compose setup for local development. For Kubernetes deployment (production/staging), see:
- Kubernetes Setup Guide - Complete step-by-step Kubernetes deployment
- Deployment Guide - All deployment options (Kubernetes, Docker Compose, Hybrid)
Prerequisites
Before you begin, ensure you have:
- Docker Desktop (Windows/macOS) or Docker Engine + Docker Compose (Linux)
- Make (usually pre-installed on macOS/Linux, install on Windows)
- Git
- At least 8GB RAM available
- For GPU inference: NVIDIA GPU with CUDA support
Optional (for development):
- Go 1.21+
- Go 1.23+ (for jan-cli api-test)
Quick Setup
1. Clone the Repository
git clone https://github.com/janhq/jan-server.git
cd jan-server2. Run the Setup Wizard (Recommended)
make quickstartmake quickstart launches the jan-cli wizard. It prompts for your LLM provider (local vLLM vs remote API), MCP search provider, and Media API preference, then writes .env plus config/secrets.env. When configuration finishes it automatically starts Docker Compose. Re-run the command anytime to update settings (answer Y when asked to overwrite .env).
Wizard options at a glance
- LLM provider: Local vLLM (GPU, downloads models) or remote OpenAI-compatible endpoint (no GPU required).
- MCP search: Serper (API key), SearXNG (local, no key), or None (MCP Tools still run without search).
- Media API: Enable for uploads and jan_* IDs, or disable if not needed.
Example configurations
- Full local (GPU): Local vLLM + SearXNG + Media enabled → everything runs locally.
- Cloud LLM + Serper: Remote API endpoint + Serper key + Media enabled → light local footprint, best search.
- Minimal: Remote API endpoint + no search + Media disabled → smallest local runtime.
Manual configuration (if you cannot run the wizard)
# Copy templates
cp .env.template .env
cp config/secrets.env.example config/secrets.env
# Edit with your values
nano .env
nano config/secrets.env
# Populate defaults and validate
make setupmake setup uses jan-cli in non-interactive mode to check dependencies, ensure directories exist, and pull base images.
Configuration details:
- Canonical defaults live in
config/defaults.yaml(generated from Go structs) - Secrets belong in
config/secrets.env(copied fromconfig/secrets.env.example) - Environment templates (Docker/Kubernetes) are documented in Configuration System
3. Start Services (skip if quickstart already did this)
# Start full stack (CPU inference)
make up-full
# Optional: start monitoring stack
make monitor-upWait for all services to start (30-60 seconds). You can monitor progress with:
make logsWhat the wizard does
- Prompts for LLM/search/media choices.
- Writes
.envandconfig/secrets.env. - Checks Docker availability and networks.
- Starts the Compose stack and waits for health (about 30 seconds).
5. Verify Installation
make health-checkYou should see all services reporting as healthy.
Access Services
Once running, you can access:
| Service | URL | Credentials |
|---|---|---|
| API Gateway | http://localhost:8000 | - |
| API Documentation | http://localhost:8000/api/swagger/index.html | - |
| LLM API | http://localhost:8080 | Authorization: Bearer <token> |
| Response API | http://localhost:8082 | Authorization: Bearer <token> |
| Media API | http://localhost:8285 | Authorization: Bearer <token> |
| MCP Tools | http://localhost:8091 | Authorization: Bearer <token> |
| Memory Tools | http://localhost:8090 | Authorization: Bearer <token> |
| Realtime API | http://localhost:8186 | Authorization: Bearer <token> |
| Keycloak Console | http://localhost:8085 | admin/admin |
| Grafana Dashboards | http://localhost:3331 | admin/admin (after make monitor-up) |
| Prometheus | http://localhost:9090 | - (after make monitor-up) |
| Jaeger Tracing | http://localhost:16686 | - (after make monitor-up) |
Service activation:
- vLLM starts only if you choose the local provider (GPU). Use the remote provider option for a lighter stack.
- MCP Tools and Vector Store always run; the search choice only affects which search provider is used.
- Media API is optional based on the wizard choice.
Your First API Call
1. Get a Guest Token via Kong
curl -X POST http://localhost:8000/llm/auth/guest-loginAll traffic to http://localhost:8000 flows through the Kong gateway, which validates Keycloak-issued JWTs or API keys (use Authorization: Bearer <token> or X-API-Key: sk_* headers).
Response:
{
"access_token": "eyJhbGci...",
"refresh_token": "eyJhbGci...",
"expires_in": 300
}2. Make a Chat Completion Request
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Authorization: Bearer YOUR_ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "jan-v1-4b",
"messages": [
{"role": "user", "content": "What is the capital of France?"}
],
"stream": false
}'3. Try Streaming
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Authorization: Bearer YOUR_ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "jan-v1-4b",
"messages": [
{"role": "user", "content": "Tell me a short story"}
],
"stream": true
}'4. Use MCP Tools
# List available tools
curl -X POST http://localhost:8000/v1/mcp \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/list"
}'
# Google search
curl -X POST http://localhost:8000/v1/mcp \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"id": 2,
"method": "tools/call",
"params": {
"name": "google_search",
"arguments": {
"q": "latest AI news"
}
}
}'Enable Monitoring (Optional)
To enable full observability stack:
make monitor-upAccess:
- Grafana: http://localhost:3331 (admin/admin)
- Prometheus: http://localhost:9090
- Jaeger: http://localhost:16686
Common Commands
# View logs
make logs # All services
make logs-api # API profile (LLM, Response, Media)
make logs-mcp # MCP Tools profile
# Check status
make health-check # Hit health endpoints
docker compose ps # Container status
# Restart services
make restart-full # Restart everything
make restart-api # Restart API profile
make restart-llm-api # Restart only LLM API
# Stop services
make down # Stop all containers (keeps volumes)
make down-clean # Stop containers and remove volumesTroubleshooting
Services won't start
# Check Docker
docker --version
docker compose version
# Check status
make health-check
docker compose ps
# View errors
make logs
# Full reset
make down
make down-clean
make setup
make up-fullPort conflicts
If you get port binding errors:
# Check what's using ports
# Windows PowerShell:
netstat -ano | findstr "8000 8080 8085"
# macOS/Linux:
lsof -i:8000
lsof -i:8080
lsof -i:8085
# Kill conflicting processes or change ports in .envvLLM GPU issues
# Verify GPU availability
docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smiIf no GPU is detected:
- Rerun
make quickstartand choose the remote API option (skips local vLLM) - Or run
make up-vllm-cputo start the CPU-only vLLM profile when testing locally
Database connection errors
# Reset database
make db-reset
# Check database logs
docker compose logs api-db
# Verify connection
make db-consoleAPI returns 401 Unauthorized
- Check token hasn't expired (default: 5 minutes)
- Get new guest token:
curl -X POST http://localhost:8000/llm/auth/guest-login - Check
Authorization: Bearer <token>header is set
Update configuration later
make quickstart # rerun the wizard and choose Y to overwrite .envThe wizard safely reuses existing values and updates your choices without manual edits.
What's Next?
Now that you have Jan Server running:
- Explore the API:
- Learn Development:
- Development Guide
- Development Guide - Dev-Full Mode (recommended for development)
- Testing Guide
- Understand Architecture:
- Deploy to Production:
Need Help?
Quick Reference: make help | All Commands: make help-all