AI Service API Reference

Complete REST API documentation for AI service endpoints.

Base URL

All AI endpoints are prefixed with /ai:

http://localhost:8000/ai

Endpoints Overview

POST /ai/chat

Send chat message and receive AI response

Details
POST /ai/chat/stream

Stream AI responses with Server-Sent Events

Details
GET /ai/conversations

List user conversations with metadata

Details
GET /ai/conversations/{id}

Get conversation with full message history

Details
GET /ai/health

Check AI service health status

Details
GET /ai/version

Get service version and capabilities

Details

Chat Endpoints

POST `/ai/chat`

Send a chat message and receive AI response.

Request Body:

Field	Type	Required	Description
`message`	string	✅ Yes	User's chat message
`conversation_id`	string \| null	❌ No	Existing conversation ID (creates new if null)
`user_id`	string	❌ No	User identifier (default: "api-user")

Response:

Response Schema

{
  "message_id": "uuid",
  "content": "AI response text",
  "conversation_id": "uuid",
  "response_time_ms": 1234.5
}

Examples:

cURLPythonJavaScript

curl -X POST http://localhost:8000/ai/chat \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Explain FastAPI in one sentence",
    "user_id": "my-user"
  }'

Basic Chat Request

import httpx

response = httpx.post(  # (1)!
    "http://localhost:8000/ai/chat",
    json={
        "message": "What is async/await in Python?",  # (2)!
        "user_id": "my-user"  # (3)!
    }
)

data = response.json()
print(f"AI: {data['content']}")  # (4)!
print(f"Conversation: {data['conversation_id']}")  # (5)!

POST request to the chat endpoint
The user's message - this is what gets sent to the AI
User identifier for conversation tracking (optional, defaults to "api-user")
Extract and print the AI's response text
Save this conversation_id to continue the conversation in future requests

Fetch API Example

const response = await fetch('http://localhost:8000/ai/chat', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    message: 'How do I handle errors in async functions?',
    user_id: 'web-user'
  })
});

const data = await response.json();
console.log(`AI: ${data.content}`);

Continue Conversation:

# First message
curl -X POST http://localhost:8000/ai/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "What is FastAPI?"}' \
  | jq -r '.conversation_id' > conv_id.txt

# Follow-up message (maintains context)
curl -X POST http://localhost:8000/ai/chat \
  -H "Content-Type: application/json" \
  -d "{
    \"message\": \"Can you show me an example?\",
    \"conversation_id\": \"$(cat conv_id.txt)\"
  }"

POST `/ai/chat/stream`

Stream chat response with Server-Sent Events (SSE).

Request Body:

Same as /ai/chat:

{
  "message": "string",
  "conversation_id": "string | null",
  "user_id": "string"
}

Response:

Server-Sent Events stream with the following event types:

Event: connect

event: connect
data: {"status": "connected", "message": "Streaming started"}

Event: chunk (repeated for each content chunk)

event: chunk
data: {
  "content": "text delta",
  "is_final": false,
  "is_delta": true,
  "message_id": "uuid",
  "conversation_id": "uuid",
  "timestamp": "2024-01-15T10:30:00Z"
}

Event: final

event: final
data: {
  "content": "complete response",
  "is_final": true,
  "is_delta": false,
  "message_id": "uuid",
  "conversation_id": "uuid",
  "timestamp": "2024-01-15T10:30:05Z",
  "response_time_ms": 1234.5,
  "provider": "groq",
  "model": "llama-3.1-70b-versatile"
}

Event: complete

event: complete
data: {"status": "completed", "message": "Stream finished"}

Event: error (on error)

event: error
data: {"error": "AI service error", "detail": "error message"}

Examples:

cURLJavaScriptPython

curl -X POST http://localhost:8000/ai/chat/stream \
  -H "Content-Type: application/json" \
  -d '{"message": "Write a Python hello world"}' \
  --no-buffer

SSE Streaming Client

const eventSource = new EventSource(  // (1)!
  '/ai/chat/stream?' + new URLSearchParams({
    message: 'Explain async programming',
    user_id: 'web-user'
  })
);

let fullResponse = '';

eventSource.addEventListener('chunk', (e) => {  // (2)!
  const data = JSON.parse(e.data);
  fullResponse += data.content;
  updateUI(fullResponse);  // (3)!
});

eventSource.addEventListener('final', (e) => {  // (4)!
  const data = JSON.parse(e.data);
  console.log('Complete response:', data.content);
  console.log('Response time:', data.response_time_ms);
});

eventSource.addEventListener('error', (e) => {  // (5)!
  const data = JSON.parse(e.data);
  console.error('Error:', data.detail);
  eventSource.close();
});

eventSource.addEventListener('complete', (e) => {  // (6)!
  console.log('Stream complete');
  eventSource.close();
});

Create EventSource connection to the streaming endpoint
Handle each streamed chunk as it arrives
Update UI in real-time as tokens stream in
Handle final event with complete response and timing
Handle errors and close connection
Clean up connection when stream completes

httpx Streaming

import httpx
import json

url = "http://localhost:8000/ai/chat/stream"
data = {"message": "Explain decorators in Python", "user_id": "my-user"}

with httpx.stream("POST", url, json=data) as response:  # (1)!
    for line in response.iter_lines():  # (2)!
        if line.startswith('event:'):
            event_type = line.split(':')[1].strip()
        elif line.startswith('data:'):
            data = json.loads(line.split('data:')[1])

            if event_type == 'chunk':  # (3)!
                print(data['content'], end='', flush=True)
            elif event_type == 'final':  # (4)!
                print(f"\n\nResponse time: {data['response_time_ms']}ms")

Open streaming connection with context manager
Iterate through Server-Sent Events line by line
Print each chunk as it arrives for real-time output
Show final response metadata when stream completes

Conversation Management

GET `/ai/conversations`

List conversations for a user.

Query Parameters:

Parameter	Type	Required	Default	Description
`user_id`	string	❌ No	"api-user"	User identifier
`limit`	integer	❌ No	50	Maximum conversations to return

Response:

[
  {
    "id": "uuid",
    "title": "Conversation title",
    "message_count": 5,
    "last_activity": "2024-01-15T10:30:00Z",
    "provider": "groq",
    "model": "llama-3.1-70b-versatile"
  }
]

Example:

# List conversations
curl "http://localhost:8000/ai/conversations?user_id=my-user&limit=10"

# With Python
import httpx

response = httpx.get(
    "http://localhost:8000/ai/conversations",
    params={"user_id": "my-user", "limit": 10}
)

conversations = response.json()
for conv in conversations:
    print(f"{conv['id']}: {conv['title']} ({conv['message_count']} messages)")

GET `/ai/conversations/{conversation_id}`

Get a specific conversation with full message history.

Path Parameters:

Parameter	Type	Description
`conversation_id`	string	Conversation UUID

Query Parameters:

Parameter	Type	Required	Default	Description
`user_id`	string	❌ No	"api-user"	User identifier for access control

Response:

{
  "id": "uuid",
  "title": "Conversation title",
  "provider": "groq",
  "model": "llama-3.1-70b-versatile",
  "created_at": "2024-01-15T10:00:00Z",
  "updated_at": "2024-01-15T10:30:00Z",
  "message_count": 5,
  "messages": [
    {
      "id": "msg-uuid-1",
      "role": "user",
      "content": "What is FastAPI?",
      "timestamp": "2024-01-15T10:00:00Z"
    },
    {
      "id": "msg-uuid-2",
      "role": "assistant",
      "content": "FastAPI is a modern web framework...",
      "timestamp": "2024-01-15T10:00:02Z"
    }
  ],
  "metadata": {
    "user_id": "my-user",
    "last_response_time_ms": 1234.5
  }
}

Example:

curl "http://localhost:8000/ai/conversations/CONVERSATION_ID?user_id=my-user"

Service Status

GET `/ai/health`

AI service health status and configuration.

Response:

{
  "service": "ai",
  "status": "healthy",
  "enabled": true,
  "provider": "groq",
  "model": "llama-3.1-70b-versatile",
  "agent_ready": true,
  "total_conversations": 42,
  "configuration_valid": true,
  "validation_errors": []
}

Status Values: - healthy - Service operational and properly configured - unhealthy - Configuration issues or service errors - error - Critical service failure

Example:

curl http://localhost:8000/ai/health | jq

GET `/ai/version`

Service version and feature information.

Response:

{
  "service": "ai",
  "engine": "pydantic-ai",
  "version": "1.0",
  "features": [
    "chat",
    "conversation_management",
    "multi_provider_support",
    "health_monitoring",
    "api_endpoints",
    "cli_commands"
  ],
  "providers_supported": [
    "openai",
    "anthropic",
    "google",
    "groq",
    "mistral",
    "cohere"
  ]
}

Example:

curl http://localhost:8000/ai/version | jq

Error Handling

HTTP Status Codes

Status	Description
200	Success
400	Bad request (invalid conversation_id, missing required fields)
403	Forbidden (conversation access denied)
404	Not found (conversation doesn't exist)
502	Bad gateway (AI provider error)
503	Service unavailable (AI service disabled or misconfigured)
500	Internal server error

Error Response Format

{
  "detail": "Error message description"
}

Common Errors

AI Service Disabled:

{
  "detail": "AI service error: AI service is disabled"
}

Missing API Key:

{
  "detail": "AI service error: Missing API key for openai provider. Set OPENAI_API_KEY environment variable."
}

Provider Error:

{
  "detail": "AI provider error: Rate limit exceeded"
}

Conversation Not Found:

{
  "detail": "Conversation error: Conversation abc-123 not found"
}

Access Denied:

{
  "detail": "Access denied"
}

Next Steps:

Service Layer - Integration patterns and architecture
CLI Commands - Command-line interface reference
Examples - Real-world usage patterns

AI Service API Reference

Base URL

Endpoints Overview

Chat Endpoints

POST /ai/chat

POST /ai/chat/stream

Conversation Management

GET /ai/conversations

GET /ai/conversations/{conversation_id}

Service Status

GET /ai/health

GET /ai/version

Error Handling

HTTP Status Codes

Error Response Format

Common Errors

POST `/ai/chat`

POST `/ai/chat/stream`

GET `/ai/conversations`

GET `/ai/conversations/{conversation_id}`

GET `/ai/health`

GET `/ai/version`