Skip to content

AI CLI Commands

Part of the Generated Project CLI - See CLI Reference for complete overview.

The AI service provides three command groups: ai (chat and status), llm (model catalog), and rag (document indexing and search).

Command Overview

# AI Chat & Status
my-app ai status          # Show configuration and validation
my-app ai providers       # List all providers
my-app ai chat "message"  # Send single message
my-app ai chat            # Interactive chat session (Illiana)
my-app ai conversations   # List conversations
my-app ai history <id>    # View conversation history

# LLM Catalog
my-app llm sync           # Sync ~2000 models from cloud/Ollama
my-app llm status         # Catalog statistics
my-app llm vendors        # List vendors
my-app llm modalities     # List modalities
my-app llm list <pattern> # Search models
my-app llm current        # Show current model config
my-app llm use <model>    # Switch active model
my-app llm info <model>   # Detailed model info

# RAG
my-app rag index <path>   # Index documents
my-app rag add <file>     # Add/update single file
my-app rag remove <path>  # Remove file from index
my-app rag files          # List indexed files
my-app rag search <query> # Semantic search
my-app rag list           # List collections
my-app rag delete <name>  # Delete collection
my-app rag status         # RAG configuration
my-app rag install-model  # Pre-download embedding model

AI Commands

ai status

Show AI service status, configuration, and validation:

my-app ai status
AI Service Status
========================================
Engine: pydantic-ai
Status: Enabled
Provider: groq
Model: llama-3.1-70b-versatile
Temperature: 0.7
Max Tokens: 1000
API Key: Set

✓ Configuration valid
  Free tier
  Streaming supported

Available providers: 3 (run 'ai providers' to list)

ai providers

List all available AI providers and their status:

my-app ai providers
           AI Providers
┏━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃ Provider ┃ Status                   ┃ Free ┃ Features         ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ public   │ Available (current)      │ Yes  │ Basic            │
│ groq     │ Need GROQ_API_KEY        │ Yes  │ Stream           │
│ openai   │ Need OPENAI_API_KEY      │ No   │ Stream, Functions│
│ anthropic│ Need ANTHROPIC_API_KEY   │ No   │ Stream, Vision   │
│ google   │ Need GOOGLE_API_KEY      │ Yes  │ Stream           │
│ mistral  │ Need MISTRAL_API_KEY     │ No   │ Stream           │
│ cohere   │ Need COHERE_API_KEY      │ No   │ Stream           │
└──────────┴──────────────────────────┴──────┴──────────────────┘

ai chat

Send messages to Illiana or start interactive sessions.

Single Message:

my-app ai chat "Explain async/await in Python"

Options:

Flag Description
--stream / --no-stream Enable/disable streaming (default: enabled)
--conversation-id, -c Continue an existing conversation
--user-id, -u User identifier (default: cli-user)
--verbose, -v Show conversation metadata
--rag Enable RAG context
--collection RAG collection to search
--top-k Number of RAG results (default: 5)
--sources Show source file references

Examples:

# Simple message
my-app ai chat "What is this project's architecture?"

# Continue a conversation
my-app ai chat -c abc123 "Tell me more about that"

# Chat with codebase context (RAG)
my-app ai chat --rag --collection code --top-k 20 --sources \
  "How does the auth service work?"

Interactive Mode (no message argument):

$ my-app ai chat
Illiana v0.6.3
Provider: groq | Model: llama-3.1-70b-versatile

You: What is FastAPI?
Illiana: FastAPI is a modern Python web framework...

You: /model gpt-4o
 Switched to OpenAI/gpt-4o

You: /rag code
 RAG enabled with collection: code

You: /status
Provider: openai
Model: gpt-4o
RAG: ON (code)

You: /exit
Goodbye!

Slash Commands (Interactive Mode)

Command Description
/help Show available commands
/model [name] Show current model or switch
/status Show current configuration
/new Start a new conversation
/rag [off\|collection] Toggle RAG or select collection
/sources [enable\|disable] Toggle source references
/clear Clear the screen
/exit Exit the chat session

ai conversations

List conversations for a user:

my-app ai conversations
my-app ai conversations -u "user-456" -l 20

ai history

View message history of a conversation:

my-app ai history <conversation-id>

LLM Catalog Commands

Manage the local model catalog. See LLM Catalog for full documentation.

llm sync

Sync model data from cloud APIs or local Ollama:

# Default: sync chat models from cloud (OpenRouter + LiteLLM)
my-app llm sync

# Sync only Ollama models
my-app llm sync --source ollama

# Sync everything (cloud + Ollama)
my-app llm sync --source all --mode all

# Preview without saving
my-app llm sync --dry-run

# Full refresh (truncate + re-sync)
my-app llm sync --refresh

Options:

Flag Values Default Description
--mode, -m chat, embedding, all chat Model type filter
--source, -s cloud, ollama, all cloud Data source
--dry-run, -n flag off Preview without saving
--refresh, -r flag off Truncate tables first
Sync Results
┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┓
┃ Metric              ┃ Count ┃
┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━┩
│ Vendors Added       │ 32    │
│ Models Added        │ 1847  │
│ Deployments Synced  │ 2103  │
│ Prices Synced       │ 1952  │
│ Modalities Synced   │ 3891  │
│ Duration            │ 12.4s │
└─────────────────────┴───────┘

llm status

Show catalog statistics:

my-app llm status
LLM Catalog Summary
┏━━━━━━━━━━━━━┳━━━━━━━┓
┃ Metric      ┃ Count ┃
┡━━━━━━━━━━━━━╇━━━━━━━┩
│ Vendors     │ 32    │
│ Models      │ 1847  │
│ Deployments │ 2103  │
│ Prices      │ 1952  │
└─────────────┴───────┘

llm vendors

List all vendors:

my-app llm vendors

llm modalities

List all modalities (text, image, audio, video) with model counts:

my-app llm modalities

llm list

Search and filter models:

# Search by pattern
my-app llm list claude

# Filter by vendor
my-app llm list gpt-4 --vendor openai

# Filter by modality
my-app llm list --vendor anthropic --modality image

# Include disabled models
my-app llm list --vendor openai --all

# Limit results
my-app llm list --vendor google --limit 10
LLM Models (5 results)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Model ID                    ┃ Vendor    ┃ Context  ┃ Input $/1M┃ Output $/1M┃ Released   ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ claude-sonnet-4-20250514    │ Anthropic │ 200,000  │ $3.00     │ $15.00     │ 2025-05-14 │
│ claude-haiku-4-5-20251001   │ Anthropic │ 200,000  │ $0.80     │ $4.00      │ 2025-10-01 │
│ claude-opus-4-6             │ Anthropic │ 200,000  │ $15.00    │ $75.00     │ 2025-06-01 │
└─────────────────────────────┴───────────┴──────────┴───────────┴────────────┴────────────┘

llm current

Show current LLM configuration from .env, enriched with catalog data:

my-app llm current
Current LLM Configuration
├── Provider: openai
├── Model: gpt-4o
├── Temperature: 0.7
└── Max Tokens: 1,000

Model Details (from catalog)
├── Context Window: 128,000
├── Input Price: $2.50 / 1M tokens
├── Output Price: $10.00 / 1M tokens
└── Modalities: text, image

llm use

Switch to a different model:

# Auto-detects vendor and updates AI_PROVIDER in .env
my-app llm use gpt-4o
my-app llm use claude-sonnet-4-20250514

# Force any model string (skip catalog validation)
my-app llm use my-custom-model --force

llm info

Show detailed model information:

my-app llm info gpt-4o
╭─────────────── gpt-4o ───────────────╮
│ GPT-4o                               │
│                                       │
│ Model ID: gpt-4o                     │
│ Vendor: OpenAI                       │
│                                       │
│ Context Window: 128,000 tokens       │
│ Streamable: Yes                      │
│ Enabled: Yes                         │
│ Released: 2024-05-13                 │
│                                       │
│ Pricing (per 1M tokens)             │
│   Input: $2.50                       │
│   Output: $10.00                     │
│                                       │
│ Modalities: text, image              │
╰───────────────────────────────────────╯

RAG Commands

Manage document indexing and search. See RAG for full documentation.

rag index

Index documents from a path into a collection:

# Index current directory
my-app rag index . --collection my-codebase

# Index specific directory with extensions
my-app rag index ./app --collection code --extensions .py,.ts
╭──────── Collection: code ─────────╮
│ Successfully indexed 1,523 chunks │
│ from 87 files                     │
│                                    │
│ Extensions: .py                   │
│ Duration: 8.3s                    │
│   Loading:  1.2s (14%)            │
│   Chunking: 0.8s (10%)           │
│   Indexing: 6.3s (76%)           │
│ Throughput: 183.5 chunks/sec     │
│ Collection size: 1,523 chunks    │
╰────────────────────────────────────╯

rag add

Add or update a single file (upsert semantics):

my-app rag add app/services/auth.py --collection code
my-app rag add app/services/auth.py -c code --show-ids

rag remove

Remove a file's chunks from the collection:

my-app rag remove /path/to/file.py --collection code
my-app rag remove /path/to/file.py -c code --force

rag files

List indexed files in a collection:

my-app rag files --collection code
Indexed Files: code
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ File                                  ┃ Chunks ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ app/services/ai/service.py            │ 45     │
│ app/services/auth/service.py          │ 23     │
│ app/components/backend/api/ai/router… │ 12     │
└───────────────────────────────────────┴────────┘

Total: 87 files, 1,523 chunks

Semantic search across indexed documents:

# Basic search
my-app rag search "how does authentication work" --collection code

# Show full content
my-app rag search "database connection" -c code --content

# More results
my-app rag search "error handling" -c code --top-k 10

rag list

List all collections:

my-app rag list
RAG Collections
┏━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Collection  ┃ Documents ┃
┡━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ code        │ 1,523     │
│ docs        │ 342       │
└─────────────┴───────────┘

rag delete

Delete a collection:

my-app rag delete code
my-app rag delete code --force

rag status

Show RAG service status and configuration:

my-app rag status
╭──────── RAG Service Status ────────╮
│ Enabled: Yes                       │
│ Persist Directory: .chromadb       │
│ Embedding Model: all-MiniLM-L6-v2 │
│ Model Status: Installed            │
│ Chunk Size: 1000                   │
│ Chunk Overlap: 200                 │
│ Default Top K: 5                   │
│ Collections: 2                     │
╰────────────────────────────────────╯

rag install-model

Pre-download the embedding model for offline use:

# Download to default location
my-app rag install-model

# Custom cache directory
my-app rag install-model --cache-dir /path/to/models

# Specific model
my-app rag install-model --model sentence-transformers/all-MiniLM-L6-v2

Note

Not needed for OpenAI embeddings - only for local sentence-transformers models (~400MB download).


See Also

Was this page helpful?