Skip to content

Model Configuration

Configure language models for different Orchestre operations to optimize performance, cost, and results.

Overview

Orchestre uses multiple LLMs for different tasks:

  • Primary Model: Code generation and execution
  • Planning Model: Analysis and strategic planning
  • Review Models: Multi-perspective code review

Configuration Location

Model settings can be configured in:

  1. Model Configuration (models.config.json in project root)
  2. Environment Variables (for API keys)
  3. Runtime Parameters (command-specific overrides)

Default Configuration

json
// models.config.json
{
  "primary": "claude-3-opus",
  "planning": "gemini-2.0-flash-thinking-exp",
  "review": ["gpt-4o", "claude-3-sonnet"]
}

Available Models

Claude Models (Anthropic)

json
// models.config.json
{
  "primary": "claude-3-opus"    // Most capable
  // or "claude-3-sonnet"       // Balanced performance
  // or "claude-3-haiku"        // Fast and economical
}

Best for: Code generation, complex reasoning, nuanced instructions

Gemini Models (Google)

json
// models.config.json
{
  "planning": "gemini-2.0-flash-thinking-exp"  // Latest reasoning model
  // or "gemini-pro"                          // General purpose
  // or "gemini-pro-vision"                   // Multimodal capabilities
}

Best for: Structured analysis, planning, research

GPT Models (OpenAI)

json
// models.config.json
{
  "review": [
    "gpt-4o",           // Latest multimodal
    "gpt-4-turbo",      // High performance
    "gpt-3.5-turbo"     // Fast and economical
  ]
}

Best for: Code review, API design, documentation

Model Selection Guide

By Task Type

Code Generation

json
// Complex features
{
  "primary": "claude-3-opus"
}

// Simple tasks
{
  "primary": "claude-3-haiku"
}

Analysis & Planning

json
// Deep analysis
{
  "planning": "gemini-2.0-flash-thinking-exp"
}

// Quick assessment
{
  "planning": "gemini-pro"
}

Code Review

json
// Comprehensive review
{
  "review": ["gpt-4o", "claude-3-sonnet", "gemini-pro"]
}

// Quick review
{
  "review": ["gpt-3.5-turbo"]
}

By Project Phase

Early Development

json
// Focus on speed and iteration
{
  "primary": "claude-3-haiku",
  "planning": "gemini-pro",
  "review": ["gpt-3.5-turbo"]
}

Production

json
// Focus on quality and security
{
  "primary": "claude-3-opus",
  "planning": "gemini-2.0-flash-thinking-exp",
  "review": ["gpt-4o", "claude-3-sonnet"]
}

Cost Optimization

Model Pricing (Approximate)

ModelInput (per 1M tokens)Output (per 1M tokens)
Claude 3 Opus$15$75
Claude 3 Sonnet$3$15
Claude 3 Haiku$0.25$1.25
GPT-4o$5$15
GPT-4 Turbo$10$30
GPT-3.5 Turbo$0.50$1.50
Gemini Pro$0.50$1.50

Cost-Saving Strategies

  1. Use Appropriate Models

    json
    // Don't use Opus for simple tasks
    {
      "primary": "claude-3-haiku"  // For routine coding
    }
  2. Limit Review Models

    json
    // Development: Single reviewer
    {
      "review": ["gpt-3.5-turbo"]
    }
    
    // Production: Multiple reviewers
    {
      "review": ["gpt-4o", "claude-3-sonnet"]
    }
  3. Enable Debug Mode

    bash
    # For cost tracking
    export ORCHESTRE_DEBUG=true

Advanced Configuration

Model Parameters

json
// models.config.json
{
  "primary": "claude-3-opus",
  "parameters": {
    "temperature": 0.7,    // Creativity (0.0-1.0)
    "maxTokens": 4096,     // Maximum response length
    "topP": 0.9           // Nucleus sampling
  }
}

Provider-Specific Settings

Environment Variables

bash
# Claude settings
export ANTHROPIC_API_VERSION="2023-06-01"

# OpenAI settings
export OPENAI_ORG_ID="org-xxx"
export OPENAI_API_BASE="https://api.openai.com"

# Gemini settings (handled by SDK)

Alternative Model Providers

For custom or alternative providers, use environment variables:

bash
# Set custom endpoints
export OPENAI_API_BASE="https://your-custom-endpoint.com/v1"

Dynamic Model Selection

Command-Specific Overrides

bash
# Use specific model for one command
/orchestrate --model gpt-4o "Complex analysis task"

# Override planning model
/analyze-project --planning-model gemini-pro

Context-Aware Selection

Orchestre automatically selects appropriate models based on:

  • Task complexity
  • Available context
  • Performance requirements

You can guide this with natural language:

bash
/orchestrate "Use fast models for this simple task"

Performance Tuning

Response Time Optimization

bash
# Environment variables for performance
export ORCHESTRE_PARALLEL_AGENTS=5    # More parallelism
export ORCHESTRE_TIMEOUT=60000        # Longer timeout for complex tasks

Quality vs Speed Trade-offs

json
// High Quality (models.config.json)
{
  "primary": "claude-3-opus",
  "parameters": {
    "temperature": 0.3,
    "maxTokens": 8192
  }
}

// High Speed
{
  "primary": "claude-3-haiku",
  "parameters": {
    "temperature": 0.7,
    "maxTokens": 2048
  }
}

Monitoring & Debugging

Enable Debug Logging

bash
# Environment variables for debugging
export ORCHESTRE_DEBUG=true
export ORCHESTRE_LOG_LEVEL=debug  # error, warn, info, debug, verbose

Monitor Performance

When debug mode is enabled, Orchestre logs:

  • Model selection decisions
  • API call latency
  • Token usage estimates
  • Error details

Best Practices

1. Start with Defaults

Orchestre's defaults are optimized for most use cases:

json
// Default configuration (used if no models.config.json)
{
  "primary": "claude-3-opus",
  "planning": "gemini-2.0-flash-thinking-exp",
  "review": ["gpt-4o", "claude-3-sonnet"]
}

2. Adjust Based on Results

Monitor output quality and adjust models:

  • If code quality is low → upgrade primary model
  • If analysis is shallow → upgrade planning model
  • If reviews miss issues → add more review models

3. Consider Context

  • Large codebases: Use models with larger context windows
  • Complex logic: Use more capable models
  • Simple CRUD: Use faster, cheaper models

4. Test Different Models

bash
# Compare outputs
/orchestrate --model claude-3-opus "Build feature X"
/orchestrate --model gpt-4o "Build feature X"

Troubleshooting

Common Issues

Rate Limit Errors

bash
# Reduce parallel operations
export ORCHESTRE_PARALLEL_AGENTS=1

# Wait and retry

Token Limit Exceeded

json
// Reduce token usage in models.config.json
{
  "parameters": {
    "maxTokens": 2048
  }
}

API Key Issues

bash
# Check environment variables
echo $ANTHROPIC_API_KEY
echo $OPENAI_API_KEY
echo $GEMINI_API_KEY

Migration Guide

From Earlier Versions

json
// Old format (if any)
{
  "llm": {
    "model": "gpt-4"
  }
}

// Current format (models.config.json)
{
  "primary": "gpt-4o",
  "planning": "gemini-2.0-flash-thinking-exp",
  "review": ["gpt-4o", "claude-3-sonnet"]
}

Switching Providers

bash
# Migrate from OpenAI to Claude
/orchestrate "Update model configuration to use Claude as primary"

See Also

Built with ❤️ for the AI Coding community, by Praney Behl