Files
command-assistant/docs/developer/api-integration.md
T
2025-04-09 09:34:15 +02:00

9.1 KiB

API Integration

This document explains how Edison integrates with the OpenAI API to translate natural language into shell commands.

Overview

flowchart LR
    A[User Query] --> B[Prompt Construction]
    B --> C[API Request]
    C --> D[Response Processing]
    D --> E[Command Extraction]
    E --> F[Command Validation]
    F --> G[Command Execution]
    
    style A fill:#f9d5e5,stroke:#333,stroke-width:2px
    style B fill:#eeeeee,stroke:#333,stroke-width:2px
    style C fill:#d3f6db,stroke:#333,stroke-width:2px
    style D fill:#d3f6f5,stroke:#333,stroke-width:2px
    style E fill:#d5f6d5,stroke:#333,stroke-width:2px
    style F fill:#f6f6d5,stroke:#333,stroke-width:2px
    style G fill:#f5d5f5,stroke:#333,stroke-width:2px

API Client Module

The api_client.py module is responsible for all OpenAI API interactions:

def get_api_key(config):
    """Get the OpenAI API key from various sources."""
    # Find API key from environment, file, or config

def create_client(config):
    """Create and initialize an OpenAI client."""
    # Initialize client with API key

def call_api(client, config, query):
    """Call the OpenAI API with the given query."""
    # Send request to API and extract response

def generate_command(client, config, query, max_retries=3):
    """Generate a command using the OpenAI API with retry logic."""
    # Call API with retries for rate limits

API Key Management

Edison supports multiple methods for supplying the OpenAI API key, processed in this order:

  1. Environment Variable: OPENAI_API_KEY
  2. API Key File: ~/.openai.apikey
  3. Configuration File: openai_api_key in edison.yaml

This implementation is in get_api_key():

def get_api_key(config):
    """Get the OpenAI API key from various sources."""
    dotenv.load_dotenv()
    
    # Method 1: Environment variable
    api_key = os.getenv("OPENAI_API_KEY")
    
    # Method 2: File in home directory
    if not api_key:
        home_path = os.path.expanduser("~")
        api_key_path = os.path.join(home_path, ".openai.apikey")
        if os.path.exists(api_key_path):
            with open(api_key_path, 'r') as f:
                api_key = f.read().strip()
    
    # Method 3: Configuration file
    if not api_key:
        api_key = config.get("openai_api_key")
    
    if not api_key:
        raise ValueError("No OpenAI API key found")
        
    return api_key

Prompt Construction

Edison uses a template-based approach to construct effective prompts:

sequenceDiagram
    participant User
    participant PromptManager
    participant Template
    participant APIClient
    participant OpenAI
    
    User->>APIClient: Query: "list all files"
    APIClient->>PromptManager: get_full_prompt(query, shell)
    PromptManager->>Template: Load template
    Template-->>PromptManager: Template content
    PromptManager->>PromptManager: Format template with query
    PromptManager-->>APIClient: Formatted prompt
    APIClient->>OpenAI: Send API request
    OpenAI-->>APIClient: Command response

The prompt_manager.py module handles this:

def load_prompt_template(shell="bash"):
    """Load the prompt template with shell-specific considerations."""
    # Load and return the appropriate template

def get_full_prompt(query, shell="bash"):
    """Get the full prompt for the given query and shell."""
    # Format prompt with query and shell

Prompt Template

Edison uses a prompt template (edison.prompt) to structure requests to the AI model. The template:

  1. Provides context about the desired output format
  2. Includes examples of good responses
  3. Specifies the shell environment
  4. Encourages safe commands
  5. Includes the user's query

API Request

Edison uses the OpenAI Python client library for API requests:

def call_api(client, config, query):
    """Call the OpenAI API with the given query."""
    prompt = prompt_manager.get_full_prompt(query, config.get("shell", "bash"))
    system_prompt = prompt.split('\n')[0] if '\n' in prompt else prompt
    
    response = client.chat.completions.create(
        model=config.get("model", "gpt-3.5-turbo"),
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": prompt}
        ],
        temperature=config.get("temperature", 0),
        max_tokens=config.get("max_tokens", 500),
    )
    
    return response.choices[0].message.content.strip()

Streaming API Integration

Edison also supports streaming command generation, which provides a more responsive user experience:

def call_api_streaming(client, config, query, callback):
    """Call the OpenAI API with streaming enabled.
    
    Args:
        client: The OpenAI client
        config: The configuration dictionary
        query: The user query
        callback: Function to call with each token
    """
    prompt = prompt_manager.get_full_prompt(query, config.get("shell", "bash"))
    system_prompt = prompt.split('\n')[0] if '\n' in prompt else prompt
    
    response = client.chat.completions.create(
        model=config.get("model", "gpt-3.5-turbo"),
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": prompt}
        ],
        temperature=config.get("temperature", 0),
        max_tokens=config.get("max_tokens", 500),
        stream=True  # Enable streaming
    )
    
    # Collect the full response while calling the callback for each chunk
    full_response = ""
    
    for chunk in response:
        if chunk.choices and chunk.choices[0].delta.content:
            content = chunk.choices[0].delta.content
            full_response += content
            callback(content)  # Call the callback with each chunk
            
    return full_response.strip()

This streaming functionality is integrated with the UI through a callback function that updates the display in real-time as tokens are received.

Request Parameters

The API request includes these key parameters:

Parameter Description Default
model The OpenAI model to use gpt-3.5-turbo
temperature Randomness of completions (0-1) 0
max_tokens Maximum tokens in response 500

Error Handling and Retries

The generate_command() function implements retry logic to handle rate limiting:

def generate_command(client, config, query, max_retries=3):
    """Generate a command using the OpenAI API with retry logic."""
    for attempt in range(max_retries):
        try:
            return call_api(client, config, query)
        except Exception as e:
            if "rate limit" in str(e).lower() and attempt < max_retries - 1:
                wait_time = 2 ** attempt  # Exponential backoff
                logger.warning(f"Rate limited. Retrying in {wait_time} seconds...")
                time.sleep(wait_time)
            else:
                logger.error(f"Error after {attempt+1} attempts: {str(e)}")
                raise

Key features:

  • Exponential backoff for rate limits
  • Maximum retry attempts
  • Detailed error logging

Command Processing and Validation

After receiving the API response, Edison:

  1. Extracts the command from the response
  2. Validates the command for safety
  3. Checks for markdown or other formatting issues
  4. Prepares the command for execution

Extending the API Integration

To support additional AI providers or models:

  1. Create a new client factory function:

    def create_anthropic_client(config):
        # Initialize Anthropic client
    
  2. Add a model selection mechanism:

    def get_client_for_model(config):
        model = config.get("model", "gpt-3.5-turbo")
        if model.startswith("claude"):
            return create_anthropic_client(config)
        else:
            return create_client(config)
    
  3. Implement provider-specific API call function:

    def call_anthropic_api(client, config, query):
        # Format request for Anthropic API
    
  4. Update the command generation logic:

    def generate_command(client, config, query, max_retries=3):
        model = config.get("model", "gpt-3.5-turbo")
        if model.startswith("claude"):
            return call_anthropic_api(client, config, query)
        else:
            return call_api(client, config, query)
    

API Response Examples

Successful Response

{
  "choices": [
    {
      "message": {
        "content": "ls -la",
        "role": "assistant"
      },
      "index": 0,
      "finish_reason": "stop"
    }
  ]
}

Error Response

{
  "error": {
    "message": "Rate limit exceeded",
    "type": "rate_limit_error",
    "param": null,
    "code": null
  }
}

Performance Considerations

To optimize API usage and performance:

  1. Use Efficient Models: Default to gpt-3.5-turbo for lower latency
  2. Limit Token Usage: Keep max_tokens reasonable (default 500)
  3. Request Caching: Consider implementing caching for common queries
  4. Concurrent Requests: For batch processing, consider async requests
  5. Prompt Optimization: Keep prompt templates concise but effective