command-assistant/docs/developer/api-integration.md

# API Integration

This document explains how Edison integrates with the OpenAI API to translate natural language into shell commands.

## Overview

```mermaid
flowchart LR
    A[User Query] --> B[Prompt Construction]
    B --> C[API Request]
    C --> D[Response Processing]
    D --> E[Command Extraction]
    E --> F[Command Validation]
    F --> G[Command Execution]

    style A fill:#f9d5e5,stroke:#333,stroke-width:2px
    style B fill:#eeeeee,stroke:#333,stroke-width:2px
    style C fill:#d3f6db,stroke:#333,stroke-width:2px
    style D fill:#d3f6f5,stroke:#333,stroke-width:2px
    style E fill:#d5f6d5,stroke:#333,stroke-width:2px
    style F fill:#f6f6d5,stroke:#333,stroke-width:2px
    style G fill:#f5d5f5,stroke:#333,stroke-width:2px
```

## API Client Module

The `api_client.py` module is responsible for all OpenAI API interactions:

```python
def get_api_key(config):
    """Get the OpenAI API key from various sources."""
    # Find API key from environment, file, or config

def create_client(config):
    """Create and initialize an OpenAI client."""
    # Initialize client with API key

def call_api(client, config, query):
    """Call the OpenAI API with the given query."""
    # Send request to API and extract response

def generate_command(client, config, query, max_retries=3):
    """Generate a command using the OpenAI API with retry logic."""
    # Call API with retries for rate limits
```

## API Key Management

Edison supports multiple methods for supplying the OpenAI API key, processed in this order:

1. **Environment Variable**: `OPENAI_API_KEY`
2. **API Key File**: `~/.openai.apikey`
3. **Configuration File**: `openai_api_key` in `edison.yaml`

This implementation is in `get_api_key()`:

```python
def get_api_key(config):
    """Get the OpenAI API key from various sources."""
    dotenv.load_dotenv()

    # Method 1: Environment variable
    api_key = os.getenv("OPENAI_API_KEY")

    # Method 2: File in home directory
    if not api_key:
        home_path = os.path.expanduser("~")
        api_key_path = os.path.join(home_path, ".openai.apikey")
        if os.path.exists(api_key_path):
            with open(api_key_path, 'r') as f:
                api_key = f.read().strip()

    # Method 3: Configuration file
    if not api_key:
        api_key = config.get("openai_api_key")

    if not api_key:
        raise ValueError("No OpenAI API key found")

    return api_key
```

## Prompt Construction

Edison uses a template-based approach to construct effective prompts:

```mermaid
sequenceDiagram
    participant User
    participant PromptManager
    participant Template
    participant APIClient
    participant OpenAI

    User->>APIClient: Query: "list all files"
    APIClient->>PromptManager: get_full_prompt(query, shell)
    PromptManager->>Template: Load template
    Template-->>PromptManager: Template content
    PromptManager->>PromptManager: Format template with query
    PromptManager-->>APIClient: Formatted prompt
    APIClient->>OpenAI: Send API request
    OpenAI-->>APIClient: Command response
```

The `prompt_manager.py` module handles this:

```python
def load_prompt_template(shell="bash"):
    """Load the prompt template with shell-specific considerations."""
    # Load and return the appropriate template

def get_full_prompt(query, shell="bash"):
    """Get the full prompt for the given query and shell."""
    # Format prompt with query and shell
```

### Prompt Template

Edison uses a prompt template (`edison.prompt`) to structure requests to the AI model. The template:

1. Provides context about the desired output format
2. Includes examples of good responses
3. Specifies the shell environment
4. Encourages safe commands
5. Includes the user's query

## API Request

Edison uses the OpenAI Python client library for API requests:

```python
def call_api(client, config, query):
    """Call the OpenAI API with the given query."""
    prompt = prompt_manager.get_full_prompt(query, config.get("shell", "bash"))
    system_prompt = prompt.split('\n')[0] if '\n' in prompt else prompt

    response = client.chat.completions.create(
        model=config.get("model", "gpt-3.5-turbo"),
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": prompt}
        ],
        temperature=config.get("temperature", 0),
        max_tokens=config.get("max_tokens", 500),
    )

    return response.choices[0].message.content.strip()
```

### Streaming API Integration

Edison also supports streaming command generation, which provides a more responsive user experience:

```python
def call_api_streaming(client, config, query, callback):
    """Call the OpenAI API with streaming enabled.

    Args:
        client: The OpenAI client
        config: The configuration dictionary
        query: The user query
        callback: Function to call with each token
    """
    prompt = prompt_manager.get_full_prompt(query, config.get("shell", "bash"))
    system_prompt = prompt.split('\n')[0] if '\n' in prompt else prompt

    response = client.chat.completions.create(
        model=config.get("model", "gpt-3.5-turbo"),
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": prompt}
        ],
        temperature=config.get("temperature", 0),
        max_tokens=config.get("max_tokens", 500),
        stream=True  # Enable streaming
    )

    # Collect the full response while calling the callback for each chunk
    full_response = ""

    for chunk in response:
        if chunk.choices and chunk.choices[0].delta.content:
            content = chunk.choices[0].delta.content
            full_response += content
            callback(content)  # Call the callback with each chunk

    return full_response.strip()
```

This streaming functionality is integrated with the UI through a callback function that updates the display in real-time as tokens are received.

### Request Parameters

The API request includes these key parameters:

| Parameter | Description | Default |
|-----------|-------------|---------|
| model | The OpenAI model to use | gpt-3.5-turbo |
| temperature | Randomness of completions (0-1) | 0 |
| max_tokens | Maximum tokens in response | 500 |

## Error Handling and Retries

The `generate_command()` function implements retry logic to handle rate limiting:

```python
def generate_command(client, config, query, max_retries=3):
    """Generate a command using the OpenAI API with retry logic."""
    for attempt in range(max_retries):
        try:
            return call_api(client, config, query)
        except Exception as e:
            if "rate limit" in str(e).lower() and attempt < max_retries - 1:
                wait_time = 2 ** attempt  # Exponential backoff
                logger.warning(f"Rate limited. Retrying in {wait_time} seconds...")
                time.sleep(wait_time)
            else:
                logger.error(f"Error after {attempt+1} attempts: {str(e)}")
                raise
```

Key features:
- Exponential backoff for rate limits
- Maximum retry attempts
- Detailed error logging

## Command Processing and Validation

After receiving the API response, Edison:

1. Extracts the command from the response
2. Validates the command for safety
3. Checks for markdown or other formatting issues
4. Prepares the command for execution

## Extending the API Integration

To support additional AI providers or models:

1. **Create a new client factory function**:
   ```python
   def create_anthropic_client(config):
       # Initialize Anthropic client
   ```

2. **Add a model selection mechanism**:
   ```python
   def get_client_for_model(config):
       model = config.get("model", "gpt-3.5-turbo")
       if model.startswith("claude"):
           return create_anthropic_client(config)
       else:
           return create_client(config)
   ```

3. **Implement provider-specific API call function**:
   ```python
   def call_anthropic_api(client, config, query):
       # Format request for Anthropic API
   ```

4. **Update the command generation logic**:
   ```python
   def generate_command(client, config, query, max_retries=3):
       model = config.get("model", "gpt-3.5-turbo")
       if model.startswith("claude"):
           return call_anthropic_api(client, config, query)
       else:
           return call_api(client, config, query)
   ```

## API Response Examples

### Successful Response

```json
{
  "choices": [
    {
      "message": {
        "content": "ls -la",
        "role": "assistant"
      },
      "index": 0,
      "finish_reason": "stop"
    }
  ]
}
```

### Error Response

```json
{
  "error": {
    "message": "Rate limit exceeded",
    "type": "rate_limit_error",
    "param": null,
    "code": null
  }
}
```

## Performance Considerations

To optimize API usage and performance:

1. **Use Efficient Models**: Default to `gpt-3.5-turbo` for lower latency
2. **Limit Token Usage**: Keep max_tokens reasonable (default 500)
3. **Request Caching**: Consider implementing caching for common queries
4. **Concurrent Requests**: For batch processing, consider async requests
5. **Prompt Optimization**: Keep prompt templates concise but effective