Kode-cli/DEPLOYMENT_GUIDE.md
CrazyBoyM 02c7c31a31 feat: Add GPT-5 Responses API support and model adapter system
- Implement model adapter factory for unified API handling
- Add response state manager for conversation continuity
- Support GPT-5 Responses API with continuation tokens
- Add model capabilities type system
- Include deployment guide and test infrastructure
- Enhance error handling and debugging for model interactions
2025-08-22 13:22:48 +08:00

5.2 KiB

Kode Responses API Support - Deployment Guide

🚀 Overview

The new capability-based model system has been successfully implemented to support GPT-5 and other Responses API models. The system replaces hardcoded model detection with a flexible, extensible architecture.

What's New

1. Capability-Based Architecture

  • Models are now defined by their capabilities rather than name-based detection
  • Automatic API selection (Responses API vs Chat Completions)
  • Seamless fallback mechanism for compatibility

2. New Files Created

src/
├── types/modelCapabilities.ts          # Type definitions
├── constants/modelCapabilities.ts      # Model capability registry
├── services/
│   ├── modelAdapterFactory.ts         # Adapter factory
│   └── adapters/                      # Pure adapters
│       ├── base.ts                    # Base adapter class
│       ├── responsesAPI.ts            # Responses API adapter
│       └── chatCompletions.ts         # Chat Completions adapter
└── test/testAdapters.ts               # Test suite

3. Supported Models

  • GPT-5 Series: gpt-5, gpt-5-mini, gpt-5-nano
  • GPT-4 Series: gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-4
  • Claude Series: All Claude models
  • O1 Series: o1, o1-mini, o1-preview
  • Future Models: GPT-6, GLM-5, and more through configuration

🔧 How to Use

Enable the New System

# Enable new adapter system (default)
export USE_NEW_ADAPTERS=true

# Use legacy system (fallback)
export USE_NEW_ADAPTERS=false

Add Support for New Models

Edit src/constants/modelCapabilities.ts:

// Add your model to the registry
export const MODEL_CAPABILITIES_REGISTRY: Record<string, ModelCapabilities> = {
  // ... existing models ...
  
  'your-model-name': {
    apiArchitecture: {
      primary: 'responses_api',  // or 'chat_completions'
      fallback: 'chat_completions'  // optional
    },
    parameters: {
      maxTokensField: 'max_completion_tokens',  // or 'max_tokens'
      supportsReasoningEffort: true,
      supportsVerbosity: true,
      temperatureMode: 'flexible'  // or 'fixed_one' or 'restricted'
    },
    toolCalling: {
      mode: 'custom_tools',  // or 'function_calling' or 'none'
      supportsFreeform: true,
      supportsAllowedTools: true,
      supportsParallelCalls: true
    },
    stateManagement: {
      supportsResponseId: true,
      supportsConversationChaining: true,
      supportsPreviousResponseId: true
    },
    streaming: {
      supported: false,
      includesUsage: true
    }
  }
}

🧪 Testing

Run Adapter Tests

npx tsx src/test/testAdapters.ts

Verify TypeScript Compilation

npx tsc --noEmit

🏗️ Architecture

Request Flow

User Input
    ↓
query.ts
    ↓
claude.ts (queryLLM)
    ↓
ModelAdapterFactory
    ↓
[Capability Check]
    ↓
ResponsesAPIAdapter or ChatCompletionsAdapter
    ↓
API Call (openai.ts)
    ↓
Response

Key Components

  1. ModelAdapterFactory: Determines which adapter to use based on model capabilities
  2. ResponsesAPIAdapter: Handles GPT-5 Responses API format
  3. ChatCompletionsAdapter: Handles traditional Chat Completions format
  4. Model Registry: Central configuration for all model capabilities

🔄 Migration from Legacy System

The system is designed for zero-downtime migration:

  1. Phase 1 : Infrastructure created (no impact on existing code)
  2. Phase 2 : Integration with environment variable toggle
  3. Phase 3: Remove legacy hardcoded checks (optional)

📊 Performance

  • Zero overhead: Capabilities are cached after first lookup
  • Smart fallback: Automatically uses Chat Completions for custom endpoints
  • Streaming aware: Falls back when streaming is needed but not supported

🛡️ Safety Features

  1. 100% backward compatible: Legacy system preserved
  2. Environment variable toggle: Easy rollback if needed
  3. Graceful degradation: Falls back to Chat Completions when needed
  4. Type-safe: Full TypeScript support

🎯 Benefits

  1. No more hardcoded model checks: Clean, maintainable code
  2. Easy to add new models: Just update the registry
  3. Future-proof: Ready for GPT-6, GLM-5, and beyond
  4. Unified interface: Same code handles all API types

📝 Notes

  • The system automatically detects official OpenAI endpoints
  • Custom endpoints automatically use Chat Completions API
  • Streaming requirements are handled transparently
  • All existing model configurations are preserved

🚨 Troubleshooting

Models not using correct API

  • Check if USE_NEW_ADAPTERS=true is set
  • Verify model is in the registry
  • Check if custom endpoint is configured (forces Chat Completions)

Type errors

  • Run npx tsc --noEmit to check for issues
  • Ensure all imports are correct

Runtime errors

  • Check console for adapter selection logs
  • Verify API keys and endpoints are correct

📞 Support

For issues or questions:

  1. Check the test output: npx tsx src/test/testAdapters.ts
  2. Review the model registry in src/constants/modelCapabilities.ts
  3. Check adapter selection logic in src/services/modelAdapterFactory.ts

Status: Production Ready with Environment Variable Toggle