CrazyBoyM 02c7c31a31 feat: Add GPT-5 Responses API support and model adapter system

- Implement model adapter factory for unified API handling
- Add response state manager for conversation continuity
- Support GPT-5 Responses API with continuation tokens
- Add model capabilities type system
- Include deployment guide and test infrastructure
- Enhance error handling and debugging for model interactions

2025-08-22 13:22:48 +08:00

5.2 KiB

Raw Blame History

Kode Responses API Support - Deployment Guide

🚀 Overview

The new capability-based model system has been successfully implemented to support GPT-5 and other Responses API models. The system replaces hardcoded model detection with a flexible, extensible architecture.

✅ What's New

1. Capability-Based Architecture

Models are now defined by their capabilities rather than name-based detection
Automatic API selection (Responses API vs Chat Completions)
Seamless fallback mechanism for compatibility

2. New Files Created

src/
├── types/modelCapabilities.ts          # Type definitions
├── constants/modelCapabilities.ts      # Model capability registry
├── services/
│   ├── modelAdapterFactory.ts         # Adapter factory
│   └── adapters/                      # Pure adapters
│       ├── base.ts                    # Base adapter class
│       ├── responsesAPI.ts            # Responses API adapter
│       └── chatCompletions.ts         # Chat Completions adapter
└── test/testAdapters.ts               # Test suite

3. Supported Models

GPT-5 Series: gpt-5, gpt-5-mini, gpt-5-nano
GPT-4 Series: gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-4
Claude Series: All Claude models
O1 Series: o1, o1-mini, o1-preview
Future Models: GPT-6, GLM-5, and more through configuration

🔧 How to Use

Enable the New System

# Enable new adapter system (default)
export USE_NEW_ADAPTERS=true

# Use legacy system (fallback)
export USE_NEW_ADAPTERS=false

Add Support for New Models

Edit src/constants/modelCapabilities.ts:

// Add your model to the registry
export const MODEL_CAPABILITIES_REGISTRY: Record<string, ModelCapabilities> = {
  // ... existing models ...
  
  'your-model-name': {
    apiArchitecture: {
      primary: 'responses_api',  // or 'chat_completions'
      fallback: 'chat_completions'  // optional
    },
    parameters: {
      maxTokensField: 'max_completion_tokens',  // or 'max_tokens'
      supportsReasoningEffort: true,
      supportsVerbosity: true,
      temperatureMode: 'flexible'  // or 'fixed_one' or 'restricted'
    },
    toolCalling: {
      mode: 'custom_tools',  // or 'function_calling' or 'none'
      supportsFreeform: true,
      supportsAllowedTools: true,
      supportsParallelCalls: true
    },
    stateManagement: {
      supportsResponseId: true,
      supportsConversationChaining: true,
      supportsPreviousResponseId: true
    },
    streaming: {
      supported: false,
      includesUsage: true
    }
  }
}

🧪 Testing

Run Adapter Tests

npx tsx src/test/testAdapters.ts

Verify TypeScript Compilation

npx tsc --noEmit

🏗️ Architecture

Request Flow

User Input
    ↓
query.ts
    ↓
claude.ts (queryLLM)
    ↓
ModelAdapterFactory
    ↓
[Capability Check]
    ↓
ResponsesAPIAdapter or ChatCompletionsAdapter
    ↓
API Call (openai.ts)
    ↓
Response

Key Components

ModelAdapterFactory: Determines which adapter to use based on model capabilities
ResponsesAPIAdapter: Handles GPT-5 Responses API format
ChatCompletionsAdapter: Handles traditional Chat Completions format
Model Registry: Central configuration for all model capabilities

🔄 Migration from Legacy System

The system is designed for zero-downtime migration:

Phase 1 ✅: Infrastructure created (no impact on existing code)
Phase 2 ✅: Integration with environment variable toggle
Phase 3: Remove legacy hardcoded checks (optional)

📊 Performance

Zero overhead: Capabilities are cached after first lookup
Smart fallback: Automatically uses Chat Completions for custom endpoints
Streaming aware: Falls back when streaming is needed but not supported

🛡️ Safety Features

100% backward compatible: Legacy system preserved
Environment variable toggle: Easy rollback if needed
Graceful degradation: Falls back to Chat Completions when needed
Type-safe: Full TypeScript support

🎯 Benefits

No more hardcoded model checks: Clean, maintainable code
Easy to add new models: Just update the registry
Future-proof: Ready for GPT-6, GLM-5, and beyond
Unified interface: Same code handles all API types

📝 Notes

The system automatically detects official OpenAI endpoints
Custom endpoints automatically use Chat Completions API
Streaming requirements are handled transparently
All existing model configurations are preserved

🚨 Troubleshooting

Models not using correct API

Check if USE_NEW_ADAPTERS=true is set
Verify model is in the registry
Check if custom endpoint is configured (forces Chat Completions)

Type errors

Run npx tsc --noEmit to check for issues
Ensure all imports are correct

Runtime errors

Check console for adapter selection logs
Verify API keys and endpoints are correct

📞 Support

For issues or questions:

Check the test output: npx tsx src/test/testAdapters.ts
Review the model registry in src/constants/modelCapabilities.ts
Check adapter selection logic in src/services/modelAdapterFactory.ts

Status: ✅ Production Ready with Environment Variable Toggle

5.2 KiB Raw Blame History