- Implement model adapter factory for unified API handling - Add response state manager for conversation continuity - Support GPT-5 Responses API with continuation tokens - Add model capabilities type system - Include deployment guide and test infrastructure - Enhance error handling and debugging for model interactions
5.2 KiB
5.2 KiB
Kode Responses API Support - Deployment Guide
🚀 Overview
The new capability-based model system has been successfully implemented to support GPT-5 and other Responses API models. The system replaces hardcoded model detection with a flexible, extensible architecture.
✅ What's New
1. Capability-Based Architecture
- Models are now defined by their capabilities rather than name-based detection
- Automatic API selection (Responses API vs Chat Completions)
- Seamless fallback mechanism for compatibility
2. New Files Created
src/
├── types/modelCapabilities.ts # Type definitions
├── constants/modelCapabilities.ts # Model capability registry
├── services/
│ ├── modelAdapterFactory.ts # Adapter factory
│ └── adapters/ # Pure adapters
│ ├── base.ts # Base adapter class
│ ├── responsesAPI.ts # Responses API adapter
│ └── chatCompletions.ts # Chat Completions adapter
└── test/testAdapters.ts # Test suite
3. Supported Models
- GPT-5 Series: gpt-5, gpt-5-mini, gpt-5-nano
- GPT-4 Series: gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-4
- Claude Series: All Claude models
- O1 Series: o1, o1-mini, o1-preview
- Future Models: GPT-6, GLM-5, and more through configuration
🔧 How to Use
Enable the New System
# Enable new adapter system (default)
export USE_NEW_ADAPTERS=true
# Use legacy system (fallback)
export USE_NEW_ADAPTERS=false
Add Support for New Models
Edit src/constants/modelCapabilities.ts:
// Add your model to the registry
export const MODEL_CAPABILITIES_REGISTRY: Record<string, ModelCapabilities> = {
// ... existing models ...
'your-model-name': {
apiArchitecture: {
primary: 'responses_api', // or 'chat_completions'
fallback: 'chat_completions' // optional
},
parameters: {
maxTokensField: 'max_completion_tokens', // or 'max_tokens'
supportsReasoningEffort: true,
supportsVerbosity: true,
temperatureMode: 'flexible' // or 'fixed_one' or 'restricted'
},
toolCalling: {
mode: 'custom_tools', // or 'function_calling' or 'none'
supportsFreeform: true,
supportsAllowedTools: true,
supportsParallelCalls: true
},
stateManagement: {
supportsResponseId: true,
supportsConversationChaining: true,
supportsPreviousResponseId: true
},
streaming: {
supported: false,
includesUsage: true
}
}
}
🧪 Testing
Run Adapter Tests
npx tsx src/test/testAdapters.ts
Verify TypeScript Compilation
npx tsc --noEmit
🏗️ Architecture
Request Flow
User Input
↓
query.ts
↓
claude.ts (queryLLM)
↓
ModelAdapterFactory
↓
[Capability Check]
↓
ResponsesAPIAdapter or ChatCompletionsAdapter
↓
API Call (openai.ts)
↓
Response
Key Components
- ModelAdapterFactory: Determines which adapter to use based on model capabilities
- ResponsesAPIAdapter: Handles GPT-5 Responses API format
- ChatCompletionsAdapter: Handles traditional Chat Completions format
- Model Registry: Central configuration for all model capabilities
🔄 Migration from Legacy System
The system is designed for zero-downtime migration:
- Phase 1 ✅: Infrastructure created (no impact on existing code)
- Phase 2 ✅: Integration with environment variable toggle
- Phase 3: Remove legacy hardcoded checks (optional)
📊 Performance
- Zero overhead: Capabilities are cached after first lookup
- Smart fallback: Automatically uses Chat Completions for custom endpoints
- Streaming aware: Falls back when streaming is needed but not supported
🛡️ Safety Features
- 100% backward compatible: Legacy system preserved
- Environment variable toggle: Easy rollback if needed
- Graceful degradation: Falls back to Chat Completions when needed
- Type-safe: Full TypeScript support
🎯 Benefits
- No more hardcoded model checks: Clean, maintainable code
- Easy to add new models: Just update the registry
- Future-proof: Ready for GPT-6, GLM-5, and beyond
- Unified interface: Same code handles all API types
📝 Notes
- The system automatically detects official OpenAI endpoints
- Custom endpoints automatically use Chat Completions API
- Streaming requirements are handled transparently
- All existing model configurations are preserved
🚨 Troubleshooting
Models not using correct API
- Check if
USE_NEW_ADAPTERS=trueis set - Verify model is in the registry
- Check if custom endpoint is configured (forces Chat Completions)
Type errors
- Run
npx tsc --noEmitto check for issues - Ensure all imports are correct
Runtime errors
- Check console for adapter selection logs
- Verify API keys and endpoints are correct
📞 Support
For issues or questions:
- Check the test output:
npx tsx src/test/testAdapters.ts - Review the model registry in
src/constants/modelCapabilities.ts - Check adapter selection logic in
src/services/modelAdapterFactory.ts
Status: ✅ Production Ready with Environment Variable Toggle