Kode-cli/DEPLOYMENT_GUIDE.md
CrazyBoyM 02c7c31a31 feat: Add GPT-5 Responses API support and model adapter system
- Implement model adapter factory for unified API handling
- Add response state manager for conversation continuity
- Support GPT-5 Responses API with continuation tokens
- Add model capabilities type system
- Include deployment guide and test infrastructure
- Enhance error handling and debugging for model interactions
2025-08-22 13:22:48 +08:00

185 lines
5.2 KiB
Markdown

# Kode Responses API Support - Deployment Guide
## 🚀 Overview
The new capability-based model system has been successfully implemented to support GPT-5 and other Responses API models. The system replaces hardcoded model detection with a flexible, extensible architecture.
## ✅ What's New
### 1. **Capability-Based Architecture**
- Models are now defined by their capabilities rather than name-based detection
- Automatic API selection (Responses API vs Chat Completions)
- Seamless fallback mechanism for compatibility
### 2. **New Files Created**
```
src/
├── types/modelCapabilities.ts # Type definitions
├── constants/modelCapabilities.ts # Model capability registry
├── services/
│ ├── modelAdapterFactory.ts # Adapter factory
│ └── adapters/ # Pure adapters
│ ├── base.ts # Base adapter class
│ ├── responsesAPI.ts # Responses API adapter
│ └── chatCompletions.ts # Chat Completions adapter
└── test/testAdapters.ts # Test suite
```
### 3. **Supported Models**
- **GPT-5 Series**: gpt-5, gpt-5-mini, gpt-5-nano
- **GPT-4 Series**: gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-4
- **Claude Series**: All Claude models
- **O1 Series**: o1, o1-mini, o1-preview
- **Future Models**: GPT-6, GLM-5, and more through configuration
## 🔧 How to Use
### Enable the New System
```bash
# Enable new adapter system (default)
export USE_NEW_ADAPTERS=true
# Use legacy system (fallback)
export USE_NEW_ADAPTERS=false
```
### Add Support for New Models
Edit `src/constants/modelCapabilities.ts`:
```typescript
// Add your model to the registry
export const MODEL_CAPABILITIES_REGISTRY: Record<string, ModelCapabilities> = {
// ... existing models ...
'your-model-name': {
apiArchitecture: {
primary: 'responses_api', // or 'chat_completions'
fallback: 'chat_completions' // optional
},
parameters: {
maxTokensField: 'max_completion_tokens', // or 'max_tokens'
supportsReasoningEffort: true,
supportsVerbosity: true,
temperatureMode: 'flexible' // or 'fixed_one' or 'restricted'
},
toolCalling: {
mode: 'custom_tools', // or 'function_calling' or 'none'
supportsFreeform: true,
supportsAllowedTools: true,
supportsParallelCalls: true
},
stateManagement: {
supportsResponseId: true,
supportsConversationChaining: true,
supportsPreviousResponseId: true
},
streaming: {
supported: false,
includesUsage: true
}
}
}
```
## 🧪 Testing
### Run Adapter Tests
```bash
npx tsx src/test/testAdapters.ts
```
### Verify TypeScript Compilation
```bash
npx tsc --noEmit
```
## 🏗️ Architecture
### Request Flow
```
User Input
query.ts
claude.ts (queryLLM)
ModelAdapterFactory
[Capability Check]
ResponsesAPIAdapter or ChatCompletionsAdapter
API Call (openai.ts)
Response
```
### Key Components
1. **ModelAdapterFactory**: Determines which adapter to use based on model capabilities
2. **ResponsesAPIAdapter**: Handles GPT-5 Responses API format
3. **ChatCompletionsAdapter**: Handles traditional Chat Completions format
4. **Model Registry**: Central configuration for all model capabilities
## 🔄 Migration from Legacy System
The system is designed for zero-downtime migration:
1. **Phase 1** ✅: Infrastructure created (no impact on existing code)
2. **Phase 2** ✅: Integration with environment variable toggle
3. **Phase 3**: Remove legacy hardcoded checks (optional)
## 📊 Performance
- **Zero overhead**: Capabilities are cached after first lookup
- **Smart fallback**: Automatically uses Chat Completions for custom endpoints
- **Streaming aware**: Falls back when streaming is needed but not supported
## 🛡️ Safety Features
1. **100% backward compatible**: Legacy system preserved
2. **Environment variable toggle**: Easy rollback if needed
3. **Graceful degradation**: Falls back to Chat Completions when needed
4. **Type-safe**: Full TypeScript support
## 🎯 Benefits
1. **No more hardcoded model checks**: Clean, maintainable code
2. **Easy to add new models**: Just update the registry
3. **Future-proof**: Ready for GPT-6, GLM-5, and beyond
4. **Unified interface**: Same code handles all API types
## 📝 Notes
- The system automatically detects official OpenAI endpoints
- Custom endpoints automatically use Chat Completions API
- Streaming requirements are handled transparently
- All existing model configurations are preserved
## 🚨 Troubleshooting
### Models not using correct API
- Check if `USE_NEW_ADAPTERS=true` is set
- Verify model is in the registry
- Check if custom endpoint is configured (forces Chat Completions)
### Type errors
- Run `npx tsc --noEmit` to check for issues
- Ensure all imports are correct
### Runtime errors
- Check console for adapter selection logs
- Verify API keys and endpoints are correct
## 📞 Support
For issues or questions:
1. Check the test output: `npx tsx src/test/testAdapters.ts`
2. Review the model registry in `src/constants/modelCapabilities.ts`
3. Check adapter selection logic in `src/services/modelAdapterFactory.ts`
---
**Status**: ✅ Production Ready with Environment Variable Toggle