chore: initialize sandbox and overwrite remote content
Some checks failed
Pre-commit / run (ubuntu-latest) (push) Has been cancelled
Deploy Sphinx documentation to Pages / build_en (ubuntu-latest, 3.10) (push) Has been cancelled
Deploy Sphinx documentation to Pages / build_zh (ubuntu-latest, 3.10) (push) Has been cancelled
Python Unittest Coverage / test (macos-15, 3.10) (push) Has been cancelled
Python Unittest Coverage / test (macos-15, 3.11) (push) Has been cancelled
Python Unittest Coverage / test (macos-15, 3.12) (push) Has been cancelled
Python Unittest Coverage / test (ubuntu-latest, 3.10) (push) Has been cancelled
Python Unittest Coverage / test (ubuntu-latest, 3.11) (push) Has been cancelled
Python Unittest Coverage / test (ubuntu-latest, 3.12) (push) Has been cancelled
Python Unittest Coverage / test (windows-latest, 3.10) (push) Has been cancelled
Python Unittest Coverage / test (windows-latest, 3.11) (push) Has been cancelled
Python Unittest Coverage / test (windows-latest, 3.12) (push) Has been cancelled
Some checks failed
Pre-commit / run (ubuntu-latest) (push) Has been cancelled
Deploy Sphinx documentation to Pages / build_en (ubuntu-latest, 3.10) (push) Has been cancelled
Deploy Sphinx documentation to Pages / build_zh (ubuntu-latest, 3.10) (push) Has been cancelled
Python Unittest Coverage / test (macos-15, 3.10) (push) Has been cancelled
Python Unittest Coverage / test (macos-15, 3.11) (push) Has been cancelled
Python Unittest Coverage / test (macos-15, 3.12) (push) Has been cancelled
Python Unittest Coverage / test (ubuntu-latest, 3.10) (push) Has been cancelled
Python Unittest Coverage / test (ubuntu-latest, 3.11) (push) Has been cancelled
Python Unittest Coverage / test (ubuntu-latest, 3.12) (push) Has been cancelled
Python Unittest Coverage / test (windows-latest, 3.10) (push) Has been cancelled
Python Unittest Coverage / test (windows-latest, 3.11) (push) Has been cancelled
Python Unittest Coverage / test (windows-latest, 3.12) (push) Has been cancelled
This commit is contained in:
160
examples/workflows/multiagent_realtime/README.md
Normal file
160
examples/workflows/multiagent_realtime/README.md
Normal file
@@ -0,0 +1,160 @@
|
||||
# Multi-Agent Realtime Voice Interaction Example
|
||||
|
||||
This example demonstrates how to use AgentScope's `ChatRoom` class to create a multi-agent real-time voice interaction system where two AI agents can have autonomous conversations without user input.
|
||||
|
||||
## Features
|
||||
|
||||
- 🗣️ **Real-time Voice Interaction**: Two agents communicate through voice in real-time
|
||||
- 🤖 **Autonomous Conversation**: Agents converse with each other without user intervention
|
||||
- ⚙️ **Customizable Configuration**: Configure agent names and instructions through the web interface
|
||||
- 🎨 **Modern UI**: Clean, shadcn-inspired interface for easy interaction
|
||||
- 📊 **Live Transcript**: See the conversation transcripts in real-time
|
||||
|
||||
## Architecture
|
||||
|
||||
The example uses:
|
||||
- **Backend**: FastAPI server with WebSocket support
|
||||
- **Frontend**: HTML5 with Web Audio API for audio playback
|
||||
- **AgentScope Components**:
|
||||
- `ChatRoom`: Manages multiple `RealtimeAgent` instances
|
||||
- `RealtimeAgent`: Handles real-time voice interaction with AI models
|
||||
- `DashScopeRealtimeModel`: DashScope's Qwen3-Omni realtime model
|
||||
|
||||
## Prerequisites
|
||||
|
||||
1. **Python Dependencies**:
|
||||
```bash
|
||||
pip install agentscope[dashscope]
|
||||
pip install fastapi uvicorn
|
||||
```
|
||||
|
||||
2. **DashScope API Key**:
|
||||
- Set your DashScope API key as an environment variable:
|
||||
```bash
|
||||
export DASHSCOPE_API_KEY="your-api-key-here"
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
1. **Start the Server**:
|
||||
```bash
|
||||
python run_server.py
|
||||
```
|
||||
|
||||
2. **Open the Web Interface**:
|
||||
- Navigate to `http://localhost:8000` in your web browser
|
||||
|
||||
3. **Configure Agents**:
|
||||
- Set names and instructions for both Agent 1 and Agent 2
|
||||
- Example configurations:
|
||||
- **Agent 1 (Alice)**: "You are Alice, a cheerful and optimistic person who loves to share stories and ask questions. Keep your responses brief and conversational."
|
||||
- **Agent 2 (Bob)**: "You are Bob, a thoughtful and analytical person who enjoys deep conversations. Keep your responses brief and conversational."
|
||||
|
||||
4. **Start the Conversation**:
|
||||
- Click the "▶️ Start Conversation" button
|
||||
- The agents will begin conversing autonomously
|
||||
- You'll see transcripts and system messages in the message panel
|
||||
- Audio playback will stream in real-time
|
||||
|
||||
5. **Stop the Conversation**:
|
||||
- Click the "⏹️ Stop Conversation" button when you want to end the session
|
||||
|
||||
## How It Works
|
||||
|
||||
### Backend Flow
|
||||
|
||||
1. **WebSocket Connection**: Client connects via WebSocket to `/ws/{user_id}/{session_id}`
|
||||
2. **Session Creation**:
|
||||
- Client sends `client_session_create` event with agent configurations
|
||||
- Server creates two `RealtimeAgent` instances with specified names and instructions
|
||||
- Server creates a `ChatRoom` with both agents
|
||||
- Server starts the chat room and returns `session_created` event
|
||||
3. **Message Broadcasting**:
|
||||
- `ChatRoom` automatically broadcasts messages between agents
|
||||
- All events (audio, transcripts, etc.) are forwarded to the frontend
|
||||
4. **Session End**: Client sends `client_session_end` event to stop the conversation
|
||||
|
||||
### Frontend Flow
|
||||
|
||||
1. **WebSocket Setup**: Establishes connection and waits for server events
|
||||
2. **Session Management**: Sends configuration and manages conversation state
|
||||
3. **Audio Playback**:
|
||||
- Receives base64-encoded PCM16 audio chunks
|
||||
- Decodes and queues audio data
|
||||
- Uses Web Audio API `ScriptProcessorNode` for streaming playback at 24kHz
|
||||
4. **Transcript Display**: Shows real-time transcripts from both agents
|
||||
|
||||
## Key Components
|
||||
|
||||
### ChatRoom
|
||||
|
||||
The `ChatRoom` class manages multiple `RealtimeAgent` instances:
|
||||
- Establishes connections for all agents
|
||||
- Broadcasts messages between agents automatically
|
||||
- Forwards events to the frontend
|
||||
- Handles lifecycle management (start/stop)
|
||||
|
||||
### RealtimeAgent
|
||||
|
||||
Each `RealtimeAgent`:
|
||||
- Connects to the DashScope realtime API
|
||||
- Processes audio input from other agents
|
||||
- Generates voice responses
|
||||
- Emits events for transcripts, audio, and status updates
|
||||
|
||||
## Customization
|
||||
|
||||
### Changing the Model
|
||||
|
||||
To use a different model, modify the `DashScopeRealtimeModel` configuration in `run_server.py`:
|
||||
|
||||
```python
|
||||
model=DashScopeRealtimeModel(
|
||||
model_name="your-model-name",
|
||||
api_key=os.getenv("DASHSCOPE_API_KEY"),
|
||||
)
|
||||
```
|
||||
|
||||
### Adding More Agents
|
||||
|
||||
To add more agents, modify the agent creation section in `run_server.py`:
|
||||
|
||||
```python
|
||||
agent3 = RealtimeAgent(
|
||||
name=agent3_name,
|
||||
sys_prompt=agent3_instructions,
|
||||
model=DashScopeRealtimeModel(
|
||||
model_name="qwen3-omni-flash-realtime",
|
||||
api_key=os.getenv("DASHSCOPE_API_KEY"),
|
||||
),
|
||||
)
|
||||
|
||||
chat_room = ChatRoom(agents=[agent1, agent2, agent3])
|
||||
```
|
||||
|
||||
And update the frontend to include configuration fields for the additional agents.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### No Audio Playback
|
||||
- Ensure your browser supports Web Audio API
|
||||
- Check browser console for audio-related errors
|
||||
- Verify the audio format matches the expected PCM16 at 24kHz
|
||||
|
||||
### Connection Issues
|
||||
- Verify your DashScope API key is set correctly
|
||||
- Check that port 8000 is not blocked by firewall
|
||||
- Review server logs for error messages
|
||||
|
||||
### Agents Not Responding
|
||||
- Ensure both agent configurations have valid instructions
|
||||
- Check that the instructions encourage conversational behavior
|
||||
- Review the console logs for API errors
|
||||
|
||||
## References
|
||||
|
||||
- [AgentScope Documentation](https://modelscope.github.io/agentscope/)
|
||||
- [DashScope API Documentation](https://help.aliyun.com/zh/model-studio/)
|
||||
- [FastAPI Documentation](https://fastapi.tiangolo.com/)
|
||||
- [Web Audio API](https://developer.mozilla.org/en-US/docs/Web/API/Web_Audio_API)
|
||||
|
||||
Reference in New Issue
Block a user