Features - Echo

Speech Recognition

⚡ Real-time Streaming

Words appear as you speak. Advanced streaming speech recognition shows text in real-time - no waiting for recording to finish.

First token latency as low as 180ms
Multi-engine: Deepgram, ByteDance Volcano, OpenAI Whisper
Intelligent routing selects optimal path
Mixed Chinese-English recognition

🚀 Stream Mode — Speed First

Speak and see text instantly. Spot and fix errors in real-time.

Fastest Real-time Fix as you speak

📦 Batch Mode — Quality First

Process after you finish speaking. Better for deep Polish.

Better for Polish Stable Results

⚡ 180ms First Token

AI Polish

🎯 V2 Smart Presets

Five preset modes from pure transcript to deep polish. Choose the best processing for your scenario with one click.

Pure Transcript

Raw transcript only

StreamFast

Speed first

Smart Polish ⭐

Balanced (default)

Deep Edit

Maximum rewrite

🎯 5 Presets

Speech Recognition

✨ Smart ASR Engines

Industry-leading speech recognition with intelligent multi-engine routing for optimal results.

Deepgram — English leader, fast streaming
ByteDance Volcano — Chinese optimized, low latency
OpenAI Whisper — Offline backup, high accuracy
Smart switching, automatic optimal path

✨ Multi-Engine Routing

Real-time Translation

🌍 Real-time Translation

Speak in Chinese, get English. Speak in English, get Chinese. International meetings, communicating with foreign colleagues, learning languages - all made simpler.

Chinese ↔ English bidirectional
Stream translation as you speak
Works with other polish modes

🌍 Chinese ↔ English

Input Experience

📝 Editable by Design

Transcribed text inserts directly into your current input field. Edit anytime. Not just voice - it's voice + text fusion.

Auto-insert to focused input
macOS global hotkey support
iOS keyboard extension
Real-time editing

📝 Voice + Text

💡 Echo's Flexibility Design

We believe everyone has different needs. Echo provides dual flexibility:

1. API Flexibility: Logged-in users get cloud proxy to protect API keys. Non-logged users can bring their own keys — flexibility meets security.

2. Mode Switching: Stream mode for speed, Batch mode for quality — choose based on your scenario.

Powerful Yet Simple