GitHub - modelscope/FunASR: Industrial-grade speech recognition toolkit: 170x realtime, 50+ languages, speaker diarization, emotion detection, streami
Industrial-grade speech recognition toolkit: 170x realtime, 50+ languages, speaker diarization, emotion detection, streaming, and OpenAI-compatible API.
Overview
FunASR is a cutting-edge, industrial-grade speech recognition toolkit developed by the ModelScope team. Designed for high performance and scalability, FunASR achieves an impressive 170x real-time factor, making it suitable for both batch and streaming applications. It supports over 50 languages, enabling global deployment. Advanced features include speaker diarization to identify who spoke when, emotion detection for affective computing, and a fully compatible OpenAI API for seamless integration. Whether you're building voice assistants, transcription services, or real-time captioning, FunASR provides a robust, open-source solution.
Key Features
170x real-time speech recognition
Support for 50+ languages
Speaker diarization
Emotion detection
Streaming (real-time) recognition
OpenAI-compatible API
Pre-trained models for various domains
Customizable and fine-tunable
Lightweight deployment options
Pros & Cons
Pros
- arrow_right + High speed (170x real-time)
- arrow_right + Multi-language support (50+)
- arrow_right + Comprehensive features (diarization, emotion)
- arrow_right + Streaming and batch modes
- arrow_right + OpenAI-compatible API for easy integration
- arrow_right + Open-source and free
- arrow_right + Active community and updates
Cons
- arrow_right - Requires technical expertise for setup
- arrow_right - Documentation may be incomplete for some languages
- arrow_right - Emotion detection accuracy varies by language and context
Pricing Details
FunASR is fully open-source and free to use under the Apache 2.0 license. No licensing fees or usage restrictions. For enterprise support or custom deployment, contact ModelScope.
FAQ
What is the real-time factor?
expand_moreThe real-time factor is 170x, meaning it processes 170 seconds of audio in 1 second.
Does FunASR support streaming?
expand_moreYes, it supports streaming recognition with low latency.
Can I deploy FunASR on-premises?
expand_moreYes, it is open-source and can be deployed anywhere.
How many languages does it support?
expand_moreOver 50 languages, including major world languages.
Is speaker diarization included?
expand_moreYes, FunASR can identify different speakers in audio.
What about emotion detection?
expand_moreIt includes emotion recognition capabilities.
User Reviews
Share your experience
Please sign in to leave a star rating and detailed review for this tool.
No reviews yet.
Smart Alternatives & Comparison
Compare GitHub - modelscope/FunASR: Industrial-grade speech recognition toolkit: 170x realtime, 50+ languages, speaker diarization, emotion detection, streami side-by-side with other leading tools in the same category.
| Criteria |
GitHub - modelscope/FunASR: Industrial-grade speech recognition toolkit: 170x realtime, 50+ languages, speaker diarization, emotion detection, streami
This Tool
|
|||
|---|---|---|---|---|
| Overview | Industrial-grade speech recognition toolkit: 170x realtime, 50+ languages, speaker diarization, emotion detection, streaming, and OpenAI-compatible API. | Manage local and cloud AI agents from your editor. Plan, delegate, review, and ship code without context switching. | Spawn background coding agents in the cloud. Hand off tasks from Slack, Linear, or GitHub and get a PR back. | Ship API integrations instantly. Forward deploys AI agents into your customers' repos, auto-writing and testing integrations in minutes. |
| Pricing Model | Freemium | Freemium | Freemium | Freemium |
| Community Rating |
star
0.0
(0)
|
star
0.0
(0)
|
star
0.0
(0)
|
star
0.0
(0)
|
| Developer API | cancel Not Available | cancel Not Available | cancel Not Available | cancel Not Available |
| Open Source | lock Proprietary | lock Proprietary | lock Proprietary | lock Proprietary |
| Action | Visit Web open_in_new |