Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions fern/assistants/examples/multilingual-agent.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -1155,6 +1155,7 @@ For a more structured approach with explicit language selection, see our compreh
## Provider Support Summary

**Speech-to-Text (Transcription):**
- **Gladia**: Solaria, automatic language detection and code-switching.
- **Deepgram**: Nova 2, Nova 3 with "Multi" language setting
- **Google**: Latest models with "Multilingual" language setting
- **All other providers**: Single language only, no automatic detection
Expand Down
8 changes: 4 additions & 4 deletions fern/customization/multilingual.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,9 @@ Set up your transcriber to automatically detect and process multiple languages.
2. Create a new assistant or edit an existing one
3. In the **Transcriber** section:
- **Provider**: Select `Deepgram` (recommended), `Google`, or `Gladia`
- **Model**: For Deepgram, choose `Nova 2` or `Nova 3`; for Google, choose `Latest`; for Gladia, choose your preferred Gladia model
- **Language / Mode**: Set `Multi` (Deepgram), `Multilingual` (Google), or enable automatic language detection (Gladia)
4. **Other providers**: May require a single language and not auto-detect
- **Model**: For Deepgram, choose `Nova 2` or `Nova 3`; for Google, choose `Latest`; for Gladia, choose `Solaria`
- **Language / Mode**: Set `Multi` (Deepgram), `Multilingual` (Google), or choose the language you want to transcribe (Gladia)
4. **Other providers**: May require a single languages and not auto-detect
5. Click **Save** to apply the configuration
</Tab>
<Tab title="TypeScript (Server SDK)">
Expand Down Expand Up @@ -460,10 +460,10 @@ Validate your configuration with different languages and scenarios.
|----------|---------------------|-----------|-------|
| **Deepgram** | ✅ Full auto-detection | 100+ | **Recommended**: Nova 2/Nova 3 with "Multi" language setting |
| **Google STT** | ✅ Full auto-detection | 125+ | Latest models with "Multilingual" language setting |
| **Gladia** | ✅ Full auto-detection | 110+ | Supports automatic language detection and code-switching |
| **Assembly AI** | ❌ English only | English | No multilingual support |
| **Azure STT** | ❌ Single language | 100+ | Many languages, but no auto-detection |
| **OpenAI Whisper** | ❌ Single language | 90+ | Many languages, but no auto-detection |
| **Gladia** | ✅ Full auto-detection | 110+ | Supports automatic language detection and code-switching |
| **Speechmatics** | ❌ Single language | 50+ | Many languages, but no auto-detection |
| **Talkscriber** | ❌ Single language | 40+ | Many languages, but no auto-detection |

Expand Down
1 change: 1 addition & 0 deletions fern/debugging.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ Start with these immediate checks before diving deeper:
- [Anthropic Status](https://status.anthropic.com/) for Anthropic language models
- [ElevenLabs Status](https://status.elevenlabs.io/) for ElevenLabs voice synthesis
- [Deepgram Status](https://status.deepgram.com/) for Deepgram speech-to-text
- [Gladia Status](https://status.gladia.io/) for Gladia speech-to-text
- And other providers' status pages as needed
</Step>
</Steps>
Expand Down
138 changes: 46 additions & 92 deletions fern/providers/transcriber/gladia.mdx
Original file line number Diff line number Diff line change
@@ -1,111 +1,65 @@
---
title: Gladia
subtitle: What is Gladia?
slug: providers/transcriber/gladia
title: Gladia
slug: providers/transcriber/gladia
---

## What is Gladia?
Gladia is a state-of-the-art audio transcription and intelligence platform. It provides **real-time** speech-to-text for audio and video and adds advanced audio-intelligence features so you can turn unstructured audio into actionable insights. It integrates easily and scales so you can focus on building features instead of transcription infrastructure.
<Tip>Try Gladia on their [playground](https://app.gladia.io/?utm_source=vapi) to get a feel for the product!</Tip>

**What is Gladia?**
## Why choose Gladia on Vapi?

Gladia is an advanced AI platform specializing in real-time transcription, translation, and audio intelligence. By leveraging state-of-the-art ASR (Automatic Speech Recognition), NLP (Natural Language Processing), and GenAI (Generative AI) models, Gladia helps businesses extract valuable insights from unstructured audio data. Their enterprise-grade API offers scalable, secure, and efficient solutions for various applications, from virtual meetings to customer service.
### Real-time speech-to-text
- Low-latency live transcription (often under ~300 ms) for calls and streaming audio.
- Super-fast partials transcription ( ~100 ms) for immediate response processing.
- Word-level timestamps, and detailed custom vocabulary to power downstream workflows.
- Mixed-language and code-switch support for natural conversations.

### Global language coverage
- Support for **110+ languages** and dialects.
- Robust handling of multilingual and mixed-language audio.

**The Evolution of AI Transcription:**
### Audio intelligence add-ons
- Translation in one API call to one or more target languages.
- Summarization post-call, sentiment analysis, and named-entity recognition in real-time.
- Build meeting notes, customer-call insights, and content production workflows on top of transcripts.

AI transcription has significantly evolved, moving from basic speech recognition systems to advanced platforms capable of real-time transcription, translation, and audio intelligence. Innovations in machine learning and natural language processing have enhanced accuracy and efficiency. Gladia utilizes these advancements to deliver top-tier transcription services tailored for modern business needs.
### API and integrations
- Developer-friendly REST/JSON endpoints, webhooks ans callbacks.
- Telephony compatibility (SIP/VoIP) and noise resistance for live use cases.
- Real-time streaming with low-latency interfaces for platforms and contact centers.

**Overview of Gladia’s Offerings:**
## Getting started

Gladia provides a comprehensive suite of AI-driven tools:
1. Go to the **Assistants** tab in the left-hand navigation.
2. Create a new assistant, or select the voice assistant you want to configure.
3. Open the **Transcriber** tab in the top navigation (or scroll to the Transcriber module).
4. In the **Provider** dropdown, select **Gladia**.

<Tip>Watch the [Vapi x Gladia demo video](https://youtu.be/7EoYnMOHR5A?si=dIDTTXw2L--DY-QY) to see real-time features in action!</Tip>

**Speech-to-Text:**
## Best practices

Gladia’s core offering is its AI-powered speech-to-text technology, delivering highly accurate and real-time transcription. This service supports automatic language detection (including code‑switching within a conversation) and 90+ languages, and includes speaker diarization.
- **Region selection**: Use the region closest to your users; EU and US options are available for data residency and latency.
- **Custom vocabulary**: Add domain-specific terms (product names, acronyms) to improve accuracy.
- **Timestamps**: Use word-level timestamps when you need precise analytics or subtitles.
- **Translation**: Use built-in translation when you need multilingual outputs from a single stream.

**Audio Intelligence:**
## Use cases

Gladia’s audio intelligence add-ons offer features like summarization, chapterization, and sentiment analysis, providing deeper insights into audio data.
- **Voice agents**: Real-time transcription, speaker attribution, translation, and post-call summaries.
- **Virtual meetings**: Live transcription, speaker attribution, translation, and meeting notes.
- **Customer service / contact centers**: Live call transcription, sentiment/keyword extraction, multilingual agent assistance.
- **Sales enablement**: Capture names, emails, and details across languages and accents; feed CRMs.
- **Media & content creation**: Transcribe/edit audio/video, generate subtitles (SRT/VTT), and translate for global distribution.

**API:**
## Data protection and compliance

Gladia’s robust API allows seamless integration of speech-to-text capabilities into applications, ensuring low latency and high availability.
Gladia offers enterprise-grade data governance, secure hosting options, and alignment with privacy and compliance frameworks such as GDPR. EU and US regions are available for data residency.

**AI Transcription Technology:**
## Useful links
- **Playground**: [app.gladia.io](https://app.gladia.io/?utm_source=vapi)
- **Website**: [gladia.io](https://gladia.io/?utm_source=vapi)
- **Documentation**: [docs.gladia.io](https://docs.gladia.io/?utm_source=vapi)

Gladia’s AI transcription technology offers several key features and benefits:

**Features:**

- High Accuracy: Industry-leading transcription accuracy.
- Real-time and Async Transcription: Instantaneous and batch processing options.
- Multilingual Support: Supports transcription and translation in 99 languages.

**Benefits:**

- Efficiency: Reduces the time needed for transcription and analysis.
- Scalability: Handles large volumes of data efficiently.
- Cost-Effective: Provides high performance at a competitive cost.

**Real-time Transcription and Translation:**

Gladia excels in providing real-time transcription and translation:


**Multilingual Support:**

- Automatic language recognition: Detects the spoken language automatically and handles code‑switching
- 90+ languages: Supports a wide range of languages and dialects
- Real-time Translation: Near-instantaneous translation for diverse applications

**Use Cases:**

- Virtual Meetings: Provides real-time transcriptions, note-taking, and video captions.
- Content Creation: Transcribes and translates videos and podcasts for global audiences.

**Developer API:**

Gladia offers a comprehensive API for easy integration:

**Integration:**

- SDKs: Available for multiple programming languages.
- Comprehensive Documentation: Detailed guides and support for seamless implementation.

**Use Cases:**

- Application Development: Enhance applications with advanced AI capabilities.
- Business Solutions: Improve operational efficiency and customer service.

**Use Cases for Gladia:**

Gladia supports a wide range of applications:

**Content Creation:**

Enhance content creation with high-quality transcription, translation, and subtitling.


**Customer Service:**

Improve customer service with accurate call transcriptions and emotion detection.

**Market Research:**

Gain valuable insights into market trends and customer preferences through advanced speech analysis.

**Impact on Business Operations:**

Gladia is revolutionizing business operations by providing tools that enhance productivity and insights. By automating transcription and audio intelligence, businesses can focus on innovation and strategy rather than manual processes.

**Innovation and Research:**

Gladia is committed to continuous innovation and research in AI transcription. Their team of experts focuses on advancing the capabilities of ASR and NLP technologies, exploring new applications, and refining existing tools to stay at the forefront of the industry.

**AI Safety and Ethics:**

Ensuring the ethical use of AI is a core principle at Gladia. They implement robust safeguards to prevent misuse of their technology and are actively involved in promoting responsible AI development. Protecting user data and maintaining transparency in AI operations are central to their mission.

**Integrations and Compatibility:**

Gladia’s API allows seamless integration with various platforms and applications. This ensures that users can incorporate Gladia’s AI capabilities into their existing systems effortlessly, enhancing functionality and improving user experience.
---
2 changes: 1 addition & 1 deletion fern/quickstart/introduction.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ Every Vapi assistant combines three core technologies:
</Card>
</CardGroup>

You have full control over each component, with dozens of providers and models to choose from; OpenAI, Anthropic, Google, Deepgram, ElevenLabs, and many, many more.
You have full control over each component, with dozens of providers and models to choose from; OpenAI, Anthropic, Google, Gladia, Deepgram, ElevenLabs, and many, many more.

## Two ways to build voice agents

Expand Down