diff --git a/fern/assistants/examples/multilingual-agent.mdx b/fern/assistants/examples/multilingual-agent.mdx index 3a2813bac..8afe1fc52 100644 --- a/fern/assistants/examples/multilingual-agent.mdx +++ b/fern/assistants/examples/multilingual-agent.mdx @@ -1155,6 +1155,7 @@ For a more structured approach with explicit language selection, see our compreh ## Provider Support Summary **Speech-to-Text (Transcription):** +- **Gladia**: Solaria, automatic language detection and code-switching. - **Deepgram**: Nova 2, Nova 3 with "Multi" language setting - **Google**: Latest models with "Multilingual" language setting - **All other providers**: Single language only, no automatic detection diff --git a/fern/customization/multilingual.mdx b/fern/customization/multilingual.mdx index cfef4bb88..94c3686c9 100644 --- a/fern/customization/multilingual.mdx +++ b/fern/customization/multilingual.mdx @@ -29,9 +29,9 @@ Set up your transcriber to automatically detect and process multiple languages. 2. Create a new assistant or edit an existing one 3. In the **Transcriber** section: - **Provider**: Select `Deepgram` (recommended), `Google`, or `Gladia` - - **Model**: For Deepgram, choose `Nova 2` or `Nova 3`; for Google, choose `Latest`; for Gladia, choose your preferred Gladia model - - **Language / Mode**: Set `Multi` (Deepgram), `Multilingual` (Google), or enable automatic language detection (Gladia) - 4. **Other providers**: May require a single language and not auto-detect + - **Model**: For Deepgram, choose `Nova 2` or `Nova 3`; for Google, choose `Latest`; for Gladia, choose `Solaria` + - **Language / Mode**: Set `Multi` (Deepgram), `Multilingual` (Google), or choose the language you want to transcribe (Gladia) + 4. **Other providers**: May require a single languages and not auto-detect 5. Click **Save** to apply the configuration @@ -460,10 +460,10 @@ Validate your configuration with different languages and scenarios. |----------|---------------------|-----------|-------| | **Deepgram** | ✅ Full auto-detection | 100+ | **Recommended**: Nova 2/Nova 3 with "Multi" language setting | | **Google STT** | ✅ Full auto-detection | 125+ | Latest models with "Multilingual" language setting | +| **Gladia** | ✅ Full auto-detection | 110+ | Supports automatic language detection and code-switching | | **Assembly AI** | ❌ English only | English | No multilingual support | | **Azure STT** | ❌ Single language | 100+ | Many languages, but no auto-detection | | **OpenAI Whisper** | ❌ Single language | 90+ | Many languages, but no auto-detection | -| **Gladia** | ✅ Full auto-detection | 110+ | Supports automatic language detection and code-switching | | **Speechmatics** | ❌ Single language | 50+ | Many languages, but no auto-detection | | **Talkscriber** | ❌ Single language | 40+ | Many languages, but no auto-detection | diff --git a/fern/debugging.mdx b/fern/debugging.mdx index 8ac703a52..2c9f95b53 100644 --- a/fern/debugging.mdx +++ b/fern/debugging.mdx @@ -83,6 +83,7 @@ Start with these immediate checks before diving deeper: - [Anthropic Status](https://status.anthropic.com/) for Anthropic language models - [ElevenLabs Status](https://status.elevenlabs.io/) for ElevenLabs voice synthesis - [Deepgram Status](https://status.deepgram.com/) for Deepgram speech-to-text + - [Gladia Status](https://status.gladia.io/) for Gladia speech-to-text - And other providers' status pages as needed diff --git a/fern/providers/transcriber/gladia.mdx b/fern/providers/transcriber/gladia.mdx index 12780e10f..4d6da4d62 100644 --- a/fern/providers/transcriber/gladia.mdx +++ b/fern/providers/transcriber/gladia.mdx @@ -1,111 +1,57 @@ --- -title: Gladia -subtitle: What is Gladia? -slug: providers/transcriber/gladia +title: Gladia +slug: providers/transcriber/gladia --- +## What is Gladia? +Gladia is a state-of-the-art audio transcription and intelligence platform. It provides **real-time** speech-to-text for audio and video and adds advanced audio-intelligence features so you can turn unstructured audio into actionable insights. It integrates easily and scales so you can focus on building features instead of transcription infrastructure. +Try Gladia on their [playground](https://app.gladia.io/?utm_source=vapi) to get a feel for the product! -**What is Gladia?** +## Why choose Gladia on Vapi for speech-to-text? -Gladia is an advanced AI platform specializing in real-time transcription, translation, and audio intelligence. By leveraging state-of-the-art ASR (Automatic Speech Recognition), NLP (Natural Language Processing), and GenAI (Generative AI) models, Gladia helps businesses extract valuable insights from unstructured audio data. Their enterprise-grade API offers scalable, secure, and efficient solutions for various applications, from virtual meetings to customer service. +### Low latency transcription +Gladia delivers low-latency live transcription, often under ~600 ms, for calls and streaming audio, with super-fast partials around ~300 ms for immediate response processing. It provides word-level timestamps and detailed custom vocabulary to power downstream workflows. +### Global language coverage +Gladia supports **110+ languages** and dialects and robustly handles multilingual and mixed-language audio. It also supports mixed-language and code-switch scenarios for natural conversations and multilingual conversations. -**The Evolution of AI Transcription:** +### Audio intelligence add-ons +Translation is available in one API call to one or more target languages. Gladia also offers summarization post-call, sentiment analysis, and named-entity recognition in real-time, enabling meeting notes, customer-call insights, and content production workflows on top of transcripts. -AI transcription has significantly evolved, moving from basic speech recognition systems to advanced platforms capable of real-time transcription, translation, and audio intelligence. Innovations in machine learning and natural language processing have enhanced accuracy and efficiency. Gladia utilizes these advancements to deliver top-tier transcription services tailored for modern business needs. +### API and integrations +Gladia offers telephony compatibility (SIP/VoIP) and noise resistance for live use cases, and supports real-time streaming with low-latency interfaces for platforms and contact centers. It also provides a developer-friendly playground to test and monitor your transcription workflows. -**Overview of Gladia’s Offerings:** +## Getting started -Gladia provides a comprehensive suite of AI-driven tools: +1. Go to the **Assistants** tab in the left-hand navigation. +2. Create a new assistant, or select the voice assistant you want to configure. +3. Open the **Transcriber** tab in the top navigation (or scroll to the Transcriber module). +4. In the **Provider** dropdown, select **Gladia**. +Watch the [Vapi x Gladia demo video](https://youtu.be/7EoYnMOHR5A?si=dIDTTXw2L--DY-QY) to see real-time features in action! -**Speech-to-Text:** +## Best practices -Gladia’s core offering is its AI-powered speech-to-text technology, delivering highly accurate and real-time transcription. This service supports automatic language detection (including code‑switching within a conversation) and 90+ languages, and includes speaker diarization. +- **Region selection**: Use the region closest to your users; EU and US options are available for data residency and latency. +- **Custom vocabulary**: Add domain-specific terms (product names, acronyms) to improve accuracy. +- **Timestamps**: Use word-level timestamps when you need precise analytics or subtitles. +- **Translation**: Use built-in translation when you need multilingual outputs from a single stream. -**Audio Intelligence:** +## Use cases -Gladia’s audio intelligence add-ons offer features like summarization, chapterization, and sentiment analysis, providing deeper insights into audio data. +- **Voice agents**: Real-time transcription, speaker attribution, translation, and post-call summaries. +- **Virtual meetings**: Live transcription, speaker attribution, translation, and meeting notes. +- **Customer service / contact centers**: Live call transcription, sentiment/keyword extraction, multilingual agent assistance. +- **Sales enablement**: Capture names, emails, and details across languages and accents; feed CRMs. +- **Media & content creation**: Transcribe/edit audio/video, generate subtitles (SRT/VTT), and translate for global distribution. -**API:** +## Data protection and compliance -Gladia’s robust API allows seamless integration of speech-to-text capabilities into applications, ensuring low latency and high availability. +Gladia offers enterprise-grade data governance, secure hosting options, and alignment with privacy and compliance frameworks such as GDPR. EU and US regions are available for data residency. -**AI Transcription Technology:** +## Useful links +- **Playground**: [app.gladia.io](https://app.gladia.io/?utm_source=vapi) +- **Website**: [gladia.io](https://gladia.io/?utm_source=vapi) +- **Documentation**: [docs.gladia.io](https://docs.gladia.io/?utm_source=vapi) -Gladia’s AI transcription technology offers several key features and benefits: - -**Features:** - -- High Accuracy: Industry-leading transcription accuracy. -- Real-time and Async Transcription: Instantaneous and batch processing options. -- Multilingual Support: Supports transcription and translation in 99 languages. - -**Benefits:** - -- Efficiency: Reduces the time needed for transcription and analysis. -- Scalability: Handles large volumes of data efficiently. -- Cost-Effective: Provides high performance at a competitive cost. - -**Real-time Transcription and Translation:** - -Gladia excels in providing real-time transcription and translation: - - -**Multilingual Support:** - -- Automatic language recognition: Detects the spoken language automatically and handles code‑switching -- 90+ languages: Supports a wide range of languages and dialects -- Real-time Translation: Near-instantaneous translation for diverse applications - -**Use Cases:** - -- Virtual Meetings: Provides real-time transcriptions, note-taking, and video captions. -- Content Creation: Transcribes and translates videos and podcasts for global audiences. - -**Developer API:** - -Gladia offers a comprehensive API for easy integration: - -**Integration:** - -- SDKs: Available for multiple programming languages. -- Comprehensive Documentation: Detailed guides and support for seamless implementation. - -**Use Cases:** - -- Application Development: Enhance applications with advanced AI capabilities. -- Business Solutions: Improve operational efficiency and customer service. - -**Use Cases for Gladia:** - -Gladia supports a wide range of applications: - -**Content Creation:** - -Enhance content creation with high-quality transcription, translation, and subtitling. - - -**Customer Service:** - -Improve customer service with accurate call transcriptions and emotion detection. - -**Market Research:** - -Gain valuable insights into market trends and customer preferences through advanced speech analysis. - -**Impact on Business Operations:** - -Gladia is revolutionizing business operations by providing tools that enhance productivity and insights. By automating transcription and audio intelligence, businesses can focus on innovation and strategy rather than manual processes. - -**Innovation and Research:** - -Gladia is committed to continuous innovation and research in AI transcription. Their team of experts focuses on advancing the capabilities of ASR and NLP technologies, exploring new applications, and refining existing tools to stay at the forefront of the industry. - -**AI Safety and Ethics:** - -Ensuring the ethical use of AI is a core principle at Gladia. They implement robust safeguards to prevent misuse of their technology and are actively involved in promoting responsible AI development. Protecting user data and maintaining transparency in AI operations are central to their mission. - -**Integrations and Compatibility:** - -Gladia’s API allows seamless integration with various platforms and applications. This ensures that users can incorporate Gladia’s AI capabilities into their existing systems effortlessly, enhancing functionality and improving user experience. \ No newline at end of file +--- \ No newline at end of file diff --git a/fern/quickstart/introduction.mdx b/fern/quickstart/introduction.mdx index 99fc8e800..d8a57219c 100644 --- a/fern/quickstart/introduction.mdx +++ b/fern/quickstart/introduction.mdx @@ -30,7 +30,7 @@ Every Vapi assistant combines three core technologies: -You have full control over each component, with dozens of providers and models to choose from; OpenAI, Anthropic, Google, Deepgram, ElevenLabs, and many, many more. +You have full control over each component, with dozens of providers and models to choose from; OpenAI, Anthropic, Google, Gladia, Deepgram, ElevenLabs, and many, many more. ## Two ways to build voice agents