Speechmatics

An all-in-one, highly accurate voice API for real-time and batch transcription in multiple languages.

About Speechmatics

Introduction

Speechmatics is a leading speech technology company that provides AI-powered speech recognition solutions for businesses worldwide. With a voice API that is unmatched in accuracy and flexibility, Speechmatics provides businesses with real-time and batch transcription, multi-lingual support, and post-processing features to improve transcript readability. Founded by Dr Tony Robinson in the 1980s, Speechmatics leverages machine learning to understand human-level speech accurately, irrespective of demographics, age, gender, accent, dialect, or location. Speechmatics has been recognized for its impressive achievements and has received various accolades and awards, including being named one of the Most Innovative Companies in AI of 2023 by Fast Company and a FT 1000 (2019 - 2023) Europe's Fastest Growing Companies. The company offers an all-in-one speech API with two pricing options for businesses seeking a scalable, accurate, and reliable speech recognition solution.

TLDR

Speechmatics provides businesses with an AI-powered speech recognition solution that is unmatched in accuracy and flexibility. Its voice API offers real-time and batch transcription, multi-lingual support, and post-processing features. With a focus on machine learning, Speechmatics provides businesses with high accuracy in understanding human-level speech, irrespective of demographics, age, gender, accent, dialect, or location. The company's all-in-one speech API has two pricing options and caters to businesses seeking a scalable, accurate, and reliable speech recognition solution.

Company Overview

Speechmatics is a leading speech technology company that offers the most inclusive and accurate speech API ever released. The company was founded in the 1980s by Dr Tony Robinson, who pioneered the approach of applying neural networks to the problem of speech recognition at Cambridge University. Today, Speechmatics is changing the way companies work by offering speech technology that is accurate and fast.

Speechmatics exists to understand every voice. The company's speech API is available for solution and service providers to integrate into their stack irrespective of their industry or use case. By using machine learning, businesses from around the world use Speechmatics to accurately understand human-level speech regardless of demographic, age, gender, accent, dialect, or location.

The company's aim is to "Understand Every Voice", a goal that extends beyond just technology. Speechmatics cares deeply about its customers as well as the impact its actions have on the world. The company believes in putting people first, providing the best fit, and helping their team members develop their skills. Speechmatics trusts their teams to deliver and empowers individuals to debate freely, make timely decisions, and commit to outcomes while finding the perfect balance between the complex and the simple.

Several well-known venture capital firms have invested in Speechmatics, including Susquehanna Growth Equity, AlbionVC, IQ Capital, and Amadeus Capital Partners. These firms recognize the company's disruptive technical approach to speech recognition and support Speechmatics' vision to be a world-leading speech platform.

Speechmatics has been widely recognized for its impressive achievements, including being named one of the Most Innovative Companies in AI of 2023 by Fast Company, a FT 1000 (2019 - 2023) Europe's Fastest Growing Companies, and the recipient of the Queen's Award for Enterprise 2019 Innovation. The company has also been recognized for their outstanding AI/ML Industry Project by Computing's AI and ML Awards 2022 and as Santander Technology Business of the Year by the Growing Business Awards 2022, among other accolades.

If you are interested in joining the Speechmatics team, the company offers many available roles to see what positions are currently available.

Features

Real-time and Batch Transcription

Unmatched Accuracy

Speechmatics' Speech API offers comprehensive speech-to-text features that enable you to deliver an exceptional user experience. Our models are built to deliver in real-time, delivering the very best performance and fast transcription whether you choose batch or real-time modes. You can quickly transcribe large quantities of pre-recorded video or audio files with context-driven accuracy improvements over time. The accuracy of our transcription is unmatched, making it an ideal tool for businesses that rely on high accuracy.

Flexible Deployment

We offer low-latency, accurate transcription of live audio streams from meetings, calls, or broadcast events. You'll get initial transcriptions in milliseconds and context-driven accuracy improvements over time. This helps improve workflow efficiencies and minimize latencies, which helps to target a wider market with diverse customer needs. Speechmatics offers support for Cloud and on-prem deployments. You can switch seamlessly between the two and combine them if needed. You can also host our API in your own environment to meet your architecture, security, and compliance needs. Doing so enables you to deploy Speechmatics using Docker Containers or preconfigured Virtual Appliances.

Instant, Secure, and Scalable Access

Get instant, secure, and scalable access to our API through our Cloud deployment. Avoid the cost and complexity of building a high-availability system from scratch while getting instant access to all our new features, languages, and updates.

Multi-Lingual Support

Comprehensive Coverage

Speechmatics is an AI tool that maximizes your total addressable market. It delivers for multilingual, multicultural, and multinational businesses with coverage of nearly half the world's languages across a range of dialects and accents. We support 48 languages, covering most native languages with unmatched accuracy. Whether you need Brazilian Portuguese or Canadian French, we have you covered with a single language model that supports all associated accents and dialects. Transcribe and translate audio to and from English for over 30 languages using a single API call.

Customizability Options

The vocabulary used in different contexts and different domains can vary widely. Our customization options allow you to achieve high accuracy with even the most unique words and phrases. Boost accuracy for proper nouns, acronyms, or industry-specific terms by providing a list of custom words. Increase accuracy for a use-case or domain by using a relevant corpus of textual content to customize default models. We're developing English language packs optimized for industries with sector-specific terminology. Finance is available now, with more to follow soon.

Diarization and Speaker Labeling

Speechmatics' diarization enriches the transcript with accurate speaker labels, so your users can identify every speaker in a conversation. You can track who said what and when with speaker labeling for each word, available for both batch and real-time transcription. Capture exactly what was said, even when there is crosstalk between speakers, with separate transcription on each channel.

Post-Processing Features

Improved Readability

Written and spoken conversations vary. From punctuation to the formatting of numbers and dates, our API includes a number of features to accurately transform conversation to transcript. Identify and correctly format numbers, dates, and currencies automatically to improve transcript readability and enable effective post-processing. Improve readability with language-specific capitalization and punctuation including commas, question marks, and exclamation marks.

Automated Profanity and Hesitation Detection

Simplify integration and ensure accurate transcription with automatic detection of the language spoken. Aid comprehensibility and compliance by detecting and optionally removing words that are considered profanities or hesitations.

Rich Set of Metadata

Easily push a variety of media formats to the API and get a rich set of metadata to support your post-processing needs. Get accurate timestamps for every word in the transcript to allow for post-processing and improved end-user experience. Collect confidence scores for every word in the transcript to enable efficient human review and editing. Minimize the resource needed to prepare audio or video files with support for all major audio and video formats along with automatic sample rate detection.

Pricing

Speechmatics offers an all-in-one speech API with no hidden costs, providing businesses with the most accurate speech recognition solution available on the market. With access to world-leading Ursa generation models in 48 languages, the company caters to businesses looking to test and scale their ASR needs.

Speechmatics offers two pricing options: Batch (Pre-recorded) or Real-Time (Live Stream). The standard pricing for Batch is $1.25/hr, while the enhanced pricing is $1.90/hr. For real-time services, the standard pricing is $1.65/hr, while the enhanced pricing is $2.15/hr. All pricing plans include all features and capabilities, as well as SaaS deployment and online support, making it an excellent option for businesses seeking an all-in-one solution.

For businesses with custom integrations, SLAs, or large volumes, Speechmatics offers customized pricing plans that cater to their specific needs. Whether you need on-premises or cloud deployment, the flexible plan options provide the enterprise-level support and personalized service required to streamline your ASR needs.

Speechmatics offers an impressive range of features that make it stand out from its competitors. These include the ability to transcribe across 48 languages, including translation, accents and dialects, language detection, all major file formats, speaker and channel diarization, numeral formatting, advanced punctuation, profanity detection, word timing, and confidence scores. The company also offers a free 8-hour trial, allowing businesses to test the technology before purchasing.

For businesses that will be sending large volumes of content (over 5,000 hours per year) through Speechmatics technology, the company provides volume discounts. Billing is done on the 1st of each month for the previous month's usage, with users having 15 days to pay. Simply add your credit card details in the "manage billing" section of the portal to increase your usage. If you need assistance or have any questions, Speechmatics' customer support team is always available to help.

Speechmatics
Alternatives

Company Results

AppTek's Automatic Dubbing Technology streamlines the dubbing process with cutting-edge speech recognition, translation, and text-to-speech capabilities for expanded content viewership.

AI Chatbot

AssemblyAI offers APIs for developers to integrate cutting-edge, state-of-the-art artificial intelligence models into their products and apps.

Provides fast and accurate audio transcription services for various formats.

AI Automation

Transform written text into lifelike, natural-sounding audio using over 600+ TTS voices in 142 languages and accents.