speech

Increase in the Number of Children Who Receive Federal Disability Benefits for Speech and Language Disorders Similar to Trends in the General Population, Says New Report

The increase in the number of children from low-income families who are receiving federal disability benefits for speech and language disorders over the past decade parallels the rise in the prevalence of these disorders among all U.S. children, says a new report by the National Academies of Sciences, Engineering, and Medicine.




speech

EMAM, Inc. partners with Deepgram to Elevate Video and Audio Management with Advanced Speech-to-Text and Audio Intelligence Integration

The joint offering allows organizations to easily find the best media for any use.




speech

Sensory Unveils TrulyNatural Speech-To-Text on the Edge

TrulyNatural Speech-To-Text technology offers unrivaled accuracy and multilingual support in a compact package.




speech

Verbal Transactions Simulation Software, ACES, Adds Interactive Speech & Deep Analytics Capabilities to Articulate Storyline & other eLearning Platforms

ACES (accelerated cognitive engagement system) can embed any eLearning content to provide richer user experiences and capture real-time skill gap analysis.




speech

Auphonic Speech Recognition Engine using Whisper by OpenAI (Beta)

Today we release our first self-hosted Auphonic Speech Recognition Engine using the open-source Whisper model by OpenAI!
With Whisper, you can now integrate automatic speech recognition in 99 languages into your Auphonic audio post-production workflow, without creating an external account and without extra costs!

Whisper Speech Recognition in Auphonic

So far, Auphonic users had to choose one of our integrated external service providers (Wit.ai, Google Cloud Speech, Amazon Transcribe, Speechmatics) for speech recognition, so audio files were transferred to an external server, using external computing powers, that users had to pay for in their external accounts.

The new Auphonic Speech Recognition is using Whisper, which was published by OpenAI as an open-source project. Open-source means, the publicly shared GitHub repository contains a complete Whisper package including source code, examples, and research results.
However, automatic speech recognition is a very time and hardware-consuming process, that can be incredibly slow using a standard home computer without special GPUs. So we decided to integrate this service and offer you automatic speech recognition (ASR) by Whisper processed on our own hardware, just like any other Auphonic processing task, giving you quite some benefits:

  • No external account is needed anymore to run ASR in Auphonic.
  • Your data doesn't leave our Auphonic servers for ASR processing.
  • No extra costs for external ASR services.
  • Additional Auphonic pre- and post-processing for more accurate ASR, especially for Multitrack Productions.
  • The quality of Whisper ASR is absolutely comparable to the “best” services in our comparison table.

How to use Whisper?

To use the Auphonic Whisper integration, you just have to create a production or preset as you are used to and select “Auphonic Whisper ASR” as “Service” in the section Speech Recognition.
This option will automatically appear for Beta and paying users. If you are a free user but want to try Whisper: please just ask for access!

When your Auphonic speech recognition is done, you can download your transcript in different formats and may edit or share your transcript with the Auphonic Transcript Editor.
For more details about all our integrated speech recognition services, please visit our Speech Recognition Help and watch this channel for Whisper updates – soon to come.

Why Beta?

We decided to launch Whisper for Beta and paying users only, as Whisper was just published end of September and there was not enough time to test every single use case sufficiently.
Another issue is the required computing power: for suitable scaling of the GPU infrastructure, we need a beta phase to test the service while we are monitoring the hardware usage, to make sure there are no server overloads.

Conclusion

Automatic speech recognition services are evolving very quickly, and we've seen major improvements over the past few years.
With Whisper, we can now perform speech recognition without extra costs on our own GPU hardware, no external services are required anymore.

Auphonic Whisper ASR is available for Beta and paying users now, free users can ask for Beta access.
You are very welcome to send us feedback (directly in the production interface or via email), whether you notice something that works particularly well or discover any problems.
Your feedback is a great help to improve the system!







speech

Integrating Image-To-Text And Text-To-Speech Models (Part 1)

Joas Pambou built an app that integrates vision language models (VLMs) and text-to-speech (TTS) AI technologies to describe images audibly with speech. This audio description tool can be a big help for people with sight challenges to understand what’s in an image. But how this does it even work? Joas explains how these AI systems work and their potential uses, including how he built the app and ways to further improve it.




speech

Integrating Image-To-Text And Text-To-Speech Models (Part 2)

In the second part of this series, Joas Pambou aims to build a more advanced version of the previous application that performs conversational analyses on images or videos, much like a chatbot assistant. This means you can ask and learn more about your input content.




speech

Who Really Supports Free Speech?

Judd Legum: For actual examples of government censorship, consider what is happening right now in a state that has embraced MAGA-style politics: Florida.




speech

UK: The King’s Speech and What it Means for Employment Law

  • The King’s Speech was delivered on July 17, setting forth the UK Government’s legislative agenda for the next Parliamentary Session.  
  • Highlights include the introduction of an Employment Rights Bill within the first 100 days, publication of a Draft Equality (Race and Disability) Bill, and a living wage that accounts for the current cost of living and eliminates age bands.




speech

ETSI workshop: improving Quality of Emerging Services for Speech and Audio

ETSI workshop: improving Quality of Emerging Services for Speech and Audio

Sophia Antipolis, 23 November 2022

The ETSI STQ (Speech and multimedia Transmission Quality) Workshop that took place on 21-22 November 2022 in Bratislava (Slovakia) was hosted by Amazon. It focused on a user-centred perspective of the Quality of Emerging Services for Speech and Audio.

The event was attended by organizations providing a rich mix of inputs and perspectives from industry, regulators, and academia. Through presentations, discussions and professional networking, this STQ Workshop demonstrated a very high level of engagement by all participants, with stimulating interaction among all speakers and the audience.

Read More...




speech

Episode 408: Mike McCourt on Voice and Speech Analysis

Felienne spoke with Mike McCourt on difficulties in processing voice data using machine learning.







speech

The King's Speech Screening

Followed by Q&A with actor Colin Firth and director Tom Hooper




speech

The King’s Speech – New Resources

Celebrate the film’s release on DVD and Blu-Ray with additional study materials for KS4 English and Media




speech

Curtis LeGeyt Speech at Media Institute Communications Forum Luncheon

WASHINGTON, D.C. – National Association of Broadcasters President and CEO Curtis LeGeyt was the featured speaker at The Media Institute’s Communications Forum luncheon today.




speech

Azerbaijani president & UN climate summit host calls oil a ‘gift of God’ in COP29 speech – ‘The people need them’ – Slams Western ‘fake news media’

Azerbaijani President Ilham Aliyev, has accused Western "fake news media" and environmental organizations of a slander campaign against his country, in his address to fellow leaders...Aliyev repeated his controversial quote that Azerbaijan's oil and gas reserves are a "gift of the God [sic]." "Countries should not be blamed for having them and should not be blamed for bringing these resources to the market because the market needs them, the people need them," he said. Oil and gas are natural resources, just like gold, copper, wind or the sun. "To accuse us that we have oil is the same like [sic] to accuse us that we have more than 250 sunny days a year in Baku," he said.




speech

[ P.804 (10/17) ] - Subjective diagnostic test method for conversational speech quality analysis

Subjective diagnostic test method for conversational speech quality analysis




speech

[ P.811 (01/19) ] - Subjective test methodology for evaluating Speech oriented stereo communication systems over headphones

Subjective test methodology for evaluating Speech oriented stereo communication systems over headphones




speech

[ P.1140 (03/17) ] - Speech communication requirements for emergency calls originating from vehicles

Speech communication requirements for emergency calls originating from vehicles




speech

[ P.808 (06/18) ] - Subjective evaluation of speech quality with a crowdsourcing approach

Subjective evaluation of speech quality with a crowdsourcing approach




speech

[ P.700 (06/19) ] - Calculation of loudness for speech communication

Calculation of loudness for speech communication




speech

GSTP-CSS - The composite source signal as a measuring signal and a summary of various investigations on speech echo cancellers

GSTP-CSS - The composite source signal as a measuring signal and a summary of various investigations on speech echo cancellers




speech

[ G.191 (01/19) ] - Software tools for speech and audio coding standardization

Software tools for speech and audio coding standardization




speech

2023 Vertical Markets Spotlight: Speech Technology in Government

Public-sector entities can improve customer service, delivery of benefits and services, efficiency, transparency, and more.




speech

2023 Vertical Markets Spotlight: Speech Technology in Financial Services

Rise in voice banking requires a rise in voice-based support, verification, and AI.




speech

2023 Vertical Markets Spotlight: Speech Technology in Consumer Electronics

Interoperability's been the biggest challenge, but standards are emerging.




speech

2023 Vertical Markets Spotlight: A Speech Technology Special Report

Our roundup of the speech innovation taking place in eight important industry verticals.




speech

How Speech Analytics Helps Improve Coaching/Training

Data-driven guidance provides a better agent and customer experience.




speech

Speech Analytics Can Help Steer Chatbot Interactions

Companies are beginning to apply traditional speech analytics to their automated conversations.




speech

2023 Speech Industry Award Winner: SyncWords Leads in Live Captioning, Dubbing, and Subtitling

The New York-based company reportedly last year captioned and subtitled more than 15 million minutes in video on-demand, processed more than 500,000 minutes of content with speech recognition, and live-captioned more than 300,000 minutes of events.




speech

2023 Speech Industry Award Winner: Speechmatics Inches Closer Toward a Universal Translator

Speechmatics, a provider of automatic speech recognition software based on recurrent neural networks and statistical language modeling, is on a mission to make its speech-to-text technology usable by 70 percent of the world's population in the next three years.




speech

2023 Speech Industry Award Winner: SoundHound AI Brings Speech Breakthroughs to the Mainstream

SoundHound AI, based in Santa Clara, Calif., this year launched, among other things, Chat AI, a voice-enabled digital assistant with generative artificial intelligence; and Smart Answering, which uses voice AI to handle inbound customer calls.




speech

2023 Speech Industry Award Winner: Resemble AI Fights for Responsible Use of Voice Clones

Resemble AI, providers of a platform that uses generative AI to create realistic-sounding voices, in July released Resemble Detect, which can validate the authenticity of audio data to expose speech deepfakes in real time .




speech

2023 Speech Industry Award Winner: ReadSpeaker Embeds TTS in Many More Platforms

With almost 25 years of experience in developing text-to-speech solutions, ReadSpeaker today offers one of the largest selections of expressive, humanlike voices in the industry.




speech

2023 Speech Industry Award Winner: OpenAI and Its ChatGPT Upended Everything

When it comes to new technologies, few have had as much of an impact as generative artificial intelligence, ushered in by OpenAI in November 2022 with its ChatGPT launch.




speech

2023 Speech Industry Award Winner: NVIDIA Is Making Voice AI Better for Almost Everyone

NVIDIA saw blowout second-quarter results, surging margins, and incredible demand, which prompted one analyst from Constellation Insights to conclude that "it's clear the company has little competition and a lot of pricing power."




speech

2023 Speech Industry Award Winner: Microsoft?s VALL-E Breaks the Mold in AI Training

VALL-E, one of Microsoft's latest forays into artificial intelligence, is a transformer-based text-to-speech model that can re-create any voice from just a three-second sample clip.




speech

2023 Speech Industry Award Winner: ID R&D Pioneers Liveness Detection

ID R&D, a New York-based provider of liveness detection and voice biometrics, has quickly become a leader in addressing AI-powered fraud by combining passive facial liveness and voice anti-spoofing technologies.




speech

2023 Speech Industry Award Winner: D-ID Gives a Human Face and Voice to AI

D-ID, an Israeli company founded in 2017, is providing superpowers to individual creators and businesses alike, uniquely enabling them to transform any picture into an interactive video in seconds.




speech

The Top Speech Technologies and Vendors: The 2023 Speech Industry Awards

AI, AI, and more AI: The technology is disrupting everything, and it's found everywhere in our speech industry achievements for 2023.




speech

Industry-Standard Speech App Building Blocks Take Shape

Interface interoperability is becoming closer to reality, but more work is needed.




speech

2024 State of AI in the Speech Technology Industry: AI Is Revolutionizing Translation, Dubbing, and Subtitling

Improved accuracy, wider language choices, and real-time options are among the benefits.




speech

2024 State of AI in the Speech Technology Industry: Voice Biometrics Both Profits From and Is Plagued by AI

Deepfakes threats advance, and technology is challenged to keep up.




speech

2024 State of AI in the Speech Technology Industry: AI Is Enabling Audiovisual Enhancements

Voice and video creation tools continue to benefit from AI augmentation.




speech

2024 State of AI in the Speech Technology Industry: AI?s Impact on Natural Language Processing

Speech systems will continue to get more conversational with AI advances.




speech

2024 State of AI in the Speech Technology Industry: GenAI-Fueled Speech Analytics Enable Real-Time Results

The insights come faster and are more comprehensive




speech

2024 State of AI in the Speech Technology Industry

How generative (and other types of) artificial intelligence is impacting five important sectors.




speech

2024 Vertical Market Case Studies: Speech Technology in Entertainment

TrueFan is a true fan of Resemble AI's TTS