audi

Infosys says audit committee finds no evidence of financial impropriety or executive misconduct

Infosys says audit committee finds no evidence of financial impropriety or executive misconduct





audi

Qatar to exit Opec amid tension with Saudi Arabia

Qatar to exit Opec amid tension with Saudi Arabia





audi

1.34 lakh Indians return, government watching situation in Saudi Arabia

Asked whether Indians are being targeted abroad, Overseas Indian Affairs Minister Vayalar Ravi said, "One cannot make such a general statement."




audi

Fresh drive to help illegal Indian workers in Saudi Arabia

The number of Indians who left during the grace period till October-end is 1,34,281, the Indian Embassy here said in a statement.




audi

4870 Indians return from Saudi after job-related issues: Government

V K Singh said presently, a section of Indian workers facing difficulties in two major Saudi companies -- Saudi Oger and the Saad Group -- are being brought back to India.




audi

20,000 Indians to return from Saudi Arabia via amnesty scheme

Around 1,500 blue collar workers from Tamil Nadu are among those who are using the amnesty to come back to the country.




audi

Crude oil prices rise on Saudi crude price increase, China export bounce

US West Texas Intermediate (WTI) crude rose $1.78, or 7.4 per cent, to $25.77, on track for its highest close since April 6 and up more than 30 per cent this week.




audi

Saudi, American firms eye stakes in Reliance's Jio

Three deals in three weeks injected a combined $8 billion in the group and help it pare its debt.




audi

Norwalk Man Sentenced to Prison for Defrauding Service Member




audi

Accounts and audit determination for public health entities in NSW




audi

Saudi PiF plot move for second football club: NUFC evening update

All the latest Newcastle United takeover headlines with updates on the Saudi Public Investment Fund and Gary Neville




audi

Edison Research, NPR Release 2020 Smart Audio Report

EDISON RESEARCH and NPR released the findings in its 2020 Smart Audio Report on smart speaker and voice-controlled device usage THURSDAY (4/30)  in a webinar hosted by EDISON's TOM … more




audi

RAB 'Open For Business' Live Video Series Offers Presentation By Entercom's Jennifer Morrelli On Audience Engagement

The RADIO ADVERTISING BUREAU's next webinar in its "Business Unusual" program's "Open for Business" series, “Creating Audience Engagement,” will … more




audi

TopLine By Futuri Presents Nielsen Audio April '20 PPMs Released Monday

NIELSEN AUDIO PPM APRIL '20 MONTHLY results arrive MONDAY, MAY 11th for NEW YORK; LOS ANGELES; CHICAGO; SAN FRANCISCO; DALLAS; HOUSTON; PHILADELPHIA; ATLANTA; NASSAU-SUFFOLK; … more




audi

AudioSweets Make New PopCore Volume Available

AUDIOSWEETS has released the latest in its imaging POPCORE series, POPCORE VOL. 14 from ASX. POPCORE VOL. 14 features 220 imaging elements with 11 categories in the update including Artist … more




audi

John Harrington-WHAT WE USE - Audio and sound kit

Here's a video segment on the Audio and Sound Kit that we use. A transcription of the video is available after the jump.




audi

John Harrington-WHAT WE USE - Audio Entertainment Kit

Here's a video segment on the Audio Entertainment Kit that we use. A transcription of the video is available after the jump.




audi

TopLine By Futuri Presents Nielsen Audio March '20 Ratings Released Today

NIELSEN AUDIO MARCH '20 results arrive TODAY for SYRACUSE; AKRON; MONTEREY-SALINAS-SANTA CRUZ; and CHARLESTON, SC. Find the ratings for the subscribing stations in the ALLACCESS.COM … more




audi

TopLine By Futuri Presents Nielsen Audio March '20 Ratings Released Today

NIELSEN AUDIO MARCH '20 results arrive TODAY for DES MOINES; COLORADO SPRINGS; MOBILE; WICHITA; and SPOKANE. Find the ratings for the subscribing stations in the ALLACCESS.COM NIELSEN … more




audi

Push The Boundaries Of Creativity And Fun During COVID-19 -- Your Radio Audience Will Thank You

Rally your troops to get virtual to bring new creative ideas to your radio stations. Get on ZOOM and have a brainstorming session with your creative teams and clients. Time to squeeze new … more




audi

Jacobs Media: Stop Worrying About The Ratings And Focus On Serving Your Audience

The focus of JACOBS MEDIA President FRED JACOBS' blog is simple and to the point as he writes: "Welcome to BIZARRO WORLD! "If you were a reader of SUPERMAN comics back in the … more




audi

TopLine By Futuri Presents Nielsen Audio March '20 Ratings Released Today

NIELSEN AUDIO MARCH '20 arrive TODAY for CHATTANOOGA; MADISON; HUNTSVILLE, AL; and JACKSON, MS. Find the ratings for the subscribing stations in the ALLACCESS.COM NIELSEN AUDIO  … more




audi

Your Audience Depends On The Immediacy Of Radio -- What Have You Done Today To Earn Their Ears?

During the COVID-19 lockdown, and during the gradual re-opening of communities, cities and businesses there is a lot information you've got that your listeners need. And, they are looking … more




audi

Be There When Your Audience Needs You -- Right Now, During The COVID-19 Pandemic

During the COVID-19 lockdown, and during the gradual re-opening of communities, cities and businesses there is a lot information you've got that your listeners need. And, they are looking … more




audi

What Will You And Your Station Do Differently Today To Make A Difference In Your Audience's Life In The Middle Of The COVID-19 Pandemic?

During the COVID-19 lockdown, and during the gradual re-opening of communities, PPM meters are now coming back online and meter counts are inching up as more people get in cars and resume a … more




audi

Report: Is Warner Music Group Entertaining An Offer To Sell To Saudi Arabia's Public Investment Fund?

MUSIC BUSINESS WORLDWIDE and the HOLLYWOOD REPORTER are both carrying reports on rumors that SAUDI ARABIA's PUBLIC INVESTMENT FUND is making an offer to buy WARNER MUSIC GROUP. The fund … more




audi

Hooked: How to engage your website audience in one second or less

You have less than one second to make the right impression. Almost immediately after landing on your website users will make an uninformed, mostly subconscious judgment about what type of organization they’re interacting with. This initial judgment will largely be influenced by layout, design, and visual tone. It will not only influence the rest of […]

The post Hooked: How to engage your website audience in one second or less appeared first on Psychology of Web Design | 3.7 Blog.




audi

WordPress Audio Player Plugin

I recently went looking for a good audio player for WordPress. I came across WPAudioPlayer from 1 pixel out. The plugin is extremely simple to use and has a really awesome automatic color detention tool which will match to your site with ease. For more info visit the demo page at http://www.1pixelout.net/code/audio-player-wordpress-plugin/

The post WordPress Audio Player Plugin appeared first on WPCult.




audi

Facebook Live Streaming and Audio/Video Hosting connected to Auphonic

Facebook is not only a social media giant, the company also provides valuable tools for broadcasting. Today we release a connection to Facebook, which allows to use the Facebook tools for video/audio production and publishing within Auphonic and our connected services.

The following workflows are possible with Facebook and Auphonic:
  • Use Facebook for live streaming, then import, process and distribute the audio/video with Auphonic.
  • Post your Auphonic audio or video productions directly to the news feed of your Facebook Page or User.
  • Use Facebook as a general media hosting service and share the link or embed the audio/video on any webpage (also visible to non-Facebook users).

Connect to Facebook

First you have to connect to a Facebook account at our External Services Page, click on the "Facebook" button.

Select if you want to connect to your personal Facebook User or to a Facebook Page:

It is always possible to remove or edit the connection in your Facebook Settings (Tab Business Integrations).

Import (Live) Videos from Facebook to Auphonic

Facebook Live is an easy (and free) way to stream live videos:

We implemented an interface to use Facebook as an Incoming External Service. Please select a (live or non-live) video from your Facebook Page/User as the source of a production and then process it with Auphonic:

This workflow allows you to use Facebook for live streaming, import and process the audio/video with Auphonic, then publish a podcast and video version of your live video to any of our connected services.

Export from Auphonic to Facebook

Similar to Youtube, it is possible to use Facebook for media file hosting.
Please add your Facebook Page/User as an External Service in your Productions or Presets to upload the Auphonic results directly to Facebook:

Options for the Facebook export:
  • Distribution Settings
    • Post to News Feed: The exported video is posted directly to your news feed / timeline.
    • Exclude from News Feed: The exported video is visible in the videos tab of your Facebook Page/User (see for example Auphonic's video tab), but it is not posted to your news feed (you can do that later if you want).
    • Secret: Only you can see the exported video, it is not shown in the Facebook video tab and it is not posted to your news feed (you can do that later if you want).
  • Embeddable
    Choose if the exported video should be embeddable in third-party websites.

It is always possible to change the distribution/privacy and embeddable options later directly on Facebook. For example, you can export a video to Facebook as Secret and publish it to your news feed whenever you want.


If your production is audio-only, we automatically generate a video track from the Cover Image and (possible) Chapter Images.
Alternatively you can select an Audiogram Output File, if you want to add an Audiogram (audio waveform visualization) to your Facebook video - for details please see Auphonic Audiogram Generator.

Auphonic Title and Description metadata fields are exported to Facebook as well.
If you add Speech Recognition to your production, we create an SRT file with the speech recognition results and add it to your Facebook video as captions.
See the example below.

Facebook Video Hosting Example with Audiogram and Automatic Captions

Facebook can be used as a general video hosting service: even if you export videos as Secret, you will get a direct link to the video which can be shared or embedded in any third-party websites. Users without a Facebook account are also able to view these videos.

In the example below, we automatically generate an Audiogram Video for an audio-only production, use our integrated Speech Recognition system to create captions and export the video as Secret to Facebook.
Afterwards it can be embedded directly into this blog post (enable Captions if they don't show up per default) - for details please see How to embed a video:

It is also possible to just use the generated result URL from Auphonic to share the link to your video (also visible to non-Facebook users):
https://www.facebook.com/auphonic/videos/1687244844638091/

Important Note:
Facebook needs some time to process an exported video (up to a few minutes) and the direct video link won't work before the processing is finished - please try again a bit later!
On Facebook Pages, you can see the processing progress in your Video Library.

Conclusion

Facebook has many broadcasting tools to offer and is a perfect addition to Auphonic.
Both systems and our other external services can be used to create automated processing and publishing workflows. Furthermore, the export and import to/from Facebook is also fully supported in the Auphonic API.

Please contact us if you have any questions or further ideas!




audi

Auphonic Audio Inspector Release

At the Subscribe 9 Conference, we presented the first version of our new Audio Inspector:
The Auphonic Audio Inspector is shown on the status page of a finished production and displays details about what our algorithms are changing in audio files.

A screenshot of the Auphonic Audio Inspector on the status page of a finished Multitrack Production.
Please click on the screenshot to see it in full resolution!

It is possible to zoom and scroll within audio waveforms and the Audio Inspector might be used to manually check production result and input files.

In this blog post, we will discuss the usage and all current visualizations of the Inspector.
If you just want to try the Auphonic Audio Inspector yourself, take a look at this Multitrack Audio Inspector Example.

Inspector Usage

Control bar of the Audio Inspector with scrollbar, play button, current playback position and length, button to show input audio file(s), zoom in/out, toggle legend and a button to switch to fullscreen mode.

Seek in Audio Files
Click or tap inside the waveform to seek in files. The red playhead will show the current audio position.
Zoom In/Out
Use the zoom buttons ([+] and [-]), the mouse wheel or zoom gestures on touch devices to zoom in/out the audio waveform.
Scroll Waveforms
If zoomed in, use the scrollbar or drag the audio waveform directly (with your mouse or on touch devices).
Show Legend
Click the [?] button to show or hide the Legend, which describes details about the visualizations of the audio waveform.
Show Stats
Use the Show Stats link to display Audio Processing Statistics of a production.
Show Input Track(s)
Click Show Input to show or hide input track(s) of a production: now you can see and listen to input and output files for a detailed comparison. Please click directly on the waveform to switch/unmute a track - muted tracks are grayed out slightly:

Showing four input tracks and the Auphonic output of a multitrack production.

Please click on the fullscreen button (bottom right) to switch to fullscreen mode.
Now the audio tracks use all available screen space to see all waveform details:

A multitrack production with output and all input tracks in fullscreen mode.
Please click on the screenshot to see it in full resolution.

In fullscreen mode, it’s also possible to control playback and zooming with keyboard shortcuts:
Press [Space] to start/pause playback, use [+] to zoom in and [-] to zoom out.

Singletrack Algorithms Inspector

First, we discuss the analysis data of our Singletrack Post Production Algorithms.

The audio levels of output and input files, measured according to the ITU-R BS.1770 specification, are displayed directly as the audio waveform. Click on Show Input to see the input and output file. Only one file is played at a time, click directly on the Input or Output track to unmute a file for playback:

Singletrack Production with opened input file.
See the first Leveler Audio Example to try the audio inspector yourself.

Waveform Segments: Music and Speech (gold, blue)
Music/Speech segments are displayed directly in the audio waveform: Music segments are plotted in gold/yellow, speech segments in blue (or light/dark blue).
Waveform Segments: Leveler High/No Amplification (dark, light blue)
Speech segments can be displayed in normal, dark or light blue: Dark blue means that the input signal was very quiet and contains speech, therefore the Adaptive Leveler has to use a high amplification value in this segment.
In light blue regions, the input signal was very quiet as well, but our classifiers decided that the signal should not be amplified (breathing, noise, background sounds, etc.).

Yellow/orange background segments display leveler fades.

Background Segments: Leveler Fade Up/Down (yellow, orange)
If the volume of an input file changes in a fast way, the Adaptive Leveler volume curve will increase/decrease very fast as well (= fade) and should be placed in speech pauses. Otherwise, if fades are too slow or during active speech, one will hear pumping speech artifacts.
Exact fade regions are plotted as yellow (fade up, volume increase) and orange (fade down, volume decrease) background segments in the audio inspector.

Horizontal red lines display noise and hum reduction profiles.

Horizontal Lines: Noise and Hum Reduction Profiles (red)
Our Noise and Hiss Reduction and Hum Reduction algorithms segment the audio file in regions with different background noise characteristics, which are displayed as red horizontal lines in the audio inspector (top lines for noise reduction, bottom lines for hum reduction).
Then a noise print is extracted in each region and a classifier decides if and how much noise reduction is necessary - this is plotted as a value in dB below the top red line.
The hum base frequency (50Hz or 60Hz) and the strength of all its partials is also classified in each region, the value in Hz above the bottom red line indicates the base frequency and whether hum reduction is necessary or not (no red line).

You can try the singletrack audio inspector yourself with our Leveler, Noise Reduction and Hum Reduction audio examples.

Multitrack Algorithms Inspector

If our Multitrack Post Production Algorithms are used, additional analysis data is shown in the audio inspector.

The audio levels of the output and all input tracks are measured according to the ITU-R BS.1770 specification and are displayed directly as the audio waveform. Click on Show Input to see all the input files with track labels and the output file. Only one file is played at a time, click directly into the track to unmute a file for playback:

Input Tracks: Waveform Segments, Background Segments and Horizontal Lines
Input tracks are displayed below the output file including their track names. The same data as in our Singletrack Algorithms Inspector is calculated and plotted separately in each input track:
Output Waveform Segments: Multiple Speakers and Music
Each speaker is plotted in a separate, blue-like color - in the example above we have 3 speakers (normal, light and dark blue) and you can see directly in the waveform when and which speaker is active.
Audio from music input tracks are always plotted in gold/yellow in the output waveform, please try to not mix music and speech parts in music tracks (see also Multitrack Best Practice)!

You can try the multitrack audio inspector yourself with our Multitrack Audio Inspector Example or our general Multitrack Audio Examples.

Ducking, Background and Foreground Segments

Music tracks can be set to Ducking, Foreground, Background or Auto - for more details please see Automatic Ducking, Foreground and Background Tracks.

Ducking Segments (light, dark orange)
In Ducking, the level of a music track is reduced if one of the speakers is active, which is plotted as a dark orange background segment in the output track.
Foreground music parts, where no speaker is active and the music track volume is not reduced, are displayed as light orange background segments in the output track.
Background Music Segments (dark orange background)
Here the whole music track is set to Background and won’t be amplified when speakers are inactive.
Background music parts are plotted as dark organge background segments in the output track.
Foreground Music Segments (light orange background)
Here the whole music track is set to Foreground and its level won’t be reduced when speakers are active.
Foreground music parts are plotted as light organge background segments in the output track.

You can try the ducking/background/foreground audio inspector yourself: Fore/Background/Ducking Audio Examples.

Audio Search, Chapters Marks and Video

Audio Search and Transcriptions
If our Automatic Speech Recognition Integration is used, a time-aligned transcription text will be shown above the waveform. You can use the search field to search and seek directly in the audio file.
See our Speech Recognition Audio Examples to try it yourself.
Chapters Marks
Chapter Mark start times are displayed in the audio waveform as black vertical lines.
The current chapter title is written above the waveform - see “This is Chapter 2” in the screenshot above.

A video production with output waveform, input waveform and transcriptions in fullscreen mode.
Please click on the screenshot to see it in full resolution.

Video Display
If you add a Video Format or Audiogram Output File to your production, the audio inspector will also show a separate video track in addition to the audio output and input tracks. The video playback will be synced to the audio of output and input tracks.

Supported Audio Formats

We use the native HTML5 audio element for playback and the aurora.js javascript audio decoders to support all common audio formats:

WAV, MP3, AAC/M4A and Opus
These formats are supported in all major browsers: Firefox, Chrome, Safari, Edge, iOS Safari and Chrome for Android.
FLAC
FLAC is supported in Firefox, Chrome, Edge and Chrome for Android - see FLAC audio format.
In Safari and iOS Safari, we use aurora.js to directly decode FLAC files in javascript, which works but uses much more CPU compared to native decoding!
ALAC
ALAC is not supported by any browser so far, therefore we use aurora.js to directly decode ALAC files in javascript. This works but uses much more CPU compared to native decoding!
Ogg Vorbis
Only supported by Firefox, Chrome and Chrome for Android - for details please see Ogg Vorbis audio format.

We suggest to use a recent Firefox or Chrome browser for best performance.
Decoding FLAC and ALAC files also works in Safari and iOS with the help of aurora.js, but javascript decoders need a lot of CPU and they sometimes have problems with exact scrolling and seeking.

Please see our blog post Audio File Formats and Bitrates for Podcasts for more details about audio formats.

Mobile Audio Inspector

Multiple responsive layouts were created to optimize the screen space usage on Android and iOS devices, so that the audio inspector is fully usable on mobile devices as well: tap into the waveform to set the playhead location, scroll horizontally to scroll waveforms, scroll vertically to scroll between tracks, use zoom gestures to zoom in/out, etc.

Unfortunately the fullscreen mode is not available on iOS devices (thanks to Apple), but it works on Android and is a really great way to inspect everything using all the available screen space:

Audio inspector in horizontal fullscreen mode on Android.

Conclusion

Try the Auphonic Audio Inspector yourself: take a look at our Audio Example Page or play with the Multitrack Audio Inspector Example.

The Audio Inspector will be shown in all productions which are created in our Web Service.
It might be used to manually check production result/input files and to send us detailed feedback about audio processing results.

Please let us know if you have some feedback or questions - more visualizations will be added in future!







audi

Auphonic Add-ons for Adobe Audition and Adobe Premiere

The new Auphonic Audio Post Production Add-ons for Adobe allows you to use the Auphonic Web Service directly within Adobe Audition and Adobe Premiere (Mac and Windows):

Audition Multitrack Editor with the Auphonic Audio Post Production Add-on.
The Auphonic Add-on can be embedded directly inside the Adobe user interface.


It is possible to export tracks/projects from Audition/Premiere and process them with the Auphonic audio post production algorithms (loudness, leveling, noise reduction - see Audio Examples), use our Encoding/Tagging, Chapter Marks, Speech Recognition and trigger Publishing with one click.
Furthermore, you can import the result file of an Auphonic Production into Audition/Premiere.


Download the Auphonic Audio Post Production Add-ons for Adobe:

Auphonic Add-on for Adobe Audition

Audition Waveform Editor with the Auphonic Audio Post Production Add-on.
Metadata, Marker times and titles will be exported to Auphonic as well.

Export from Audition to Auphonic

You can upload the audio of your current active document (a Multitrack Session or a Single Audio File) to our Web Service.
In case of a Multitrack Session, a mixdown will be computed automatically to create a Singletrack Production in our Web Service.
Unfortunately, it is not possible to export the individual tracks in Audition, which could be used to create Multitrack Productions.

Metadata and Markers
All metadata (see tab Metadata in Audition) and markers (see tab Marker in Audition and the Waveform Editor Screenshot) will be exported to Auphonic as well.
Marker times and titles are used to create Chapter Marks (Enhanced Podcasts) in your Auphonic output files.
Auphonic Presets
You can optionally choose an Auphonic Preset to use previously stored settings for your production.
Start Production and Upload & Edit Buttons
Click Upload & Edit to upload your audio and create a new Production for further editing. After the upload, a web browser will be started to edit/adjust the production and start it manually.
Click Start Production to upload your audio, create a new Production and start it directly without further editing. A web browser will be started to see the results of your production.
Audio Compression
Uncompressed Multitrack Sessions or audio files in Audition (WAV, AIFF, RAW, etc.) will be compressed automatically with lossless codecs to speed up the upload time without a loss in audio quality.
FLAC is used as lossless codec on Windows and Mac OS (>= 10.13), older Mac OS systems (< 10.13) do not support FLAC and use ALAC instead.

Import Auphonic Productions in Audition

To import the result of an Auphonic Production into Audition, choose the corresponding production and click Import.
The result file will be downloaded from the Auphonic servers and can be used within Audition. If the production contains multiple Output File Formats, the output file with the highest bitrate (or uncompressed/lossless if available) will be chosen.

Auphonic Add-on for Adobe Premiere

Premiere Video Editor with the Auphonic Audio Post Production Add-on.
The Auphonic Add-on can be embedded directly inside the Adobe Premiere user interface.

Export from Premiere to Auphonic

You can upload the audio of your current Active Sequence in Premiere to our Web Service.

We will automatically create an audio-only mixdown of all enabled audio tracks in your current Active Sequence.
Video/Image tracks are ignored: no video will be rendered or uploaded to Auphonic!
If you want to export a specific audio track, please just mute the other tracks.

Start Production and Upload & Edit Buttons
Click Upload & Edit to upload your audio and create a new Production for further editing. After the upload, a web browser will be started to edit/adjust the production and start it manually.
Click Start Production to upload your audio, create a new Production and start it directly without further editing. A web browser will be started to see the results of your production.
Auphonic Presets
You can optionally choose an Auphonic Preset to use previously stored settings for your production.
Chapter Markers
Chapter Markers in Premiere (not all the other marker types!) will be exported to Auphonic as well and are used to create Chapter Marks (Enhanced Podcasts) in your Auphonic output files.
Audio Compression
The mixdown of your Active Sequence in Premiere will be compressed automatically with lossless codecs to speed up the upload time without a loss in audio quality.
FLAC is used as lossless codec on Windows and Mac OS (>= 10.13), older Mac OS systems (< 10.13) do not support FLAC and use ALAC instead.

Import Auphonic Productions in Premiere

To import the result of an Auphonic Production into Premiere, choose the corresponding production and click Import.
The result file will be downloaded from the Auphonic servers and can be used within Premiere. If the production contains multiple Output File Formats, the output file with the highest bitrate (or uncompressed/lossless if available) will be chosen.

Installation

Install our Add-ons for Audition and Premiere directly on the Adobe Add-ons website:

Auphonic Audio Post Production for Adobe Audition:
https://exchange.adobe.com/addons/products/20433

Auphonic Audio Post Production for Adobe Premiere:
https://exchange.adobe.com/addons/products/20429

The installation requires the Adobe Creative Cloud desktop application and might take a few minutes. Please also also try to restart Audition/Premiere if the installation does not work (on Windows it was once even necessary to restart the computer to trigger the installation).


After the installation, you can start our Add-ons directly in Audition/Premiere:
navigate to Window -> Extensions and click Auphonic Post Production.

Enjoy

Thanks a lot to Durin Gleaves and Charles Van Winkle from Adobe for their great support!

Please let us know if you have any questions or feedback!







audi

Audio Manipulations and Dynamic Ad Insertion with the Auphonic API

We are pleased to announce a new Audio Inserts feature in the Auphonic API: audio inserts are separate audio files (like intros/outros), which will be inserted into your production at a defined offset.
This blog post shows how one can use this feature for Dynamic Ad Insertion and discusses other Audio Manipulation Methods of the Auphonic API.

API-only Feature

For the general podcasting hobbyist, or even for someone producing a regular podcast, the features that are accessible via our web interface are more than sufficient.

However, some of our users, like podcasting companies who integrate our services as part of their products, asked us for dynamic ad insertions. We teamed up with them to develop a way of making this work within the Auphonic API.

We are pleased therefore to announce audio inserts - a new feature that has been made part of our API. This feature is not available through the web interface though, it requires the use of our API.

Before we talk about audio inserts, let's talk about what you need to know about dynamic ad insertion!

Dynamic Ad Insertion

There are two ways of dealing with adverts within podcasts. In the first, adverts are recorded or edited into the podcast and are fixed, or baked in. The second method is to use dynamic insertion, whereby the adverts are not part of the podcast recording/file but can be inserted into the podcast afterwards, at any time.

This second approach would allow you to run new ad campaigns across your entire catalog of shows. As a podcaster this allows you to potentially generate new revenue from your old content.

As a hosting company, dynamic ad insertion allows you to choose up to date and relevant adverts across all the podcasts you host. You can make these adverts relevant by subject or location, for instance.

Your users can define the time for the ads and their podcast episode, you are then in control of the adverts you insert.

Audio Inserts in Auphonic

Whichever approach to adverts you are taking, using audio inserts can help you.

Audio inserts are separate audio files which will be inserted into your main single or multitrack production at your defined offset (in seconds).

When a separate audio file is inserted as part of your production, it creates a gap in the podcast audio file, shifting the audio back by the length of the insert. Helpfully, chapters and other time-based information like transcriptions are also shifted back when an insert is used.

The biggest advantage of this is that Auphonic will apply loudness normalization to the audio insert so, from an audio point of view, it matches the rest of the podcast.

Although created with dynamic ad insertion in mind, this feature can be used for any type of audio inserts: adverts, music songs, individual parts of a recording, etc. In the case of baked-in adverts, you could upload your already processed advert audio as an insert, without having to edit it into your podcast recording using a separate audio editing application.

Please note that audio inserts should already be edited and processed before using them in production. (This is usually the case with pre-recorded adverts anyway). The only algorithm that Auphonic applies to an audio insert is loudness normalization in order to match the loudness of the entire production. Auphonic does not add any other processing (i.e. no leveling, noise reduction etc).

Audio Inserts Coding Example

Here is a brief overview of how to use our API for audio inserts. Be warned, this section is coding heavy, so if this isn't your thing, feel free to move along to the next section!

You can add audio insert files with a call to https://auphonic.com/api/production/{uuid}/multi_input_files.json, where uuid is the UUID of your production.
Here is an example with two audio inserts from an https URL. The offset/position in the main audio file must be given in seconds:

curl -X POST -H "Content-Type: application/json" 
    https://auphonic.com/api/production/{uuid}/multi_input_files.json 
    -u username:password 
    -d '[
            {
                "input_file": "https://mydomain.com/my_audio_insert_1.wav",
                "type": "insert",
                "offset": 20.5
            },
            {
                "input_file": "https://mydomain.com/my_audio_insert_2.wav",
                "type": "insert",
                "offset": 120.3
            }
        ]'

More details showing how to use audio inserts in our API can be seen here.

Additional API Audio Manipulations

In addition to audio inserts, using the Auphonic API offers a number of other audio manipulation options, which are not available via the web interface:

Cut start/end of audio files: See Docs
In Single-track productions, this feature allows the user to cut the start and/or the end of the uploaded audio file. Crucially, time-based information such as chapters etc. will be shifted accordingly.
Fade In/Out time of audio files: See Docs
This allows you to set the fade in/out time (in ms) at the start/end of output files. The default fade time is 100ms, but values can be set between 0ms and 5000ms.
This feature is also available in our Auphonic Leveler Desktop App.
Adding intro and outro: See Docs
Automatically add intros and outros to your main audio input file, as it is also available in our web interface.
Add multiple intros or outros: See Docs
Using our API, you can also add multiple intros or outros to a production. These intros or outros are played in series.
Overlapping intros/outros: See Docs
This feature allows intros/outros to overlap either the main audio or the following/previous intros/outros.

Conclusion

If you haven't explored our API already, the new audio inserts feature allows for greater flexibility and also dynamic ad insertion.
If you offer online services to podcasters, the Auphonic API would also then allow you to pass on Auphonic's audio processing algorithms to your customers.

If this is of interest to you or you have any new feature suggestions that you feel could benefit your company, please get in touch. We are always happy to extend the functionality of our products!







audi

Leveler Presets, LRA Target and Advanced Audio Parameters (Beta)

Lots of users have asked us about more customization and control over the sound of our audio algorithms in the past, so today, we have introduced some advanced algorithm parameters for our singletrack version in a private beta program!

The following new parameters are available:

UPDATE Nov. 2018:
We released a complete rework of the Adaptive Leveler parameters and the description here is not valid anymore!
Please see Auphonic Adaptive Leveler Customization (Beta Update)!

Please join our private beta program and let us know how you use these new features or if you need even more control!

Leveler Presets

Our Adaptive Leveler corrects level differences between speakers, between music and speech and will also apply dynamic range compression to achieve a balanced overall loudness. If you don't know about the Leveler yet, take a look at our Audio Examples.

Leveler presets are basically complete new leveling algorithms, which we have been working on in the past few months:
Our current Leveler tries to normalize all speakers to the same loudness. However, in some cases, you might want more or less loudness differences (dynamic range / loudness range) between the speakers and music segments, or more or less compression, etc.
For these use cases, we have developed additional Leveler Presets and the parameter Maximum Loudness Range.

The following Leveler presets are now available:
Preset Medium:
This is our current leveling algorithm as demonstrated in the Audio Examples.
Preset Hard:
The hard preset reacts faster and applies more gain and compression compared to the medium preset. It is built for recordings with extreme loudness differences, for example very quiet questions from the audience in a lecture recording, extremely soft and loud voices within one audio track, etc.
Preset Soft:
This preset reacts slower, applies less gain and compression compared to the medium preset. Use it if you want to keep more loudness differences (dynamic narration), if you want your voices to sound "less compressed/processed", for dynamic music (concert/classical recordings), background music, etc.
Preset Softer:
Like soft, but softer :)
Preset Speech Medium, Music Soft:
Uses the medium preset in speech segments and the soft preset in music segments. It is built for music live recordings or dynamic music mixes, where you want to amplify all speakers but keep the loudness differences within and between music segments.
Preset Medium, No Compressor:
Like the medium preset, but only (mid-term) leveling and no (short-term) compression is applied. This preset is optimal if you just use a Maximum Loudness Range Target and want to avoid any additional compression as much as possible.
Please let us know your use case, if you need more/other controls or if anything is confusing. The Leveler presets are still in private beta and can be changed as necessary!

Maximum Loudness Range (LRA) Target

The loudness range (LRA) indicates the variation of loudness over the course of a program and is measured in LU (loudness units) - for more details see Loudness Measurement and Normalization or EBU Tech 3342.

The parameter Max Loudness Range controls how much leveling is applied:
volume changes of our Adaptive Leveler will be restricted so that the loudness range of the output file is below the selected value.
High loudness range values will result in very dynamic output files, low loudness range values in compressed output audio. If the LRA value of your input file is already below the maximum loudness range value, no leveling at all will be applied.

It is also important which Leveler Preset you select, for example, if you use the soft(er) preset, it won't be possible to achieve very low loudness range targets.

Also, the Max Loudness Range parameter is not such a precise target value as the Loudness Target. The LRA of your output file might be off a few LU, as it is not reasonable to reach the exact target value.

Use Cases: The Maximum LRA parameter allows you to control the strength of our leveling algorithms, in combination with the parameter Leveler Preset. This might be used for automatic mixdowns with different LRA values for different target platforms (very compressed ones like mobile devices or Alexa, very dynamic ones like home cinema, etc.).

Maximum True Peak Level

This parameter sets the maximum allowed true peak level of the processed output file, which is controlled by the True Peak Limiter after our Global Loudness Normalization algorithms.

If set to Auto (which is the current default), a reasonable value according to the selected loudness target is used: -1dBTP for 23 LUFS (EBU R128) and higher, -2dBTP for -24 LUFS (ATSC A/85) and lower loudness targets.

The maximum true peak level parameter is already available in our desktop program.

Better Hum and Noise Reduction Controls

In addition to the parameter (Noise) Reduction Amount, we now offer two more parameters to control the combination of our Noise and Hum Reduction algorithms:
Hum Base Frequency:
Set the hum base frequency to 50Hz or 60Hz (if you know it), or use Auto to automatically detect the hum base frequency in each speech region.
Hum Reduction Amount:
Maximum hum reduction amount in dB, higher values remove more noise.
In Auto mode, a classifier decides how much hum reduction is necessary in each speech region. Set it to a custom value (> 0), if you prefer more hum reduction or want to bypass our classifier. Use Disable Dehum to disable hum reduction and use our noise reduction algorithms only.

Behavior of noise and hum reduction parameter combinations:

Noise Reduction Amount Hum Base Frequency Hum Reduction Amount
Auto Auto Auto Automatic hum and noise reduction
Auto or > 0 * Disabled No hum reduction, only denoise
Disabled 50Hz Auto or > 0 Force 50Hz hum reduction, no denoise
Disabled Auto Auto or > 0 Automatic dehum, no denoise
12dB 60Hz Auto or > 0 Always do dehum (60Hz) and denoise (12dB)

Advanced Parameters Private Beta and Feedback

At the moment the advanced algorithm parameters are for beta users only. This is to allow us to get user feedback, so we can change the parameters to suit user needs.
Please let us know your case studies, if you need any other algorithm parameters or if you have any questions!

Here are some private beta invitation codes:

y6KCBI4yo0 ksIFEsmI1y BDZec2a21V i4XRGLlVm2 0UDxuS0vbu aaBxi35sKN aaiDSZUbmY bu8lPF80Ih eMsSl6Sf8K DaWpsUnyjo
2YM00m8zDW wh7K2pPmSa jCX7mMy2OJ ZGvvhzCpTF HI0lmGhjVO eXqVhN6QLU t4BH0tYcxY LMjQREVuOx emIogTCAth 0OTPNB7Coz
VIFY8STj2f eKzRSWzOyv 40cMMKKCMN oBruOxBkqS YGgPem6Ne7 BaaFG9I1xZ iSC0aNXoLn ZaS4TykKIa l32bTSBbAx xXWraxS40J
zGtwRJeAKy mVsx489P5k 6SZM5HjkxS QmzdFYOIpf 500AHHtEFA 7Kvk6JRU66 z7ATzwado6 4QEtpzeKzC c9qt9Z1YXx pGSrDzbEED
MP3JUTdnlf PDm2MOLJIG 3uDietVFSL 1i7jZX0Y9e zPkSgmAqqP 5OhcmHIZUP E0vNsPxZ4s FzTIyZIG2r 5EywA0M7r5 FMhpcFkVN5
oRLbRGcRmI 2LTh8GlN7h Cjw6Z3cveP fayCewjE55 GbkyX89Lxu 4LpGZGZGgc iQV7CXYwkH pGLyQPgaha e3lhKDRUMs Skrei1tKIa
We are happy to send further invitation codes to all interested users - please do not hesitate to contact us!

If you have an invitation code, you can enter it here to activate the advanced audio algorithm parameters:
Auphonic Algorithm Parameters Private Beta Activation







audi

Advanced Multitrack Audio Algorithms Release (Beta)

Last weekend, at the Subscribe10 conference, we released Advanced Audio Algorithm Parameters for Multitrack Productions:

We launched our advanced audio algorithm parameters for Singletrack Productions last year. Now these settings (and more) are available for Multitrack Algorithms as well, which gives you detailed control for each track of your production.

The following new parameters are available:

Please join our private beta program and let us know how you use these new features or if you need even more control!

Fore/Background Settings

The parameter Fore/Background controls whether a track should be in foreground, in background, ducked, or unchanged, which is especially important for music or clip tracks.
For more details, please see Automatic Ducking, Foreground and Background Tracks .

We now added the new option Unchanged and a new parameter to set the level of background segments/tracks:
Unchanged (Foreground):
We sometimes received complaints from users, which produced very complex music or clip tracks, that Auphonic changes the levels too hard.
If you set the parameter Fore/Background to the new option Unchanged (Foreground), Level relations within this track won’t be changed at all. It will be added to the final mixdown so that foreground/solo parts of this track will be as loud as (foreground) speech from other tracks.
Background Level:
It is now possible to set the level of background segments/tracks (compared to foreground segments) in background and ducking tracks. By default, background and ducking segments are 18dB softer than foreground segments.

Leveler Parameters

Similar to our Singletrack Advanced Leveler Parameters (see this previous blog post), we also released leveling parameters for Multitrack Productions now.
The following advanced parameters for our Multitrack Adaptive Leveler can be set for each track and allow you to customize which parts of the audio should be leveled, how much they should be leveled, how much dynamic range compression should be applied and to set the stereo panorama (balance):

Leveler Preset:
Select the Speech or Music Leveler for this track.
If set to Automatic (default), a classifier will decide if this is a music or speech track.
Dynamic Range:
The parameter Dynamic Range controls how much leveling is applied: Higher values result in more dynamic output audio files (less leveling). If you want to increase the dynamic range by 3dB (or LU), just increase the Dynamic Range parameter by 3dB.
For more details, please see Multitrack Leveler Parameters.
Compressor:
Select a preset for Micro-Dynamics Compression: Auto, Soft, Medium, Hard or Off.
The Compressor adjusts short-term dynamics, whereas the Leveler adjusts mid-term level differences.
For more details, please see Multitrack Leveler Parameters.
Stereo Panorama (Balance):
Change the stereo panorama (balance for stereo input files) of the current track.
Possible values: L100, L75, L50, L25, Center, R25, R50, R75 and R100.

If you understand German and want to know more about our Advanced Leveler Parameters and audio dynamics in general, watch our talk at the Subscribe10 conference:
Video: Audio Lautheit und Dynamik.

Better Hum and Noise Reduction Controls

We now offer three parameters to control the combination of our Multitrack Noise and Hum Reduction Algorithms for each input track:
Noise Reduction Amount:
Maximum noise and hum reduction amount in dB, higher values remove more noise.
In Auto mode, a classifier decides if and how much noise reduction is necessary (to avoid artifacts). Set to a custom (non-Auto) value if you prefer more noise reduction or want to bypass our classifier.
Hum Base Frequency:
Set the hum base frequency to 50Hz or 60Hz (if you know it), or use Auto to automatically detect the hum base frequency in each speech region.
Hum Reduction Amount:
Maximum hum reduction amount in dB, higher values remove more noise.
In Auto mode, a classifier decides how much hum reduction is necessary in each speech region. Set it to a custom value (> 0), if you prefer more hum reduction or want to bypass our classifier. Use Disable Dehum to disable hum reduction and use our noise reduction algorithms only.

Behavior of noise and hum reduction parameter combinations:

Noise Reduction Amount Hum Base Frequency Hum Reduction Amount
Auto Auto Auto Automatic hum and noise reduction
Auto or > 0 * Disabled No hum reduction, only denoise
Disabled 50Hz Auto or > 0 Force 50Hz hum reduction, no denoise
Disabled Auto Auto or > 0 Automatic dehum, no denoise
12dB 60Hz Auto or > 0 Always do dehum (60Hz) and denoise (12dB)

Maximum True Peak Level

In the Master Algorithm Settings of your multitrack production, you can set the maximum allowed true peak level of the processed output file, which is controlled by the True Peak Limiter after our Loudness Normalization algorithms.

If set to Auto (which is the current default), a reasonable value according to the selected loudness target is used: -1dBTP for 23 LUFS (EBU R128) and higher, -2dBTP for -24 LUFS (ATSC A/85) and lower loudness targets.

Full API Support

All advanced algorithm parameters, for Singletrack and Multitrack Productions, are available in our API as well, which allows you to integrate them into your scripts, external workflows and third-party applications.

Singletrack API:
Documentation on how to use the advanced algorithm parameters in our singletrack production API: Advanced Algorithm Parameters
Multitrack API:
Documentation of advanced settings for each track of a multitrack production:
Multitrack Advanced Audio Algorithm Settings

Join the Beta and Send Feedback

Please join our beta and let us know your case studies, if you need any other algorithm parameters or if you have any questions!

Here are some private beta invitation codes:

8tZPc3T9pH VAvO8VsDg9 0TwKXBW4Ni kjXJMivtZ1 J9APmAAYjT Zwm6HabuFw HNK5gF8FR5 Do1MPHUyPW CTk45VbV4t xYOzDkEnWP
9XE4dZ0FxD 0Sl3PxDRho uSoRQxmKPx TCI62OjEYu 6EQaPYs7v4 reIJVOwIr8 7hPJqZmWfw kti3m5KbNE GoM2nF0AcN xHCbDC37O5
6PabLBRm9P j2SoI8peiY olQ2vsmnfV fqfxX4mWLO OozsiA8DWo weJw0PXDky VTnOfOiL6l B6HRr6gil0 so0AvM1Ryy NpPYsInFqm
oFeQPLwG0k HmCOkyaX9R G7DR5Sc9Kv MeQLSUCkge xCSvPTrTgl jyQKG3BWWA HCzWRxSrgW xP15hYKEDl 241gK62TrO Q56DHjT3r4
9TqWVZHZLE aWFMSWcuX8 x6FR5OTL43 Xf6tRpyP4S tDGbOUngU0 5BkOF2I264 cccHS0KveO dT29cF75gG 2ySWlYp1kp iJWPhpAimF
We are happy to send further invitation codes to all interested users - please do not hesitate to contact us!

If you have an invitation code, you can enter it here to activate the Multitrack Advanced Audio Algorithm Parameters:
Auphonic Algorithm Parameters Private Beta Activation







audi

Dynamic Range Processing in Audio Post Production

If listeners find themselves using the volume up and down buttons a lot, level differences within your podcast or audio file are too big.
In this article, we are discussing why audio dynamic range processing (or leveling) is more important than loudness normalization, why it depends on factors like the listening environment and the individual character of the content, and why the loudness range descriptor (LRA) is only reliable for speech programs.

Photo by Alexey Ruban.

Why loudness normalization is not enough

Everybody who has lived in an apartment building knows the problem: you want to enjoy a movie late at night, but you're constantly on the edge - not only because of the thrilling story, but because your index finger is hovering over the volume down button of your remote. The next loud sound effect is going to come sooner rather than later, and you want to avoid waking up your neighbors with some gunshot sounds blasting from your TV.

In our previous post, we talked about the overall loudness of a production. While that's certainly important to keep in mind, the loudness target is only an average value, ignoring how much the loudness varies within a production. The loudness target of your movie might be in the ideal range, yet the level differences between a gunshot and someone whispering can still be enormous - having you turn the volume down for the former and up for the latter.

While the average loudness might be perfect, level differences can lead to an unpleasant listening experience.

Of course, this doesn't apply to movies alone. The image above shows a podcast or radio production. The loud section is music, the very quiet section just breathing, and the remaining sections are different voices.

To be clear, we're not saying that the above example is problematic per se. There are many situations, where a big difference in levels - a high dynamic range - is justified: for instance, in a movie theater, optimized for listening and without any outside noise, or in classical music.
Also, if the dynamic range is too small, listening can be tiring.

But if you watch the same movie in an outdoor screening in the summer on a beach next to the crashing waves or in the middle of a noisy city, it can be tricky to hear the softer parts.
Spoken word usually has a smaller dynamic range, and if you produce your podcast for a target audience of train or car commuters, the dynamic range should be even smaller, adjusting for the listening situation.

Therefore, hitting the loudness target has less impact on the listening experience than level differences (dynamic range) within one file!
What makes a suitable dynamic range does not only depend on the listening environment, but also on the nature of the content itself. If the dynamic range is too small, the audio can be tiring to listen to, whereas more variability in levels can make a program more interesting, but might not work in all environments, such as a noisy car.

Dynamic range experiment in a car

Wolfgang Rein, audio technician at SWR, a public broadcaster in Germany, did an experiment to test how drivers react to programs with different dynamic ranges. They monitored to what level drivers set the car stereo depending on speed (thus noise level) and audio dynamic range.
While the results are preliminary, it seems like drivers set the volume as low as possible so that they can still understand the content, but don't get distracted by loud sounds.

As drivers adjust the volume to the loudest voice in a program, they won't understand quieter speakers in content with a high dynamic range anymore. To some degree and for short periods of time, they can compensate by focusing more on the radio program, but over time that's tiring. Therefore, if the loudness varies too much, drivers tend to switch to another program rather than adjusting the volume.
Similar results have been found in a study conducted by NPR Labs and Towson University.

On the other hand, the perception was different in pure music programs. When drivers set the volume according to louder parts, they weren't able to hear softer segments or the beginning of a song very well. But that did not matter to them as much and didn't make them want to turn up the volume or switch the program.

Listener's reaction in response to frequent loudness changes. (from John Kean, Eli Johnson, Dr. Ellyn Sheffield: Study of Audio Loudness Range for Consumers in Various Listening Modes and Ambient Noise Levels)

Loudness comfort zone

The reaction of drivers to variable loudness hints at something that BBC sound engineer Mike Thornton calls the loudness comfort zone.

Tests (...) have shown that if the short-term loudness stays within the "comfort zone" then the consumer doesn’t feel the need to reach for the remote control to adjust the volume.
In a blog post, he highlights how the series Blue Planet 2 and Planet Earth 2 might not always have been the easiest to listen to. The graph below shows an excerpt with very loud music, followed by commentary just at the bottom of the green comfort zone. Thornton writes: "with the volume set at a level that was comfortable when the music was playing we couldn’t always hear the excellent commentary from Sir David Attenborough and had to resort to turning on the subtitles to be sure we knew what Sir David was saying!"

Planet Earth 2 Loudness Plot Excerpt. Colored green: comfort zone of +3 to -5LU around the loudness target. (from Mike Thornton: BBC Blue Planet 2 Latest Show In Firing Line For Sound Issues - Are They Right?)

As already mentioned above, a good mix considers the maximum and minimum possible loudness in the target listening environment.
In a movie theater the loudness comfort zone is big (loudness can vary a lot), and loud music is part of the fun, while quiet scenes work just as well. The opposite was true in the aforementioned experiment with drivers, where the loudness comfort zone is much smaller and quiet voices are difficult to understand.

Hence, the loudness comfort zone determines how much dynamic range an audio signal can use in a specific listening environment.

How to measure dynamic range: LRA

When producing audio for various environments, it would be great to have a target value for dynamic range, (the difference between the smallest and largest signal values of an audio signal) as well. Then you could just set a dynamic range target, similarly to a loudness target.

Theoretically, the maximum possible dynamic range of a production is defined by the bit-depth of the audio format. A 16-bit recording can have a dynamic range of 96 dB; for 24-bit, it's 144 dB - which is well above the approx. 120 dB the human ear can handle. However, most of those bits are typically being used to get to a reasonable base volume. Picture a glass of water: you want it to be almost full, with some headroom so that it doesn't spill when there's a sudden movement, i.e. a bigger amplitude wave at the top.

Determining the dynamic range of a production is easier said than done, though. It depends on which signals are included in the measurement: for example, if something like background music or breathing should be considered at all.
The currently preferred method for broadcasting is called Loudness Range, LRA. It is measured in Loudness Units (LU), and takes into account everything between the 10th and the 95th percentile of a loudness distribution, after an additional gating method. In other words, the loudest 5% and quietest 10% of the audio signal are being ignored. This way, quiet breathing or an occasional loud sound effect won't affect the measurement.

Loudness distribution and LRA for the film 'The Matrix'. Figure from EBU Tech Doc 3343 (p.13).

However, the main difficulty is which signals should be included in the loudness range measurement and which ones should be gated. This is unfortunately often very subjective and difficult to define with a purely statistical method like LRA.

Where LRA falls short

Therefore, only pure speech programs give reliable LRA values that are comparable!
For instance, a typical LRA for news programs is 3 LU; for talks and discussions 5 LU is common. LRA values for features, radio dramas, movies or music very much depend on the individual character and might be in the range between 5 and 25 LU.

To further illustrate this, here are some typical LRA values, according to a paper by Thomas Lund (table 2):

ProgramLoudness Range
Matrix, full movie25.0
NBC Interstitials, Jan. 2008, all together (3:30)9.4
Friends Episode 166.6
Speak Ref., Male, German, SQUAM Trk 546.2
Speak Ref., Female, French, SQUAM Trk 514.8
Speak Ref., Male, English, Sound Check3.3
Wish You Were Here, Pink Floyd22.1
Gilgamesh, Battle of Titans, Osaka Symph.19.7
Don’t Cry For Me Arg., Sinead O’Conner13.7
Beethoven Son in F, Op17, Kliegel & Tichman12.0
Rock’n Roll Train, AC/DC6.0
I.G.Y., Donald Fagen3.6

LRA values of music are very unpredictable as well.
For instance, Tom Frampton measured the LRA of songs in multiple genres, and the differences within each genre are quite big. The ten pop songs that he analyzed varied in LRA between 3.7 and 12 LU, country songs between 3.6 and 14.9 LU. In the Electronic genre the individual LRAs were between 3.7 and 15.2 LU. Please see the tables at the bottom of his blog post for more details.

We at Auphonic also tried to base our Adaptive Leveler parameters on the LRA descriptor. Although it worked, it turned out that it is very difficult to set a loudness range target for diverse audio content, which does include speech, background sounds, music parts, etc. The results were not predictable and it was hard to find good target values. Therefore we developed our own algorithm to measure the dynamic range of audio signals.

In conclusion, LRA comparisons are only useful for productions with spoken word only and the LRA value is therefore not applicable as a general dynamic range target value. The more complex a production gets, the more difficult it is to make any judgment based on the LRA.
This is, because the definition of LRA is purely statistical. There's no smart measurement using classifiers that distinguish between music, speech, quiet breathing, background noises and other types of audio. One would need a more intelligent algorithm (as we use in our Adaptive Leveler), that knows which audio segments should be included and excluded from the measurement.

From theory to application: tools

Loudness and dynamic range clearly is a complicated topic. Luckily, there are tools that can help. To keep short-term loudness in range, a compressor can help control sudden changes in loudness - such as p-pops or consonants like t or k. To achieve a good mid-term loudness, i.e. a signal that doesn't go outside the comfort zone too much, a leveler is a good option. Or, just use a fader or manually adjust volume curves. And to make sure that separate productions sound consistent, loudness normalization is the way to go. We have covered all of this in-depth before.

Looking at the audio from above again, with an adaptive leveler applied it looks like this:

Leveler example. Output at the top, input with leveler envelope at the bottom.

Now, the voices are evened out and the music is at a comfortable level, while the breathing has not been touched at all.
We recently extended Auphonic's adaptive leveler, so that it is now possible to customize the dynamic range - please see adaptive leveler customization and advanced multitrack audio algorithms.
If you wanted to increase the loudness comfort zone (or dynamic range) of the standard preset by 10 dB (or LU), for example, the envelope would look like this:

Leveler with higher dynamic range, only touching sections with extremely low or extremely high loudness to fit into a specific loudness comfort zone.

When a production is done, our adaptive leveler uses classifiers to also calculate the integrated loudness and loudness range of dialog and music sections separately. This way it is possible to just compare the dialog LRA and loudness of complex productions.

Assessing the LRA and loudness of dialog and music separately.

Conclusion

Getting audio dynamics right is not easy. Yet, it is an important thing to keep in mind, because focusing on loudness normalization alone is not enough. In fact, hitting the loudness target often has less impact on the listening experience than level differences, i.e. audio dynamics.

If the dynamic range is too small, the audio can be tiring to listen to, whereas a bigger dynamic range can make a program more interesting, but might not work in loud environments, such as a noisy train.
Therefore, a good mix adapts the audio dynamic range according to the target listening environment (different loudness comfort zones in cinema, at home, in a car) and according to the nature of the content (radio feature, movie, podcast, music, etc.).

Furthermore, because the definition of the loudness range / LRA is purely statistical, only speech programs give reliable LRA values that are comparable.
More "intelligent" algorithms are in development, which use classifiers to decide which signals should be included and excluded from the dynamic range measurement.

If you understand German, take a look at our presentation about audio dynamic processing in podcasts for further information:







audi

Remix and make music with audio from the Library of Congress

Brian Foo is the current Innovator-in-Residence at the Library of Congress. His latest…

Tags: , ,




audi

The State – Sort of – of HTML5 Audio

The State – Sort of – of HTML5 Audio Scott Schiller discusses the high level of hype around HTML5 and CSS3. The two specs render ”many years of feature hacks redundant by replacing them with native features,” he writes in an insightful blog. Blogging, he says: CSS3’s border-radius, box-shadow, text-shadow and gradients, and HTML5’s <canvas>, Read the rest...




audi

Hooked: How to engage your website audience in one second or less

You have less than one second to make the right impression. Almost immediately after landing on your website users will make an uninformed, mostly subconscious judgment about what type of organization they’re interacting with. This initial judgment will largely be influenced by layout, design, and visual tone. It will not only influence the rest of […]

The post Hooked: How to engage your website audience in one second or less appeared first on Psychology of Web Design | 3.7 Blog.




audi

Freebie: 264 Vector Audio DJ Pack Icons

Icons packs are among the most desirable freebies around. There are several out there, going from a wide array of topics from user interfaces to personal finance. But sometimes you can find some rather unusual but clever additions to the icons universe. This Vector Audio DJ Pack is a nice example, brought to you exclusively …

Freebie: 264 Vector Audio DJ Pack Icons Read More »




audi

RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions. (arXiv:2005.03271v1 [eess.AS])

In recent years, all-neural end-to-end approaches have obtained state-of-the-art results on several challenging automatic speech recognition (ASR) tasks. However, most existing works focus on building ASR models where train and test data are drawn from the same domain. This results in poor generalization characteristics on mismatched-domains: e.g., end-to-end models trained on short segments perform poorly when evaluated on longer utterances. In this work, we analyze the generalization properties of streaming and non-streaming recurrent neural network transducer (RNN-T) based end-to-end models in order to identify model components that negatively affect generalization performance. We propose two solutions: combining multiple regularization techniques during training, and using dynamic overlapping inference. On a long-form YouTube test set, when the non-streaming RNN-T model is trained with shorter segments of data, the proposed combination improves word error rate (WER) from 22.3% to 14.8%; when the streaming RNN-T model trained on short Search queries, the proposed techniques improve WER on the YouTube set from 67.0% to 25.3%. Finally, when trained on Librispeech, we find that dynamic overlapping inference improves WER on YouTube from 99.8% to 33.0%.




audi

Email administration for rendering email on a digital audio player

Methods, systems, and computer program products are provided for email administration for rendering email on a digital audio player. Embodiments include retrieving an email message; extracting text from the email message; creating a media file; and storing the extracted text of the email message as metadata associated with the media file. Embodiments may also include storing the media file on a digital audio player and displaying the metadata describing the media file, the metadata containing the extracted text of the email message.




audi

Method for classifying audio signal into fast signal or slow signal

Low bit rate audio coding such as BWE algorithm often encounters conflict goal of achieving high time resolution and high frequency resolution at the same time. In order to achieve best possible quality, input signal can be first classified into fast signal and slow signal. This invention focuses on classifying signal into fast signal and slow signal, based on at least one of the following parameters or a combination of the following parameters: spectral sharpness, temporal sharpness, pitch correlation (pitch gain), and/or spectral envelope variation. This classification information can help to choose different BWE algorithms, different coding algorithms, and different postprocessing algorithms respectively for fast signal and slow signal.




audi

Apparatus for processing an audio signal and method thereof

An apparatus for processing an audio signal and method thereof are disclosed. The present invention includes receiving a downmix signal and side information; extracting control restriction information from the side information; receiving control information for controlling gain or panning at least one object signal; generating at least one of first multi-channel information and first downmix processing information based on the control information and object information, without using the control restriction information; and, generating an output signal by applying the at least one of the first multichannel information and the first downmix processing information to the downmix signal, wherein the control restriction information relates to a parameter indicating limiting degree of the control information.




audi

Sparse audio

A method comprising: sampling received audio at a first rate to produce a first audio signal; transforming the first audio signal into a sparse domain to produce a sparse audio signal; re-sampling of the sparse audio signal to produce a re-sampled sparse audio signal; and providing the re-sampled sparse audio signal, wherein bandwidth required for accurate audio reproduction is removed but bandwidth required for spatial audio encoding is retained AND/OR a method comprising: receiving a first sparse audio signal for a first channel; receiving a second sparse audio signal for a second channel; and processing the first sparse audio signal and the second sparse audio signal to produce one or more inter-channel spatial audio parameters.




audi

Audio controlling apparatus, audio correction apparatus, and audio correction method

According to one embodiment, an audio controlling apparatus includes a first receiver configured to receive audio signal, a second receiver configured to receive environmental sound, a temporary gain calculator configured to calculate temporary gain based on environmental sound received by second receiver, a sound type determination module configured to determine sound type of main component of audio signal received by first receiver, and a gain controller configured to stabilize temporary gain that is calculated by temporary gain calculator and set gain, when it is determined that sound type of main component of audio signal received by first receiver is music.




audi

Method and apparatus for processing audio frames to transition between different codecs

A method (700, 800) and apparatus (100, 200) processes audio frames to transition between different codecs. The method can include producing (720), using a first coding method, a first frame of coded output audio samples by coding a first audio frame in a sequence of frames. The method can include forming (730) an overlap-add portion of the first frame using the first coding method. The method can include generating (740) a combination first frame of coded audio samples based on combining the first frame of coded output audio samples with the overlap-add portion of the first frame. The method can include initializing (760) a state of a second coding method based on the combination first frame of coded audio samples. The method can include constructing (770) an output signal based on the initialized state of the second coding method.




audi

Audio encoder, audio decoder, methods for encoding and decoding an audio signal, and a computer program

An encoder for providing an audio stream on the basis of a transform-domain representation of an input audio signal includes a quantization error calculator configured to determine a multi-band quantization error over a plurality of frequency bands of the input audio signal for which separate band gain information is available. The encoder also includes an audio stream provider for providing the audio stream such that the audio stream includes information describing an audio content of the frequency bands and information describing the multi-band quantization error. A decoder for providing a decoded representation of an audio signal on the basis of an encoded audio stream representing spectral components of frequency bands of the audio signal includes a noise filler for introducing noise into spectral components of a plurality of frequency bands to which separate frequency band gain information is associated on the basis of a common multi-band noise intensity value.




audi

Multi-resolution switched audio encoding/decoding scheme

An audio encoder for encoding an audio signal has a first coding branch, the first coding branch comprising a first converter for converting a signal from a time domain into a frequency domain. Furthermore, the audio encoder has a second coding branch comprising a second time/frequency converter. Additionally, a signal analyzer for analyzing the audio signal is provided. The signal analyzer, on the hand, determines whether an audio portion is effective in the encoder output signal as a first encoded signal from the first encoding branch or as a second encoded signal from a second encoding branch. On the other hand, the signal analyzer determines a time/frequency resolution to be applied by the converters when generating the encoded signals. An output interface includes, in addition to the first encoded signal and the second encoded signal, a resolution information identifying the resolution used by the first time/frequency converter and used by the second time/frequency converter.




audi

Audio signal decoder, time warp contour data provider, method and computer program

An audio signal decoder has a time warp contour calculator, a time warp contour data rescaler and a warp decoder. The time warp contour calculator is configured to generate time warp contour data repeatedly restarting from a predetermined time warp contour start value, based on time warp contour evolution information describing a temporal evolution of the time warp contour. The time warp contour data rescaler is configured to rescale at least a portion of the time warp contour data such that a discontinuity at a restart is avoided, reduced or eliminated in a rescaled version of the time warp contour. The warp decoder is configured to provide the decoded audio signal representation, based on an encoded audio signal representation and using the rescaled version of the time warp contour.




audi

Apparatus and method for encoding and decoding an audio signal using an aligned look-ahead portion

An apparatus for encoding an audio signal having a stream of audio samples has: a windower for applying a prediction coding analysis window to the stream of audio samples to obtain windowed data for a prediction analysis and for applying a transform coding analysis window to the stream of audio samples to obtain windowed data for a transform analysis, wherein the transform coding analysis window is associated with audio samples within a current frame of audio samples and with audio samples of a predefined portion of a future frame of audio samples being a transform-coding look-ahead portion, wherein the prediction coding analysis window is associated with at least the portion of the audio samples of the current frame and with audio samples of a predefined portion of the future frame being a prediction coding look-ahead portion, wherein the transform coding look-ahead portion and the prediction coding look-ahead portion are identically to each other or are different from each other by less than 20%; and an encoding processor for generating prediction coded data or for generating transform coded data.