Our Technology & Engineering Emmy® Award N° 2: Using AI to Optimize Video Compression

Earlier this year, ATEME made history by winning three Technology & Engineering Emmy® awards from The National Academy of Television Arts & Sciences (NATAS) for its achievements in video compression and encoding for media delivery and distribution. In this blog series, I provide an overview of the technologies that were honored with this award and explain how they benefit our customers. Last week we looked at the award for the development of perceptual metrics for video compression. This week, I’ll talk about the award for the use of artificial intelligence (AI) to optimize video compression.

Traditionally, video encoders were configured in constant bitrate (CBR) mode: they were expected to produce a constant bitrate output, regardless of the variations – from scene to scene – in the content. This was especially true for Over-The-Top (OTT) streaming, where an Adaptive Bitrate Ladder (ABR) was required to stream the content over a non-managed network. With ABR, a given content is encoded at multiple bitrates and resolutions. A player device can then dynamically switch between these variants depending on available bandwidth.

The problem with CBR encoding is that it leads to variable video quality! Video compression techniques are designed to signal – with as few bits as possible – the amount of new information from one frame to another. Depending on the complexity of a scene (think about a fast action sports game vs. a talk show), that information content can be highly variable. So, by nature, video compression results in a variable bitrate, if you want to maintain a constant quality.

Which brings us to how you measure quality, and hence the AQI. With AQI, ABR ladders can be constructed more efficiently, particularly when they are combined with ATEME’s intelligent scene change detection algorithms, which align the fragment boundaries of ABR encodings with scene change boundaries. This has the added benefit of minimizing the use of I-frames. These techniques result in fewer variants and/or lower bitrates for each variant. That means lower storage space (up to 50%) and less CDN bandwidth usage as well as better use of CDN edge caches, resulting in fewer unwanted pauses on the player.

But the use of the AQI is not limited to OTT distribution. The AQI is particularly attractive for use in statistical multiplexers. Multiple channels, using different codecs and encoded at different resolutions, can be statistically multiplexed together while being encoded at very similar visual quality, thanks to the AQI being incorporated directly into the encoder design. Each scene for each input is analyzed to derive an AQI measure. An optimization algorithm then allocates bits to each input source in such a fashion that results in similar visual quality as measured by AQI across all channels, while remaining within the bit budget of the output channel.

Another use of the AQI perceptual quality metric pertains to the third Emmy award on the use of massive processing resources to optimize video compression, which we will look into in the final article in this series coming next week. Stay tuned!