NAB Panel Looks at Next Generation Codecs for 8K
At the NAB 2023 event, the 8K Association arranged a panel with the title ‘Next-Generation Video Codecs for 8K’. The panel was moderated by Ravindra (‘Ravi’) Velhal of Intel Research. Intel has been very involved, of course, as the supplier of much of the processor power for the codecs. If you’d like to watch the full panel discussion, check this video on the 8K Association YouTube channel.
The panel was made up of
- Thomas Burnichon – VP Innovation Strategy – Ateme
- Thomas Kramer – VP Strategy and Business Development – MainConcept
- Mauricio Alvarez-Mesa – CEO & Co-founder at Spin Digital – Spin Digital and
- Ali Jerbi – Sr. Director Product Management – SSIMWAVE
Velhal opened the session by highlighting that 100 years ago NAB was about the radio – not video. Operations, royalties and spectrum were the main topics 100 years ago and they are all still topics of discussion.
A Special Effort for the Olympics
There was a special effort made to broadcast the 2020 Tokyo Olympics when it took place in 2021 at 8K and with 60 frames per second with HDR and using the HEVC codec. The stream was delivered to selected locations around the world. The ‘glass-to-glass’ latency from the camera to the TV set was less than two seconds where the RTP protocol was used and 14 seconds using HLS. For more on this topic see our previous article.
There are four parts to the media workflow – Capture, Compress, Distribute and Display. The panel focused on just the Compress stage. After capture, there was some compression for production and then with more compression for distribution. The 8K content was so clear, Velhal said, that you could not only see when one of the judo competitors dropped a contact lens – you could see markings on the lens! There was a true sense of ‘real life immersion’. You get even more than the stadium experience but at home.
Burnichon introduced Ateme, a leading provider of compression services for broadcast. The company started in 8K in 2007 when, working with NHK, it daisy-chained 16 AVC encoders which produced really good results in 8K. However, that kind of solution was not scaleable. In 2012, the firm started the design of an in-house compression stack, so that when HEVC came out in 2013 the group was ready to ‘crank it up to 8K’.
8K Display Arrival Took Time
Of course, it took time for 8K displays to arrive in the home, so Ateme ‘took a small detour’ into 4K and learned all about HDR and ensuring the ecosystem was ready for even more resolution. By 2019, with the arrival of TVs and the founding of the 8K Association, the firm was showing 8K demos at NAB. The firm was then ready to deliver 8K HEVC compression at scale. The arrival of VVC in 2020 moved ‘one notch higher’ with the potential of 50% further bitrate reductions. Ateme worked to launch the first OTT channel with The Explorers.
The company has since looked at other codecs, but also discovered that film-based content from 30 or 40 years ago can still shine when re-scanned properly and well encoded when shown on a consumer TV. While 8K content is not yet widely distributed, Burnichon said, this use of scanned film means that lots of 8K content is available.
Thomas Kramer introduced MainConcept which has technology for codecs in all parts of the workflow. Like Ateme, MainConcept started with 8K when HEVC arrived as it was the first codec with enough quality and efficiency for 8K. The firm has created offline and streamed content with HEVC in 8K. The company started with VVC last year and is now bringing its VVC technology to market. VVC has been deployed as an 8K native codec and at NAB MainConcept was demonstrating live 8K 60P encoding. The adoption will take time, as always, but ‘that, for sure, will happen’.
“That (8K) will, for sure, happen” – Thomas Kramer – MainConcept
Ali Jerbi of SSIMWAVE explained the form’s perceptual quality metric and how the metric can be used to QC VOD content. It can also supply technology to track quality through the workflow and the firm was acquired by IMAX as that company wanted to help streaming service providers to understand the quality that they are actually delivering to the subscriber. The aim of IMAX is to bring a cinematic experience to the home.
Mauricio Alvarez-Mesa’s Spin Digital firm specializes in very high quality encoding and in particular has worked on live production of high quality content. The firm now believes that good quality can be delivered at 30-50 or 60 Mbps which is equivalent to a compression ratio of 1,000:1. You have to compress without losing the ‘8K effect’ which depends on resolution, HDR and the wide field of view. That needs even more technology including content-aware encoding, perceptual optimization and advanced rate control. With these technologies you can get broadcast quality at low bit rates.
The most recent step has to bring all these techniques to real time encoding. That’s where Spin Digital is now concentrating its attention. (and see this article for a deeper dive into Spin Digital’s use of VVC)
Consumers Have More Streaming Bandwidth
Velhal highlighted that increasingly consumers have up to a Gigabit or more of bandwidth available for streaming, so live very high quality streaming is quite achievable. It’s not as though this kind of quality is only possible in laboratory conditions. He asked the panel about the impact of reducing bit rates – what is the threshold? Burnichon responded that, more than resolution, bit rate is the limiting factor. If you just watch 8K on YouTube, what you will see at some point, will not be ‘true’ 8K. If the bit rate is limited and simply ‘throwing more pixels’ at the viewer won’t help. If you do have enough bandwidth available, it does make sense to ramp up the resolution.
Alvarez-Mesa pointed out that ‘broadcast quality’ is always the benchmark and 8K and low quality should not be combined together. Either you stay at high quality or you drop to a lower resolution is his view. Like Velhal, he sees the metrics from companies such as SSIMWAVE as critical to the process, especially after perceptual optimization, where other metrics fail.
Never Forget the ‘Wow’ Effect of 8K
Kramer agreed that unless there’s a ‘Wow effect’ there’s no sense in having a new codec. Velhal reinforced this point. If you can maintain the quality and frame rate seen in 4K when you boost the resolution, it has a real value. Jerbi explained that much depends on the complexity of the scene being captured. SSIMWAVE is working on being able to use its metric to support optimization and was demonstrating that at the NAB show. At the moment that optimization is for file-based encoding but the hope is to get to live encoding.
Velhal asked the panel about perceptual quality during encoding and Alvarez-Mesa clarified that the challenge is that, when live encoding, you have very little time to make decisions about how to proceed. You have to look at the complexity of the scene (it could be live action sports for example) and set the target bitrate asuitably to maintain the desired quality. You can use the model of the human visual system to make better decisions about which information to drop. There is no point in sending bits that the human can’t see. The process can look at the scene and decide which parts of the scene are most likely to generate visible artifacts.
Ateme believes that the same kind of decision is important for VOD as well and it’s critical to have a measure of the final image quality. Although you still need to be efficient, Burnichon said that you could use the extra time to make better decisions. Having the time to analyze the complete asset is really important in optimizing the image. While the aim may be to get to broadcast quality, it is up to the customer to choose if content will be delivered as VOD or ‘over-the-top’ (OTT). Either way, you need pristine quality that is true to the artistic intent.
Sports & Concerts are Showpieces
Kramer said that sports events and concerts were often chosen to showcase new technologies and that often means that encoding has to be in real time. There are often discussions with codec engineers to clarify whether measurable improvements in metrics actually equate to visible differences for the viewer. Responding to a comment from Jerbi about the desire to bring IMAX quality to the home, Velhal emphasized the difference between the cinematic world and live broadcasting. In live broadcasting, on the technology side ‘there is very little forgiveness’ if something doesn’t look right. On the other hand, if you offer a better service such as 8K and the consumer can’t see much difference, it can be hard to change that impression later. Velhal summarized the panel as agreeing that ‘quality should not be compromised at any cost’.
Speed vs Efficiency
Velhal asked about the trade-offs between the speed and efficiency of encoding. Kramer said that with ‘two pass’ encoding you could do a better job and tune the compression to a ‘super-high level’. Live encoding pushes every cycle of the processor to the maximum and it’s good that there are always new processors coming along.
Velhal said that there are three approaches to encoding – scalar, vector and metrics and you can use a CPU or a GPU. He then asked the panel to look forward to new developments.
Jerbi said that with the move from full HD to 4K, the focus was all on the resolution and that proved not to be the best course and there needed to be a move to HDR instead of SDR at the same time. 8K has to deliver a more immersive experience than 4K HDR. Alvarez-Mesa agreed and said that just boosting the spatial resolution on its own is not enough. You need HDR, wide color gamut (WCG) and high frame rate (HFR) to deliver an overall more compelling experience. If you miss out on one of these, for example with just 30fps, you cannot have the impact you need. Codecs such as VVC are so complex that it is hard to promise the full 50% reduction in bitrate that is hoped for in live encoding.
Burnichon agreed that the adoption of VVC will be gradual as tool sets are developed. He also believes that 8K will be important for new methods of content consumption which may not have been developed yet, but which will more towards more immersion.
The first question was about codecs in content capture, as the panel had concentrated on codecs in distribution. Kramer replied that MainConcept is working on that part of the chain and has the encoders to ingest 8K content for production and editing, where multiple streams are likely to be used. There are already cameras with 8K RAW capture and 8K compressed with HEVC and the firm is working with the camera makers.
In closing comments, Burnichon said that there is only a growing level of content available and the delivery mechanisms and displays are becoming available, so don’t hesitate to try 8K production, Alvarez-Mesa also said that in the codec business, you only really notice the codec when it goes wrong. When the job is done well, it’s invisible. Kramer said that MainConcept is also running its live encoding in the cloud and that could have been a topic for another panel session! Jerbi said that his firm was also demonstrating it metrics.
Velhal concluded by saying that during the Tokyo Olympics more than 4.7 petabytes of data was captured and stored to create 300 terabytes of video for display. The amount of data involved was massive.
Thanks to Geoff Gordon of MainConcept for the photos.