Spin Digital Fills in on Live VVC Encoding
Last month, we reported on the release of Spin Digital’s live encoding technology and looked at issues of codecs in general. However, the VVC is a significantly different codec from previous generations, so we were pleased to get the opportunity to speak to Mauricio Alvarez Mesa, CEO of the firm to get more understanding. The firm has published a substantial white paper that really digs into the weeds but we wanted to get a higher level view.
We started by asking about the improvements in the bit rates needed using VVC compared to HEVC. The press release highlighted savings of 26% for 8K, but just 17.5% on 4K content. The big announcements about new codecs always talk about 50% reductions in bit rate, so were these numbers disappointing?
Alvarez Mesa explained that this is a function of the live encoding and of the significantly increased complexity of the codec generations. For example, H.264 used just 16 x 16 pixel blocks for partitioning an image. VVC on the other hand has many ways to partition the image with areas up to 128 pixels square and with different shapes. The codec has to identify the best way to partition and that is complex and also dependent on the content. Further, there are more than 100 different coding tools that could be used to compress the image and many of them are very complex and only yield a small saving. However, if you use all of them, you can get to the 50% target.
He also pointed out that the ‘search space’ – the process of deciding which tools to use for a particular piece of video can also be very complex. As a result, up to 50X the computation power is need to get a 50% reduction compared to HEVC/H.265. In a follow up question, he clarified that his technology uses machine learning (statistical analysis) to help to identify the optimum tools to use, based on the content of the video. VVC is simply too complex to use a ‘brute force’ approach, which was possible with earlier technologies.
While you can get improvements with earlier codecs by ‘throwing more CPU power’ at the problem, there is a plateauing effect. As you can see from the chart below, taken from the full white paper, the improvement in HEVC compression doesn’t change much with more processor power, while VVC, and the less efficient AV1 can continue to improve. The chart shows that with 50X the processing power used for real-time encoding of HEVC (the baseline), AV1 only achieves 37.5% reduction in bit-rate while VVC achieves 50%.
In the chart, Spin Digital identifies that the area for real time performance is really from the baseline of HEVC to around 3X the processor power needed. The firm came up with that range based on detailed discussions with processor makers, Intel and AMD. In particular, new fourth generation Xeon processors from Intel, announced in January, give a significant boost with more cores as well as higher clock speeds and also with new multimedia instructions that can help boost performance. At the moment, the firm is waiting for test systems to get through the supply chain, but Alvarez Mesa said that Intel had been helpful in working ahead with the firm to allow it to anticipate the new hardware in its software. The firm also works with AMD, but typically AMD implements new multimedia instructions after their introduction by Intel, so it tends to lag the leading edge.
Having established what level of computation was going to be available, the development team optimised the choice of coding tools to fit into the ‘performance budget’.
Alvarez Mesa explained that this is a different approach than is often used for encoder development. In most cases, encoders optimized for offline/VoD applications such as VVenC from the Fraunhofer HHI https://www.hhi.fraunhofer.de/en/departments/vca/technologies-and-solutions/h266-vvc/fraunhofer-versatile-video-encoder-vvenc.html are used to perform complex compression, then features are gradually disabled to get the computation requirements down to practical levels.
You Need Some Headroom
The new fourth gen. Xeon processors are expected to boost live encoding by enough to allow 8K 60fps live encoding (at the moment, 8K 30fps is the limit). With the last generation of processor, the firm has got to around 50 fps, but you need some headroom in raw power to allow for more complex content, with a benchmark level of 75fps in test material needed to allow a reasonable performance at 60fps. The 50fps performance with 8K is highlighted in the figure below. This also highlights Spin Digital’s claim of better performance than other HEVC or AV1 encoders.
The Challenge of Computational Complexity
Of course, even if you are performing offline encoding, the speed of the encode is important. Apart from the time delays if you are having to wait hours for a program to encode, there can be a significant cost to the encoding time. Content providers such as Amazon and Netflix have to balance the savings in bitstream cost against the slower production time and higher encoding cost. For a highly anticipated show that will be watched by many millions, a long encoding process may be worthwhile, but often it will not be. There, it may be better to spend less time, money and encoding power and have less compression.
We also discussed that the bit rate requirements are often non-linear, but ‘shelved’. That is to say that while ‘fewer bits’ is always better, sometimes you have to meet a particular level for the broadcast technology. NHK’s first generation 8K encoder needed 85Mbps for 8K. Later generations of codec have got that down to 50Mbps with HEVC and with VVC down to 40Mbps, so far. However, for a terrestrial broadcast via a single Mux, you need to get to 32Mbps*.
Spin Digital, which is a member of the 8K Association, will be sharing a booth with the Association in the Futures Park of this year’s NAB event. Live VVC encoding will be a feature of what the firm shows at the event.
For more on the firm’s VVC development, see the full White Paper, available here.