One year ago, almost to the day, we published a video entitled Why Are HDR Shows So Dark?, which turned out to be one of our most successful rants on YouTube. We’ve since identified no fewer than a half dozen different contributing factors, including: most shows continue to be lit in an SDR environment using the same lighting ratios they’ve been using for the past century; fewer than 1% of productions are even monitored in HDR and not infrequently, the very first time anyone sees the picture in HDR is in the colorist suite; from there, you can bet it’s already a foregone conclusion that the HDR version of the show won’t depart radically from the SDR, still considered to be the most important delivery format (stakeholders also fear confusing the consumer); moreover, colorists have been known to compromise the grade in order to mitigate motion artifacts like judder; and lastly, it’s no secret that not a few filmmakers and colorists are either ambivalent or even openly hostile toward HDR, preferring the low con look of traditional film instead. In and of themselves, apart from the absence of bright specular highlights that constitute one of the signature characteristics of high dynamic range video, none of these various assaults on HDR would necessarily make the image any darker than SDR; except for the fact that SDR’s relative approach to gamma means that the picture can always be made brighter in a room with high ambient light levels, whereas PQ based ST2084 HDR is an absolute standard where neither the EOTF nor the peak brightness can be increased, with the result that the SDR version could very well be brighter than the HDR one. Enter Michael Zink, Vice President of Emerging & Creative Technologies at WarnerMedia, who, during the course of a highly illuminating interview on The Display Show, proposes several more reasons why HDR shows might be too dark:
So, content metadata is this concept of describing technical parameters of the content itself so that that information can be provided to a display and the display can make better choices, especially when it comes to things like tone mapping. Now, you might recall that in the HDR10 format for instance, most people focus on the fact that it is using SMPTE 2084, the PQ curve in terms of the encoding, but it also includes SMPTE 2086, which is mastering display metadata. Now, that metadata describes what mastering monitor I used but that doesn’t really say anything about the content. So, you can have the most sophisticated mastering monitor and maybe I’m creating a piece of content that is in black and white. So describing the monitor, while helpful, doesn’t really tell me the full story. So, what we ended up doing was to come up with additional metadata that can go along as part of the HDR10 format and at the time it was really this notion of let’s at least create some sort of what we call static metadata that at least describes the two terms for MaxCLL and MaxFALL. MaxCLL is maximum content light level that essentially describes the brightest pixel in the entire film and MaxFALL is maximum frame average light level. You can equate that to an APL essentially: what is the brightest overall frame in the entire entire movie. Now, the reason we invented that is, as I mentioned earlier, is that as part of the discussions inside the Blu-ray Disc Association at the time, it was this distinction between how bright you need content to be for speculars versus for the entire frame, so these two parameters were kind of like developed and Mike Smith, one of the collaborators of mine here at Warner, kind of came up with that at the time to really help describe those content parameters. Now, where they become really useful for instance, let me give you a real-time example, is that let’s say you have a display that is capable of 750 nits yet you’ve mastered on a Pulsar which is able of mastering 4000 nits. But if you, for instance, only have a piece of content that isn’t very bright – maybe the MaxCLL is only 500 nits – the display actually doesn’t need to do any sort of tone mapping. Yet if you don’t use that metadata and you don’t look at the content metadata itself, and instead you just look at what mastering monitor was used, you would take whatever is in the master, assume it’s mastered to 4000 nits, map it all the way down to 750 nits, which means the actual content, the brightest pixel in that piece of content – 500 nits – will now be displayed much lower than that and content will end up looking very dark. And I think we’ve seen a lot of complaints early on when it came to HDR from consumers saying HDR looks too dark; and I think a lot of those instances were caused by those types of bad judgments – probably the wrong term – but certainly by using the wrong type of information; so I think it was always helpful using as much information as possible and I think it would be great for display manufacturers to really pay attention to the different types of metadata that is available and we wanted to make sure that we have or are providing information about what content metadata is there now as I said static metadata is that it’s static it just describes one snapshot over the entire feature film. There’s obviously a lot richer for metadata and dynamic metadata that describes that frame by frame; and for a lot of content, that metadata is available as well; and I think manufacturers should choose to use that one simply because it gives them more information. I think from a display perspective more information is typically better if you want to maintain the creative intent.
Michael Zink then goes on to explain how so-called outlier pixels – very bright pixels that are unintended – can skew MaxCLL metadata, in turn distorting tone mapping; and how some television manufacturers are simply ignoring metadata altogether.