Pages

Monday, August 22, 2011

आज़ाद

देखता हूँ जब मैं लोगों के हंसते हुए चेहरे
याद आते हैं मुझे उनके रोते हुए चेहरे,

दिल से निकल के वो कहीं आँखों में छुपा है
सीने में समंदर है मगर उठती नही लहरें,

साहिल पे बैठे बैठे रात भी टपक रही है
उतर गया है वो शक्स शायद पानी कहीं गहरे,

आज़ाद हो गये हैं वो  जो ता-उम्र क़ैद थे
उठ गये हैं ज़ींदो  पे लगे हुए पहरे.

Wednesday, August 10, 2011




Red Facts
straight talk on the technical realities of the Red camera


This is a second draft of this article. As we were ammending the first draft to address new arguments, we found that it was becoming mired in what could be an endless back and forth over technical definitions and minutia. So we're going to zoom back, take a breath, and refocus this piece on what was meant to be the original point of it in the first place:


There is a lot of hype right now about Red camera outperforming high-end HD cameras and even 35mm film cameras. We're going to tell you why we think it's just that -- hype.

We'd like to stress that we're not saying that Red is a bad camera. What we will do in this article, though, is take a sober look at the technical realities of digital imaging for cinema and show why the Red One camera does not, in our opinion, top high-end HD cameras like F23, F35, and Genesis.


The Red camera proselytizers have taken advantage of rampant confusion and misinformation in a nascent digital world. Even for engineers it's difficult to keep track of today's specs and formats, and for producers and directors, it's truly daunting. We tend to seek out something easy to explain it all, and the fallback has become the size of the number in front of the K.

"4K is bigger than 2K, so 4K must be better." It only stands to reason.

But in truth it's much more complicated than that, and the number before the K is almost meaningless without a lot of other requisite information.

The Red camera takes advantage of this Big-K phenomenon by severely sacrificing the overall quality of the image just to increase the number before the K. The Red camp then proclaims that they've beat the other cameras.

Again, let us reiterate that we are not saying Red is a bad camera. Also, let us be clear that we are not saying that misleading use of specs is necessarily coming from the official Red representatives as much as from its proselytes.

We think that the Red One camera is very impressive, especially for the price, and certainly beats many or most HD cameras out there by a large margin in all sorts of ways. We'd also like to seperate what we're talking about here from subjective taste. Red's literature says "just look at the results," and we agree with them. If you like the look of it, that's all that matters -- we're not refuting your taste. But after testing cameras and viewing uncompressed images on professional equipment, we preferred some other high-end HD (specifically F23 and Genesis) and 35mm film to Red. That's just our taste.

But, in this article, we'll put those subjective analyses away and tell you why we think that the TECHNICAL SPECS alone do not assert, as some would claim, that Red beats 35mm film or cameras like Genesis and F23/F35. In fact, these specs assert quite the opposite. This article is about mastering images for THEATRICAL CINEMA, not for TV or internet or any other such medium.

So we've agreed with the Red camp: you can't go by the specs, you have to look at the footage. But, there ARE people out there touting the Red camera's supremacy based on specs alone, and that's what we're addressing. So let's see what you CAN know about a camera just by the specs.

**********************

This article is not meant to show that high-end HD cameras and film have more RESOLUTION than Red One cameras -- it's meant to show that there is some overstated hype out there about the Red camera, and that some digital cameras that we like (F23 and Genesis) see and record more IMAGE INFORMATION THAT IS IMPORTANT FOR CINEMA than Red One does.


PRELIMINARIES


There is a glut of aggressive and cacophonous information and misinformation out there right now about the incredibly complicated world of digital imaging. Many companies and individuals have a stake in dumbing things down so that there is a single word that they can yell over the din, to get your attention. We do not think that the merits of a camera or format can be summed up in one utterance, like "4K."

You can't tell that a camera is better or worse by one number (just as you can't tell that Red is better or worse from "4K") and you need to know a lot more about a camera's imaging to evaluate it. For every big number like "4K" that the Red camp throws out, we could throw back numbers like these about some other high-end HD digital cinema cameras (again, F23/F35 and Genesis):

1.
Red records a maximum of 1.2 to 1.5 megabytes per frame. F23, F35 and Genesis, in (what up till recently has been) their usual configuration record 1.9 megabytes per frame in SQ mode or 3.8 megabytes per frame in HQ mode, and, if suited with a hard drive or solid state recorder, can record up to 7.9 megabytes per frame of real data about the image. So, high-end HD cameras can record more overall data about the image than Red by at least 3.8-to-1.5 in usual configuration. And, as on-board solid-state recorders have become more and more common, that ratio is more properly a minimum of 7.9-to-1.5.


2.
Red's image sensor has one 12-bit photosite per image pixel. F23/F35 and Genesis sensors have three 14-bit photosites per image pixel.


We agree that these two specs don't necessarily mean anything on their own (believe us, we can recite all the reasons that Red camp will use to jump on these two boasts -- and they're right), but we're just showing you that a big number doesn't prove a camera is better, because each camera can have its own big number. So, let's drop the battle for big numbers and get into some real information.


PERCEPTUAL SHARPNESS


Even though this article is mostly about how resolution count is not the way to determine which camera gathers a technically better picture for cinema, we do need to touch on resolution and sharpness.

Preliminarily, we would like to point out that the Red camp's definitions of the terms used in their specs seem to us to shift elusively (for example "resolution" in one moment refers to sheer count of photosites on the image sensor, then in the next to "effective resolution" in real-world tests). It's hard to pin them down, but, without going into details here (see their literature), the Red techies themselves, upon being confronted with hard numbers, back down from 4K and say that the camera is "effectively 3.2K." High-end HD cameras on the other hand don't have a confusing spec: their "effective" resolutions matches their published resolution. The high-end HD cameras that we will compare the Red to (like Genesis, F23, and F35) are 1920x1080, which is 1.9K.

Just about every feature film you see projected in the cinema today (even if it was shot on film) was mastered in 1.9K or 2K digital files. 1.9K and 2K are effectively interchangeable, firstly because the numbers are simply so close as to not matter, and secondly because most "2K" films are scanned for full academy aperture and only use 1828 pixels across for the usable image, whereas 1.9K projects use the whole aperture for image (so, 1.9K can be more than 2K when both are recorded to film). So, whether it's 1.8K or 1.9K or 2K doesn't really matter -- they're all about the same. The bottom line is, even films shot on film and displayed on film look perfectly sharp on a giant movie screen even at 1.8K (less than 1.9K).

PERCEPTUAL SHARPNESS and RESOLUTION are not the same thing at all. For an in-depth explanation of this, look up "Modulation Transfer Function" on Google or search Panavision's web site for a video we enjoyed of John Galt explaining it. What matters for us here is that extremely high "resolution" in the strict sense of the word is only useful for things like spy-sattelites and micro-fiche. That kind of resolution is for when you want to look very closely at (or magnify) a small still image, not when you want to sit far away from a big moving image, like you do in a movie theater, or even when watching a TV. For the purposes of cinema (and not spy satellites), once RESOLUTION has exceeded a reasonable threshold, then PERCEPTUAL SHARPNESS does not effectively increase with increased RESOLUTION.

We feel (and you may disagree) that 35mm film and high-end HD have already well exceeded that resolution threshold -- after all, you can't see pixels in the cinema, and the images look sharp and crisp. We also feel that cinema is about richness of an image, and we would prefer, once the resolution threshold has been exceeded, to capture an image with more richness and depth over an image with more superfluous resolution but less color information.


COLOR DEPTH


Resolution, whether measured in "effective resolution" (like MTFs) or in nominal resolution (sheer count of pixels or photosites) is only a measure of one aspect of image information. If you have resolution but no color or luminance information, you don't have an image. You could build a 1-bit camera that has one-bit per pixel with fantastic resolution, but it wouldn't perceptually look like a proper image. It would do great in resolution tests (shooting test patterns) and in pixel-count comparisons, but it would look terrible as an image. Every pixel would be black or white. Not only would there be no color in our imaginary 1-bit camera, there'd be no gray either -- just black or white. Any good imaging system needs to record color and luminance information to have an image at all. But how much?

Luminance information and color information are effectively the same thing for our purposes, because a color image is created by the luminance of several component colors. That means that COLOR is the relative value and LUMINANCE is the absolute value of these components. (Even formats that record more information about luminance than they do about color do the same thing, because, even in those formats, luminance is an absolute value and color is a relative value.)

How much of this information do you need to make an image of cinema quality? How much resolution and how much color information? Motion picture imaging, which has traditionally been done on 35mm, has a history of recording images of incredible breadth of color and luminance information. This breadth is necessary for the image to look rich and beautiful on the screen, and it's also necessary for the image to be adjustable without degradation in normal color grading (color correction). When light comes in through the taking lens, how much digital data needs to be recorded about it to store a 35mm-quality image? Well, a standard has been developed for this -- a standard that digitally represents what 35mm film can do.

There is only one standard for file formats that is used indusry-wide for theatrical cinema mastering -- it's the one kind of file that is ingested for color-grading, it is the one sort of file that is used for film scans, the file type that visual effects houses work with, the one file-type that is sent to film recorders. It's been the standard since at least the early 90's and hasn't changed and isn't in process of changing -- Kodak invented the standard to be the digital equivalent of film, and we think they did a great job. It's called a DPX file (it's interchangeable for our purposes with Cineon file). These files, we agree with Kodak, are just like film. Film has 3 layers: one for red, one for green one for blue. And, each pixel in a DPX file has three pieces of information ("channels") in it: one for red, one for green, one for blue. How much information about these channels? 10-bits per channel; that's 30-bits per-pixel. Kodak determined that that's how much information you need to get the full-breadth and depth of film data, and it's what's been used and continues to be used as the standard for cinema.

Also, although DPX files can be any pixel dimension, the standard has been and continues to be 2048 across (again, of the 2048, 1828 are usually used for the final image area).

To get all the information of film, you need 1.8K pixels, you need 3 color channels, and you need 10-bits of data per channel.

Note that Kodak could have made any specs they felt were required here to fully achieve the quality of 35mm film: for example, they could have prescribed a file that has more resolution and less color/luminance information. Here's one they could have used: 4K pixels and 7.5-bits-per-pixel -- that would have actually been the exact same file size as the spec they did choose, and they even could have said the word, "4K." But the standard that Kodak determined necessary and that is still today the full-quality standard of professional cinema imaging is 2K and 30-bits-per-pixel. You need resolution (lots of pixels) and you also need color depth (lots of bits-per-pixel) to get a full 35mm film quality image. Additionally, the consensus amongst imaging professionals is that it's so important to protect the full 10-bits-per-channel of color data, that most professional color correction engines do 16-bit calculations on 10-bit files just to ensure that there will be no rounding errors.

So, as you can see, color depth is information that is just as important and just as coveted as resolution for cinema imaging.
Let us also mention here that DPX file size is very simple to calculate and very meaningful. DPX files simply store all of the prescribed data just as it is, so the number of bits in each file is very simply:

(number of pixels) x (10-bits per channel) x (3 channels)

Plus there is a tiny bit extra for metadata. So, the data size of a DPX file basically has nothing missing and nothing extra, so its size faithfully represents the data amount necessary for 35mm film-style digital image data.



DENSITY CHARACTERISTICS



When Kodak was figuring out how much digital information you need to truly represent 35mm film negative, they determined that 30-bits per pixel actually wasn't enough if the file was a straight linear sample from the camera or film scanner. 

If you were going to use the linear sample straight from the camera or scanner, it had to be much higher than 10-bits per channel and 30-bits per pixel.  So, in the interest of making manageable file sizes, they made 10-bit logarithmic files instead of, say 12- or 14- or 16-bit linear files.  This means that the camera samples at a depth much higher than 30-bits per pixel and then that image data is made logarithmic before being re-quantized to 30-bits.  Even 30-bits is enough only if you have logarithmic density characteristics that are made from a higher-depth linear sample.  But if the density in your file is linear (like the camera sensor), then you need more than 30-bits. 

This is going to be important later, so we'd just like to clarify a little. 

Another way to think of this is that the blacks need more quantization steps than the highlights do in order to fit the perceptual information of 35mm film into a 30-bit file.  30 TOTAL bits are enough to achieve this film quality if and only if the quantization steps are unevenly distributed in a logarithmic way.  If the steps ARE even -- linear distribution rather than logarithmic, which is how an image sensor works -- then the image is going to need a total of many more bits than 30-per-pixel for there to be enough quantization steps at the bottom end.  So, Kodak's answer: sample at significantly higher than 30 so that the bottom end of the image has enough samples, then do a logarithmic density transform and finally requantize to fit the image into a 30-bit file.  If the requantized file is linear, too much information is lost. 

This all means that the initial sample at the camera or scanner must always be significantly greater than 30 bits to get film quality; and the subsequent storage file can be requantized to 30-bits and still maintain that quality if and only if the image is made logarithmic before being re-quantizing.



THE DIGITAL CAMERA


One problem for people that are trying to get all that information from a digital camera instead of from a film-scanner is that it's too much data for most real-world devices to gather and store at 24 frames per second (and faster). It's a real challenge to build an image sensor that can see that kind of resolution and depth, to build processors that can handle that kind of throughput, and to concoct a data storage medium that can record that fast and hold enough data. After all, a 1920x1080 DPX file is 7.9 megabytes per FRAME.

It's a challenge, but it's happening. And in a world of quickly changing technology, camera manufacturers are racing to do their best to overcome the herculean demands of that kind of imaging quality.

There are all kinds of work-arounds -- and these are good work arounds; we're not knocking them. For example, many camera sensors don't have three photosites per pixel -- they only have 1 or 2 photosites per pixel, and then they use an algorithm where each pixel borrows information from some of its neighbors. Most low and medium-end tape and file formats have a scheme of lop-sided bit-depths for the 3 channels to favor luminance information over chroma information. Another way to work around the data size is to lower the bit-depth. Many cameras and file formats only store 8-bits of data per-channel instead of 10 (that's 4-times less data). All of these work arounds are fine, especially for projects that are not going to theatrical cinema.

And, of course, there's one more work-around: digital compression. Many file formats will, for example, take an image whose specs demand 10 megabytes and just store it in one megabyte without changing the spec. How do they do this? With a clever compression scheme. A compression scheme means that the file doesn't really contain all the data demanded by its target spec. When done well, compression can be a great way to overcome some of the data problems we're talking about. Compression uses a very clever algorithm so that (hopefully) there is no perceptual evidence that information is missing, even though it IS missing: the scheme tries to only throw out bits and bytes that you won't notice are gone. Of course, it doesn't always work that you don't notice. Compression schemes are proprietary and are getting better every year, and you can't tell how good a compression scheme is just by it's numbers, because it's perceptual, not just mathematical -- a state-of-the art compression scheme may be able to compress a 10 megabyte image to 1 megabyte and make it look visually better than an older compression scheme can do in 5MB.

Of course, though, at some point you can't hide it any more no matter how good the scheme is. Obviously, you can't take an image file whose specs call for 10 megabytes and store it in 1-bit. For things like internet videos, large amounts of compression (even compression with very obvious visual artifacts) is fine and is the norm. But, for mastering of cinema images, it is critical to neutralize any evidence of compression (that's why DPX files -- still the standard for theatrical cinema -- are completely uncompressed). Uncompressed files are the only way to have ALL the prescribed information.

Sometimes compression can be visible in one mode of display, but not another. For example, one compression scheme may work fairly well if you don't adjust the image color (as is normally done to every film in post production color-grading) but reveals artifacts as soon as you make an adjustment, or another scheme may hide compression artifacts on a CRT monitor but not on an LCD monitor. So compression, like the other work-arounds mentioned above, can severely limit color information -- after all, there is a lot of information missing from a compressed file, even if it's hard to see that it's missing in some display methods.

Before saying why we think F23/F35 and Genesis technically beat Red in this arena of color depth, we'd like to again point out that we do believe that Red is a good product that certainly has many innovations in this race to capture image information digitally. For example, they use Bayer pattern in their image sensor, which is great -- it's the best way to subsample (to have fewer than 3 photosites per pixel). They have a great compression scheme -- it looks amazing considering the data rates. And they have innovative image processing to perceptually smooth over what's been lost by compression and subsampling.

The thing is, despite how well Red has done in making up for subsampling and compression, the other cameras have beat them. We think that the F23/F35 and Genesis are farther along in the race: firstly, we believe that their work-arounds are better than Red's, secondly, and more importantly, they don't need as many work-arounds, because they've actually hit the target spec.


IMAGE GATHERING


In this section, we are going to talk about the image flowing out of the camera, before it gets to the recording device.

F23, F35 and Genesis are not subsampled at all.  They have actually managed to make sensors that have pixel-counts above the theatrical resolution threshold (these cameras are 1.9K, when we know that 1.8K is already above the threshold) AND they have three photosites per pixel (one per channel) AND they've met the 10-bit-per channel requisite: the camera gathers 14-bits of (oversampled) data which is converted to 10-bit logarithmic data.  So they've done it -- they've met all the specs for a full film quality image: resolution, bit-depth, full color sampling, no compression.  (Remember, we're just talking about the image flowing out of the camera here -- before it goes into the record-device.)

The Red, on the other hand (still just talking about the image before it's recorded), is so busy trying to outperform the other cameras in just the one area of superfluous resolution that it hasn't got to the full cinema quality of the other requisite aspects.  They're using up all their photosites on resolution (so they can say 3.2K or 4K is a bigger number than 1.9K) and are truncating color information.  The Red sensor only has one sample per pixel instead of three, which means it is 3-times color subsampled.  The Red camera samples at 12-bits (as opposed to the other cameras' 42-bits-per-pixel sample which is prepared for a 30-bit file before being captured) and passes it straight through as 12-bits of linear data.  That means Red has 12-bits of linear color/luminance information per pixel compared to Genesis/F23/F35's 30-bits of true logarithmic data.  As we mentioned earlier, 30-bits is required for full 35mm film color depth, and even 30-bits is only enough if it's sampled at greater than 30-bits and made logarithmic BEFORE being re-quantized.  Genesis/F23/F35 achieve this filmic color depth: 3 channels of 14-bit sample, for a total of a 42-bit sample-per-pixel, which is made logarithmic before re-quantized to 30-bits-per pixel.  Red gets only one 12-bit sample per pixel and can't take advantage of the efficiency of logarithmic density characteristics because it doesn't oversample at the sensor, so the resulting file is 12-bits of linear data per pixel.

To give an actual rather than theoretical analogy of the importance of bit depth, think of the ArriScan film scanner.   That's the device made by Arri to scan film for digital intermediate color grading and ultimately theatrical and home-video release.  The ArriScan is basically a digital camera (that shoots already-exposed film) that is a true workhorse for actually released theatrical films.  The ArriScan has EXACTLY THE SAME specs for color depth as the Genesis/F23/F35.  It gathers 14-bits of linear data per channel (42 per pixel) at the sensor, makes it logarithmic (while still 14-bit), and then requantizes it to 10-bit before sending it to the storage file.  Again, Red gets 12-bits instead of 42 per pixel and can't take advantage of logarithmic density.

Now, again, we are not saying Red is a bad camera. The Red techs tout their use of a Bayer pattern sensor to overcome the subsampling and they tout the revolutionary quality of their compression scheme ("RedCode") and their image processing -- and they're right. They can be proud. Those are beautiful technological advances -- very impressive; they've done great in these methods of compensating for subsampling and compression. It's just that they got bested by the other cameras that don't need so many work arounds.

On top of all of this, it is important to note that fewer total photosites (which F23/F35 and Genesis have compared to Red) can sometimes be an ADVANTAGE in reducing noise and increasing dynamic range and sensitivity to light. We're not claiming that such an advantage is certain in this particular comparison, because we're only looking at the specs and the results of the specs (not the proprietary inner-workings of how the hardware achieves those specs). But, whether that phenomenon is exemplified here or not, our own tests showed that Genesis and F23/F35 have an increased dynamic range over Red's published specs and over it's measured specs.

With respect to all of these matters on image gathering, the Red camp is sure to accuse us of not taking into account the fact their image is "raw," well we actually have taken that into account. In a later section we will show why Red's concept of "raw" does not belong in this section or in the next section on image recording.


IMAGE RECORDING


Of course, the image flowing out of the camera has to be recorded and stored. Now, as we've seen, the image flowing out of F23/F35 and Genesis is uncompressed 1920x1080 10-bit-per-channel RGB. These cameras, in normal configuration, are paired up with a tape-deck or solid state recorder that easily snaps on as a standard accessory and is the normal method of recording its images. That tape deck takes HDCAMSR cassettes, and can record to the tape in two modes: 440 megabits/sec and 880 Mb/sec. The solid state recorders are fully uncompressed at 189.6 Mb/sec. The first of the tape deck’s datarates translates to 1.9 megabytes per frame, the second to 3.8 megabytes per frame. And uncompressed translates to 7.9 megabytes per frame.  In SQ mode, the tape image is compressed with a ratio of 4.2-to-1. In HQ mode, with a ratio of 2.1-to-1. And the solid state recorder has no compression at all (1-to-1 ratio).

Red records its information to drives or chips, and also has 2 modes. By our own calculations and measurements, it shoots 1 megabyte per frame in one mode and 1.2 megabytes per frame in the other. Some of the folks on the Red board say it's 1.5 megabytes per frame, which is a bit more. We think they did their math wrong, but we'll use their higher spec for the benefit of the doubt.

Now, that means that they're trying to store a "4K" image in 1MB or 1.5MB whereas the F23/F35 and Genesis are trying to store a 1.9K image in 1.9MB or 3.8MB or 7.9MB. The compression is obviously very much higher in Red camera.

It is difficult to assign a compression ratio to Red, because there is disagreement on whether compression ratios should be calculated by comparing the camera files with the image-sensor's data or with the finished file's data. We think you should compare it to the finished file, because we think compression ratio is meant to enumerate the ratio between how much data you actually have and how much your spec demands (which is how much the de-compression scheme has to inflate the image), but ratios calculated on the Red message board compare the camera data rate to the amount of data that the sensor originally gathered (of course, you don't have this dilemma with F23 and Genesis because the amount of data that the sensor gathered is fully equal to the spec for the final DPX file). A DPX file of Red Camera's pixel dimensions (4096x2304) is 36 megabytes per frame. If the Red shoots 1.5MB per frame, then that is one twenty-fourth of 36MB. We'd say that the camera has a compression ratio of 24-to-1. The Red camp seems to calculates their compression at somewhere around 12-to-1 by comparing the camera data rate to their subsampled sensor instead of to the final file that the processing software creates. Either way, you can see that Red is very much more compressed than the F23/F35/Genesis ratios of 4.2-to-1, 2.1-to-1 and 1-to-1 (uncompressed).

At this point, we must reiterate, as mentioned above, that the mere math of compression ratios does not prove that one image looks better than another -- because compression schemes are proprietary and they're perceptual (not just mathematical). Well, Red's literature would have you believe that all competing compression schemes are in the stone ages (they often use the word "wavelet," which is supposed to prove that their compression, not just their vocabulary, is better), but we don't think so -- we think that other state-of-the-art cameras are also state-of-the-art. We think (admittedly subjectively) that HDCAMSR compression is as good as it gets -- we think it's indistinguishable from uncompressed, whereas Red's scheme shows artifacts. You don't have to agree with us on which compression is visually better, but two things are for sure here: it is possible to shoot F23/F35 and Genesis uncompressed using the solid state recorders if you're worried about compression and that RedCode would have to be a LOT better than HDCAMSR just to equal it visually.

Bottom line: in recording the image data from the sensor, the Red records considerably less data (actual bytes) about the image than F23/F35 and Genesis do, whether measured by absolute amount or by ratio.


"RAW" IS A RED HERRING


We should be able to leave the comparison here, but we know that if we do, the Red proselytizers will refute our comparison by saying that we're ignoring the fact that the Red camera shoots "raw." "Raw" is another attempt to take advantage of the Big-K effect. It's a word to yell over the din.

"Raw" is a buzzword that comes from the world of digital-SLRs (digital still cameras). In that world, "raw" is indeed a fantastic advantage. In stills, "raw" is the name a file format that stores uncompressed image-sensor data, and, usually, the only option besides "raw" for getting data out of a digital-SLR is JPEG, which is a file format that is extremely compressed. So, in stills, "raw" is a far superior to the alternatives because it's uncompressed. In motion imaging, we believe that Red's version of "raw" is inferior to the HDCAMSR alternative because Red "raw" is MORE compressed than HDCAMSR, not less.

Let's forget the digital stills use of the word "raw" and look at what "raw" means in our comparison of motion-picture imaging cameras.
The Red camp tries to draw a differentiation between itself and other cameras by saying that Red shoots "raw." This means that raw data from the sensor is not processed into an image in the camera; instead it is processed into an image later by a computer running Red's proprietary software.

"Raw" is just a distraction. It is not a HOW of imaging; it's just a WHEN and WHERE.

"Raw" is not a distinction of any substance. F23/F35, Genesis, and Red all do the same thing: they take "raw" information from the sensor -- this is "raw" data, not yet an image -- and then they run that data through proprietary software that turns it into an image. The only difference is that the F23/F35 and Genesis image processing is in the camera, and Red's is in a separate computer.

If the equipment is configured correctly (the cameras and/or Red's software) then the step of turning raw sensor data into an image is in no way a degradation or a truncation -- it's just a transform.

If anything, this is an advantage for F23, F35 and Genesis, not a disadvantage, because the image processing software for F23, F35 and Genesis has access to ALL the information from the sensor (before it's been compressed for recording) and because the processing is done in real time -- you don't have to wait for a computer to render it.  That's pretty handy.  Additionally, the fact that the processing is in the camera means that the image can be made logarithmic BEFORE being re-quantized to 10-bits-per-channel -- which is essential.  This means that F23, F35, and Genesis can take advantage of the fact that logarithmic density characteristics get more perceptually important information out of 30-bits than linear density characteristics do.  The fact that Red is "raw" means it's using linear density characteristics in writing its files, which are more perceptually inefficient than logarithmic characteristics -- Red records fewer bits-per-pixel and is forced to use them less efficiently because it is "raw."

It's been stated on message boards and elsewhere that the fact that Red is "raw" makes it higher quality than other cameras because it doesn't throw away sensor information.  But this is deceptive doublespeak -- it is a misleading way to make it sound like oversampling at the sensor is a bad thing, which it obviously isn't.  The actual files from F23/F35/Genesis are 30-bits-per pixel compared with Red's 12-bits-per-pixel -- 30-bits is more information about the original sensor data than 12-bits, not less.  The fact that F23/F35/Genesis oversample at 42-bits at the sensor is only a huge advantage for them, not a disadvantage.  F23/F35/Genesis get more information from the sensor than Red (by 42-to-12) and more information in the resulting capture file than red (by 30-to-12).  The fact that 30 is less than 42 does not also make it less than 12.  Again, oversampling is only an advantage, because the image can be made logarithmic before re-quantizing, thereby using the 30-bits in a much more useful way than if it were a simple 30-bit linear sample.  F23/F35/Genesis meet the target spec 30-bits-per pixel, and can use that 30-bits more efficiently by taking a luxurious 42-bit sample at the sensor.  Red, on the other hand, doesn't even meet (let alone exceed) the target spec at the sensor, and subsequently cannot use logarithmic density to make the capture file more efficient -- it's stuck in 12-bit linear.

30-bits-per pixel is the minimum for film-quality motion imaging (it's the target spec), and even 30-bits is enough only if the sensor samples at greater than 30 bits and the image is mapped to logarithmic density characteristics before being re-quantized.  F23/F35/Genesis actually achieve this, by sampling at 42-bit, then going to logarithmic density characteristics, then re-quantizing to 30-bits.  Red only ever gets 12-bits to begin with, which is already BELOW the spec, so if they requantized like the other cameras do, they'd just be even farther below the spec.  By saying that "raw" is higher quality in this case is mischaracterizing the issue: re-quantizing is an ADVANTAGE if first you oversample (compared to the target spec), then you make it logarithmic, then you re-sample to the target spec -- you get better quality out of your target spec.  Of course, with the Red camera, re-quantizing would NOT be an advantage because it's already below the target spec, so re-quantizing would make it even worse.  Red's version of "Raw" is an advantage over an imaginary camera that, say, samples at 12-bits and then re-quantizes to 8-bits.  But, it's not an advantage over real-life cameras that simply get a lot more data than Red does: Red samples at 12-bits and sends 12-bit linear files to the recorder; F23/F35/Genesis sample at 42-bit and send 30-bit logarithmic files to the recorder.

Some people claim that the processing of F23, F35 and Genesis "bakes in" a look that you can't undo in post-production and therefore Red has more information about the image available in post, but this "baking in" problem is only true if you have bad settings in the camera (like, if you don't use the usual logarithmic settings in the camera that most people use and you crush blacks down to zero in the settings).  Actually, when the equipment is operated in the usual and correct manner, you have MORE information about the image in post from F23, F35 and Genesis -- which is our whole point through this article -- that F23, F35 and Genesis get more color and luminance information.  F23, F35 and Genesis get more breadth and depth, more range for color-grading from the richer HDCAMSR or uncompressed file.  Also, the exact same "baking in" problem applies to Red's own processing software just as it does to F23, F35 and Genesis -- if you have bad settings, you'll truncate data.  So, if trained professionals are handling the equipment (whether it's F23, F35, Genesis, Red Camera, or Red Software) in its usual configuration, then there will be no "baked in" data truncation.

We believe that we've compared the cameras fairly by not including Red's definition of "raw" in the previous sections on image gathering and recording, because the gathering of a raw image from the sensor and its subsequent processing has been examined equally and fairly for all cameras -- we just skipped over the WHEN and WHERE while speaking about it above, since the article is about technical quality of motion imaging, not time and place of motion imaging.

Some statements from the Red message board will imply that by doing the image processing later you somehow get increased dynamic range or that you don't need as many bytes to store the same amount of data or other such things. This is self-evidently absurd -- one bit is one bit, and the number of bits you captured is the amount of information actually stored about the image.  If anything, the only trick to make bits truly more efficient (more real information about the image per bit, not just throwing information away using compression), is to use logarithmic density characteristics.  This is because one bit is always one bit, but logarithmic density characteristics allow the most perceptually important 30-bits to be chosen from the 42 sampled bits.  One bit is always one bit, but there's no rule restricting you from choosing more perceptually important bits instead of just linearly spaced bits.  The F23/F35/Genesis do this, but Red can't, because it's "raw" and linear and not oversampled, so it's stuck.  The amount of bits in the capture file is the real image information captured, and you can't squeeze more out later (even if you perceptually cover up the lack later), but what you CAN do is choose at the time of capture the most important bits to pack using oversampling and logarithmic density characteristics.  But, Red doesn't oversample or have the ability to use logarithmic characteristics; artificially inflating the data later, as Red does, does not mean you captured more real data per bit at the camera -- it just means your trying to cover up that you didn't.

A reality that's being misrepresented as an advantage for Red when it's really a disadvantage -- is that IF the Red camera stored all the data from the image sensor without compressing it (which it doesn't do), then the camera file would STILL be smaller than the final DPX file, but that's just because Red is subsampled (one photosite per pixel instead of 3). One byte is still one byte. The "raw" file is not higher quality just because it's subsampled. Putting off the image processing till later does not magically recover data that wasn't captured. F23/F35 and Genesis SEE all the information and STORE all the information of the final DPX file -- they are not subsampled -- so there doesn't have to be any confusion with those cameras about WHEN subsampling gets inflated to the target spec.

Sony has been able to build image processing software and native hardware that can fit inside the camera and do full-quality lossless processing in real time. Red didn't build it on-board and they can't do it in real time. That's not an advantage for Red. Red's image processing software has to work much harder than Sony's because the camera only gathers one piece of data per pixel instead of three, and the processing software has to work hard to turn that subsampled data into an intelligible color image.

It is absurd to say that Red's image processing software is better than Sony's just because it's loaded into a personal computer instead of into a dedicated processing board. If you want to compare the innards of the processing software (which we haven't done here -- we just compared the results of the software), then you'd have to actually get inside Red's proprietary software and inside Sony's -- you can't say which is better just by saying WHERE the software is housed. We're just comparing the RESULTS of the software, not the inner workings of the software.

And, as we've shown, you get all the data from F23/F35 and Genesis -- every last bit and byte of information about the full breadth and range of the image -- nothing from its specs as we've described them is thrown away or compromised or subsampled (except for the mild HDCAMSR compression, as we discussed, which is much milder than Red's compression). Of course, if you operate the camera incorrectly you may lose information, but the same goes for Red camera AND for its image-processing software.


APPLES TO APPLES


A lot of hype that we've heard out there and seen on the Red message board recently confounds some issues by switching between definitions when convenient. Specifically here, we would like to address the issue of comparing Red's CMOS sensor to digital-still cameras.

Red has a CMOS sensor (same technology as digital still cameras) whereas F23/F35 and Genesis have CCD sensors. These are just two different technologies -- not inherently better or worse. Now, the still imaging world and the motion imaging world have VERY different naming standards that seem to get conflated with one another.

Still cameras advertise a "megapixel" count. That's the number of photosites on the image sensor: it's not a bit depth or a file format or an indication of color-subsampling, or anything -- it's just a raw count of photosites on the sensor. Now, in the still world that's fair -- that's how everyone labels the cameras for comparison, so it's apples to apples. It's an agreed-upon naming scheme.

In the motion imaging world in which Red is gaining techie supporters and touting specs (and we're talking specifically about theatrical cinema, not broadcast TV or web or anything like that), there is an industry standard measure of file size: "2K," "4K" and so forth. But, as we've discussed, these standards come from post-production file types, NOT cameras and image sensors. The reality is that "2K" and "4K" files -- as they are used in the real world to master theatrical cinema -- have 3-channels and 10-bits per channel and are uncompressed.

As we have shown, Red is much farther from this benchmark than F23/F35 and Genesis are -- in both gathering and storage. There is talk on the Red message board saying things like, "a Canon still camera with 12 million photosites is 12-megapixel, and, likewise, 4K Red camera is 4K." But that's apples to oranges. Number of megapixels (meaning sheer photosite count) is a standardized still-camera industry rating of a cameras sensor, whereas "4K" is a motion-imaging term for a post-production file-type. We agree that Red is "4K" by certain definitions, and that Red has more "resolution" than F23/F35 and Genesis -- Red is a very impressive camera. But we also think that Red is much farther from a 4K or even a 2K DPX file than F23/F35 and Genesis are in the gathering and storage of INFORMATION THAT IS IMPORTANT FOR THEATRICAL CINEMA.


(This article is published on http://www.rcjohnso.com)

Tuesday, July 19, 2011

Achieveing Film-Look on Video


THE BIGGER PICTURE

There are no buttons on video camera that are going to give you the look of film. The reality is that the mechanics of the camera is just one small part of a much larger process. Think of it like a machine that is dependent on other components to function. If one of these components fails or is omitted, the larger machine will either stop working or produce unpredictable results. And it is so when looking into those various "components" that make up the "film look". Let's examine each of these components a little closer.
Before I continue, I will acknowledge that there is an argument that some of these components can be omitted and still the film look can be achieved. This is probably a fair argument but you can make the decision as to what is important and what is not once you have a broad understanding of the bigger picture. It's like the old saying "In order to break the rules, you must first know them” or something to that effect.

CAMERA MOVEMENT

If an audience sees a lot of shaking in your footage, it triggers a chemical reaction in the recesses of their brains called videopsychosis. This signal lets the brain know that what they are seeing is footage shot with a relatively small video camera. If you are trying to fool your audience into thinking they are watching a film-originated movie, this is an undesirable result. To avoid this phenomenon, the camera operator must adjust his/her thinking and become a cinematographer, not a videographer. The “cinema” in that title is there for a reason.
So what can we do with the camera to give the illusion that your film was shot on film? Let’s add some virtual weight to it. Slow everything down. Imagine that you are operating a big 35mm camera  and you will begin to see the world a little differently. Study the slow camera movements of your favorite films. With some exceptions, most camera movements are slow and considered, often revealing details in a scene in a slow-mannered way. Adding some actual weight to your camera can really help until you get the hang of doing it with a lighter setup. When you pan, turn your body, not just your arm. As you become more fluent with your camera’s movement, your scenes will become more organic and it will be a much more pleasing to your audience.
Also consider the fact that anything that moves the camera from its fixed position (e.g., a dolly or crane) helps to further the film look. It takes the viewer away from the patented two-dimensional look of video and creates a world that you feel you could literally step into.
There are plenty of techniques for creating slow cinematic camera movements using your video camera and most of them will come from your imagination, not a book, so start experimenting.
I’ll give you one example to get you started. I like to tilt my entire tripod on two legs until my camera almost reaches the ground. I then slowly pull it back to its upright position. This creates a very effective crane movement and you can get some really smooth moves with practice.
So, don’t just watch films, study them. Imagine you are the cinematographer and really observe how the camera is moving and how it reveals or obstructs certain things and how it ultimately motivates the story or scene. You will find that there is a very specific language of movement and once you get a feel for that, you will begin to see things very differently.

LIGHTING

Everything in a scene should be there for a reason unless you are just gripping and ripping or doing documentary work. Lighting helps to selectively draw the viewer’s attention to a person or thing. In a way, it almost works in the same way as selective focusing. If something has a dominant light in a scene, it will receive your attention over everything else.
Lighting also creates the illusion of a third dimension in two-dimensional space. When we look at a movie, whether it is on a TV or on a theater screen, it is flat, having only dimensions of height and width so the challenge is to convince your audience that space exists behind the screen.
Classic three-point lighting, for instance, has a number of functions that help us in our quest for that extra dimension. We have a key light, which creates the focal point for the viewer and extrudes the subject from the rest of the frame. Then we have the fill light, which eliminates nasty shadows but also has a sculptural function by helping to reveal the shape of the face and head. Finally, we have the backlight or rim light which separates the subject from the background, giving a sense of depth.
Expounding on lighting techniques in general is beyond the scope of this article and there are many resources on the Web for your reading pleasure. I just want you to think about the concept of light of varying strengths in your scene, each having a very specific role, all conspiring to create a feeling of depth and ultimately helping the viewer focus on the message you are trying to convey.

COLOR/GAMMA

These days, there is no particular color palette that can be specifically associated with film. There are all kinds of looks out there ranging from monotone to highly saturated.
When shooting video, no matter how black you make your blacks in-camera, the picture still ends up looking a little “milky”. This is because the full depth of the blacks has not been realized and this greatly influences the general richness of the picture in terms of color. A simple levels adjustment in your NLE of choice will really bring out some vivid colors.
In the levels control, move the black control slider to the right until it crushes your blacks just to the point of losing some shadow detail. Some people like to push it further to achieve a high contrast look…they also push the whites by sliding the white control slider to the left. I usually stay away from the white end because it looks more like overexposed footage rather than a stylistic look. If I need to brighten or darken the picture beyond that, I adjust the mid-tones. That way, information in your picture will not suffer. Aside from some basic color correction, that’s about all I do to my raw footage to get that rich look.
I do not white balance if I’m taking nature shots or shots that have no connection to each other. I prefer to color balance in post production. For instance, if I was to white balance during the sunset or sunrise, all of the golden light would be gone. If I was shooting a film or music video, on the other hand, I would white balance before every shot because continuity is crucial in these situations.
Film has much greater latitude than video meaning it’s able to capture more tonal information. You will notice with film-originated footage, overexposed areas transition more gracefully than in video. In the latter, your picture will blow out to white pretty quickly, leaving some pretty ugly transitions in your picture. This is a dead giveaway that you are shooting video. Some video cameras have a “knee” adjustment that minimizes this problem. If you set your knee to “low”, the transition from exposed to overexposed will be more gradual.
Some cameras also have a “film gamma” adjustment that helps give the illusion of greater latitude. I won’t recommend any of these settings only that you experiment and make up your own mind.

FRAME RATE

Motion cadence has a large impact on the film look. Before 24p became available to the general consumer, there were programs like Magic Bullet, Cinelook and DVFilm that would strip your 50i footage and convert it to 24p. While they did a fairly good job, there is just substitute for the real thing.
35mm film cameras run at 24fps and give a distinctly different look than video which is essentially capturing images 50 times a second…(video actually captures two fields to make up every “frame” so it’s really 25fps). Video looks “hyper-real” while film has a slightly surreal feel to its motion. This makes it ideal for storytelling because it can pull the viewer into a world that contributes to suspension of disbelief. Video, because of its hyper-real motion, can sometimes feel more like a documentary and have that classic “soap opera” look. At least in my mind, this is less than ideal for narrative storytelling.
There are many pros and cons for using 24p, 25p or 50i and each of them has specific applications, even if it’s just for an aesthetic choice. The bottom line is that if you want your video to closely approximate the feel of 35m film, shoot at 24fps.
Committing to that frame rate means you also need to be aware of its limitations. Fast motion can lead to stuttering or strobing which can be distracting to the viewer. Again, overcoming these limitations is another discussion for another day but know that they are exactly the same as what you would encounter shooting on film.

ASPECT RATIO

The term Aspect Ratio refers to the relationship between the width and the height of a picture. 16x9 means that the screen is 16 units wide and 9 units high. HD video and some SD video is shot with a 16x9 aspect ratio. There are many, including myself, who feel this widescreen aspect also makes footage look more filmic. I go one better and matte my footage to a Cinemascope ratio which is the same used on many blockbuster films today. It has a much wider aspect than 16x9 and gives the frame more of an epic feel. This is particularly effective in HD where there is a lot of resolution that is absent from standard definition video, particularly in wide angle shots.
16x9 is the standard aspect ratio in today’s world of TV and video.

35MM ADAPTERS

Another staple of the film look is having control over depth of field. Because of their small CCDs, consumer- and prosumer-level video cameras have a large depth of field meaning nearly everything is in focus. This can be a challenge when you are trying to hone the viewer’s eye on a particular thing in a busy frame.
35mm adapters allow you to use all the advantages of film lenses right on your video camera. The light travels through the 35mm lens and projects an image onto a small vibrating or rotating screen. The camera then records this tiny screen in macro mode.
Not only do you get control over depth of field but there is also an organic quality introduced to the picture. Gone is the trademark sharp edges of video (particularly HD) and what’s left is a pleasing picture that’s less hard-edged and more like film.

KNOW YOUR LIMITS

Your video camera is just that. It is not a  film camera. Know the differences and you are halfway there. Try to avoid high contrast scenes and you will get a better looking picture. If you can’t avoid these kinds of situations, invest in a graduated ND filter or a polarizer. Do research on the Web and find out what you can do to minimize problems when shooting in uncontrollable environments. When you can, use lighting to your advantage. Good lighting can bring out the very best footage possible from your camera and there will be times when you fool yourself into thinking it was shot on film. To shoot well is to know the limitations of your tools and find creative solutions.
There are many other elements that go into creating a film look including art direction, composition/framing and becoming intimate with the language of film. What I have listed above are just some fundamentals to get you started.

Thursday, July 7, 2011

video compression




Compression is not the resolution of the video, but video resolution has a lot to do with how much information will need to be compressed. Common video formats are 720p, 1080i, or 1080p. The 720/1080 part is pretty straightforward, it simply refers to the number of pixels on the vertical scale of the image: 720 is a 1280×720 pixel image. 1080 is a 1920×1080 pixel image for each frame of video. That’s a big difference in the amount of information: a 720 frame has 921,600 pixels. A 1080 frame has just over 2 million pixels.
The i and p parts refer to whether the frame is interlaced or progressively scanned. To be (very, very) brief, progressively scanned is better, especially when there is a lot of motion in the image, but doesn’t make as much difference when objects in the image are fairly static. 1080p is the best of both worlds, but takes a lot of bandwidth to do (it’s overkill for web video or most television, for example). 720p and 1080i actually both take up roughly the same amount of bandwidth and are what are used for HD television as an example (some networks use 720p, some 1080i). 1080p generates significantly more data than either of the other two formats. That’s why many lower end cameras, storage devices, etc. can handle either 720p or 1080i, but not 1080p.
The other variable, that is not compression, but that does influence how much data must be compressed, is the FPS (frames per second) the camera records. Film cameras shoot at 24 FPS. Many, but not all, video cameras can shoot at 24 FPS for a ‘film-like’ look, but standard video is usually shot at 30 FPS and 25 FPS(Note: These standards refer to the US and UK and few other nations. There are other standards worldwide.) Some lower end cameras shoot at lower frame rates than these, and some high end cameras can shoot at much higher frame rates.
Obviously a 1080p image shot at 60 frames per second is going to generate a lot more data than a 720p image shot at 24 FPS. The bottom line, though, is video is generating a lot of information: 1 to 2 million pixels recorded 24 to 30 times a second, at 8 to 16 bits per pixel, plus audio is a lot of data to record. And, practically speaking, it has to be compressed somehow to make the file size manageable.

Compression and Bit rate

Simply put, bit rate (usually expressed as megabits per second, or Mb/s) is the amount of data recorded each second. After the camera (or computer) has done its compression thing, the file size will equal bitrate X seconds of video. If you dig around, you can often find the maximum bit rate that a camera, storage device, or processor handles. In theory, higher bit rate means more data is stored, which (assuming everything else is equal) means higher quality compression. But there are a lot of other variables.
Different cameras use different codecs (COmpression-DECompression algorithms) to compress the data. Your camera choice sets your choice of compression algorithm (more on this later), since different manufacturers have chosen different codecs. In general, compression algorithms are sorted into two general categories (this applies for audio and other data too, not just video) lossy and lossless. Lossless means that after decompression each pixel is exactly the same as the original, no data can be lost. There are no video cameras (other than a few amazingly expensive professional cameras) that record lossless.
Lossy compression isn’t an exact pixel-for-pixel match when uncompressed, but it offers much higher compression ratios than lossless compression. (The compression ratio is the size of the original video compared to the compressed video. Uncompressed video is 1:1. Lossy ratios can get very high— 1:200 compression isn’t unheard of for some heavily compressed video formats. The codecs used in video cameras offer better quality, but less dramatic compression ratios. More on the order of 1:50.) There are lots of different codecs in use today. The better codecs are usually newer and offer a higher compression ratio with similar quality image. For example, MPEG-4 gives a higher quality image than MPEG-2 at the same bit rate. Some high-end cameras, though, use less aggressive codecs with less compression to maintain the best possible image quality. In exchange for that, they require significantly higher bit rates to record their data.

How video compression works

There are two ways to compress the data in a video clip: Intra frame and inter frame. Intraframe compression takes each frame of the video and compresses it just like you would use JPEG to compress a still image (in fact one format, Motion JPEG, does exactly that). With intraframe compression every frame of the image is complete, although slightly compressed. This can be important if your video has lost a frame – since the frame before and after the lost frame are complete, not much damage is done. It’s also important when you cut and paste video clips – the video editing software needs a complete frame at the beginning and end of each transition. Intraframe compression, though, doesn’t really make the file size all that much smaller. Compression ratios of about 1:20 are about as good as it can do.
To really get more significant compression, video codecs also use interframe compression. The basic idea is simple. Video consists of multiple still frames, (anywhere from 24-60 per second typically). Interframe compression looks at each frame, compares it to the previous one, then stores only the data that has changed. Usually it doesn’t look at individual pixels, but rather at square blocks of pixels (less time consuming and resource intensive). But each frame in an interframe compressed video contains only the changed parts of the image.
But interframe compression brings a new problem: What happens when you’re sending this video to wherever (or importing it), and it skips a frame? If each frame is referencing the previous frame, you’re in trouble until the entire picture has changed. If you have a 3 minute clip of the same scenery, there would be a problem. And the same problem would occur, if you wanted to cut the scene halfway through that 3 minute clip: the frame at your transition wouldn’t be a complete frame, just a compression of the changes that occurred from the previous frame. And so on. The solution all interframe compression formats use is the key frame.

Key frames and long-GOP compression

Interframe compression codecs record a Key Frame every so often: a frame that contains the entire image data set, whether the scene has changed or not. The key frame is shot every x number of frames (usually 15) and that frame contains a complete image. The next group of frames (until the next key frame) is heavily compressed, containing only the changes from the previous key frame. Using this method, if you skip a frame, you only lose (at most) 15 frames before you’re good to go again (or until your next editing point). It’s still a relatively long time, but allows for a much smaller file size than intraframe compression alone. This key picture followed by several the compressed pictures until the next key frame is abbreviated GOP (group of pictures). Since there is a fairly long group of images grouped associated with each key frame, this is often referred to as long GOP compression.
A final note about frame skipping: it’s rare. In fact, it almost never happens when using quality equipment. Because of this, long-GOP encoding is usable and safe. Intraframe-only compression does protect against frame skips, but requires a lot more disk space (and a higher bit rate for the same quality image). Since video editing software can only cut at a key frame, some high quality video recording devices (like the nanoFlash that started this discussion) will record video with only intraframe compression (a half-second until the next key frame can be an eternity to a video editor), but the resulting files can get very, very large.

Luminance and Color Compression

Since the days when video was analog, luma (the black and white values) and chroma (the color) have been stored separately. Y’CbCr is how video is stored today, typically using a process known as chroma subsampling. Y’ (sometimes simplified to just Y) is the luma (grayscale) information. Cb and Cr each store a portion of the color information (like LAB color space in Photoshop).
We are less sensitive to color and very sensitive to the grayscale value of an image, so video cameras today discard some of the color information to further compress the video data. The proportions are usually shown as a ratio with 4, indicating no compression. Recording video at 4:4:4 would be ideal, but it takes up an enormous amount of space and isn’t feasible in most situations with today’s equipment. Top quality video formats, like XDCAM422 and DVCPRO HD, keeps twice as much Luma data as either color (Cb and CR) data in a ratio of 4:2:2. This reduces bit rate by 1/3 with very little image compromise. Other video formats such as HDV, AVCHD, MJPEG, and MPEG-2 (DVD quality) use even less chroma data, storing video at a ratio of 4:2:0. This may sound extreme, but DVD quality video is recorded at 4:2:0, so it is intentionally missing 3/4 of the color information originally present. Don’t we all think DVD is pretty high quality? Even Blu-ray is only keeping 1/2 of the color information, storing video at 4:2:2 chroma compression.
Many professional video cameras use a 4:2:0 ratio to keep the bit rate manageable when recording in camera. When absolute image quality is critical, however, these cameras (The Sony EX1 and EX3, for example) that internally record at 4:2:0 have HD-SDI output from the camera which can output a higher quality 4:2:2 signal, but will require an external recording device (like the nanoFlash).

Recorded Bitrates

Interframe compression algorithms record so many bits-per-second. Using a set bit rate stores the same amount of data for every second of video, regardless of how the frames change over that second. With a set bit rate you know exactly how large a 5 or 10 minute video file will be, since the bit rate is fixed. When we used film to record to (or MiniDV tape), the bit rate had to be constant, because the tape moved at a constant rate. DV footage, once digital, still records to tape at a fixed rate of 25Mbps. HDV, a descendent of DV that uses MPEG-2 compression, records at a fixed rate of 35Mbps.
Most cameras and codecs today, however, record using variable bit rates because it is more efficient. This changes the recording bit rate based on the amount of information change frame-to-frame. If it sees an almost identical previous frame, very little data will be encoded. However, if a large part of the frame is changing, there is much more data, and a higher bit rate will be recorded. The takeaway message, though, is that every recording device, whether in-camera or external, has a maximum bit rate it can handle. The various compression codecs have to provide final data at a bit rate that is acceptable to the recording device or bad things will happen: missed frames, jumping, etc.

HDV XDCAM AVCHD and every other codec

The terminology involved in the various codecs is beyond chaotic. To a video-outsider it’s incomprehensible, but we can try to clarify things a bit. Like most simplifications, what follows is a bit generalized in the interest of keeping it easy to follow. We’ve left out and ignored some arguable points that would easily lead to 4 more pages of clarification in an effort to make it readable. As a general overview, however, this is a pretty reasonable summary. First, we need to separate containers (sometimes called formats) from the underlying codecs. A container is a format that can use (or be used by) many different (but not all) codecs. AVI, Quicktime, RealMedia, DivX, and many other containers exist, but they are (with a few exceptions) not actual codecs.
There are several codecs in common use today, each following a set of standards developed by the Motion Picture Experts Group (MPEG), the ITU-T Video Coding Experts Group (VCEG), or the Joint Video Team (JVT) from both groups. These standards provide a lot of customizable options to the various camcorder and software manufacturers. Some cameras let you choose between two codecs, but most only offer one. The reason behind this? Different codecs require different processing algorithms to encode video. The processor in the camera (yes, cameras have processors very similar to computers) is designed so that it can handle the encoding for that specific codec. And the memory used to store the video is designed to handle the bit rate needed for decent quality video from that codec. Etc…
The most current families of standard codecs from MPEG and VCEG are combined as the H.264/AVC/MPEG-4 standards. H.264/MPEG-4 (also referred to as MP4 at times) allows for a much lower bit rate then previous codecs while still achieving excellent quality. It is used not only for video compression during recording, but also for compression after editing. Youtube, Blu-ray, and the iTunes Store all use H.264 for encoding video. AVCHD codecs are H.264 based codecs used in newer high end Sony and Panasonic cameras, but many other newer camcorders use H.264 based codecs.
Several other codecs remain in common use. Motion JPEG is used on many point-and-shoot video cameras and Nikon Video SLRs. It doesn’t compress nearly as much as H.264 codecs, but requires a lot less processing power and is particularly suitable for nonHD video and lower resolutions. The HDV and DVCAM family of codecs use largely MPEG-2 compression as does the XDCAM codec. These files aren’t usually as tightly compressed as H.264 codecs, although MPEG-2 Long-GOP comes close. These codecs are often found on high-end digital video cameras. The reason why they don’t use H.264? Depending on the source of information you read, this is because the files are easier to edit, or because the manufacturer had lots of chips made for these codecs and was going to use them. I suspect both reasons are true.
So basically each manufacturer chooses which of the standard codecs to implement in their camcorder. Well and good. However, they then modify it a bit, build the chip they’ll install in the camera to use their version and identify it with a cryptic set of initials in an apparent attempt to prevent anyone from understanding that their codec has anything in common with anyone else’s codec. Let’s look at one example. Sony and Panasonic jointly developed AVCHD (Advanced Video Codec High Definition) for their consumer camcorders, which is also used by Canon. AVCHD is MPEG-4 AVC/H.264 compliant so it can also get tagged with those initials. Panasonic tweaks AVCHD with some higher bitrates and markets this codec as AVCCAM in their professional cameras, or downgrades it to 720p recording only and calls that version AVCHD Lite. Sony calls their version NXCAM in their newest professional cameras (as opposed to the XDCAM, a different codec used in many of their current high end cameras). Canon and Panasonic use a High-Profile level 4.1 modification of the AVCHD codec in some cameras which allows a maximum bit rate capture of 24 Mbits/sec, while most camcorders using AVCHD capture a maximum bit rate of 17Mbits/sec. On the editing side, Adobe Premier required a third party plug-in to convert certain versions of AVCHD, but does fine with others, Final Cut Pro converted this format to Apple Intermediate Codec before editing was possible, and Vegas had no problems with the format at all.
Pretty confusing, huh? The takeaway message, with a lot of caveats, is that most codecs in higher level cameras are MPEG-4/H.264 compliant and fairly similar as to how effectively they compress video while maintaining quality. They may differ in offering 1080p (some don’t), in how high of a bit rate they can record (which, given similar codecs, is a fair estimate of image quality), how often they record a key frame (which may be user adjustable in-camera), and how easily your editing program can convert it into an editable format. There are a few common codecs that you’ll run across regularly that fall into several groups:
  • DV/DVC/DVCPRO/DVCAM – largely legacy technology, but many HD/HDV systems are backwardly compatible with DV/DVC, and it is used in some high-end video and video broadcast cameras.
  • HD/HDV – Used by Sony, Canon, JVC, and Sharp, originally designed for recording to tape. It uses MPEG-2 compression, 4:2:0 chroma sub-sampling, and writes with a constant bit rate. Used in many tape-based camcorders, but also some digital recorders.
  • XDCAM – Designed by Sony, but also used by JVC, originally designed for recording to disc. (In some ways a container {see above} rather than just a codec as most cameras using XDCAM can also record in DVCAM or MPEG-2 variants.) Uses an MPEG-2 or MPEG-2 Long-GOP codec, 4:2:0 chroma subsampling, and writes in a variable bit rate to 35 Mbits/sec. However, the XDCAM HD422 version uses a 4:2:2 chroma subsampling profile and writes a maximum 50 Mbps rate.
  • AVCHD – Sony (NXCAM), Panasonic (AVCCAM). Uses MPEG-4/H.264 compression, 4:2:0 chroma sub-sampling, and writes in a variable bit rate to 24 Mbps. Note: Some video editors need a third party plug-in to upsample certain AVCHD files to a usable format.
  • Motion JPG – intraframe only compression, usually used in point-and-shoot video cameras, but also the Nikon D90 and Pentax K7. It is less efficient than other codecs, so usually image size or frame rate are limited.
Or if you’d rather see what some common camcorders and videoSLR cameras use:
Camera
Algorithm
Maximum Bitrate
Panasonic HVX200
DVCPRO
100Mbps
Sony EX1, EX3, JVC HM100
XDCAM
35Mbps
Sony Z7U
HDV
35Mbps
Canon HV30, HV40
HDV
35Mbps
HG21
AVCHD
24Mbps
Canon 5D MkII, 7D
H.264
40Mbps
Nikon D90/D300s/D3s
motionJPG
bitrate unknown
Panasonic GH-1
AVCHD
40 Mbps
Compare those bit rates to what an external recorder like the nanoFlash can record: 230Mbps.

Conclusion

What does all of this mean? In general, you want the highest bit rate, using the most efficient compression algorithm possible. MPEG-4/H.264 codecs probably produce the best quality/compression ratio. However, top-end professional editing may require a less lossy format, such as an MPEG-2 based codec with resulting larger file sizes to get the absolute best image quality. Some high end cameras will allow you to take the video feed directly out to an external device and record it at an even higher bit rate with less compression for critical footage. Hence an external recorder like the Nanoflash, provides higher bit rates (180Mbits/sec) and less compression than is possible in-camera. (A note of sensibility: the 230Mbps bit rate of the nanoFlash is excessive for use with your $300 handycam or even the Canon HV40. Your image isn’t going to improve beyond the quality of the camera.)
What you intend to do with the footage after recording is also important. Some of the higher compression codecs can be difficult to work with in a non-linear editor and require upcoding to an intermediate format (read: lots of processor power and hard drive space) for editing. Some of the simplest formats, like Motion JPEG, can be drag-and-drop edited in even the simplest programs. And less compressed, but larger files (or even uncompressed files in certain high-end devices) can be a dream to edit and provide the absolute best quality after processing