Sunday, September 2, 2007

Actual resolution of Bayer sensors - You get only half of what they tell you



Almost all current DSLR-type cameras are based on a Bayer-filter array sensor. This means that each pixel on a CCD or CMOS chip has their own color filter. The array is usually arranged in a a square fashion where each square of four pixels has 1 pixel with a red filter, one pixel with a blue filter and 2 with green filters (in one of the two diagonals for obvious reasons. There are two green ones, because the human eye is most sensitive to green light and therefore to noise in the green channel. So a 10MP camera has 2.5 Million green pixels, 2.5 million blue ones and 5 million green ones. The job of the RAW processor in the camera or in the computer software if you shoot RAW is to interpolate between the single colors to generate the missing color values at all pixel locations. Most RAW processors do a pretty good job at this, but there is a physical limitation imposed by this. The actual resolution of these sensors will never be as large as what is claimed. If you would remove all the filters, you would get the claimed resolution in black and white. Another alternative is the foveon sensor such as the ones used by sigma, which has three color pixels all at the same site arranged in layers. This leads to the exact same resolution as the number of photosites. Unfortunately, Sigma chooses to simply count the number of photodetectors, which artificially inflates the resolution number by a factor of 3. For example, the SD14 is marketed as a 14.1 MPixel camera. This is extremely misleading as the actual resolution is only 4.6 Megapixels. The camera generates jpegs at 14.1 megapixels, but the pixels in between the actual photosites are simply interpolated and do not add any extra information. Silly marketing! Anyway, cameras with Bayer-array sensors suffer a similar problem as I explained above. The excellent site dpreview.com actually tests the resolution of all the cameras they can get their hands on. They use high resolution primes to make the comparison fair. Of course, comparing between cameras this way is the hallmark of a measurebator as ken rockwell likes to say, so I am not going to compare cameras, just see if we can draw some conclusions about the Bayer technology. Dpreview gives some numbers for the actual resolutions of the cameras they test in lines per picture height (LPH). For example, for the Nikon D200 the horizontal LPH is 2100 and the vertical 1700, since this is a 6x4 sensor, the actual resolution of the sensor is 6/4*2100*1700=5.36 Mpixels. About half the number of photosites on the sensor. Dpreview also gives an extinction resolution where all detail disappears, but moiré artefacts are visible. For the D200, this occurs at 7.4 Mpixels. To see if we can learn some more about these Bayer array sensors, I've taken the MTF data from dpreview.com for a large array of Nikon and Canon DSLR cameras and plotted the actual measured resolution vs the number of photosites on the sensor (on the right), together with the "extinction resolution". The green line would be if the resolution was the same as the camera megapixels. It is clear that a trend is visible in both the real resolution and the extinction resolution. For all array densities, the actual resolution obeys:

actual resolution = 0.58*MP

and the extinction resolution:

ext. res. = 0.8*MP.

So the actual resolution of your camera, if you have a Bayer sensor, is only about half of what is claimed! Extraordinarly good RAW processors might be able to get slightly more out of it, but never more than 0.8*MP. WIth Foveon sensors it is even worse, you only get 1/3 of the resolution of the number the camera manufacturer claims as they quote you the number of photodetectors, while the only thing that matters is the number of pixel locations, which is only 1/3 of the number of photodetectors.

4 comments:

  1. It is to understand the problem of Sigma in advertising the number of pixels; a marketing of the 4.6mp camera today would be very difficult indeed!

    They do state in the camera specs that 14mp is 2652x1768 x 3 layers, so they are as open as they can be while taking care of a sadly megapixel focused market.
    http://sigma-dp1.com/

    I for my part is happy with few megapixels, and their 13mb raw files will be wonderfully small to work with!

    ReplyDelete
  2. nice one Jao.

    I've been using 2/3 which falls between 0.58 and 0.8.

    currently all the cameras are resolution challenged and Foveon doesn't provide higher QE either so we are going to have Bayer for a while, at least until 36MP becomes the lowest resolution on the market.

    ReplyDelete
  3. Actually even a monochrome sensor cannot deliver an image the genuinely has the same native resolution as the sensor. Pixels have to be interpolated, or at least estimated from the combined input from several pixels, to create an image with as many pixels as the sensor has photosensitive elements. This is because the maximum resolvable spatial frequency is one half of the sampling frequency. That is to say (for example) if a lens delivers the equivalent of 2000 discernible pixels across the frame then it takes 4000 photo-sensitive elements to capture them accurately.

    Suppose we have a 16Mp sensor, and to make the maths easy let us assume that is is square, i.e. 4Mp by 4Mp. Then the most detail that it can resolve linearly is 2Mp, and the true information content of the whole image is 4 Genuine megapixels.

    Obviously the situation is worse when we superimpose a Bayer Matrix or a single layer sensor. In the example above it reduces the amount of Green detail that is captured below the 2000 pixel theoretical limit, and reduces Blue and Green even further, so a huge amount of guesstimation goes into constructing the missing pixels.

    The interpolated pictures generated from a sensor and Bayer matrix may look nice, and many of the interpolated pixels will be close to correct, but there is no way that a 16Mp image file created from a 16Mp sensor can be a completely accurate representation of the scene. The heuristics that are used in interpolation are chosen so that a typical range of pictures ends up looking good.

    A Foveon X3F Merrill sensor captures data from 45 Million photosites from which it constructs a 15Mp image file. Is this as "good" as an image from a Bayer sensor of 30Million photosites (+/- a few Mp) delivering a 30+ Mp image? At first sight the Foveon should give the more faithful representation, as it captures more data in the first place.

    There is no simple way to answer this from theory. Signal to Noise ratios affect things, but most important are the so-called "demosaicing" algorithm that constructs a 30Mp image from the available information in a Bayer sensor, and the way the data from the Foveon is interpolated up to 15 Mp.

    What really matters is real life images, and comparing Foveon and Bayer sensors shows that it is a close call between the 15Mp from X3F Merrill (e.g. SD1) and 30+ Mb from a Bayer sensor (e.g. Nikon D800).

    When I compare 15Mp Foveon files (from the 45 Mega-site Merrill chip) with the 38Mp files from a Nikon D800 I see that the Nikon seems to have slightly more detail overall, but that in blue and red areas (especially red) the Foveon retains more of the variation in tone (and hence the detail) that was in the original.
    No company that is trying to sell a camera seems prepared to be honest about the true relationship between number of photosensitive sites, their arrangement, and what it means for the final image file. Most of the confusion could be avoided if we distinguished between the number of photosites on the sensor (whether a single layer with a Bayer Matrix or in three layers) and the number of pixels in the derived image.

    ReplyDelete
    Replies
    1. There seems to be no way, on this site, to delete or edit the nonsense that we write before we have a full understanding. (I seem to suffer as much from the Dunning-Kruger syndrome as anyone).

      I still don't have a full understanding, but at least now I know what I know, and realise that there is a lot that I don't, or that I don't properly understand. So I want to correct the comment above.

      The maximum resolution that a monochrome sensor could theoretically attain does indeed match the pixel resolution of the sensor. The theory says that the limit is when the sampling frequency is twice the frequency being sampled. In photography that means we need to take two samples over the distance that the scene goes from light to dark and back to light again. So we need two pixels to capture adjacent dark and light lines that are each of a pixel width. But that can happen only if the peaks and troughs of light intensity in the scene line up perfectly with the sensor elements. This is not going to happen in any real-world scene.

      It turns out, when you do some real world measurement rather than making theoretical arguments, that you have to sample at three times the limiting case to resolve a dark/light line pair in real life (http://www.clarkvision.com/articles/sampling1/).

      In other words it takes 6 pixels rather than two to be sure of recording a transition from light to dark and back to light again. The amount of true detail that a sensor (even a monochrome or Foveon sensor) can capture is obviously a lot less than the number of pixels!

      So what is going on when an X Mp sensor returns an X Mp image that appears to be accurate down to the pixel level? It is very difficult to figure out theoretically because there are so many complicating factors:

      - What level of contrast we deem to be sufficient to distinguish light from dark

      - In the case of a sensor using a Colour Filter Array (such as Bayer) how much this further reduces the amount of detail that can be captured.

      This second problem is made even more difficult because the amount of true detail available to a Bayer-array sensor will depend on the distribution of colours. Less true detail is available in a pure red or pure blue image than in a pure green one.

      - The success rate of the algorithms by which the missing colour content of each pixel is estimated is also a factor. The demosaicing algorithms that have been developed to date seem to do a tremendously good job most of the time.

      And finally, no sensor can capture more true detail than is returned by the lens!

      It is a nightmare.

      Fortunately, for the practical photographer, at least for the photo-artist, none of this matters. We can LOOK at the results and decide whether or not they are what we wanted. As Debussy, scornful of theoretical arguments said of music, "The ears have it". In photography the eyes have it.

      It may be more of a problem for a forensic photographer, or a scientist wanting an accurate representation of some complex structure.

      Delete