Raster Display shows images as rectangular arrays of pixels. It’s also prevalent in devices, hence the raster images are also the most common way to store and process images. Raster Image is simply a 2D array that stores the pixel value for each pixel. It’s not a good way to display images since the pixel number is not always the same and we want to change the orientation and size sometimes.

Vector Image is described by storing descriptions of shapes (line or curves) with no reference to any particular pixel grid. It stores instructions for displaying the image rather than the pixels needed to display it. It’s resolution independent. Often used in images, texts, diagrams, mechanical drawings, …

Raster Devices

  • Output
    • Display
      • Transmissive: liquid crystal display (LCD)
      • Emissive: light-emitting diode(LED) display
    • Hardcopy
      • Binary: ink-jet printer
      • Continuous tone: dye sublimation printer
  • Input
    • 2D array sensor: digital camera
    • 1D array sensor: flatbed scanner

Output devices

Emissive displays: directly emit controllable amounts of light. e.g. LED. A pixel is divided into 3 subpixels (red, green, blue).

Transmissive displays: vary the amount of light allowed to pass through e.g. LCD. A layer of polarizing film behind it (horizontally). A layer of polarizing film in front of the pixel (vertically). If the voltage doesn’t change the polarization all light is blocked. If the voltage is set so that the liquid crystal rotates the polarization by 90 degrees, then then all the light that entered through the back of the pixel will escape through the front, and the pixel is “on”. It has subpixels too. Transmissive requires a light source to illuminate them (e.g. the light for an projector), but emissive has its own light source. If the voltage doesn’t change the polarization all light is blocked. If the voltage is set so that the liquid crystal rotates the polarization by 90 degrees, then then all the light that entered through the back of the pixel will escape through the front, and the pixel is “on”.

Binary Images - pigment is either deposited or not, there are no intermediate amounts. The resolution is determined by the size of the smallest drop.

Thermal dye transfer process is an example of a continuous tone printing process. A print head contains a linear array of heating elements, one for each column of pixels in the image. As the paper and ribbon move past the head, the heating elements switch on and off to heat the ribbon in areas where dye is desired. The process is repeated for each of dye colors. The resolution is determined by the rate of heating and cooling compared to the speed of the paper. It’s described in pixel density. 300 per inch across its print head has a resolution of 300 pixels per inch (ppi).

Input Devices

Make a light measurement for each pixel, usually based on arrays of sensros

e.g. digital camera.

  • CCD (charge-coupled devices)
  • CMOS (complimentary metal-oxide-semiconductor)

the lens projects an image of the scene onto the sensor, then each pixel measures the light energy.

color-filter array / mosaic - allow each pixel to see only red, green, blue light. image processing fill in the missing values in a process known as demosaicking.

e.g. flatbed scanner use 1D array that sweeps across the page being scanned. A color scanner has a $3 \times n_{x}$ array, where $n_{x}$ is the number of pixels across the page.

Images, Pixels, and Geometry

we can abstract an image as a function: $$ I(x, y): R \rightarrow V $$ R is a rectangular area and V is the set of possible pixel values.

A pixel from a camera/scanner is a measurement of the average color of the image over some small area around the pixel. The pixel value is a local average of the color of the image (point sample). Value x in a pixel = “the value of the image in the vicinity of this grid point is x”.

Pixel Value

Images should be arrays of floating-point numbers with either one or three 32-bit floating point numbers stored per pixel. High Dynamic Range (HDR): Images stored with floating-point numbers allows a wide range of values. Low Dynamic Range (LDR): store integers

Some pixel formats with typical applications:

  • 1-bit grayscale: intermediate grays are not desired, text
  • 8-bit RGB fixed-range color: web, email, consumer photographs.
  • 8- or 10-bit fixed-range RGB: digital interfaces to computer displays.
  • 12- to 14-bit fixed-range RGB: raw camera images
  • 16-bit fixed-range RGB: professional photography and printing; intermediate format for image processing of fixed-range images.
  • 16-bit fixed-range grayscale: radiology and medical imaging
  • 16-bit “half-precision” floating-point RGB — HDR images; intermediate format for real-time rendering;
  • 32-bit floating-point RGB — general purpose intermediate format for software rendering and processing of HDR images.

Reducing the number of bits used to store each pixel leads to 2 distinctive types of artifacts.

  • clipping: when pixel value should be greater (brighter), it’s set to the maximum value
  • quantization/banding: rounding pixel values to the nearest representable value results in jumping in intensity or color. Very visible in animation.

Monitor Intensities and Gamma

Real monitors have some non-zero intensity when they are off because the screen reflects some light. We assume a numeric description of pixel color that ranges from zero to one. Black is zero, white is one. Monitors doesn’t display input linearly because of human perception. They might display [0, 0.5, 1.0] as [0, 0.25, 1.0]. This nonlinear is called gamma value. $$ \text{displayed intensity} = (\text{maximum intensity}) a ^{\gamma} $$ a = input pixel value between zero and one.

A nice method to find gamma is giving halfway grey between [0, 1]:

By logarithms: $$ \begin{gather} y = \log_a{x} \leftrightarrow a^{y} = x \ \ln{x} \equiv \log_e{x} \
\end{gather} $$ We can deduce the “halfway” grey color: $$ \begin{gather} 0.5 = a^{\gamma} \

\log_a{0.5} = \gamma \

\frac{\ln{0.5}}{\ln{a}} = \gamma \end{gather} $$ To find the value a, we can use the checkerboard technique:

checkerboard

Adjusting value $a$ of grey pixels until both square have the same brightness in the distance, because the blurred checkerboard mix even number of black and white color so that the overall is a uniform color of half grey. $$ \begin{gather} \ln{a} = \frac{\ln{0.5}}{\gamma} \ e^{\ln{a}} = e^{\ln{0.5^{\frac{1}{\gamma}}}} \ a = 0.5^{\frac{1}{\gamma}} \end{gather} $$

$$ a^{\prime} = a^{\frac{1}{\gamma}} $$

Plugin the displayed intensity equation we have: $$ \text{displayed intensity} = (a^{\prime})^{\gamma} = (a^{\frac{1}{\gamma}})^{\gamma} (\text{maximum intensity}) = a(\text{maximum intensity}) $$

RGB Color

RGB colors

3 primary lights: red, green, blue. The lights mix in an additive manner. The RYB color system is a subtractive color mixing. If the primary lights from fully off to fully on (0 to 1), it will create all the colors that can be displayed on an RGB monitor. When we put them in a 3D coordinates:

RGB cube

yellow = (1, 1, 0); magenta = (1, 0, 1); cyan = (0, 1, 1)

Each component is specified with an integer. The integer’s common size is 1 byte (0 — 255). 3 components make 24 bits. Thus a system that has “24-bit color” has 256 possible levels for each of the three primary colors.

Alpha Compositing

Compositing: a foreground wants to cover the background over. opaque pixel replaces; transparent doesn’t change background; partially transparent pixel needs some care, most time is to blend at the edge of the foreground object or sub-pixel holes.

Pixel Coverage tells the fraction of the pixel covered by the foreground layer. $$ c = \alpha c_{f} + (1 - \alpha) c_{b} $$ opaque foreground layer: $\alpha$ is the foreground object cover area. transparent foreground layer: foreground blocks $(1-\alpha)$ fraction of the light coming through from the background and contributes a fraction $\alpha$ of its own color.

Alpha Mask/Transparency Mask: it stores the image’s alpha value. Or the alpha value is stored in RGBA format. Takes up 32 bits for one pixel value.

Image Storage

Image format is either lossless or lossy.

jpeg: lossy format compresses image blocks based on thresholds in the human visual system.

tiff: commonly used to hold binary images or losslessly compressed 8- or 16-bit RGB although many other options exist.

ppm: uncompressed format for 8-bit RGB images.

png: lossless formats with a good set of open source management tools.

Q: why use gamma?

A: human perception of intensity is itself nonlinear. Gamma between 1.5 and 3 makes the intensities approximately uniform in a subjective sense.