Layers and Alpha Maps as Visual Representations

Abstract

The basic vocabulary of computer graphics and digital illustration includes ‘layers’ and ‘alpha maps’. A layer contains a set of registered 2-D arrays (maps), containing information about colour (the RGB channels) and transparency (the alpha channel) at each point. Layers are ordered in depth, but need not have metric depth. Metelli used a similar vocabulary to analyse perceptual transparency, and related descriptions have been used to understand illusory contours, brightness illusions, etc. We believe that the layer concepts can be extended to have broad utility in human vision and computer vision. By adding a velocity map to a layer, and imposing smoothing within the layer (but not across layers), we have found that many classically difficult problems in motion analysis become more tractable. The problems that usually arise with segmentation and motion boundaries are alleviated, because the discontinuities are assigned to the alpha maps rather than to the velocity maps. Other challenging phenomena, such as motion blur, motion transparency, and moving shadows, also become more tractable within this framework. Decomposition of a scene into layers is a useful stage of processing even when it does not represent the final output; later stages that seek 3-D representations can benefit from an initial layered processing. We suggest that layers with alpha maps may be a basic representation in many aspects of human vision.

Get full access to this article

View all access options for this article.