61,99 €
Essential Image Processing and GIS for Remote Sensing is an accessible overview of the subject and successfully draws together these three key areas in a balanced and comprehensive manner. The book provides an overview of essential techniques and a selection of key case studies in a variety of application areas.
Key concepts and ideas are introduced in a clear and logical manner and described through the provision of numerous relevant conceptual illustrations. Mathematical detail is kept to a minimum and only referred to where necessary for ease of understanding. Such concepts are explained through common sense terms rather than in rigorous mathematical detail when explaining image processing and GIS techniques, to enable students to grasp the essentials of a notoriously challenging subject area.
The book is clearly divided into three parts, with the first part introducing essential image processing techniques for remote sensing. The second part looks at GIS and begins with an overview of the concepts, structures and mechanisms by which GIS operates. Finally the third part introduces Remote Sensing Applications. Throughout the book the relationships between GIS, Image Processing and Remote Sensing are clearly identified to ensure that students are able to apply the various techniques that have been covered appropriately. The latter chapters use numerous relevant case studies to illustrate various remote sensing, image processing and GIS applications in practice.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 923
Veröffentlichungsjahr: 2013
Contents
Overview of the Book
Part One Image Processing
1 Digital Image and Display
1.1 What is a digital image?
1.2 Digital image display
1.3 Some key points
Questions
2 Point Operations (Contrast Enhancement)
2.1 Histogram modification and lookup table
2.2 Linear contrast enhancement
2.3 Logarithmic and exponential contrast enhancement
2.4 Histogram equalization
2.5 Histogram matching and Gaussian stretch
2.6 Balance contrast enhancement technique
2.7 Clipping in contrast enhancement
2.8 Tips for interactive contrast enhancement
Questions
3 Algebraic Operations (Multi-image Point Operations)
3.1 Image addition
3.2 Image subtraction (differencing)
3.3 Image multiplication
3.4 Image division (ratio)
3.5 Index derivation and supervised enhancement
3.6 Standardization and logarithmic residual
3.7 Simulated reflectance
3.8 Summary
Questions
4 Filtering and Neighbourhood Processing
4.1 Fourier transform: understanding filtering in image frequency
4.2 Concepts of convolution for image filtering
4.3 Low-pass filters (smoothing)
4.4 High-pass filters (edge enhancement)
4.5 Local contrast enhancement
4.6 *FFT selective and adaptive filtering
4.7 Summary
Questions
5 RGB-IHS Transformation
5.1 Colour coordinate transformation
5.2 IHS decorrelation stretch
5.3 Direct decorrelation stretch technique
5.4 Hue RGB colour composites
5.5 *Derivation of RGB–IHS and IHS–RGB transformations based on 3D geometry of the RGB colour cube
5.6 *Mathematical proof of DDS and its properties
5.7 Summary
Questions
6 Image Fusion Techniques
6.1 RGB–IHS transformation as a tool for data fusion
6.2 Brovey transform (intensity modulation)
6.3 Smoothing-filter-based intensity modulation
6.4 Summary
Questions
7 Principal Component Analysis
7.1 Principle of PCA
7.2 Principal component images and colour composition
7.3 Selective PCA for PC colour composition
7.4 Decorrelation stretch
7.5 Physical-property-orientated coordinate transformation and tasselled cap transformation
7.6 Statistic methods for band selection
7.7 Remarks
Questions
8 Image Classification
8.1 Approaches of statistical classification
8.2 Unsupervised classification (iterative clustering)
8.3 Supervised classification
8.4 Decision rules: dissimilarity functions
8.5 Post-classification processing: smoothing and accuracy assessment
8.6 Summary
Questions
9 Image Geometric Operations
9.1 Image geometric deformation
9.2 Polynomial deformation model and image warping co-registration
9.3 GCP selection and automation
9.4 *Optical flow image co-registration to sub-pixel accuracy
9.5 Summary
Questions
10 *Introduction to Interferometric Synthetic Aperture Radar Techniques
10.1 The principle of a radar interferometer
10.2 Radar interferogram and DEM
10.3 Differential InSAR and deformation measurement
10.4 Multi-temporal coherence image and random change detection
10.5 Spatial decorrelation and ratio coherence technique
10.6 Fringe smoothing filter
10.7 Summary
Questions
Part Two Geographical Information Systems
11 Geographical Information Systems
11.1 Introduction
11.2 Software tools
11.3 GIS, cartography and thematic mapping
11.4 Standards, interoperability and metadata
11.5 GIS and the Internet
12 Data Models and Structures
12.1 Introducing spatial data in representing geographic features
12.2 How are spatial data different from other digital data?
12.3 Attributes and measurement scales
12.4 Fundamental data structures
12.5 Raster data
12.6 Vector data
12.7 Conversion between data models and structures
12.8 Summary
Questions
13 Defining a Coordinate Space
13.1 Introduction
13.2 Datums and projections
13.3 How coordinate information is stored and accessed
13.4 Selecting appropriate coordinate systems
Questions
14 Operations
14.1 Introducing operations on spatial data
14.2 Map algebra concepts
14.3 Local operations
14.4 Neighbourhood operations
14.5 Vector equivalents to raster map algebra
14.6 Summary
Questions
15 Extracting Information from Point Data: Geostatistics
15.1 Introduction
15.2 Understanding the data
15.3 Interpolation
15.4 Summary
Questions
16 Representing and Exploiting Surfaces
16.1 Introduction
16.2 Sources and uses of surface data
16.3 Visualizing surfaces
16.4 Extracting surface parameters
16.5 Summary
Questions
17 Decision Support and Uncertainty
17.1 Introduction
17.2 Decision support
17.3 Uncertainty
17.4 Risk and hazard
17.5 Dealing with uncertainty in spatial analysis
17.6 Summary
Questions
18 Complex Problems and Multi-Criteria Evaluation
18.1 Introduction
18.2 Different approaches and models
18.3 Evaluation criteria
18.4 Deriving weighting coefficients
18.5 Multi-criteria combination methods
18.6 Summary
Questions
Part Three Remote Sensing Applications
19 Image Processing and GIS Operation Strategy
19.1 General image processing strategy
19.2 Remote-sensing-based GIS projects: from images to thematic mapping
19.3 An example of thematic mapping based on optimal visualization and interpretation of multi-spectral satellite imagery
19.4 Summary
Questions
20 Thematic Teaching Case Studies in SE Spain
20.1 Thematic information extraction (1): gypsum natural outcrop mapping and quarry change assessment
20.2 Thematic information extraction (2): spectral enhancement and mineral mapping of epithermal gold alteration, and iron ore deposits in ferroan dolomite
20.3 Remote sensing and GIS: evaluating vegetation and land-use change in the Nijar Basin, SE Spain
20.4 Applied remote sensing and GIS: a combined interpretive tool for regional tectonics, drainage and water resources
Questions
References
21 Research Case Studies
21.1 Vegetation change in the three parallel rivers region, Yunnan province, China
21.2 Landslide hazard assessment in the three gorges area of the Yangtze river using ASTER imagery: Wushan–Badong–Zogui
21.3 Predicting landslides using fuzzy geohazard mapping; an example from Piemonte, North-west Italy
21.4 Land surface change detection in a desert area in Algeria using multi-temporal ERS SAR coherence images
Questions
References
22 Industrial Case Studies
22.1 Multi-criteria assessment of mineral prospectivity, in SE Greenland
Acknowledgements
22.2 Water resource exploration in Somalia
Questions
References
Part Four Summary
23 Concluding Remarks
23.1 Image processing
23.2 Geographical information systems
23.3 Final remarks
Appendix A: Imaging Sensor Systems and Remote Sensing Satellites
A.1 Multi-spectral sensing
A.2 Broadband multi-spectral sensors
A.3 Thermal sensing and thermal infrared sensors
A.4 Hyperspectral sensors (imaging spectrometers)
A.5 Passive microwave sensors
A.6 Active sensing: SAR imaging systems
Appendix B: Online Resources for Information, Software and Data
B.1 Software – proprietary, low cost and free (shareware)
B.2 Information and technical information on standards, best practice, formats, techniques and various publications
B.3 Data sources including online satellite imagery from major suppliers, DEM data plus GIS maps and data of all kinds
References
General references
Index
This edition first published 2009, © 2009 by John Wiley & Sons Ltd.
Wiley-Blackwell is an imprint of John Wiley & Sons, formed by the merger of Wiley’s global Scientific, Technical and Medical business with Blackwell Publishing.
Registered office: John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK
Other Editorial Offices:
9600 Garsington Road, Oxford, OX4 2DQ, UK
111 River Street, Hoboken, NJ 07030-5774, USA
For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com/wiley-blackwell
The right of the author to be identified as the author of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.
Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book. This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold on the understanding that the publisher is not engaged in rendering professional services. If professional advice or other expert assistance is required, the services of a competent professional should be sought.
Library of Congress Cataloguing-in-Publication Data
Liu, Jian-Guo.
Essential image processing and GIS for remote sensing / Jian Guo Liu,
Philippa J. Mason.
p. cm.
Includes index.
ISBN 978-0-470-51032-2 (HB) – ISBN 978-0-470-51031-5 (PB)
1. Remote sensing. 2. Geographic information systems. 3. Image processing.
4. Earth–Surface–Remote sensing. I. Mason, Philippa J. II. Title.
G70.4.L583 2009
621.36’78–dc22
2009007663
ISBN: 978-0-470-51032-2 (HB)
978-0-470-51031-5 (PB)
A catalogue record for this book is available from the British Library.
First Impression 2009
Overview of the Book
From an applied viewpoint, and mainly for Earth observation, remote sensing is a tool for collecting raster data or images. Remotely sensed images represent an objective record of the spectrum relating to the physical properties and chemical composition of the Earth surface materials. Extracting information from images is, on the other hand, a subjective process. People with differing application foci will derive very different thematic information from the same source image. Image processing thus becomes a vital tool for the extraction of thematic and/or quantitative information from raw image data. For more comprehensive analysis, the images need to be analysed in conjunction with other complementary data, such as existing thematic maps of topography, geomorphology, geology and land use, or with geochemical and geophysical survey data, or ‘ground truth’ data, logistical and infrastructure information, which is where the geographical information system (GIS) comes into play. GIS contains highly sophisticated tools for the management, display and analysis of all kinds of spatially referenced information.
Remote sensing, image processing and GIS are all extremely broad subjects in their own right and are far too broad to be covered in one book. As illustrated in Figure 1, this book aims to pinpoint the overlap between the three subjects, providing an overview of essential techniques and a selection of case studies in a variety of application areas. The application cases are biased towards the earth sciences but the image processing and GIS techniques are generic and therefore transferable skills suited to all applications.
In this book, we have presented a unique combination of tools, techniques and applications which we hope will be of use to a wide community of ‘geoscientists’ and ‘remote sensors’. The book begins in Part One with the fundamentals of the core image processing tools used in remote sensing and GIS with adequate mathematical details. It then becomes slightly more applied and less mathematical in Part Two to cover the wide scope of GIS where many of those core image processing tools are used in different contexts. Part Three contains the entirely applied part of the book where we describe a selection of cases where image processing and GIS have been used, by us, in teaching, research and industrial projects in which there is a dominant remote sensing component. The book has been written with university students and lecturers in mind as a principal textbook. For students’ needs in particular, we have tried to convey knowledge in simple words, with clear explanations and with conceptual illustrations. For image processing and GIS, mathematics is unavoidable, but we understand that this may be offputting for some. To minimize such effects, we try to emphasize the concepts, explaining in common-sense terms rather than in too much mathematical detail. The result is intended to be a comprehensive yet ‘easy learning’ solution to a fairly challenging topic.
Figure 1. Schematic illustration of the scope of this book
On the other hand, the book indeed presents in depth some novel image processing techniques and GIS approaches. There are sections providing extended coverage of necessary mathematics and advanced materials for use by course tutors and lecturers; these sections will be marked by an asterisk. Hence the book is for both students and teachers. With many of our developed techniques and most recent research case studies, it is also an excellent reference book for higher level readers including researchers and professionals in remote sensing application sectors.
This part covers the most essential image processing techniques for image visualization, quantitative analysis and thematic information extraction for remote sensing applications. A series of chapters introduce topics with increasing complexity from basic visualization algorithms, which can be easily used to improve digital camera pictures, to much more complicated multi-dimensional transform-based techniques.
Digital image processing can improve image visual quality, selectively enhance and highlight particular image features and classify, identify and extract spectral and spatial patterns representing different phenomena from images. It can also arbitrarily change image geometry and illumination conditions to give different views of the same image. Importantly, image processing cannot increase any information from the original image data, although it can indeed optimize the visualization for us to see more information from the enhanced images than from the original.
For real applications our considered opinion, based on years of experience, is that simplicity is beautiful. Image processing does not follow the well-established physical law of energy conservation. As shown in Figure P.1, often the results produced using very simple processing techniques in the first 10 minutes of your project may actually represent 90% of the job done! This should not encourage you to abandon this book after the first three chapters, since it is the remaining 10% that you achieve during the 90% of your time that will serve the highest level objectives of your project. The key point is that thematic image processing should be application driven whereas our learning is usually technique driven.
Figure P.1 This simple diagram is to illustrate that the image processing result is not necessarily proportional to the time/effort spent. On the one hand, you may spend little time in achieving the most useful results and with simple techniques; on the other hand, you may spend a lot of time achieving very little using complicated techniques
An image is a picture, photograph or any form of a two-dimensional representation of objects or a scene. The information in an image is presented in tones or colours. A digital image is a twodimensional array of numbers. Each cell of a digital image is called a pixel and the number representing the brightness of the pixel is called a digital number (DN) (Figure 1.1). As a two-dimensional (2D) array, a digital image is composed of data in lines and columns. The position of a pixel is allocated with the line and column of its DN. Such regularly arranged data, without x and y coordinates, are usually called raster data. As digital images are nothing more than data arrays, mathematical operations can be readily performed on the digital numbers of images. Mathematical operations on digital images are called digital image processing.
Digital image data can also have a third dimension: layers (Figure 1.1). Layers are the images of the same scene but containing different information. In multi-spectral images, layers are the images of different spectral ranges called bands or channels. For instance, a colour picture taken by a digital camera is composed of three bands containing red, green and blue spectral information individually. The term ‘band’ is more often used than ‘layer’ to refer to multi-spectral images. Generally speaking, geometrically registered multi-dimensional datasets of the same scene can be considered as layers of an image. For example, we can digitize a geological map and then co-register the digital map with a Landsat thematic mapper (TM) image. Then the digital map becomes an extra layer of the scene beside the seven TM spectral bands. Similarly, if we have a dataset of a digital elevation model (DEM) to which a SPOT image is rectified, then the DEM can be considered as a layer of the SPOT image beside its four spectral bands. In this sense, we can consider a set of co-registered digital images as a three-dimensional (3D) dataset and with the ‘third’ dimension providing the link between image processing and GIS.
A digital image can be stored as a file in a computer data store on a variety of media, such as a hard disk, CD, DVD or tape. It can be displayed in black and white or in colour on a computer monitor as well as in hard copy output such as film or print. It may also be output as a simple array of numbers for numerical analysis. As a digital image, its advantages include:
The images do not change with environmental factors as hard copy pictures and photographs do.
The images can be identically duplicated without any change or loss of information.
The images can be mathematically processed to generate new images without altering the original images.
The images can be electronically transmitted from or to remote locations without loss of information.
Figure 1.1 A digital image and its elements
Remotely sensed images are acquired by sensor systems onboard aircraft or spacecraft, such as Earth observation satellites. The sensor systems can be categorized into two major branches: passive sensors and active sensors. Multi-spectral optical systems are passive sensors that use solar radiation as the principal source of illumination for imaging. Typical examples include across-track and push-broom multi-spectral scanners, and digital cameras. An active sensor system provides its own mean of illumination for imaging, such as synthetic aperture radar (SAR). Details of major remote sensing satellites and their sensor systems are beyond the scope of this book but we provide a summary in Appendix A for your reference.
We live in a world of colour. The colours of objects are the result of selective absorption and reflection of electromagnetic radiation from illumination sources. Perception by the human eye is limited to the spectral range of 0.38–0.75 μm, that is a very small part of the solar spectral range. The world is actually far more colourful than we can see. Remote sensing technology can record over a much wider spectral range than human visual ability and the resultant digital images can be displayed as either black and white or colour images using an electronic device such as a computer monitor. In digital image display, the tones or colours are visual representations of the image information recorded as digital image DNs, but they do not necessarily convey the physical meanings of these DNs. We will explain this further in our discussion on false colour composites later.
The wavelengths of major spectral regions used for remote sensing are listed below:
Visible light (VIS)
0.4–0.7 μm
Blue (B)
0.4–0.5 μm
Green (G)
0.5–0.6 μm
Red (R)
0.6–0.7 μm
Visible–photographic infrared
0.5–0.9 μm
Reflective infrared (IR)
0.7–3.0 μm
Nearer infrared (NIR)
0.7–1.3 μm
Short-wave
1.3–3.0 μm
infrared (SWIR)
Thermal infrared (TIR):
3–5 μm,
8–14 μm
Microwave
0.1–100 cm
Commonly used abbreviations of the spectral ranges are denoted by the letters in brackets in the list above. The spectral range covering visible light and nearer infrared is the most popular for broadband multi-spectral sensor systems and it is usually denoted as VNIR.
Any image, either a panchromatic image or a spectral band of a multi-spectral image, can be displayed as a black and white (B/W) image by a monochromatic display. The display is implemented by converting DNs to electronic signals in a series of energy levels that generate different grey tones (brightness) from black to white, and thus formulate a B/W image display. Most image processing systems support an 8 bit graphical display, which corresponds to 256 grey levels, and displays DNs from 0 (black) to 255 (white). This display range is wide enough for human visual capability. It is also sufficient for some of the more commonly used remotely sensed images, such as Landsat TM/ETM+, SPOT HRV and Terra-1 ASTER VIR-SWIR (see Appendix A); the DN ranges of these images are not wider than 0–255. On the other hand, many remotely sensed images have much wider DN ranges than 8 bits, such as those from Ikonos and Quickbird, whose images have an 11 bit DN range (0–2047). In this case, the images can still be visualized in an 8 bit display device in various ways, such as by compressing the DN range into 8 bits or displaying the image in scenes of several 8 bit intervals of the whole DN range. Many sensor systems offer wide dynamic ranges to ensure that the sensors can record across all levels of radiation energy without localized sensor adjustment. Since the received solar radiation does not normally vary significantly within an image scene of limited size, the actual DN range of the scene is usually much narrower than the full dynamic range of the sensor and thus can be well adapted into an 8 bit DN range for display.
In a monochromatic display of a spectral band image, the brightness (grey level) of a pixel is proportional to the reflected energy in this band from the corresponding ground area. For instance, in a B/W display of a red band image, light red appears brighter than dark red. This is also true for invisible bands (e.g. infrared bands), though the ‘colours’ cannot be seen. After all, any digital image is composed of DNs; the physical meaning of DNs depends on the source of the image. A monochromatic display visualizes DNs in grey tones from black to white, while ignoring the physical relevance.
If you understand the structure and principle of a colour TV tube, you must know that the tube is composed of three colour guns of red, green and blue. These three colours are known as primary colours. The mixture of the light from these three primary colours can produce any colour on a TV. This property of the human perception of colour can be explained by the tristimulus colour theory. The human retina has three types of cones and the response by each type of cone is a function of the wavelength of the incident light; it peaks at 440 nm (blue), 545 nm (green) and 680 nm (red). In other words, each type of cone is primarily sensitive to one of the primary colours: blue, green or red. A colour perceived by a person depends on the proportion of each of these three types of cones being stimulated and thus can be expressed as a triplet of numbers (r, g, b) even though visible light is electromagnetic radiation in a continuous spectrum of 380–750 nm. A light of non-primary colour C will stimulate different portions of each cone type to form the perception of this colour:
(1.1)
Digital image colour display is based entirely on the tristimulus colour theory. A colour monitor, like a colour TV, is composed of three precisely registered colour guns, namely red, green and blue. In the red gun, pixels of an image are displayed in reds of different intensity (i.e. dark red, light red, etc.) depending on their DNs. The same is true of the green and blue guns. Thus if the red, green and blue bands of a multi-spectral image are displayed in red, green and blue simultaneously, a colour image is generated (Figure 1.3) in which the colour of a pixel is decided by the DNs of red, green and blue bands (r, g, b). For instance, if a pixel has red and green DNs of 255 and blue DN of 0, it will appears in pure yellow on display. This kind colour display system is called an additive RGB colour composite system. In this system, different colours are generated by additive combinations of Red, Green and Blue components.
Figure 1.2 The relation of the primary colours to their complementary colours
Figure 1.3 Illustration of RGB additive colour image display
Figure 1.4 The RGB colour cube
As mentioned before, although colours lie in the visible spectral range of 380–750 nm, they are used as a tool for information visualization in the colour display of all digital images. Thus, for digital image display, the assignment of each primary colour for a spectral band or layer can arbitrarily depend on the requirements of the application, which may not necessarily correspond to the actual colour of the spectral range of the band. If we display three image bands in the red, green and blue spectral ranges in RGB, then a true colour composite (TCC) image is generated (Figure 1.5, bottom left). Otherwise, if the image bands displayed in red, green and blue do not match the spectra of these three primary colours, a false colour composite (FCC) image is produced. A typical example is the so-called standard false colour composite (SFCC) in which the near-infrared band is displayed in red, the red band in green and the green band in blue (Figure 1.5, bottom right). The SFCC effectively highlights any vegetation distinctively in red. Obviously, we could display various image layers, which are without any spectral relevance, as a false colour composite. The false colour composite is the general case of an RGB colour display while the true colour composite is only a special case of it.
Figure 1.5 True colour and false colour composites of blue, green, red and near-infrared bands of a Landsat-7 ETM+ image. If we display the blue band in blue, green band in green and red band in red, then a true colour composite is produced as shown at the bottom left. If we display the green band in blue, red band in green and near-infrared band in red, then a so-called standard false colour composite is produced as shown at the bottom right
The human eye can recognize far more colours than it can grey levels, so colour can be used very effectively to enhance small grey-level differences in a B/W image. The technique to display a monochrome image as a colour image is called pseudo colour display. A pseudo colour image is generated by assigning each grey level to a unique colour (Figure 1.6). This can be done by interactive colour editing or by automatic transformation based on certain logic. A common approach is to assign a sequence of grey levels to colours of increasing spectral wavelength and intensity.
The advantage of pseudo colour display is also its disadvantage. When a digital image is displayed in grey scale, using its DNs in a monochromic display, the sequential numerical relationship between different DNs is effectively presented. This crucial information is lost in a pseudo colour display because the colours assigned to various grey levels are not quantitatively related in a numeric sequence. Indeed, the image in a pseudo colour display is an image of symbols; it is no longer a digital image! We can regard the grey-scale B/W display as a special case of pseudo colour display in which a sequential grey scale based on DN levels is used instead of a colour scheme. Often, we can use a combination of B/W and pseudo colour display to highlight important information in particular DN ranges in colours over a grey-scale background as shown in Figure 1.6c.
Figure 1.6 (a) An image in grey-scale (B/W) display;(b) the same image in a pseudo colour display;and (c) the brightest DNs are highlighted in red on a grey-scale background
In this chapter, we learnt what a digital image is and the elements comprising a digital image and we also learnt about B/W and colour displays of digital images. It is important to remember these key points:
A digital image is a raster dataset or a 2D array of numbers.
Our perception of colours is based on the tristimulus theory of human vision. Any colour is composed of three primary colours: red, green and blue.
Using an RGB colour cube, a colour can be expressed as a vector of the weighted summation of red, green and blue components.
In image processing, colours are used as a tool for image information visualization. From this viewpoint, the true colour display is a special case of the general false colour display.
Pseudo colour display results in the loss of the numerical sequential relationship of the image DNs. It is therefore no longer a digital image; it is an image of symbols.
Contrast enhancement, sometimes called radiometric enhancement or histogram modification, is the most basic but also the most effective technique for optimizing the image contrast and brightness for visualization or for highlighting information in particular DN ranges.
Let X represent a digital image and xij be the DN of any a pixel in the image at line i and column j. Let Y represent the image derived from X by a function f and yij be the output value corresponding to xij. Then a contrast enhancement can be expressed in the general form
(2.1)
This processing transforms a single input image X to a single output image Y, through a function f, in such a way that the DN of an output pixel yij depends on and only on the DN of the corresponding input pixel xij. This type of processing is called a point operation. Contrast enhancement is a point operation that modifies the image brightness and contrast but does not alter the image size.
Let x represent a DN level of an image X; then the number of pixels of each DN level hi(x) is called the histogram of the image X. The hi(x) can also be expressed as a percentage of the pixel number of a DN level x against the total number of pixels in the image X. In this case, in statistical terms, hi(x) is a probability density function.
A histogram is a good presentation of the contrast, brightness and data distribution of an image. Every image has a unique histogram but the reverse is not necessarily true because a histogram does not contain any spatial information. As a simple example, imagine how many different patterns you can form on a 10 × 10 grid chessboard using 50 white pieces and 50 black pieces. All these patterns have the same histogram!
(2.2)
As shown in Figure 2.1, suppose hi(x) is a continuous function; as a point operation does not change the image size, the number of pixels in the DN range δx in the input image X should be equal to the number of pixels in the DN range δy in the output image Y. Thus we have
(2.3)
Let δx → 0; then δy → 0 and
(2.4)
Therefore,
(2.5)
We can also write (2.5) as
The formula (2.5) shows that the histogram of the output image can be derived from the histogram of the input image divided by the first derivative of the point operation function.
Figure 2.1 The principles of the point operation by histogram modification
This linear function will produce an output image with a flattened histogram twice as wide and half as high as that of the input image and with all the DNs shifted to the left by three DN levels. This linear function stretches the image DN range to increase its contrast.
As f′(x) is the gradient of the point operation function f(x), formula (2.5) thus indicates:
For a nonlinear point operation function, this stretches and compresses different sections of DN levels, depending on its gradient at different DN levels, as shown later in the discussion on logarithmic and exponential point operation functions.
(2.6)
Figure 2.2 Histograms before (a) and after (b) Linear stretch for integer image data. Though the histogram bars in the histogram of the stretched image on the right are the same height as those in the original histogram on the left, the equivalent histogram drawn in the curve is wider and flatter because of the wider interval of these histogram bars
x
y
3
0
4
2
5
4
6
6
7
8
8
10
…
…
130
254
As most display systems can only display 8 bit integers in 0–255 grey levels, it is important to configure a point operation function in such a way that the value range of an output image Y is within 0–255.
The point operation function for linear contrast enhancement (LCE) is defined as
(2.7)
There are several popular LCE algorithms available in most image processing software packages:
(2.8)
In many modern image processing software packages, this function is largely redundant as the operation specified in (2.8) can be easily done using an interactive PLS. However, formula (2.8) helps us to understand the principle.
(2.9)
where Ei and SDi are the mean and standard deviation of the input image X.
Figure 2.3 Interactive PLS function for contrast enhancement and thresholding: (a) the original image; (b) the PLS function for contrast enhancement; (c) the enhanced image; (d) the PLS function for thresholding; and (e) the binary image produced by thresholding
These last two linear stretch functions are often used for automatic processing while, for interactive processing, PLS is the obvious choice.
Figure 2.4 Derivation of a linear function from two points of input image X and output image Y
Similarly, linear functions for mean and standard deviation adjustment defined in (2.9) can be derived from either
or
Logarithmic and exponential functions are inverse operations of one another. For contrast enhancement, the two functions modify the image histograms in opposite ways. Both logarithmic and exponential functions change the shapes of image
histograms and distort the information in original images.
The general form of the logarithmic function used for image processing is defined as
(2.10)
Here a (> 0) controls the curvature of the logarithmic function while b is a scaling factor to make the output DNs fall within a given value range, and the shift 1 is to avoid the zero value at which the logarithmic function loses its meaning. As shown in Figure 2.5, the gradient of the function is greater than 1 in the low DN range, thus it spreads out low DN values, while in the high DN range the gradient of the function is less than 1 and so compresses high DN values. As a result, logarithmic contrast enhancement shifts the peak of the image histogram to the right and highlights the details in dark areas in an input image. Many images have histograms similar in form to logarithmic normal distributions. In such cases, a logarithmic function will effectively modify the histogram to the shape of a normal distribution.
We can slightly modify formula (2.10) to introduce a shift constant c:
(2.11)
Figure 2.5 Logarithmic contrast enhancement function
This function allows the histogram of the output image to shift by c.
The general form of the exponential function used for image processing is defined as
(2.12)
Here again, a (> 0) controls the curvature of the exponential function while b is a scaling factor to make the output DNs falls within a given value range, and the exponential shift 1 is to avoid the zero value because e0 ≡ 1. As the inverse of the logarithmic function, exponential contrast enhancement shifts the image histogram to the left by spreading out high DN values and compressing low DN values to enhance detail in light areas at the cost of suppressing the tone variation in the dark areas (Figure 2.6). Again, we can introduce a shift parameter c, to modify the exponential contrast enhancement function as below:
(2.13)
(2.14)
According to (2.4)
(2.15)
Thus, the HE function is
(2.16)
Figure 2.6 Exponential contrast enhancement function
As the histogram hi(x) is essentially the probability density function of X, the Hi(x) is the cumulative distribution function of X. The calculation of Hi(x) is simple for a discrete function in the case of digital images. For a given DN level x, Hi(x) is equal to the total number of those pixels with DN values no greater than x:
(2.17)
Theoretically, HE can be achieved if Hi(x) is a continuous function. However, as Hi(x) is a discrete function for an integer digital image, HE can only produce a relatively flat histogram mathematically equivalent to an equalized histogram, in which the distance between histogram bars is proportional to their heights (Figure 2.7).
Figure 2.7 Histogram of histogram equalization
The idea behind the HE contrast enhancement is that the data presentation of an image should be evenly distributed across the whole value range. In reality, however, HE often produces images with too high a contrast. This is because natural scenes are more likely to follow normal (Gaussian) distributions and, consequently, the human eye is adapted to be more sensitive for discriminating subtle grey-level changes, of intermediate brightness, than of very high and very low brightness.
Histogram matching (HM) is a point operation that transforms an input image to make its histogram match a given shape defined by either a mathematical function or the histogram of another image. It is particularly useful for image comparison and differencing. If the two images in question are modified to have similar histograms, the comparison will be on a fair basis.
Thus
(2.18)
Figure 2.8 Histogram equalization acts as a bridge for histogram matching
Table 2.2 An example LUT for histogram matching
x
z
y
5
3
0
6
4
2
7
5
4
8
6
5
…
…
…
If the reference histogram ho(y) is defined by a Gaussian distribution function
(2.19)
where σ is the standard deviation and the mean of image X, the HM transformation is then called Gaussian stretch since the resultant image has a histogram in the shape of a Gaussian distribution.
Colour bias is one of the main causes of poor colour composite images. For RGB colour composition, if the average brightness of one image band is significantly higher or lower than the other two, the composite image will show obvious colour bias. To eliminate this, the three bands used for colour composition must have an equal value range and mean. The balance contrast enhancement technique (BCET) is a simple solution to this problem. Using a parabolic function derived from an input image, BCET can stretch (or compress) the image to a given value range and mean without changing the basic shape of the image histogram. Thus three image bands for colour composition can be adjusted to the same value range and mean to achieve a balanced colour composite.
The BCET based on a parabolic function is
(2.20)
This general form of parabolic function is defined by three coefficients: a, b and c. It is therefore capable of adjusting three image parameters: minimum, maximum and mean. The coefficients a, b and c can be derived based on the minimum, maximum and mean (l, h and e) of the input image X and the given minimum, maximum and mean (L, H and E) for the output image Y as follows:
(2.21)
where s is the mean square sum of input image X,
Figure 2.9 illustrates a comparison between RGB colour composites using the original band 5, 4 and 1 of an ETM+ sub-scene and the same bands after BCET stretch. The colour composite of the original bands (Figure 2.9a) shows strong colour bias to magenta as the result of much lower brightness in band 4, displayed in green. This colour bias is completely removed by BCET which stretches all the bands to the same value range 0–255 and mean 110 (Figure 2.9b). The BCET colour composite in Figure 2.9b presents various terrain materials (rock types, vegetation, etc.) in much more distinctive colours than those in the colour composite of the original image bands in Figure 2.9a. An interactive PLS may achieve similar results but without quantitative control.
Let xi represent any pixel of an input image X, with N pixels. Then the minimum, maximum and mean of X are
Suppose L, H and E are the desired minimum, maximum and mean for the output image Y. Then we can establish following equations:
(2.22)
Solving for b from (2.22),
(2.23)
where
With b known, a and c can then be resolved from (2.22) as
(2.24)
(2.25)
Figure 2.9 Colour composites of ETM+ bands 5, 4 and 1 in red, green and blue: (a) colour composite of the original bands showing magenta cast as the result of colour bias; and (b) BCET colour composite stretching all the bands to an equal value range of 0–255 and mean of 110
The parabolic function is an even function (Figure 2.10a). Coefficients b and c are the coordinates of the turning point of the parabola which determine the section of the parabola to be utilized by the BCET function. In order to perform BCET, the turning point and its nearby section of the parabola should be avoided, so that only the section of the monotonically increasing branch of the curve is used. This is possible for most cases of image contrast enhancement.
From the solutions of a, b and c in Equations (2.23)–(2.25), we can make the following observations:
This simple treatment often improves image display quality significantly, especially when the image looks hazy because of atmospheric scattering. When using BCET, the input minimum (l) and maximum (h) should be determined based on appropriate cut-off levels of xl and xh.
The general purpose of contrast enhancement is to optimize visualization. Often after quite complicated image processing, you will need to apply interactive contrast enhancement to view the results properly. After all, you need to be able to see the image! Visual observation is always the most effective way to judge image quality. This does not sound technical enough for digital image processing but this golden rule is quite true! On the other hand, the histogram gives you a quantitative description of image data distribution and so can also effectively guide you to improve the image visual quality. As mentioned earlier, the business of contrast enhancement is histogram modification and so you should find the following guidelines useful:
A common approach in the PLS is therefore to use functions with slope >45° to spread the peak section and those with slope <45° to compress the tails at both ends of the histogram.
Figure 2.11 Derivation of the linear stretch function and mean/standard deviation adjustment function
