Rough-Fuzzy Pattern Recognition - Pradipta Maji - E-Book

Rough-Fuzzy Pattern Recognition E-Book

Pradipta Maji

0,0
101,99 €

oder
-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Learn how to apply rough-fuzzy computing techniques to solve problems in bioinformatics and medical image processing

Emphasizing applications in bioinformatics and medical image processing, this text offers a clear framework that enables readers to take advantage of the latest rough-fuzzy computing techniques to build working pattern recognition models. The authors explain step by step how to integrate rough sets with fuzzy sets in order to best manage the uncertainties in mining large data sets. Chapters are logically organized according to the major phases of pattern recognition systems development, making it easier to master such tasks as classification, clustering, and feature selection.

Rough-Fuzzy Pattern Recognition examines the important underlying theory as well as algorithms and applications, helping readers see the connections between theory and practice. The first chapter provides an introduction to pattern recognition and data mining, including the key challenges of working with high-dimensional, real-life data sets. Next, the authors explore such topics and issues as:

  • Soft computing in pattern recognition and data mining
  • A mathematical framework for generalized rough sets, incorporating the concept of fuzziness in defining the granules as well as the set
  • Selection of non-redundant and relevant features of real-valued data sets
  • Selection of the minimum set of basis strings with maximum information for amino acid sequence analysis
  • Segmentation of brain MR images for visualization of human tissues

Numerous examples and case studies help readers better understand how pattern recognition models are developed and used in practice. This text—covering the latest findings as well as directions for future research—is recommended for both students and practitioners working in systems design, pattern recognition, image analysis, data mining, bioinformatics, soft computing, and computational intelligence.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 440

Veröffentlichungsjahr: 2011

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Series Page

Title Page

Copyright

Dedication Page

Foreword

Preface

About the Authors

Chapter 1: Introduction to Pattern Recognition and Data Mining

1.1 Introduction

1.2 Pattern Recognition

1.3 Data Mining

1.4 Relevance of Soft Computing

1.5 Scope and Organization of the Book

References

Chapter 2: Rough-Fuzzy Hybridization and Granular Computing

2.1 Introduction

2.2 Fuzzy Sets

2.3 Rough Sets

2.4 Emergence of Rough-Fuzzy Computing

2.5 Generalized Rough Sets

2.6 Entropy Measures

2.7 Conclusion and Discussion

References

Chapter 3: Rough-Fuzzy Clustering: Generalized c-Means Algorithm

3.1 Introduction

3.2 Existing c-Means Algorithms

3.3 Rough-Fuzzy-Possibilistic c-Means

3.4 Generalization of Existing c-Means Algorithms

3.5 Quantitative Indices for Rough-Fuzzy Clustering

3.6 Performance Analysis

3.7 Conclusion and Discussion

References

Chapter 4: Rough-Fuzzy Granulation and Pattern Classification

4.1 Introduction

4.2 Pattern Classification Model

4.3 Quantitative Measures

4.4 Description of Data Sets

4.5 Experimental Results

4.6 Conclusion and Discussion

References

Chapter 5: Fuzzy-Rough Feature Selection using f-Information Measures

5.1 Introduction

5.2 Fuzzy-Rough Sets

5.3 Information Measure on Fuzzy Approximation Spaces

5.4 f-Information and Fuzzy Approximation Spaces

5.5 f-Information for Feature Selection

5.6 Quantitative Measures

5.7 Experimental Results

5.8 Conclusion and Discussion

References

Chapter 6: Rough Fuzzy c-Medoids and Amino Acid Sequence Analysis

6.1 Introduction

6.2 Bio-Basis Function and String Selection Methods

6.3 Fuzzy-Possibilistic c-Medoids Algorithm

6.4 Rough-Fuzzy c-Medoids Algorithm

6.5 Relational Clustering for Bio-Basis String Selection

6.6 Quantitative Measures

6.7 Experimental Results

6.8 Conclusion and Discussion

References

Chapter 7: Clustering Functionally Similar Genes from Microarray Data

7.1 Introduction

7.2 Clustering Gene Expression Data

7.3 Quantitative and Qualitative Analysis

7.4 Description of Data Sets

7.5 Experimental Results

7.6 Conclusion and Discussion

References

Chapter 8: Selection of Discriminative Genes from Microarray Data

8.1 Introduction

8.2 Evaluation Criteria for Gene Selection

8.3 Approximation of Density Function

8.4 Gene Selection using Information Measures

8.5 Experimental Results

8.6 Conclusion and Discussion

References

Chapter 9: Segmentation of Brain Magnetic Resonance Images

9.1 Introduction

9.2 Pixel Classification of Brain MR Images

9.3 Segmentation of Brain MR Images

9.4 Experimental Results

9.5 Conclusion and Discussion

References

Index

Copyright © 2011 by John Wiley & Sons, Inc. All rights reserved

Published by John Wiley & Sons, Inc., Hoboken, New Jersey

Published simultaneously in Canada

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.

Library of Congress Cataloging-in-Publication Data:

Maji, Pradipta, 1976—

Rough-fuzzy pattern recognition : applications in bioinformatics and medical imaging / Pradipta Maji, Sankar K. Pal.

p. cm. — (Wiley series in bioinformatics ; 3)

ISBN 978-1-118-00440-1 (hardback)

1. Fuzzy systems in medicine. 2. Pattern recognition systems. 3. Bioinformatics. 4. Diagnostic imaging—Data processing. I. Pal, Sankar K. II. Title.

Includes index.

R859.7.F89M35 2011

610.285—dc23

2011013787

To our parents

Foreword

It is my great pleasure to welcome the new book Rough-Fuzzy Pattern Recognition: Applications in Bioinformatics and Medical Imaging by the prominent scientists Professor Sankar K. Pal and Professor Pradipta Maji.

Soft computing methods allow us to achieve high-quality solutions for many real-life applications. The characteristic features of these methods are tractability, robustness, low-cost solution, and close resemblance with human-like decision making. They make it possible to use imprecision, uncertainty, approximate reasoning, and partial truth in searching for solutions. The main research directions in soft computing are related to fuzzy sets, neurocomputing, genetic algorithms, probabilistic reasoning, and rough sets. By integration or combination of the different soft computing methods, one may improve the performance of these methods. Among the various integrations realized so far, neuro-fuzzy computing (combing fuzzy sets and neural networks) is the most visible one because of its several real-life applications.

Both fuzzy and rough set theory represent two different approaches to analyzing vagueness. Fuzzy set theory addresses gradualness of knowledge, expressed by the fuzzy membership, whereas rough set theory addresses the granularity of knowledge, expressed by the indiscernibility relation. In 1999, together with Professor Sankar K. Pal, we edited the book Rough-Fuzzy Hybridization published by Springer. Since then, great progress has been made in the development of methods based on a combination of these approaches, both on foundations and applications. It is proved that by combining the rough-set and fuzzy-set approaches it is possible to achieve a significant improvement in the performance of methods. They are complementary to each other rather than competitive.

The book is based on a unified framework describing how rough-fuzzy computing techniques can be formulated for building efficient information granules, especially pattern recognition models. These granules are induced from some elementary granules by (hierarchical) fusion. The elementary granules can be induced using rough-set-based methods and/or fuzzy-set-based methods, and also the aggregation of granules can be based on a combination of such methods. In this way, one can consider different approaches such as rough-fuzzy, fuzzy-rough or fuzzy rough-fuzzy. For example, using rough-set-based methods one can efficiently induce some crisp patterns, which can be next fuzzified for making them soft. This can help, for example, in searching for the relevant fusion of granules. Analogously, the discovered fuzzy patterns may contain too many details and then, by using rough-set-based methods, one can obtain simpler patterns with satisfactory quality. Such patterns can be next used efficiently in approximate reasoning, for example, in the discovery of more complex patterns from the existing ones. In all these methods, the rough-set approach and the fuzzy-set approach work synergistically in efficient searching under uncertainty and imprecision for the target granules (e.g., classifiers for complex concepts) with a high quality. Note that in the fusion of granules, inclusion measures also play an important role because they allow us to estimate the quality of the granules constructed.

In a perfect way, the book introduces the reader to the fascinating and successful cooperative game between the rough-set approach and fuzzy-set approach. In this book, the reader will find a nice introduction to the rough-fuzzy approach and fuzzy-set approach. The discussed methods and algorithms cover all major phases of a pattern recognition system (e.g., classification, clustering, and feature selection). The book covers existing results and also presents new results. It was proved experimentally that the performance of the developed algorithms based on a combination of approaches is much better than the performance of algorithms based on approaches taken separately. This is shown in the book for several tasks such as feature selection of real valued data, case selection, image processing and analysis, data mining and knowledge discovery, selection of vocabulary for information retrieval, and decision rule extraction. The high performance of the developed methods is especially emphasized for real-life applications in bioinformatics and medical image processing. The balance of theory, algorithms, and applications will make this book attractive to many readers. The reader will also find in the book a discussion on various challenging issues relevant to application domains and possible ways of handling them with rough-fuzzy approaches.

This book, unique in its character, will be very useful to graduate students, researchers, and practitioners. The authors deserve the highest appreciation for their outstanding work. This is a work whose importance is hard to exaggerate.

Andrzej Skowron

Institute of Mathematics

Warsaw University, Poland

December, 2011

Preface

Soft computing is a collection of methodologies that work synergistically, not competitively, that, in one form or another, reflects its guiding principle: exploits the tolerance for imprecision, uncertainty, approximate reasoning, and partial truth to achieve tractability, robustness, low cost solution, and close resemblance with human-like decision making. It provides a flexible information processing capability for representation and evaluation of various real-life, ambiguous and uncertain situations and therefore results in the foundation for the conception and design of high machine intelligence quotient systems. At this juncture, the principal constituents of soft computing are fuzzy sets, neurocomputing, genetic algorithms, probabilistic reasoning, and rough sets.

One of the challenges of basic soft computing research is how to integrate its different constituting tools synergistically to achieve both generic and application-specific merits. Application-specific merits point to the advantages of integrated systems not achievable using the constituting tools singly.

Rough set theory, which is considered to be a newer soft computing tool compared with others, deals with uncertainty, vagueness, and incompleteness arising from the indiscernibility of objects in the universe. The main goal of rough set theoretic analysis is to synthesize or construct approximations, in terms of upper and lower bounds of concepts, properties, or relations from the acquired data. The key notions here are those of information granules and reducts. Information granules formalize the concept of finite precision representation of objects in real-life situations, and the reducts represent the core of an information system, both in terms of objects and features, in a granular universe. Its integration with fuzzy set theory, called rough-fuzzy computing, has motivated researchers to design a much stronger paradigm of reasoning and handling uncertainties associated with the data compared with those of the individual ones. The generalized theories of rough-fuzzy sets (when a fuzzy set is defined over crisp granules), fuzzy-rough sets (when a crisp set is defined over fuzzy granules), and fuzzy rough-fuzzy sets (when a fuzzy set is defined over fuzzy granules) have been applied successfully to problems such as classification, clustering, feature selection of real valued data, case selection, image processing and analysis, data mining and knowledge discovery, selecting vocabulary for information retrieval, and decision rule extraction.

In rough-fuzzy pattern recognition, fuzzy sets are used for handling uncertainties arising from ill-defined or overlapping nature of concepts, classes or regions, in terms of membership values, while rough sets are used for granular computing and handling uncertainties, due to granulation or indiscernibility in the feature space, in terms of lower and upper approximations of concepts or regions. On the one hand, rough information granules are used, for example, in defining class exactness, and encoding domain knowledge. On the other hand, granular computing, which deals with clumps of indiscernible data together rather than individual data points, leads to computation gain, thereby making it suitable for mining large data sets. Furthermore, depending on the problems, granules can be class dependent or independent. Several clustering algorithms have been formulated where the incorporation of rough sets resulted in a balanced mixture between restrictive or hard clustering and descriptive or fuzzy clustering. Judicious integration of the concept of rough sets with the existing fuzzy clustering algorithms has made the latter work faster with improved performance. Various real-life applications of these models, including those in bioinformatics, have also been reported during the last five to seven years. These are available in different journals, conference proceedings, and edited volumes. This scattered information causes inconvenience to readers, students, and researchers.

The current volume is aimed at providing a treatise in a unified framework describing how rough-fuzzy computing techniques can be judiciously formulated and used in building efficient pattern recognition models. On the basis of the existing as well as new results, the book is structured according to the major phases of a pattern recognition system (for example, clustering, classification, and feature selection) with a balanced mixture of theory, algorithm, and applications. Special emphasis is given to applications in bioinformatics and medical image processing.

The book consists of nine chapters. Chapter 1 provides an introduction to pattern recognition and data mining, along with different research issues and challenges related to high dimensional real-life data sets. The significance of soft computing in pattern recognition and data mining is also presented in Chapter 1. Chapter 2 presents the basic notions and characteristics of two soft computing tools, namely, fuzzy sets and rough sets. These are followed by the concept of information granules, the emergence of a rough-fuzzy computing paradigm and their relevance to pattern recognition. It also provides a mathematical framework for generalized rough sets incorporating the concept of fuzziness in defining the granules as well as the set. Various roughness and uncertainty measures with properties are reported. Different research issues related to rough granules are stated.

Chapter 3 mainly centers on a generalized unsupervised learning (clustering) algorithm, termed as rough-fuzzy-possibilistic c-means. While the concept of lower and upper approximations of rough sets deals with uncertainty, vagueness, and incompleteness in class definition, the membership function of fuzzy sets enables efficient handling of overlapping partitions. It incorporates both probabilistic and possibilistic memberships simultaneously to avoid the problems of noise sensitivity of fuzzy c-means and the coincident clusters of possibilistic c-means. The concept of crisp lower bound and fuzzy boundary of a class enables efficient selection of cluster prototypes. The algorithm is generalized in the sense that all the existing variants of c-means algorithms can be derived from this as a special case. Superiority in terms of computation time and performance is demonstrated. Several quantitative indices are reported for evaluating the performance on various real-life data sets.

Chapter 4 provides various supervised classification methods based on class-dependent f-granulation and rough set theoretic feature selection. The significance of granulation for better class discriminatory information and neighborhood rough sets for better feature selection is demonstrated. Extensive experimental results with quantitative indices are provided on both fully and partially labeled data sets. Future directions on the use of this concept in other computing paradigms are also provided.

Selection of nonredundant and relevant features of real valued data sets is a highly challenging problem. Chapter 5 addresses this issue. Methods described here are based on fuzzy-rough sets by maximizing the relevance and minimizing the redundancy of the selected features. Various new concepts such as fuzzy equivalence partition matrix, representation of Shannon's entropy for fuzzy approximation spaces, f-information measures to compute both relevance and redundancy of features, and feature evaluation indices are stated along with experimental results.

While several experimental results on both artificial and real-life data sets, including speech and remotely sensed multi-spectral image data, are provided in Chapters 3, 4, and 5 to demonstrate the effectiveness of the respective rough-fuzzy methodologies, the next four chapters are concerned only with certain specific applications, namely, in bioinformatics and medical imaging. Problems considered in bioinformatics include selection of a minimum set of bio-basis strings with maximum information for amino acid sequence analysis (Chapter 6), grouping functionally similar genes from microarray gene expression data through clustering (Chapter 7), and selection of relevant genes from high dimensional microarray data (Chapter 8). Selection of bio-basis strings is done by devising a relational clustering algorithm, called rough-fuzzy c-medoids. It judiciously integrates rough sets, fuzzy sets, and amino acid mutation matrices with hard c-medoids algorithms. The selected bio-basis strings are evaluated with newly defined indices in terms of a homology alignment score. Gene clusters thus identified may contribute to revealing underlying class structures, providing a useful tool for the exploratory analysis of biological data. The concept of fuzzy equivalence partition matrix, based on the theory of fuzzy-rough sets, is shown to be effective for selecting relevant and nonredundant continuous valued genes from high dimensional microarray gene expression data.

Problems of segmentation of brain MR images for visualization of human tissues during clinical analysis is addressed in Chapter 9 using rough-fuzzy clustering with new indices for feature extraction. The concept of discriminant analysis, based on the maximization of class separability, is used to circumvent the problems of initialization and local minima of rough-fuzzy clustering. Different challenging issues in the respective application domains and the ways of handling them with rough-fuzzy approaches are also discussed.

The relevant existing conventional/traditional approaches or techniques are also included wherever necessary. Directions for future research in the concerned topic are provided in each chapter. Most of the materials presented in the book are from our published works. For the convenience of readers, a comprehensive bibliography on the subject is also appended in each chapter. Some works in the related areas might have been omitted because of oversight or ignorance.

This book, which is unique in its character, will be useful to graduate students and researchers in computer science, electrical engineering, system science, medical science, bioinformatics, and information technology, both as a textbook and a reference book for some parts of the curriculum. Researchers and practitioners in industry and R&D laboratories working in the fields of system design, pattern recognition, machine learning, image analysis, vision, data mining, bioinformatics, and soft computing or computational intelligence will also be benefited.

Finally, the authors take this opportunity to thank Mr. Michael Christian and Dr. Simone Taylor of John Wiley & Sons, Inc., Hoboken, New Jersey, for their initiative and encouragement. The authors also gratefully acknowledge the support provided by Prof. Malay K. Kundu, Dr. Saroj K. Meher, Dr. Debashis Sen, Ms. Sushmita Paul, and Mr. Indranil Dutta of Indian Statistical Institute in the preparation and proofreading of a few chapters of the manuscript. The book was written when one of the authors, Prof. S. K. Pal, held a J. C. Bose National Fellowship of the Government of India.

Pradipta Maji

Sankar K. Pal

Kolkata, India

About the Authors

Pradipta Maji received his BSc degree in Physics, MSc degree in Electronics Science, and PhD degree in the area of Computer Science from Jadavpur University, India, in 1998, 2000, and 2005, respectively.

Currently, he is an assistant professor in the Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India. He was associated with the Center for Soft Computing Research: A National Facility, Indian Statistical Institute, Kolkata, India, from 2005 to 2009. Before joining the Indian Statistical Institute, he was a lecturer in the Department of Computer Science and Engineering, Netaji Subhash Engineering College, Kolkata, India, from 2004 to 2005. In 2004, he visited the Laboratory of Information Security & Internet Applications (LISIA), Division of Electronics, Computer and Telecommunication Engineering, Pkyoung National University, Pusan, Korea. During the period of September 2000 to April 2004, he was a research scholar at the Department of Computer Science and Technology, Bengal Engineering College (DU) (currently known as Bengal Engineering and Science University), Shibpur, Howrah, India. From 2002 to 2004, he also served as a Research and Development Consultant of Cellular Automata Research Laboratory (CARL), Kolkata, India. His research interests include pattern recognition, computational biology and bioinformatics, medical image processing, cellular automata, and soft computing. He has published more than 60 papers in international journals and conferences and is a reviewer for many international journals.

Dr. Maji received the 2006 Best Paper Award of the International Conference on Visual Information Engineering from The Institution of Engineering and Technology, UK, the 2008 Microsoft Young Faculty Award from Microsoft Research Laboratory India Pvt., the 2009 Young Scientist Award from the National Academy of Sciences, India, and the 2011 Young Scientist Award from the Indian National Science Academy, and was selected as the 2009 Associate of the Indian Academy of Sciences.

Sankar K. Pal is a distinguished scientist of the Indian Statistical Institute and a former Director. He is also a J.C. Bose Fellow of the Government of India. He founded the Machine Intelligence Unit and the Center for Soft Computing Research, a national facility in the institute in Calcutta. He received his PhD in Radio Physics and Electronics from the University of Calcutta in 1979, and another PhD in Electrical Engineering along with DIC from Imperial College, University of London, in 1982. He joined his institute in 1975 as a CSIR Senior Research Fellow and later became a Full Professor in 1987, Distinguished Scientist in 1998, and the Director for the term 2005–2010.

He worked at the University of California, Berkeley, and the University of Maryland, College Park, in 1986–1987; the NASA Johnson Space Center, Houston, Texas, in 1990–1992 and 1994; and the US Naval Research Laboratory, Washington, DC, in 2004. Since 1997 he has served as a distinguished visitor of the IEEE Computer Society (USA) for the Asia-Pacific region, and held several visiting positions at universities in Italy, Poland, Hong Kong, and Australia.

Prof. Pal is a Fellow of the IEEE, USA, the Academy of Sciences for the Developing World (TWAS), Italy, International Association for Pattern Recognition, USA, International Association of Fuzzy Systems, USA, and all the four National Academies for Science/Engineering in India. He is a coauthor of 17 books and more than 400 research publications in the areas of pattern recognition and machine learning, image processing, data mining and web intelligence, soft computing, neural nets, genetic algorithms, fuzzy sets, rough sets, and bioinformatics.

He received the 1990 S.S. Bhatnagar Prize (which is the most coveted award for a scientist in India), and many prestigious awards in India and abroad including the 1999 G.D. Birla Award, 1998 Om Bhasin Award, 1993 Jawaharlal Nehru Fellowship, 2000 Khwarizmi International Award from the Islamic Republic of Iran, 2000–2001 FICCI Award, 1993 Vikram Sarabhai Research Award, 1993 NASA Tech Brief Award (USA), 1994 IEEE Transactions Neural Networks Outstanding Paper Award (USA), 1995 NASA Patent Application Award (USA), 1997 IETE-R.L. Wadhwa Gold Medal, the 2001 INSA-S.H. Zaheer Medal, 2005–2006 Indian Science Congress-P.C. Mahalanobis Birth Centenary Award (Gold Medal) for Lifetime Achievement, 2007 J.C. Bose Fellowship of the Government of India, and 2008 Vigyan Ratna Award from Science & Culture Organization, West Bengal.

Prof. Pal is currently or has in the past been an Associate Editor of IEEE Trans. Pattern Analysis and Machine Intelligence (2002–2006), IEEE Trans. Neural Networks [1994–1998 & 2003–2006], Neurocomputing (1995-2005), Pattern Recognition Letters, Int. J. Pattern Recognition & Artificial Intelligence, Applied Intelligence, Information Sciences, Fuzzy Sets and Systems, Fundamenta Informaticae, LNCS Trans. on Rough Sets, Int. J. Computational Intelligence and Applications, IET Image Processing, J. Intelligent Information Systems, and Proc. INSA-A; Editor-in-Chief, Int. J. Signal Processing, Image Processing and Pattern Recognition; Series Editor, Frontiers in Artificial Intelligence and Applications, IOS Press, and Statistical Science and Interdisciplinary Research, World Scientific; a Member, Executive Advisory Editorial Board, IEEE Trans. Fuzzy Systems, Int. Journal on Image and Graphics, and Int. Journal of Approximate Reasoning; and a Guest Editor of IEEE Computer.

Chapter 1

Introduction to Pattern Recognition and Data Mining

1.1 Introduction

Pattern recognition is an activity that human beings normally excel in. The task of pattern recognition is encountered in a wide range of human activity. In a broader perspective, the term could cover any context in which some decision or forecast is made on the basis of currently available information. Mathematically, the problem of pattern recognition deals with the construction of a procedure to be applied to a set of inputs; the procedure assigns each new input to one of a set of classes on the basis of observed features. The construction of such a procedure on an input data set is defined as pattern recognition.

A pattern typically comprises some features or essential information specific to a pattern or a class of patterns. Pattern recognition, as per the convention, is the study of how machines can observe the environment, learn to distinguish patterns of interest from their background, and make sound and reasonable decisions about the categories of the patterns. In other words, the discipline of pattern recognition essentially deals with the problem of developing algorithms and methodologies that can enable the computer implementation of many recognition tasks that humans normally perform. The objective is to perform these tasks more accurately, faster, and perhaps more economically than humans and, in many cases, to release them from drudgery resulting from performing routine recognition tasks repetitively and mechanically. The scope of pattern recognition also encompasses tasks at which humans are not good, such as reading bar codes. Hence, the goal of pattern recognition research is to devise ways and means of automating certain decision-making processes that lead to classification and recognition.

Pattern recognition can be viewed as a twofold task, consisting of learning the invariant and common properties of a set of samples characterizing a class, and of deciding that a new sample is a possible member of the class by noting that it has properties common to those of the set of samples. The task of pattern recognition can be described as a transformation from the measurement space to the feature space and finally to the decision space ; that is,

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!