Fuzzy C-mean Clustering using Data Mining - VIGNESH RAMAMOORTHY H - E-Book

Fuzzy C-mean Clustering using Data Mining E-Book

VIGNESH RAMAMOORTHY H

0,0
5,49 €

oder
-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.
Beschreibung

The goal of traditional clustering is to assign each data point to one and only one cluster. In contrast, fuzzy clustering assigns different degrees of membership to each point. The membership of a point is thus shared among various clusters. This creates the concept of fuzzy boundaries which differs from the traditional concept of well-defined boundaries. In hard clustering, data is divided into distinct clusters, where each data element belongs to exactly one cluster. In fuzzy clustering (also referred to as soft clustering), data elements can belong to more than one cluster, and associated with each element is a set of membership levels. These indicate the strength of the association between that data element and a particular cluster. Fuzzy clustering is a process of assigning these membership levels, and then using them to assign data elements to one or more clusters. This algorithm uses the FCM traditional algorithm to locate the centers of clusters for a bulk of data points. The potential of all data points is being calculated with respect to specified centers. The availability of dividing the data set into large number of clusters will slow the processing time and needs more memory size for the program. Hence traditional clustering should device the data to four clusters and each data point should be located in one specified cluster .Imprecision in data and information gathered from and about our environment is either statistical(e.g., the outcome of a coin toss is a matter of chance) or no statistical (e.g., "apply the brakes pretty soon"). Many algorithms can be implemented to develop clustering of data sets. Fuzzy C-mean clustering (FCM) is efficient and common algorithm. We are tuning this algorithm to get a solution for the rest of data point which omitted because of its farness from all clusters. To develop a high performance algorithm that sort and group data set in variable number of clusters to use this data in control and managing of those clusters.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



VIGNESH RAMAMOORTHY H

Fuzzy C-mean Clustering using Data Mining

BookRix GmbH & Co. KG80331 Munich

TABLE OF CONTENTS

 

 

 

CHAPTER I - INTRODUCTION

Background StudyData, Information and Knowledge 

1.2.1 Data

1.2.2 Information

1.2.3 Knowledge

1.2.4  Data warehouses

Data Mining

           1.3.1 Trends in Data Mining

Process of Data Mining

          1.4.1 Classes

          1.4.2 Cluster

          1.4.3 Association

          1.4.4 Sequential Patterns

          1.4.5 Data Mining Consists Of Five Major Elements

Clustering

          1.5.1 General Types of Clusters

Cluster Analysis

     1.7 Classification of Clustering

     1.8 Fuzzy C- Means Clustering

     1.9 Chapter Layout

 

CHAPTER II - LITERATURE SURVEY

     2.1 Applications of Fuzzy C-Means Algorithm

     2.2 Summary

 

CHAPTER III - METHODOLOGY

     3.1 Existing System

           3.1.1 Disadvantages

     3.2 Problem Description

          3.2.1 Results

          3.2.2 Drawbacks

     3.3 Proposed System

          3.3.1  Fuzzy Sets and Membership Functions

          3.3.2  Fuzziness and Probability

          3.3.3  Clustering

          3.3.4  Difficulties with Fuzzy Clustering

          3.3.5   Objectives and Challenges

3.4 Proposed Performance Measures

         3.4.1   FCM Clustering with Varying Density

 

CHAPTER IV- IMPLEMENTATION

     4.1 Simulation Environment

     4.2 Experimental Results

 

CHAPTER V- CONCLUSION AND FUTURE ENHANCEMENT    

     5.1 Conclusion

     5.2 Future Enhancement

 

 

REFERENCES

1. INTRODUCTION

 

1.1 Background Study

          Data mining (sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it into useful information- information that can be used to increase revenue, cuts costs, or both. Data mining software is one of a number of analytical tools for analyzing data. It allows users to analyze data from many different dimensions or angles, categorize it, and summarize the relationship identified. Technically, data mining is the process of finding co relation or patterns among dozens of fields in large relational databases.

 

            Mining of multimedia data is more involved than that of traditional business data because multimedia data are unstructured by nature. These are no well-defined fields of data with precise and non-ambiguous meaning, and the data must be processed to arrive at fields that can provide content information about it. Such processing often leads to non-unique results with several possible interpretations. In fact, multimedia data are often subject to varied interpretations even by human beings. Another difficulty in mining of multimedia data is its heterogeneous nature. The data are often the result of outputs from various kinds of sensor modalities with each modality needing sophisticated preprocessing, synchronization and transformation procedures.

 

            The objective of this work is to discover accurate sequence patterns. PTM event data considers the start time, end time, event label and associated probability for the sequence pattern discovery, the PTM representation is more realistic and we can discover more useful and accurate knowledge using this data.

 

 

1.2 Data, Information and Knowledge

1.2.1. Data

 Data are any facts, numbers or text that can be processed by a computer. Today, Organizations are accumulating vast and growing amounts of data in different formats and different databases. This includes Operational or transactional data such as, sales, cost, inventory, payroll, and accounting Non-operational data, such as industry sales, forecast data and macroeconomic data. Meta data is a data about the data itself, such as logical database design or data dictionary definitions.

1.2.2. Information

            The patterns, association, or relationship among all this data can provide information. For example analysis of retail point of sale transaction data can yield information on which products is selling and when.

1.2.3. Knowledge

            Information can be converted into knowledge about historical patterns and future trends. For example summary information on retail supermarket sales can be analyzed in light of promotional efforts to provide knowledge of consumer buying behavior. Thus, a manufacturer or retailer could determine which items are most susceptible to promotional efforts.

1.2.4. Data warehouses

            Dramatic advances in data capture processing power data transmission and storage capabilities are enabling organizations to integrate their various databases into data warehouse. Data warehousing is defined as a process of centralized data management and retrieval. Data warehousing like data mining is a relatively new term although the concept itself has been around for years. Data warehousing represents an ideal vision of maintaining a central repository of all organizational data.

 

It is a database used for reporting and analysis. The data stored in the warehouse are uploaded from the operational systems (such as marketing, sales etc., shown in the figure to the right). The data may pass through an operational data store for additional operations before they are used in the DW for reporting. The typical ETL-based data warehouse uses staging, integration, and access layers to house its key functions. The staging layer or staging database stores raw data extracted from each of the disparate source data systems. The integration layer integrates the disparate data sets by transforming the data from the staging layer often storing this transformed data in an operational data store (ODS) database. The integrated data is then moved to yet another database, often called the data warehouse database, where the data is arranged into hierarchal groups often called dimensions and into facts and aggregate facts. The combination of facts and dimensions is sometimes called a star schema. The access layer helps users retrieve data.

 

1.3 Data Mining

            Data mining, the extraction of hidden predictive information from large databases, is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses. Data mining tools predict future trends and behaviors, allowing businesses to make proactive, knowledge-driven decisions. The automated, prospective analyses offered by data mining move beyond the analyses of past events provided by retrospective tools typical of decision support systems. Data mining tools can answer business questions that traditionally were too time consuming to resolve. They scour databases for hidden patterns, finding predictive information that experts may miss because it lies outside their expectations.

 

Most companies already collect and refine massive quantities of data. Data mining techniques can be implemented rapidly on existing software and hardware platforms to enhance the value of existing information resources, and can be integrated with new products and systems as they are brought on-line. When implemented on high performance client/server or parallel processing computers, data mining tools can analyze massive databases to deliver answers to questions such as, "Which clients are most likely to respond to my next promotional mailing, and why?" This white paper provides an introduction to the basic technologies of data mining. Examples of profitable applications illustrate its relevance to today’s business environment as well as a basic description of how data warehouse architectures can evolve to deliver the value of data mining to end users. 

 

We are in an age often referred to as the information age. In this information age, because we believe that information leads to power and success, and thanks to sophisticated technologies such as computers, satellites, etc., we have been collecting tremendous amounts of information. Initially, with the advent of computers and means for mass digital storage, we started collecting and storing all sorts of data, counting on the power of computers to help sort through this amalgam of information. Unfortunately, these massive collections of data stored on disparate structures very rapidly became overwhelming. This initial chaos has led to the creation of structured databases and database management systems (DBMS). The efficient database management systems have been very important assets for management of a large corpus of data and especially for effective and efficient retrieval of particular information from a large collection whenever needed. The proliferation of database management systems has also contributed to recent massive gathering of all sorts of information. Today, we have far more information than we can handle: from business transactions and scientific data, to satellite pictures, text reports and military intelligence. Information retrieval is simply not enough anymore for decision-making. Confronted with huge collections of data, we have now created new needs to help us make better managerial choices. These needs are automatic summarization of data, extraction of the “essence” of information stored, and the discovery of patterns in raw data.