What is data mining?

Topic > What is data mining? - 1153

CHAPTER 1 INTRODUCTION The explosive growth in the amount of data and the challenges of finding interesting patterns from huge amounts of data lead to the birth of data mining. Data mining is the process of extracting interesting (valid, new, useful and understandable) patterns from massive data that are actionable and can be used for enterprise decision making. Data mining is one of the fundamental processes for discovering knowledge in databases. The basic types of data mining techniques are association rules, classification and clustering, web mining, and sequential pattern mining. Association rule mining is one of the basic and most important data mining techniques. It extracts interesting correlations, frequent patterns, associations between sets of elements that can be used in decision making. For example, in the case of a grocery store, the association rules can be the set of items brought together by the customer. For example, “30% of people who buy noodles also buy ketchup.” This template can be useful for developing marketing strategies and advertising plans. Association rules can be useful in areas such as market and risk management, customer segmentation, finance, telecommunications networks, intrusion detection, web usage extraction, and bioinformatics. Today, businesses store large amounts of data from their daily operations, such data being primarily transaction databases. Finding all interesting association rules from a large database is quite challenging. Most current approaches require multiple database scans and are very expensive. The goal is to create an efficient approach that requires less space and has lower computational overhead..CHAPTER 2PROBLEM STATEMENTRecommendations...... half of the paper ......given element sets that are not expected to be large, thus avoiding unnecessary efforts to count these sets of elements. The AIS algorithm requires more and spends more effort generating candidate sets that are further reduced. Along with this major drawback, the database also requires too many steps. 3.1.2 Apriori Algorithm: Apriori algorithm has been provided which has been improved AIS by Agrawal et al[2]. The FP-growth algorithm initially scans the transaction database to obtain item frequencies (or single item support). Items whose frequency is lower than the minimum support provided are discarded from transactions. Furthermore, in each transaction the elements are sorted in descending order based on their frequency in the database. Descending order results in a shorter execution time than ascending or random order.