Location:Home > Engineering science > Computer Science > Computer System Architecture > Implementation of parallel multi node data mining algorithm based on
Details
Name

Implementation of parallel multi node data mining algorithm based on

Downloads: []
Author
Tutor: WenJun
School: University of Electronic Science and Technology
Course: Computer System Architecture
Keywords: Cloud computing,Data mining,the MapReduce,Hadoop
CLC: TP311.13
Type: Master's thesis
Year:  2012
Facebook Google+ Email Gmail Evernote LinkedIn Twitter Addthis

not access Image Error Other errors

Abstract:
With the development of the information age, the rapid growth of data has become a very serious problem. For the reason we must use the methods of Data-mining to deal with the vast amounts of data. Through Data-mining can discover the unknown, hidden and potentially valuable knowledge for decision of support. Therefore the knowledge can be used to solve the practical problems.The traditional Apriori algorithm is performed in a single node is performed, and it can not be well adapted the massive data processing. In order to improve the needs of processing mass data, the urgent need to implement the mining algorithm in multiple nodes. To execute the mining algorithm in the node parallel, and makes the algorithm executed in a high degree of parallelism.In this paper we do an in-depth study of Map Reduce. Through the simple test and analysis of the MapReduce framework model of Hadoop, we put forward an innovative algorithm of task scheduling (DWSA). This algorithm deals with the load balancing adaptively by monitoring the number of the task in the system dynamically and using the priority-based approach to provide service for the task. In this paper we improve the basket storage model. The Boolean Matrix is used to storage the data and the same time a new data-mining algorithm of association rule. The new rule makes good use of the vector to spread out the data-mining. Thence the new model and algorithm are carried out on the optimized platform to improve the efficiency of the data-mining.
Related Dissertations
Last updated
Sponsored Links
Home |About Us| Contact Us| Feedback| Privacy | copyright | Back to top