When process the massive data, there exists a calculation bottleneck in current Web mining system based on single server. To solve these problems, proposed a cloud-computing technology-based Web mining method. That is, the large data and mining tasks will be decomposed on multiple computers and be processed by parallel. We use open source project - Hadoop to establish a parallel Web mining platform. Moreover, we put forward a kind of improved MapReduce model - MapReduce-LP. It has been verified the effectiveness of system and efficiency of new model by Web log mining job in Electronic Commerce Systems. Experimental results show that, using cloud-computing technology to process large data in the cluster can significantly improve the efficiency of Web mining.
参考文献
相似文献
引证文献
引用本文
应毅,任凯,曹阳. 基于改进的MapReduce模型的Web挖掘[J]. 科学技术与工程, 2013, 13(5): . YingYi, REN Kai, CAO Yao. Web Mining Based on Improved MapReduce Model[J]. Science Technology and Engineering,2013,13(5).