A new method of focused crawling is presented. A diagram of topic levels is built using the concept tree. URL objects to be crawled are endowed with semantic information about topic levels by the diagram and selected according to the semantic relevance and importance. It searches only the important subset of the WWW that pertains to a specific topic of semantic relevance
参考文献
相似文献
引证文献
引用本文
曾义聪 杨贯中 刘柯. 基于概念树的主题爬取技术研究[J]. 科学技术与工程, 2005, (12): 785-790796. ZENG Yicong, YANG Guanzhong, LIU Ke. Research on Focused Crawling Technology Based on the Concept Tree[J]. Science Technology and Engineering,2005,(12):785-790796.