دورية أكاديمية

Hybrid focused crawling on the Surface and the Dark Web

التفاصيل البيبلوغرافية
العنوان: Hybrid focused crawling on the Surface and the Dark Web
المؤلفون: Christos Iliou, George Kalpakis, Theodora Tsikrika, Stefanos Vrochidis, Ioannis Kompatsiaris
المصدر: EURASIP Journal on Information Security, Vol 2017, Iss 1, Pp 1-13 (2017)
بيانات النشر: SpringerOpen, 2017.
سنة النشر: 2017
المجموعة: LCC:Computer engineering. Computer hardware
LCC:Electronic computers. Computer science
مصطلحات موضوعية: Focused crawling, Dark web, Darknets, Tor, I2P, Freenet, Computer engineering. Computer hardware, TK7885-7895, Electronic computers. Computer science, QA75.5-76.95
الوصف: Abstract Focused crawlers enable the automatic discovery of Web resources about a given topic by automatically navigating through the Web link structure and selecting the hyperlinks to follow by estimating their relevance to the topic of interest. This work proposes a generic focused crawling framework for discovering resources on any given topic that reside on the Surface or the Dark Web. The proposed crawler is able to seamlessly navigate through the Surface Web and several darknets present in the Dark Web (i.e., Tor, I2P, and Freenet) during a single crawl by automatically adapting its crawling behavior and its classifier-guided hyperlink selection strategy based on the destination network type and the strength of the local evidence present in the vicinity of a hyperlink. It investigates 11 hyperlink selection methods, among which a novel strategy proposed based on the dynamic linear combination of a link-based and a parent Web page classifier. This hybrid focused crawler is demonstrated for the discovery of Web resources containing recipes for producing homemade explosives. The evaluation experiments indicate the effectiveness of the proposed focused crawler both for the Surface and the Dark Web.
نوع الوثيقة: article
وصف الملف: electronic resource
اللغة: English
تدمد: 2510-523X
Relation: http://link.springer.com/article/10.1186/s13635-017-0064-5; https://doaj.org/toc/2510-523X
DOI: 10.1186/s13635-017-0064-5
URL الوصول: https://doaj.org/article/b9e9be0ac972459a99d8a67cbda7d5b2
رقم الأكسشن: edsdoj.b9e9be0ac972459a99d8a67cbda7d5b2
قاعدة البيانات: Directory of Open Access Journals
الوصف
تدمد:2510523X
DOI:10.1186/s13635-017-0064-5