Web Crawling Algorithms

Nigam, Aviral. "Web Crawling Algorithms." International Journal of Computer Science and Artificial Intelligence 4 (2014): 63-67.


As the size of the Internet is growing rapidly, it has become important to make the search for content much faster and more accurately. Without efficient search engines, it would be impossible to get accurate results. To overcome this problem, software called “Web Crawler” is applied which uses various kinds of algorithms to achieve the goal. These algorithms use various kinds of heuristic functions to increase efficiency of the crawlers. A* and Adaptive A* Search are some of the best path finding algorithms. A* uses a best-first search and finds a least-cost path from a given initial node to a goal node. In this work, a study has been done on some of the existing Web Crawler algorithms and A* / Adaptive A* method has been modified to be used in this domain. A* / Adaptive A* method being heuristic approaches can be used to find desired results in web-like weighted environments. We create a virtual web environment using graphs and compare the time taken to search the desired node from any random node amongst various web crawling algorithms.

web_crawling_algorithms.pdf388.25 KB