S. M. Beitzel, E. C. Jensen, A. Chowdhury, D. Grossman, and O. Frieder. Hourly analysis of a very large topically categorized web query log. In SIGIR'04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pages 321-328, New York, NY, USA, 2004. ACM Press. V. Gupta and R. H. Campbell. Internet search engine freshness by web server help. In Proceedings of the Symposium on Internet Applications (SAINT), pages 113-119, San Diego, California, USA, 2001. R. Lempel and S. Moran. Predictive caching and prefetching of query results in search engines. In WWW'03: Proceedings of the 12th international conference on WorldWideWeb, pages 19-28, New York, NY, USA, 2003. ACM Press. X. Liu and W. B. Croft. Cluster-based retrieval using language models. In SIGIR'04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pages 186-193, New York, NY, USA, 2004. ACM Press. A. Spink, B. J. Jansen, D. Wolfram, and T. Saracevic. From e-sex to e-commerce: Web search changes. Computer, 35(3):107-109, 2002. Soumen Chakrabarti, Byron Dom, Rakesh Agrawal, Prabhakar Raghavan. "Scalable Feature Selection, Classification and Signature Generation for Organizing Large Text Databases into Hierarchical Topic Taxonomies" VLDB Journal 1998. Soumen Chakrabarti, M. van den Berg and B. Dom. "Focused crawling: A new approach to topic-specific Web resource discovery". In Proc. of WWW8, Toronto, May 1999. Daniel Boley, Maria Gini, Robert Gross, Eui-Hong (Sam) Han, Kyle Hastings, George Karypis, Vipin Kumar, Bamshad Mobasher, and Jerome Moore. "Document Categorization and Query Generation on the World Wide Web Using WebACE". AI Review, 1999 O. R. Zaiane, M. Xin, J. Han. "Discovering Web Access Patterns and Trends by Applying OLAP and Data Mining Technology on Web Logs". Proc. Advances in Digital Libraries Conf. (ADL'98), Santa Barbara, CA, April 1998, pp. 19-29. D.W. Cheung, B. Kao, and J.W. Lee. "Discovering User Access Patterns on the World-Wide-Web". Proc. First Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-97), Singapore, February, 1997. R. Cooley, B. Mobasher, J. Srivastava. "Web Mining: Information and Pattern Discovery on the World Wide Web (A Survey Paper)". In Proceedings of the 9th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'97), November 1997. Soumen Chakrabarti, Byron E. Dom, S. Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, Andrew Tomkins, David Gibson, and Jon Kleinberg. "Mining the Web's Link Structure". IEEE Computer, Vol. 32, No. 8, August 1999. Masaru Kitsureawa, Masashi Toyoda, Iko Pramudiono. "WEB community mining and WEB log mining: Commodity Cluster based Execution". Thirteenth Australasian Database Conference (ADC2002) Flake, G. W., Lawrence, S., and Giles, C. L. 2000. Efficient identification of Web communities. In Proceedings of the Sixth ACM SIGKDD international Conference on Knowledge Discovery and Data Mining (Boston, Massachusetts, United States, August 20 - 23, 2000). KDD '00. ACM Press, New York, NY, 150-160. Ino, H., Kudo, M., and Nakamura, A. 2005. Partitioning of Web graphs by community topology. In Proceedings of the 14th international Conference on World Wide Web (Chiba, Japan, May 10 - 14, 2005). WWW '05. ACM Press, New York, NY, 661-669. Imafuji, N. and Kitsuregawa, M. 2002. Effects of maximum flow algorithm on identifying web community. In Proceedings of the 4th international Workshop on Web information and Data Management (McLean, Virginia, USA, November 08 - 08, 2002). WIDM '02. ACM Press, New York, NY, 43-48.) Ipeirotis, P. G., Ntoulas, A., Cho, J., and Gravano, L. 2005. Modeling and Managing Content Changes in Text Databases. In Proceedings of the 21st international Conference on Data Engineering (Icde'05) - Volume 00 (April 05 - 08, 2005). ICDE. IEEE Computer Society, Washington, DC, 606-617. Najork, M. and Wiener, J. L. 2001. Breadth-first crawling yields highquality pages. In Proceedings of the 10th international Conference on World Wide Web (Hong Kong, Hong Kong, May 01 - 05, 2001). WWW '01. ACM Press, New York, NY, 114-118. Boldi, P., Codenotti, B., Santini, M., and Vigna, S. 2004. UbiCrawler: a scalable fully distributed web crawler. Softw. Pract. Exper. 34, 8 (Jul. 2004), 711-726. Fetterly, D., Manasse, M., and Najork, M. 2005. Detecting phrase-level duplication on the world wide web. In Proceedings of the 28th Annual international ACM SIGIR Conference on Research and Development in information Retrieval (Salvador, Brazil, August 15 - 19, 2005). SIGIR '05. ACM Press, New York, NY, 170-177. Wu, B. and Davison, B. D. 2005. Identifying link farm spam pages. In Special interest Tracks and Posters of the 14th international Conference on World Wide Web (Chiba, Japan, May 10 - 14, 2005). WWW '05. ACM Press, New York, NY, 820-829. Z. Gyongyi, H. Garcia-Molina, and J. Pedersen. 2004. Combating web spam with TrustRank. In Proceedings of the 30th VLDB Conference, Sept. 2004. Z. Gyongyi and H. Garcia-Molina. 2004. Web spam taxonomy. Technical report, Stanford Digital Library Technologies Project, Mar. 2004. B. D. Davison. Topical locality in the web. In Proceedings of the 23rd annual international ACM SIGIR conference on research and development in information retrieval, pages 272-279, Athens, Greece, 2000. ACM Press N. Eiron, K. S. Curley, and J. A. Tomlin. Ranking the web frontier. In Proceedings of the 13th international conference on World Wide Web, pages 309-318, New York, NY, USA, 2004. ACM Press. Z. Gy¨ongyi and H. Garcia-Molina. Link spam alliances. In Proceedings of the 31st International Conference on Very Large Data Bases (VLDB), 2005. Z. Gy¨ongyi, H. Garcia-Molina, and J. Pedersen. Combating Web spam with TrustRank. In Proceedings of the 30th International Conference on Very Large Data Bases (VLDB), 2004. A. Ntoulas, M. Najork, M. Manasse, and D. Fetterly. Detecting spam web pages through content analysis. In Proceedings of the World Wide Web conference, pages 83-92, Edinburgh, Scotland, May 2006.