Introduction to Information Retrieval: Christopher D. Manning, Prabhakar Raghavan, Hinrich Schütze: 9780521865715: Amazon.com: Books

#artificialintelligence 

This is the first book that gives you a complete picture of the complications that arise in building a modern web-scale search engine. You'll discover the seedy underworld of spam, cloaking, and doorway pages. You'll see how MapReduce and other approaches to parallelism allow us to go beyond megabytes and to efficiently manage petabytes.