Big data: are we making a big mistake?


Five years ago, a team of researchers from Google announced a remarkable achievement in one of the world's top scientific journals, Nature. Without needing the results of a single medical check-up, they were nevertheless able to track the spread of influenza across the US. What's more, they could do it more quickly than the Centers for Disease Control and Prevention (CDC). Google's tracking had only a day's delay, compared with the week or more it took for the CDC to assemble a picture based on reports from doctors' surgeries. Google was faster because it was tracking the outbreak by finding a correlation between what people searched for online and whether they had flu symptoms. Not only was "Google Flu Trends" quick, accurate and cheap, it was theory-free. Google's engineers didn't bother to develop a hypothesis about what search terms – "flu symptoms" or "pharmacies near me" – might be correlated with the spread of the disease itself.

Fertility and Its Meaning: Evidence from Search Behavior

AAAI Conferences

Fertility choices are linked to the different preferences and constraints of individuals and couples, and vary importantly by socio-economic status, as well by cultural and institutional context. The meaning of childbearing and childrearing, therefore, differs between individuals and across groups. In this paper, we combine data from Google Correlate and Google Trends for the U.S. with ground truth data from the American Community Survey to derive new insights into fertility and its meaning. First, we show that Google Correlate can be used to illustrate socio-economic differences on the circumstances around pregnancy and birth: for example, searches for “flying while pregnant” are linked to high income fertility, and “paternity test” are linked to non-marital fertility. Second, we combine several search queries to build predictive models of regional variation in fertility, explaining about 75% of the variance. Third, we explore if aggregated web search data can also be used to model fertility trends.

Google's CEO Says Tests of Censored Chinese Search Engine Turned Out Great


Google's internal tests developing a censored search engine in China have been very promising, CEO Sundar Pichai said on stage on Monday as part of the WIRED 25 Summit. "It turns out we'll be able to serve well over 99 percent of the queries," that users request. What's more, "There are many, many areas where we would provide information better than what's available," such as searching for cancer treatments, Pichai said. "Today people either get fake cancer treatments or they actually get useful information." While onstage at the event, Pichai did not back away from Google's controversial decision to build a censored search engine in China.

How a student's death highlighted our reliance on companies for health advice

The Guardian

China's equivalent of Google is under fire. Search engine Baidu has been criticised following the death of 21-year-old student Wei Zai, who used the search engine to research esoteric treatments for his cancer. After Wei Zai's death, the state-run People's Daily attacked Baidu, claiming it was ranking search results in exchange for money. "There have been hospitals making profits at the cost of killing patients who were directed by false advertisements paid at a higher rank in search results," the article claimed, adding, "profit considerations shall not be placed over social responsibility". The Chinese party newspaper may have its own reasons for wanting to control Baidu; a powerful search engine is a gateway to the outside world and a challenge to any repressive state.

Man and Machine


Engineers at Pinterest constantly create new artificial-intelligence algorithms to help its users find what they're looking for among billions of pictures of food, products, houses, and other items. Matching search queries with relevant images is crucial to keep users coming back. But until last year, it could take days to test the effectiveness of each new algorithm. To fine-tune its machine learning and provide better search results faster, Pinterest turned to an unexpected source: human intelligence. It hired crowdsourcing companies such as CrowdFlower to marshal people to quickly do "micro-tasks" such as labeling photos and assessing the quality of search results.