The World Wide Web (WWW) abounds with ever-increasing information on many topics. However, since every user has specific information needs and interests, only a tiny part of the WWW is useful to them. For example, in a family, a mother may wish to "find recipes with salmon as the main ingredient", the father may be interested in "what movie to watch tonight?", and the teenage daughter may be wondering "what is artificial intelligence?". In order for humans to quickly ‘retrieve’ relevant information of interest, they usually search the Web using a search engine such as Google.
Although it sounds simple, information retrieval is a complex field involving many sub-tasks and applications. According to "the father of information retrieval", Gerard Salton, information retrieval is the field concerned with the tasks of structure, analysis, organization, storage, searching, and retrieval of information. Applications include, but are not limited to, web search (i.e., searching the WWW) which is the most common type, where the search is specialized in a specific topic only (e.g., searching for shoes within the football topic implies someone looking for football shoes), enterprise search, which involves searching for documents in a corporate intranet, image search, which is searching for images similar to a given image, product search, which involves searching for products similar to a given product, desktop search, which is searching for relevant files in our personal computer, or mobile search, which typically takes location and time into account. Users can be searching for different kinds of items, such as webpages, emails, scholarly papers, books, news stories, or even social profiles. Furthermore, with the advent of new technologies and modalities like virtual reality, it is likely that the scope of information retrieval will only increase with time.
Regardless of the type of search and the type of the returned item, the goal of every information retrieval algorithm is to take a search query as input, and to quickly find and output a ranked list of relevant items, i.e., items that contain information that the user was looking for. For example, in our family example, the mother may submit a query of the form "find recipes with salmon" and the expected result is an ordered (ranked) list of recipes containing salmon, ordered by how relevant each recipe is to the query. Although a straightforward approach would be for a retrieval algorithm to simply compare the query text with the recipe text, this approach will not always work due to language ambiguity. For example, when someone submits a query containing the single word "jaguar" it is very difficult for any algorithm to determine whether the user is looking for documents about jaguar the animal or jaguar the vehicle brand. To be effective, an information retrieval system needs to pay special attention to the meaning of queries rather than the actual words used in them.
Along with ambiguity, information retrieval faces a number of important challenges e.g., dealing with unstructured information, ensuring that it takes each user's context and expectations into account when returning the results, and dealing with scalability (e.g., search engines typically index and search almost instantly, billions of items, in order to answer each user's query, along with answering more than a trillion queries per year). Researchers are continuing to address these challenges.
- Pigi Kouki
In the real world, the search is so large that we cannot enumerate the entire search engine. The digital world is so dynamic and disorganized that it has made it difficult to find an effective solution to ambiguous queries. The process of retrieval is affected by the ambiguous queries which average users type into the search engines. This is why they return too many results which can be manipulated by search engine black hat hackers. Problem: How a user can find an appropriate answer that is relevant to his/her query?
More information is leaking out about just how Google is planning to re-enter the Chinese market with a mobile search engine application that complies to the country's censorship laws. The Intercept first broke this story when a whistleblower provided them documentation detailing the secret censored search project (codenamed Dragonfly). According to them, an overlooked Google acquisition from 2008 -- 265.com -- has been quietly laying down the foundation for the endeavor. Back in June of 2008, Google acquired the Chinese website 265.com, which Chinese internet entrepreneur Cai Wensheng, known as "King of the Webmasters," founded in 2003. Cai is the current chairman and founder of the company behind the popular selfie app Meitu.
If Google is planning a search engine for China, how is it planning to obey the country's strict censorship laws from day one? The Intercept has obtained documents reportedly showing that Google has been using 265.com, a hybrid information and search portal it acquired in 2008, as a "honeypot" that would help it develop a blacklist for search terms in China. Google has supposedly been collecting info about search queries, which technically redirect to Baidu, to see if they would be censored. According to the report, Google has been using a tool nicknamed BeaconTower to see if the final destinations of these searches would survive the Great Firewall. If they didn't, Google would exclude them from the first page of results in its prototype Chinese search engine.
The Great Firewall of China is the largest-scale internet censorship operation in the world. The Chinese state says the firewall is there to promote societal harmony within an increasing population of billions of people. It considers the internet in China as part of its sovereign territory. Eight years ago, Google withdrew from China, pulling its search and other services out because of country's limits to freedom of speech. But it is now planning to relaunch a heavily censored version of its services in China, according to a whistleblower who spoke to online news website The Intercept.
After an eight-year-long absence from the most populated country in the world, Google search is going to dramatically make a comeback in China. Google is reportedly planning to launch a censored version of its search engine in China that is going to blacklist certain websites and search terms to comply with Chinese government's attempts to censor the Internet, a whistleblower revealed. According to leaked documents obtained by The Intercept, CEO Sundar Pichai met with a Chinese government official in December 2017 to re-enter the world's largest market for internet users. Project Dragonfly -- Censored Google Search Engine Since spring last year Google engineers have been secretly working on a project, dubbed "Dragonfly," which currently includes two Android mobile apps named--Maotai and Longfei--one of which will get launched by the end of this year after Chinese officials approve it. The censored version of Google search engine in the form of a mobile app reportedly aims to "blacklist sensitive queries" and filter out all websites (news, human rights, democracy, religion) blocked by the Chinese government, including Wikipedia, BBC News, Instagram, Facebook, and Twitter.
Google refused to confirm if it's truly been developing a censored search engine for China after reports about the project's existence came out, but it might soon have no choice but to come clean. A group of six Democratic and Republican Senators led by Marco Rubio has penned a letter addressed to Google chief Sundar Pichai demanding concrete answers. They want to know once and for all whether the tech giant is conjuring up a version of its search engine that'll work behind the Great Firewall. The Senators called the move "deeply troubling" if true, pointing out that that it "risks making Google complicit in human rights abuses related to China's rigorous censorship regime." While the company has yet to confirm the project (reportedly called "Dragonfly"), it's easy to see why the Senators would be concerned.
Google may be developing a special version of its search engine that would allow certain information to disappear in China. It would not be the first time a company has done so – and it would not even be the first time Google has done so. The decision has turned fresh attention to China's internet censorship regime, so wide-ranging that it has been labelled the "Great Firewall" and so powerful that it has allowed the government to ban people from using a single letter. The suggestion that China could be partnering with the regime has been attacked by human rights campaigners and others, who suggested that Google relenting on censorship would be a "dark day", in the words of Amnesty International. It would not be the first of such days, as Google would follow a whole host of companies including LinkedIn that already censor their users as the price of operating in China.
Google has been criticised over reports that it is working on a version of its search engine meant to hide information requested by the Chinese government. The new tool would be a way for Google to get around the vast rules that are placed on what can be shared online in the country. Google withdrew from China in 2010, amid uproar about the fact it was being forced to censor certain information. But the new plan appears to be an attempt to try and get the company back in, by making sure that censoring is happening in an agreeable way. The I.F.O. is fuelled by eight electric engines, which is able to push the flying object to an estimated top speed of about 120mph.
The Google.cn for China website is seen on a computer screen in this photo illustration. Google is reportedly working on a censored version of its search engine to comply with China's government's demands. The Google.cn for China website is seen on a computer screen in this photo illustration. Google is reportedly working on a censored version of its search engine to comply with China's government's demands. Google is testing a mobile version of its search engine that will adhere to the Chinese government's censorship demands, including the blocking of certain websites and search terms, according to multiple reports.
Google is developing a version of its search engine that will conform to China's censorship laws, reports say. The company shut down the engine in 2010, complaining that free speech was being limited. But online news site The Intercept says Google has being working on a project code-named Dragonfly that will block terms like human rights and religion, a move sure to anger activists. One state-owned newspaper in China, Securities Daily, dismissed the report. "We provide a number of mobile apps in China, such as Google Translate and Files Go, help Chinese developers, and have made significant investments in Chinese companies like JD.com," it said.