AITopics | Information Retrieval

Collaborating Authors

Information Retrieval

Our accustomed systems of retrieving particular bits of information no longer fill the needs of many people. Searching traditional indexes of print publications has been aided by computerized databases, but still usually requires time-consuming serial searching of one database after the other, and then moving on to other methods of searching for internet sources. And what if the information being sought is a sound byte? A video clip? Yesterday's e-mail exchange between respected scientists? Artificial intelligence may hold the key to information retrieval in an age where widely different formats contain the information being sought, and the universe of knowledge is simply too big and growing too rapidly for successful searching to proceed at a human's slow speed.

News Overviews Instructional Materials AI-Alerts Classics

Google's prototype Chinese search engine links users' activity to their phone numbers, report claims

Daily Mail - Science & techSep-15-2018, 15:25:44 GMT

Google's secretive plans in China are attracting renewed scrutiny from privacy advocates. The tech giant is said to be building a prototype version of a censored Chinese search engine that links users' activity to their personal phone number, according to the Intercept. In doing so, it would be able to comply with the Chinese government's censorship requirements, increasing the chances that such a product would launch there in the future. A bipartisan group of 16 US lawmakers asked Google if it would comply with China's internet censorship and surveillance policies should it re-enter the search engine market there While China is home to the world's largest number of internet users, a 2015 report by US think tank Freedom House found that the country had the most restrictive online use policies of 65 nations it studied, ranking below Iran and Syria. But China has maintained that its various forms of web censorship are necessary for protecting its national security.

artificial intelligence, information retrieval, natural language, (13 more...)

Daily Mail - Science & tech

Country:

North America > United States (1.00)
Asia > Middle East > Syria (0.25)
Asia > Middle East > Iran (0.25)
Asia > China > Beijing > Beijing (0.06)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Communications (1.00)
Information Technology > Security & Privacy (0.91)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.85)

Add feedback

Answering Science Exam Questions Using Query Rewriting with Background Knowledge

Musa, Ryan, Wang, Xiaoyan, Fokoue, Achille, Mattei, Nicholas, Chang, Maria, Kapanipathi, Pavan, Makni, Bassem, Talamadupula, Kartik, Witbrock, Michael

arXiv.org Artificial IntelligenceSep-15-2018

Open-domain question answering (QA) is an important problem in AI and NLP that is emerging as a bellwether for progress on the generalizability of AI methods and techniques. Much of the progress in open-domain QA systems has been realized through advances in information retrieval methods and corpus construction. In this paper, we focus on the recently introduced ARC Challenge dataset, which contains 2,590 multiple choice questions authored for grade-school science exams. These questions are selected to be the most challenging for current QA systems, and current state of the art performance is only slightly better than random chance. We present a system that rewrites a given question into queries that are used to retrieve supporting text from a large corpus of science-related text. Our rewriter is able to incorporate background knowledge from ConceptNet and -- in tandem with a generic textual entailment system trained on SciTail that identifies support in the retrieved results -- outperforms several strong baselines on the end-to-end QA task despite only being trained to identify essential terms in the original source question. We use a generalizable decision methodology over the retrieved evidence and answer candidates to select the best answer. By combining query rewriting, background knowledge, and textual entailment our system is able to outperform several strong baselines on the ARC dataset.

information retrieval, machine learning, question answering, (21 more...)

arXiv.org Artificial Intelligence

1809.05726

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Visual search: The natural evolution in how we search for information

#artificialintelligenceSep-11-2018, 07:27:10 GMT

Imagine you're on the Tube and the person in front of you is wearing a really nice pair of trainers. To find them, you could search for "black suede trainers with off-white soles" and leaf through hundreds of possible results. Or, in a world of perfectly accurate visual search, you could find and buy the exact pair instantly from a picture. Three-quarters (74%) of consumers agree that text based keyword searches are inefficient in helping to find the right product online. This opportunity gap will be explored at Dmexco this week in a number of sessions dedicated to smarter search, and it emphasises that brands need to prepare themselves for visual search.

artificial intelligence, information retrieval, natural language, (15 more...)

#artificialintelligence

Industry: Retail (0.31)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.55)

Add feedback

Exploiting local and global performance of candidate systems for aggregation of summarization techniques

Mehta, Parth, Majumder, Prasenjit

arXiv.org Artificial IntelligenceSep-7-2018

With an ever growing number of extractive summarization techniques being proposed, there is less clarity then ever about how good each system is compared to the rest. Several studies highlight the variance in performance of these systems with change in datasets or even across documents within the same corpus. An effective way to counter this variance and to make the systems more robust could be to use inputs from multiple systems when generating a summary. In the present work, we define a novel way of creating such ensemble by exploiting similarity between the content of candidate summaries to estimate their reliability. We define GlobalRank which captures the performance of a candidate system on an overall corpus and LocalRank which estimates its performance on a given document cluster. We then use these two scores to assign a weight to each individual systems, which is then used to generate the new aggregate ranking. Experiments on DUC2003 and DUC 2004 datasets show a significant improvement in terms of ROUGE score, over existing sate-of-art techniques.

candidate system, information retrieval, machine learning, (21 more...)

arXiv.org Artificial Intelligence

1809.02343

Country:

Asia > India (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.50)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.48)

Add feedback

Perturb and Combine to Identify Influential Spreaders in Real-World Networks

Tixier, Antoine J. -P., Rossi, Maria-Evgenia G., Malliaros, Fragkiskos D., Read, Jesse, Vazirgiannis, Michalis

arXiv.org Machine LearningSep-4-2018

Recent research has shown that graph degeneracy algorithms, which decompose a network into a hierarchy of nested subgraphs of decreasing size and increasing density, are very effective at detecting the good spreaders in a network. However, it is also known that degeneracy-based decompositions of a graph are unstable to small perturbations of the network structure. In Machine Learning, the performance of unstable classification and regression methods, such as fully-grown decision trees, can be greatly improved by using Perturb and Combine (P&C) strategies such as bagging (bootstrap aggregating). Therefore, we propose a P&C procedure for networks that (1) creates many perturbed versions of a given graph, (2) applies a node scoring function separately to each graph (such as a degeneracy-based one), and (3) combines the results. We conduct real-world experiments on the tasks of identifying influential spreaders in large social networks, and influential words (keywords) in small word co-occurrence networks. We use the k-core, generalized k-core, and PageRank algorithms as our vertex scoring functions. In each case, using the aggregated scores brings significant improvements compared to using the scores computed on the original graphs. Finally, a bias-variance analysis suggests that our P&C procedure works mainly by reducing bias, and that therefore, it should be capable of improving the performance of all vertex scoring functions, not only unstable ones.

information retrieval, machine learning, node, (22 more...)

arXiv.org Machine Learning

1807.09586

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (0.50)

Industry:

Information Technology (0.49)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.48)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.47)

Add feedback

DeFactoNLP: Fact Verification using Entity Recognition, TFIDF Vector Comparison and Decomposable Attention

Reddy, Aniketh Janardhan, Rocha, Gil, Esteves, Diego

arXiv.org Artificial IntelligenceSep-3-2018

In this paper, we describe DeFactoNLP, the system we designed for the FEVER 2018 Shared Task. The aim of this task was to conceive a system that can not only automatically assess the veracity of a claim but also retrieve evidence supporting this assessment from Wikipedia. In our approach, the Wikipedia documents whose Term Frequency-Inverse Document Frequency (TFIDF) vectors are most similar to the vector of the claim and those documents whose names are similar to those of the named entities (NEs) mentioned in the claim are identified as the documents which might contain evidence. The sentences in these documents are then supplied to a textual entailment recognition module. This module calculates the probability of each sentence supporting the claim, contradicting the claim or not providing any relevant information to assess the veracity of the claim. Various features computed using these probabilities are finally used by a Random Forest classifier to determine the overall truthfulness of the claim. The sentences which support this classification are returned as evidence. Our approach achieved a 0.4277 evidence F1-score, a 0.5136 label accuracy and a 0.3833 FEVER score.

information retrieval, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

1809.00509

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Portugal > Porto > Porto (0.04)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
Asia > India (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.71)

Add feedback

What Should An AI-Driven Search Engine Be Able To Do?

#artificialintelligenceAug-31-2018, 14:24:16 GMT

Search has always been a key enterprise technology going back to the days of the first enterprise content management systems. This is hardly surprising given how important finding the right data is for any of the applications used by enterprises in their business processes. Since the rise of big data and the use of big data sets, search has become even more important. If enterprise data is the real wealth of a business, then search is the tool that uncovers that wealth. But what do you do with the increasingly large amounts of data that enterprises now have access to?

artificial intelligence, information retrieval, natural language, (17 more...)

#artificialintelligence

Country:

North America > United States > Utah > Salt Lake County > Salt Lake City (0.05)
North America > United States > Minnesota (0.05)
North America > United States > California > San Francisco County > San Francisco (0.05)
(5 more...)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.50)

Add feedback

Leading Rights Groups Call on Google Not to Censor Its Search Engine in China

TIME - TechAug-29-2018, 05:27:14 GMT

More than a dozen human rights groups have sent a letter to Google urging the company not to offer censored internet search in China, amid reports it is planning to again begin offering the service in the giant Asian market. The joint letter dated Tuesday calls on CEO Sundar Pichai to explain what Google is doing to safeguard users from the Chinese government's censorship and surveillance. It describes the censored search engine app, codenamed "Dragonfly", as representing "an alarming capitulation by Google on human rights. "The Chinese government extensively violates the rights to freedom of expression and privacy; by accommodating the Chinese authorities' repression of dissent, Google would be actively participating in those violations for millions of internet users in China," said the letter That follows a letter earlier this month signed by more than a thousand Google employees protesting the company's secretive plan to build a search engine that would comply with Chinese ...

artificial intelligence, information retrieval, natural language, (9 more...)

TIME - Tech

Country:

Asia > China > Beijing > Beijing (0.08)
Europe > Russia (0.06)
Asia > Russia (0.06)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Government > Regional Government > Asia Government > China Government (0.79)

Technology:

Information Technology > Information Management > Search (0.86)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.86)

Add feedback

Here's what we know about Google's mysterious search engine

Washington Post - Technology NewsAug-29-2018, 00:04:44 GMT

President Trump thinks Google's search engine is "rigged." By featuring more mainstream news outlets and relatively fewer conservative sites in the results he sees, Trump tweeted Tuesday, Google is "suppressing" right-wing views on its platform. Trump escalated his attacks Tuesday afternoon in remarks from the Oval Office, warning that "Google and Twitter and Facebook, they are treading on very, very troubled territory and they have to be careful." It's easy to see how Trump arrived at this conclusion, because in many ways his experience mirrors that of millions of Americans who've awoken to the dominance of Google -- and Facebook, and Twitter -- in their everyday lives without being quite certain how it wound up there. We rely constantly on Google to find out what to buy, which restaurants to eat at and how to get from one place to another.

artificial intelligence, information retrieval, natural language, (20 more...)

Washington Post - Technology News

Country: Asia > China (0.05)

Industry: Media > News (0.90)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.64)

Add feedback

Advocacy groups criticize Google's 'alarming capitulation' over censored China search engine

Daily Mail - Science & techAug-28-2018, 21:16:10 GMT

More than a dozen human rights groups and other advocacy organizations urged Google to abandon any plans to build a censored version of its search engine in China. The project, said to be referred to internally as Dragonfly, 'would represent an alarming capitulation by Google on human rights,' argued a letter signed by 14 groups including Amnesty International, Human Rights Watch and Reporters Without Borders. The letter is addressed to Google CEO Sundar Pichai and comes after weeks of internal revolt at the company, wherein employees have expressed outrage over the firm's rumored plans to launch a censored search engine in China. While China is home to the world's largest number of internet users, a 2015 report by US think tank Freedom House found that the country had the most restrictive online use policies of 65 nations it studied, ranking below Iran and Syria. But China has maintained that its various forms of web censorship are necessary for protecting its national security.

artificial intelligence, information retrieval, natural language, (16 more...)

Daily Mail - Science & tech

Country:

Asia > China (1.00)
Asia > Middle East > Syria (0.25)
Asia > Middle East > Iran (0.25)

Industry: Law > Civil Rights & Constitutional Law (1.00)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.88)

Add feedback