AITopics | Information Retrieval

Collaborating Authors

Information Retrieval

Our accustomed systems of retrieving particular bits of information no longer fill the needs of many people. Searching traditional indexes of print publications has been aided by computerized databases, but still usually requires time-consuming serial searching of one database after the other, and then moving on to other methods of searching for internet sources. And what if the information being sought is a sound byte? A video clip? Yesterday's e-mail exchange between respected scientists? Artificial intelligence may hold the key to information retrieval in an age where widely different formats contain the information being sought, and the universe of knowledge is simply too big and growing too rapidly for successful searching to proceed at a human's slow speed.

News Overviews Instructional Materials AI-Alerts Classics

Efficient and Accurate Top-$K$ Recovery from Choice Data

Nguyen, Duc

arXiv.org Artificial IntelligenceJun-23-2022

The intersection of learning to rank and choice modeling is an active area of research with applications in e-commerce, information retrieval and the social sciences. In some applications such as recommendation systems, the statistician is primarily interested in recovering the set of the top ranked items from a large pool of items as efficiently as possible using passively collected discrete choice data, i.e., the user picks one item from a set of multiple items. Motivated by this practical consideration, we propose the choice-based Borda count algorithm as a fast and accurate ranking algorithm for top $K$-recovery i.e., correctly identifying all of the top $K$ items. We show that the choice-based Borda count algorithm has optimal sample complexity for top-$K$ recovery under a broad class of random utility models. We prove that in the limit, the choice-based Borda count algorithm produces the same top-$K$ estimate as the commonly used Maximum Likelihood Estimate method but the former's speed and simplicity brings considerable advantages in practice. Experiments on both synthetic and real datasets show that the counting algorithm is competitive with commonly used ranking algorithms in terms of accuracy while being several orders of magnitude faster.

information retrieval, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2206.11995

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Pennsylvania (0.04)

Genre: Research Report (1.00)

Industry: Government > Voting & Elections (0.77)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.34)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.34)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.34)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.34)

Add feedback

HUD settlement changes the way Meta delivers housing ads – Search Engine Land

#artificialintelligenceJun-22-2022, 02:32:02 GMT

Meta and HUD collaboration. The announcement comes after a year-long collaboration between Meta and HUD to develop processes for machine learning …

collaboration, hud settlement change, search engine land

#artificialintelligence

Industry: Media > News (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.53)
Information Technology > Information Management > Search (0.40)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.40)

Add feedback

Accelerating Business Growth with Natural Language Processing

#artificialintelligenceJun-19-2022, 16:20:35 GMT

Today, NLP is broadly adopted by businesses across industries in several forms. In fact, according to recent research, the global NLP market size is expected to reach $35.1 billion by 2026. This ubiquity of the technology form can be accorded to the abundance of text and voice data as well as the shift from human-computer interaction to human-computer conversation. In my upcoming talk at the Open Data Science Conference (ODSC) East, I am excited to be sharing my thoughts on how NLP is already aiding businesses, trends to keep an eye out for in the near future, and things to keep in mind when it comes to adopting NLP solutions. Outlined below is what you can expect me to discuss in detail during the presentation.

accelerating business growth, ai system, nlp, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.31)

Add feedback

Implementing Hearst Patterns with SpaCy

#artificialintelligenceJun-19-2022, 07:21:31 GMT

In this article, I will mostly concentrate on the Hearst patterns, implementation and usage for hypernym extraction. However, I will use Named Entity Recognition (NER) and a dataset of patents; so I recommend checking my previous post in this cycle. Why do we care about patterns in the context of NLP? Because they significantly reduce and simplifies work, basically, it is a simple model. Despite being in the era of Transformer Neural Networks, patterns still can be beneficial.

hearst pattern, hypernym relation, relation, (15 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.56)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.36)

Add feedback

Let's Discuss About Microsites & Dips In Visitors: Ask An search engine optimisation - Channel969

#artificialintelligenceJun-18-2022, 15:27:45 GMT

At present's ask an search engine optimisation query comes from Kate in Louisville, who wrote: "I work for a corporation that builds microsites for shoppers. What components do I have to concentrate on when there's a dip in natural site visitors? In This autumn 2021, for instance, we did a rebrand and meta knowledge was altered. Would this have an enormous influence on site visitors going ahead?" They nonetheless take a look at URLs, hyperlinks, titles, content material, and lots of of different rating components so the identical search engine optimisation greatest practices for diagnosing a rankings drop will apply to microsites, too.

microsite, microsite & dip, search engine optimisation, (12 more...)

#artificialintelligence

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.86)

Add feedback

Implementing Hearst Patterns with SpaCy

#artificialintelligenceJun-18-2022, 06:36:14 GMT

hearst pattern, hypernym relation, relation, (15 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.56)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.36)

Add feedback

Research Topic Flows in Co-Authorship Networks

Schäfermeier, Bastian, Hirth, Johannes, Hanika, Tom

arXiv.org Artificial IntelligenceJun-16-2022

In scientometrics, scientific collaboration is often analyzed by means of co-authorships. An aspect which is often overlooked and more difficult to quantify is the flow of expertise between authors from different research topics, which is an important part of scientific progress. With the Topic Flow Network (TFN) we propose a graph structure for the analysis of research topic flows between scientific authors and their respective research fields. Based on a multi-graph and a topic model, our proposed network structure accounts for intratopic as well as intertopic flows. Our method requires for the construction of a TFN solely a corpus of publications (i.e., author and abstract information). From this, research topics are discovered automatically through non-negative matrix factorization. The thereof derived TFN allows for the application of social network analysis techniques, such as common metrics and community detection. Most importantly, it allows for the analysis of intertopic flows on a large, macroscopic scale, i.e., between research topic, as well as on a microscopic scale, i.e., between certain sets of authors. We demonstrate the utility of TFNs by applying our method to two comprehensive corpora of altogether 20 Mio. publications spanning more than 60 years of research in the fields computer science and mathematics. Our results give evidence that TFNs are suitable, e.g., for the analysis of topical communities, the discovery of important authors in different fields, and, most notably, the analysis of intertopic flows, i.e., the transfer of topical expertise. Besides that, our method opens new directions for future research, such as the investigation of influence relationships between research fields.

information retrieval, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s11192-022-04529-w

2206.0798

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > District of Columbia > Washington (0.04)
(7 more...)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Services (0.49)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Does Twitter know your political views? POLiTweets dataset and semi-automatic method for political leaning discovery

Baran, Joanna, Kajstura, Michał, Ziółkowski, Maciej, Rajda, Krzysztof

arXiv.org Artificial IntelligenceJun-14-2022

Every day, the world is flooded by millions of messages and statements posted on Twitter or Facebook. Social media platforms try to protect users' personal data, but there still is a real risk of misuse, including elections manipulation. Did you know, that only 13 posts addressing important or controversial topics for society are enough to predict one's political affiliation with a 0.85 F1-score? To examine this phenomenon, we created a novel universal method of semi-automated political leaning discovery. It relies on a heuristical data annotation procedure, which was evaluated to achieve 0.95 agreement with human annotators (counted as an accuracy metric). We also present POLiTweets - the first publicly open Polish dataset for political affiliation discovery in a multi-party setup, consisting of over 147k tweets from almost 10k Polish-writing users annotated heuristically and almost 40k tweets from 166 users annotated manually as a test set. We used our data to study the aspects of domain shift in the context of topics and the type of content writers - ordinary citizens vs. professional politicians.

information retrieval, natural language, tweet, (16 more...)

arXiv.org Artificial Intelligence

2207.07586

Country:

Europe > Poland > Lower Silesia Province > Wroclaw (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Poland > Lubusz Province > Zielona Góra (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Services (1.00)
Government (1.00)
Information Technology > Security & Privacy (0.69)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.36)

Add feedback

Writing for Search Engines: Optimize for Robots or People?

#artificialintelligenceJun-13-2022, 10:11:54 GMT

Google processes more than 8.5 billion searches every day. That's more than 100,000 searches per second, thousands of which could lead a user to a purchase. It's no wonder, then, that 60% of marketers list SEO as their number one inbound marketing priority. But generating organic traffic comes with challenges. Google has hundreds of billions of webpages in its index, competing for the top spots on search result pages.

audience, bot, google, (14 more...)

#artificialintelligence

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.45)

Add feedback

Virtual Openhouse

#artificialintelligenceJun-12-2022, 18:14:25 GMT

Research¹ shows that including videos in web pages can effectively improve user experiences, increase Search Engine Optimization (SEO), and catch readers further down the sales funnel. To help agents with their business through Compass' website, the Compass AI Content Intelligence (AI-CI) team wants to make it easy for them to generate and share videos. We leverage state-of-the-art AI technologies to create visual and textual content for the videos to be generated and leverage the close to metal rendering algorithms together with the cloud-based distributed computation system to render the videos efficiently. With our current automatic video generation feature, agents can create a video with a single click, or with just a few more clicks they can customize it. They can then quickly review videos that have been created for them.

convolutional layer, room type, video, (13 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.55)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.35)

Add feedback