AITopics | Information Retrieval

Collaborating Authors

Information Retrieval

Our accustomed systems of retrieving particular bits of information no longer fill the needs of many people. Searching traditional indexes of print publications has been aided by computerized databases, but still usually requires time-consuming serial searching of one database after the other, and then moving on to other methods of searching for internet sources. And what if the information being sought is a sound byte? A video clip? Yesterday's e-mail exchange between respected scientists? Artificial intelligence may hold the key to information retrieval in an age where widely different formats contain the information being sought, and the universe of knowledge is simply too big and growing too rapidly for successful searching to proceed at a human's slow speed.

News Overviews Instructional Materials AI-Alerts Classics

Important digital skill-what are the basics of SEO(search engine optimization)?

#artificialintelligenceJul-21-2022, 03:45:24 GMT

With the growth of the internet, the world is becoming digital. Millions of websites are being created every day and digital content is being uploaded to them. Every site's purpose is to reach its potential audience at an earlier base. The search Engine Optimization (SEO) technique is used for this purpose. SEO is the method by which it brings organic traffic to your website.

engine optimization, search engine, search result, (13 more...)

#artificialintelligence

Industry: Marketing (0.30)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.84)

Add feedback

Consistent Polyhedral Surrogates for Top-$k$ Classification and Variants

Finocchiaro, Jessie, Frongillo, Rafael, Goodwill, Emma, Thilagar, Anish

arXiv.org Artificial IntelligenceJul-18-2022

Top-$k$ classification is a generalization of multiclass classification used widely in information retrieval, image classification, and other extreme classification settings. Several hinge-like (piecewise-linear) surrogates have been proposed for the problem, yet all are either non-convex or inconsistent. For the proposed hinge-like surrogates that are convex (i.e., polyhedral), we apply the recent embedding framework of Finocchiaro et al. (2019; 2022) to determine the prediction problem for which the surrogate is consistent. These problems can all be interpreted as variants of top-$k$ classification, which may be better aligned with some applications. We leverage this analysis to derive constraints on the conditional label distributions under which these proposed surrogates become consistent for top-$k$. It has been further suggested that every convex hinge-like surrogate must be inconsistent for top-$k$. Yet, we use the same embedding framework to give the first consistent polyhedral surrogate for this problem.

information retrieval, machine learning, natural language, (13 more...)

arXiv.org Artificial Intelligence

2207.08873

Country:

North America > United States > Colorado > Boulder County > Boulder (0.14)
North America > United States > Maryland > Baltimore (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.34)

Add feedback

MIA 2022 Shared Task Submission: Leveraging Entity Representations, Dense-Sparse Hybrids, and Fusion-in-Decoder for Cross-Lingual Question Answering

Tu, Zhucheng, Padmanabhan, Sarguna Janani

arXiv.org Artificial IntelligenceJul-18-2022

We describe our two-stage system for the Multilingual Information Access (MIA) 2022 Shared Task on Cross-Lingual Open-Retrieval Question Answering. The first stage consists of multilingual passage retrieval with a hybrid dense and sparse retrieval strategy. The second stage consists of a reader which outputs the answer from the top passages returned by the first stage. We show the efficacy of using a multilingual language model with entity representations in pretraining, sparse retrieval signals to help dense retrieval, and Fusion-in-Decoder. On the development set, we obtain 43.46 F1 on XOR-TyDi QA and 21.99 F1 on MKQA, for an average F1 score of 32.73. On the test set, we obtain 40.93 F1 on XOR-TyDi QA and 22.29 F1 on MKQA, for an average F1 score of 31.61. We improve over the official baseline by over 4 F1 points on both the development and test sets.

dense retrieval, fusion-in-decoder, retrieval, (15 more...)

arXiv.org Artificial Intelligence

2207.0194

Country:

North America > United States > Utah (0.04)
Europe (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Leisure & Entertainment > Sports > Soccer (0.94)
Government (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.72)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.56)

Add feedback

Heuristic-free Optimization of Force-Controlled Robot Search Strategies in Stochastic Environments

Alt, Benjamin, Katic, Darko, Jäkel, Rainer, Beetz, Michael

arXiv.org Artificial IntelligenceJul-15-2022

In both industrial and service domains, a central benefit of the use of robots is their ability to quickly and reliably execute repetitive tasks. However, even relatively simple peg-in-hole tasks are typically subject to stochastic variations, requiring search motions to find relevant features such as holes. While search improves robustness, it comes at the cost of increased runtime: More exhaustive search will maximize the probability of successfully executing a given task, but will significantly delay any downstream tasks. This trade-off is typically resolved by human experts according to simple heuristics, which are rarely optimal. This paper introduces an automatic, data-driven and heuristic-free approach to optimize robot search strategies. By training a neural model of the search strategy on a large set of simulated stochastic environments, conditioning it on few real-world examples and inverting the model, we can infer search strategies which adapt to the time-variant characteristics of the underlying probability distributions, while requiring very few real-world measurements. We evaluate our approach on two different industrial robots in the context of spiral and probe search for THT electronics assembly.

information retrieval, natural language, search strategy, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/IROS47612.2022.9982093

2207.07524

Country:

Europe > Germany > Bremen > Bremen (0.14)
Europe > Italy > Tuscany > Florence (0.04)
Europe > Ireland (0.04)
(4 more...)

Genre: Research Report (0.82)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)

Add feedback

You.com raises $25M to fuel its AI-powered search engine – TechCrunch

#artificialintelligenceJul-14-2022, 22:25:46 GMT

At least, that's the crux of the argument Richard Socher, the former chief scientist at Salesforce, likes to make. In 2020, Socher co-founded You, a search engine that uses AI to understand search queries, rank the results and parse the queries into different languages (including programming languages). You summarizes information from across the web and offers built-in apps, like search tools for Twitter, that allow users to complete tasks without having to leave the results page. It seems there's some truth to his words. Socher claims that You has hundreds of thousands of users, with 70% growth in sign-ups last month and 30% growth in unique searches month over month.

artificial intelligence, information retrieval, natural language, (19 more...)

#artificialintelligence

Country: North America > United States (0.05)

Industry: Information Technology > Services (0.50)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.70)

Add feedback

GrabQC: Graph based Query Contextualization for automated ICD coding

Chelladurai, Jeshuren, Santhiappan, Sudarsun, Ravindran, Balaraman

arXiv.org Artificial IntelligenceJul-14-2022

Automated medical coding is a process of codifying clinical notes to appropriate diagnosis and procedure codes automatically from the standard taxonomies such as ICD (International Classification of Diseases) and CPT (Current Procedure Terminology). The manual coding process involves the identification of entities from the clinical notes followed by querying a commercial or non-commercial medical codes Information Retrieval (IR) system that follows the Centre for Medicare and Medicaid Services (CMS) guidelines. We propose to automate this manual process by automatically constructing a query for the IR system using the entities auto-extracted from the clinical notes. We propose \textbf{GrabQC}, a \textbf{Gra}ph \textbf{b}ased \textbf{Q}uery \textbf{C}ontextualization method that automatically extracts queries from the clinical text, contextualizes the queries using a Graph Neural Network (GNN) model and obtains the ICD Codes using an external IR system. We also propose a method for labelling the dataset for training the model. We perform experiments on two datasets of clinical text in three different setups to assert the effectiveness of our approach. The experimental results show that our proposed method is better than the compared baselines in all three settings.

information retrieval, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-030-75762-5_19

2207.06802

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Italy > Tuscany > Florence (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Health Care Technology > Medical Record (1.00)
Health & Medicine > Health Care Providers & Services > Reimbursement (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Parameter-Efficient Prompt Tuning Makes Generalized and Calibrated Neural Text Retrievers

Tam, Weng Lam, Liu, Xiao, Ji, Kaixuan, Xue, Lilong, Zhang, Xingjian, Dong, Yuxiao, Liu, Jiahua, Hu, Maodi, Tang, Jie

arXiv.org Artificial IntelligenceJul-14-2022

Prompt tuning attempts to update few task-specific parameters in pre-trained models. It has achieved comparable performance to fine-tuning of the full parameter set on both language understanding and generation tasks. In this work, we study the problem of prompt tuning for neural text retrievers. We introduce parameter-efficient prompt tuning for text retrieval across in-domain, cross-domain, and cross-topic settings. Through an extensive analysis, we show that the strategy can mitigate the two issues -- parameter-inefficiency and weak generalizability -- faced by fine-tuning based retrieval methods. Notably, it can significantly improve the out-of-domain zero-shot generalization of the retrieval models. By updating only 0.1% of the model parameters, the prompt tuning strategy can help retrieval models achieve better generalization performance than traditional methods in which all parameters are updated. Finally, to facilitate research on retrievers' cross-topic generalizability, we curate and release an academic retrieval dataset with 18K query-results pairs in 87 topics, making it the largest topic-specific one to date.

computational linguistic, dataset, proceedings, (14 more...)

arXiv.org Artificial Intelligence

2207.07087

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(11 more...)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.66)

Add feedback

Paid search ways you'll want to maximize ROI in a good financial system - Channel969

#artificialintelligenceJul-13-2022, 03:52:24 GMT

As we enter financial uncertainty, many leaders are on the lookout for methods to take advantage of out of their advertising and marketing budgets. Paired with buyer conduct turning into more and more turbulent and fewer predictable, you at the moment are confronted with maximizing your ROI. We'll spotlight case research of shoppers who've already reaped the advantage of sharpening their paid search technique together with Sage and the way they achieved a 75% lower in CPCs. Register right now for "Paid Search Ways You Have to Maximize ROI in a Tight Economic system," offered by Adthena. Cynthia Ramsaran is director of customized content material at Third Door Media, publishers of Search Engine Land and MarTech.

good financial system, information retrieval, natural language, (5 more...)

#artificialintelligence

Industry: Banking & Finance (1.00)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.73)

Add feedback

ParaNames: A Massively Multilingual Entity Name Corpus

Sälevä, Jonne, Lignos, Constantine

arXiv.org Artificial IntelligenceJul-12-2022

We introduce ParaNames, a multilingual parallel name resource consisting of 118 million names spanning across 400 languages. Names are provided for 13.6 million entities which are mapped to standardized entity types (PER/LOC/ORG). Using Wikidata as a source, we create the largest resource of this type to-date. We describe our approach to filtering and standardizing the data to provide the best quality possible. ParaNames is useful for multilingual language processing, both in defining tasks for name translation/transliteration and as supplementary data for tasks such as named entity recognition and linking. We demonstrate an application of ParaNames by training a multilingual model for canonical name translation to and from English. Our resource is released under a Creative Commons license (CC BY 4.0) at https://github.com/bltlab/paranames.

experiment, information, transliteration, (14 more...)

arXiv.org Artificial Intelligence

2202.14035

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York (0.04)
Asia > China (0.04)
(14 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry: Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.55)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.35)

Add feedback

Measurement and applications of position bias in a marketplace search engine

Demsyn-Jones, Richard

arXiv.org Artificial IntelligenceJul-12-2022

Search engines intentionally influence user behavior by picking and ranking the list of results. Users engage with the highest results both because of their prominent placement and because they are typically the most relevant documents. Search engine ranking algorithms need to identify relevance while incorporating the influence of the search engine itself. This paper describes our efforts at Thumbtack to understand the impact of ranking, including the empirical results of a randomization program. In the context of a consumer marketplace we discuss practical details of model choice, experiment design, bias calculation, and machine learning model adaptation. We include a novel discussion of how ranking bias may not only affect labels, but also model features. The randomization program led to improved models, motivated internal scenario analysis, and enabled user-facing scenario tooling.

measurement and application, position bias, search result, (14 more...)

arXiv.org Artificial Intelligence

2206.1172

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > Experimental Study (0.47)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback