AITopics | Information Retrieval

Collaborating Authors

Information Retrieval

Our accustomed systems of retrieving particular bits of information no longer fill the needs of many people. Searching traditional indexes of print publications has been aided by computerized databases, but still usually requires time-consuming serial searching of one database after the other, and then moving on to other methods of searching for internet sources. And what if the information being sought is a sound byte? A video clip? Yesterday's e-mail exchange between respected scientists? Artificial intelligence may hold the key to information retrieval in an age where widely different formats contain the information being sought, and the universe of knowledge is simply too big and growing too rapidly for successful searching to proceed at a human's slow speed.

News Overviews Instructional Materials AI-Alerts Classics

JobHam-place with smart recommend job options and candidate filtering options

Wu, Shiyao

arXiv.org Artificial IntelligenceMar-31-2023

Due to the increasing number of graduates, many applicants experience the situation about finding a job, and employers experience difficulty filtering job applicants, which might negatively impact their effectiveness. However, most job-hunting websites lack job recommendation and CV filtering or ranking functionality, which are not integrated into the system. Thus, a smart job hunter combined with the above functionality will be conducted in this project, which contains job recommendations, CV ranking and even a job dashboard for skills and job applicant functionality. Job recommendation and CV ranking starts from the automatic keyword extraction and end with the Job/CV ranking algorithm. Automatic keyword extraction is implemented by Job2Skill and the CV2Skill model based on Bert. Job2Skill consists of two components, text encoder and Gru-based layers, while CV2Skill is mainly based on Bert and fine-tunes the pre-trained model by the Resume- Entity dataset. Besides, to match skills from CV and job description and rank lists of jobs and candidates, job/CV ranking algorithms have been provided to compute the occurrence ratio of skill words based on TFIDF score and match ratio of the total skill numbers. Besides, some advanced features have been integrated into the website to improve user experiences, such as the calendar and sweetalert2 plugin. And some basic features to go through job application processes, such as job application tracking and interview arrangement.

information retrieval, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2303.1793

Country:

Asia (0.14)
Europe > United Kingdom > England > Leicestershire > Leicester (0.04)

Genre: Research Report (0.64)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Communications (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
(2 more...)

Add feedback

BERTino: an Italian DistilBERT model

Muffo, Matteo, Bertino, Enrico

arXiv.org Artificial IntelligenceMar-31-2023

The recent introduction of Transformers language representation models allowed great improvements in many natural language processing (NLP) tasks. However, if on one hand the performances achieved by this kind of architectures are surprising, on the other their usability is limited by the high number of parameters which constitute their network, resulting in high computational and memory demands. In this work we present BERTino, a DistilBERT model which proposes to be the first lightweight alternative to the BERT architecture specific for the Italian language. We evaluated BERTino on the Italian ISDT, Italian ParTUT, Italian WikiNER and multiclass classification tasks, obtaining F1 scores comparable to those obtained by a BERTBASE with a remarkable improvement in training and inference speed.

information retrieval, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2303.18121

Country:

Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)
Europe > Italy > Piedmont > Turin Province > Turin (0.04)
Europe > Greece > Attica > Athens (0.04)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.37)

Add feedback

Scardina: Scalable Join Cardinality Estimation by Multiple Density Estimators

Ito, Ryuichi, Sasaki, Yuya, Xiao, Chuan, Onizuka, Makoto

arXiv.org Artificial IntelligenceMar-31-2023

In recent years, machine learning-based cardinality estimation methods are replacing traditional methods. This change is expected to contribute to one of the most important applications of cardinality estimation, the query optimizer, to speed up query processing. However, none of the existing methods do not precisely estimate cardinalities when relational schemas consist of many tables with strong correlations between tables/attributes. This paper describes that multiple density estimators can be combined to effectively target the cardinality estimation of data with large and complex schemas having strong correlations. We propose Scardina, a new join cardinality estimation method using multiple partitioned models based on the schema structure.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2303.18042

Country: Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.05)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.30)

Add feedback

Reviewer Assignment Problem: A Systematic Review of the Literature

Aksoy, Meltem | Yanik, Seda (Istanbul Technical University) | Amasyali, Mehmet Fatih (Yildiz Technical University)

Journal of Artificial Intelligence ResearchMar-31-2023

Appropriate reviewer assignment significantly impacts the quality of proposal evaluation, as accurate and fair reviews are contingent on their assignment to relevant reviewers. The crucial task of assigning reviewers to submitted proposals is the starting point of the review process and is also known as the reviewer assignment problem (RAP). Due to the obvious restrictions of manual assignment, journal editors, conference organizers, and grant managers demand automatic reviewer assignment approaches. Many studies have proposed assignment solutions in response to the demand for automated procedures since 1992. The primary objective of this survey paper is to provide scholars and practitioners with a comprehensive overview of available research on the RAP. To achieve this goal, this article presents an in-depth systematic review of 103 publications in the field of reviewer assignment published in the past three decades and available in the Web of Science, Scopus, ScienceDirect, Google Scholar, and Semantic Scholar databases. This review paper classified and discussed the RAP approaches into two broad categories and numerous subcategories based on their underlying techniques. Furthermore, potential future research directions for each category are presented. This survey shows that the research on the RAP is becoming more significant and that more effort is required to develop new approaches and a framework.

assignment, proposal, reviewer, (14 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.14318

AI Access Foundation

14318

Journal of Artificial Intelligence Research

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
(15 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Research Report > Experimental Study (0.67)

Industry:

Health & Medicine > Therapeutic Area (0.67)
Information Technology > Services (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
(4 more...)

Add feedback

MetaEnhance: Metadata Quality Improvement for Electronic Theses and Dissertations of University Libraries

Choudhury, Muntabir Hasan, Salsabil, Lamia, Jayanetti, Himarsha R., Wu, Jian, Ingram, William A., Fox, Edward A.

arXiv.org Artificial IntelligenceMar-30-2023

Metadata quality is crucial for digital objects to be discovered through digital library interfaces. However, due to various reasons, the metadata of digital objects often exhibits incomplete, inconsistent, and incorrect values. We investigate methods to automatically detect, correct, and canonicalize scholarly metadata, using seven key fields of electronic theses and dissertations (ETDs) as a case study. We propose MetaEnhance, a framework that utilizes state-of-the-art artificial intelligence methods to improve the quality of these fields. To evaluate MetaEnhance, we compiled a metadata quality evaluation benchmark containing 500 ETDs, by combining subsets sampled using multiple criteria. We tested MetaEnhance on this benchmark and found that the proposed methods achieved nearly perfect F1-scores in detecting errors and F1-scores in correcting errors ranging from 0.85 to 1.00 for five of seven fields.

information retrieval, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2303.17661

Country:

North America > United States > Virginia > Norfolk City County > Norfolk (0.05)
North America > United States > Virginia > Montgomery County > Blacksburg (0.04)
North America > United States > New Mexico > Santa Fe County > Santa Fe (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.95)
Information Technology > Information Management > Metadata Management (0.95)
Information Technology > Artificial Intelligence > Machine Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.88)

Add feedback

CODA-Prompt: COntinual Decomposed Attention-based Prompting for Rehearsal-Free Continual Learning

Smith, James Seale, Karlinsky, Leonid, Gutta, Vyshnavi, Cascante-Bonilla, Paola, Kim, Donghyun, Arbelle, Assaf, Panda, Rameswar, Feris, Rogerio, Kira, Zsolt

arXiv.org Artificial IntelligenceMar-30-2023

Computer vision models suffer from a phenomenon known as catastrophic forgetting when learning novel concepts from continuously shifting training data. Typical solutions for this continual learning problem require extensive rehearsal of previously seen data, which increases memory costs and may violate data privacy. Recently, the emergence of large-scale pre-trained vision transformer models has enabled prompting approaches as an alternative to data-rehearsal. These approaches rely on a key-query mechanism to generate prompts and have been found to be highly resistant to catastrophic forgetting in the well-established rehearsal-free continual learning setting. However, the key mechanism of these methods is not trained end-to-end with the task sequence. Our experiments show that this leads to a reduction in their plasticity, hence sacrificing new task accuracy, and inability to benefit from expanded parameter capacity. We instead propose to learn a set of prompt components which are assembled with input-conditioned weights to produce input-conditioned prompts, resulting in a novel attention-based end-to-end key-query scheme. Our experiments show that we outperform the current SOTA method DualPrompt on established benchmarks by as much as 4.5% in average final accuracy. We also outperform the state of art by as much as 4.4% accuracy on a continual learning benchmark which contains both class-incremental and domain-incremental task shifts, corresponding to many practical settings. Our code is available at https://github.com/GT-RIPL/CODA-Prompt

continual learning, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2211.13218

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (0.68)
Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.34)

Add feedback

Task Oriented Conversational Modelling With Subjective Knowledge

Kumar, Raja

arXiv.org Artificial IntelligenceMar-30-2023

Existing conversational models are handled by a database(DB) and API based systems. However, very often users' questions require information that cannot be handled by such systems. Nonetheless, answers to these questions are available in the form of customer reviews and FAQs. DSTC-11 proposes a three stage pipeline consisting of knowledge seeking turn detection, knowledge selection and response generation to create a conversational model grounded on this subjective knowledge. In this paper, we focus on improving the knowledge selection module to enhance the overall system performance. In particular, we propose entity retrieval methods which result in an accurate and faster knowledge search. Our proposed Named Entity Recognition (NER) based entity retrieval method results in 7X faster search compared to the baseline model. Additionally, we also explore a potential keyword extraction method which can improve the accuracy of knowledge selection. Preliminary results show a 4 \% improvement in exact match score on knowledge selection task. The code is available https://github.com/raja-kumar/knowledge-grounded-TODS

artificial intelligence, information retrieval, natural language, (14 more...)

arXiv.org Artificial Intelligence

2303.17695

Country: North America > United States > California > Santa Cruz County > Santa Cruz (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.71)

Add feedback

Zero-Shot Retrieval with Search Agents and Hybrid Environments

Huebscher, Michelle Chen, Buck, Christian, Ciaramita, Massimiliano, Rothe, Sascha

arXiv.org Artificial IntelligenceMar-29-2023

Learning to search is the task of building artificial agents that learn to autonomously use a search box to find information. So far, it has been shown that current language models can learn symbolic query reformulation policies, in combination with traditional term-based retrieval, but fall short of outperforming neural retrievers. We extend the previous learning to search setup to a hybrid environment, which accepts discrete query refinement operations, after a first-pass retrieval step via a dual encoder. Experiments on the BEIR task show that search agents, trained via behavioral cloning, outperform the underlying search system based on a combined dual encoder retriever and cross encoder reranker. Furthermore, we find that simple heuristic Hybrid Retrieval Environments (HRE) can improve baseline performance by several nDCG points. The search agent based on HRE (HARE) matches state-of-the-art performance, balanced in both zero-shot and in-domain evaluations, via interpretable actions, and at twice the speed.

information retrieval, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2209.15469

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(3 more...)

Genre:

Workflow (0.68)
Research Report (0.65)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.93)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.69)

Add feedback

Neural Graph Databases. A new milestone in graph data…

#artificialintelligenceMar-28-2023, 15:21:16 GMT

Vanilla graph databases are pretty much everywhere thanks to the ever-growing graphs in production, flexible graph data models, and expressive query languages. Query engines assume that graphs in classical graph DBs are complete. Under the completeness assumption, we can build indexes, store the graphs in a variety of read/write-optimized formats and expect the DB would return what is there. But this assumption does not often hold in practice (we'd say, doesn't hold way too often). If we look at some prominent knowledge graphs (KGs): in Freebase, 93.8% of people have no place of birth and 78.5% have no nationality, about 68% of people do not have any profession, while in Wikidata, about 50% of artists have no date of birth, and only 0.4% of known buildings have information about height.

graph, graph db, neural graph database, (4 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.38)

Add feedback

AI Keyword Research: Tools, Strategies & Tips - Analytics Vidhya

#artificialintelligenceMar-28-2023, 10:05:10 GMT

AI-driven keyword research has become indispensable for bloggers looking to grow their audience and boost their online presence. By leveraging advanced ML algorithms, AI tools provide data-driven insights into user search behavior, revealing high-potential keywords to target. This process helps you create compelling, relevant, and searchable content that attracts organic traffic and improves your blog's search engine ranking. In this article, we'll introduce you to the benefits of AI keyword research, the best tools to use in 2023, and actionable tips for harnessing their potential to grow your blog exponentially. AI keyword research is the process of using machine learning algorithms and advanced data analytics to identify high-potential keywords that can help improve a website's search engine ranking and drive traffic to the site.

ai keyword research, keyword, keyword research, (11 more...)

#artificialintelligence

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.75)

Add feedback