AITopics

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.70)

Neural Information Processing SystemsApr-6-2023, 12:52:12 GMT

Linear Submodular Bandits and their Application to Diversified Retrieval

Diversified retrieval and online learning are two core research areas in the design of modern information retrieval systems.In this paper, we propose the linear submodular bandits problem, which is an online learning setting for optimizing a general class of feature-rich submodular utility models for diversified retrieval. We present an algorithm, called LSBGREEDY, and prove that it efficiently converges to a near-optimal model. As a case study, we applied our approach to the setting of personalized news recommendation, where the system must recommend small sets of news articles selected from tens of thousands of available articles each day. In a live user study, we found that LSBGREEDY significantly outperforms existing online learning approaches.

application, diversified retrieval, linear submodular bandit, (1 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.69)
Information Technology > Artificial Intelligence > Machine Learning (0.49)

Neural Information Processing SystemsApr-6-2023, 12:42:21 GMT

Query Complexity of Derivative-Free Optimization

Derivative Free Optimization (DFO) is attractive when the objective function's derivatives are not available and evaluations are costly. Moreover, if the function evaluations are noisy, then approximating gradients by finite differences is difficult. This paper gives quantitative lower bounds on the performance of DFO with noisy function evaluations, exposing a fundamental and unavoidable gap between optimization performance based on noisy evaluations versus noisy gradients. This challenges the conventional wisdom that the method of finite differences is comparable to a stochastic gradient. However, there are situations in which DFO is unavoidable, and for such situations we propose a new DFO algorithm that is proved to be near optimal for the class of strongly convex objective functions.

derivative-free optimization, evaluation, function evaluation, (8 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.43)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.40)

Neural Information Processing SystemsApr-6-2023, 12:12:05 GMT

Human memory search as a random walk in a semantic network

The human mind has a remarkable ability to store a vast amount of information in memory, and an even more remarkable ability to retrieve these experiences when needed. Understanding the representations and algorithms that underlie human memory search could potentially be useful in other information retrieval settings, including internet search. Psychological studies have revealed clear regularities in how people search their memory, with clusters of semantically related items tending to be retrieved together. These findings have recently been taken as evidence that human memory search is similar to animals foraging for food in patchy environments, with people making a rational decision to switch away from a cluster of related information as it becomes depleted. We demonstrate that the results that were taken as evidence for this account also emerge from a random walk on a semantic network, much like the random web surfer model used in internet search engines.

memory search, random walk, semantic network, (3 more...)

Technology:

Information Technology > Information Management > Search (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.65)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.63)

Chen, Eason, Roche, Niall, Tseng, Yuen-Hsien, Hernandez, Walter, Shangguan, Jiangbo, Moore, Alastair

Conversion of Legal Agreements into Smart Legal Contracts using NLP

arXiv.org Artificial IntelligenceApr-5-2023

A Smart Legal Contract (SLC) is a specialized digital agreement comprising natural language and computable components. The Accord Project provides an open-source SLC framework containing three main modules: Cicero, Concerto, and Ergo. Currently, we need lawyers, programmers, and clients to work together with great effort to create a usable SLC using the Accord Project. This paper proposes a pipeline to automate the SLC creation process with several Natural Language Processing (NLP) models to convert law contracts to the Accord Project's Concerto model. After evaluating the proposed pipeline, we discovered that our NER pipeline accurately detects CiceroMark from Accord Project template text with an accuracy of 0.8. Additionally, our Question Answering method can extract one-third of the Concerto variables from the template text. We also delve into some limitations and possible future research for the proposed pipeline. Finally, we describe a web interface enabling users to build SLCs. This interface leverages the proposed pipeline to convert text documents to Smart Legal Contracts by using NLP models.

information retrieval, natural language, template, (16 more...)

doi: 10.1145/3543873.3587554

2210.08954

Country:

Europe > United Kingdom > England > Greater London > London (0.05)
Asia > Taiwan > Taiwan Province > Taipei (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Genre: Research Report (0.50)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.48)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.31)

arXiv.org Artificial IntelligenceApr-5-2023

Ericson: An Interactive Open-Domain Conversational Search Agent

Wang, Zihao, Ahmadvand, Ali, Choi, Jason, Karisani, Payam, Agichtein, Eugene

Open-domain conversational search (ODCS) aims to provide valuable, up-to-date information, while maintaining natural conversations to help users refine and ultimately answer information needs. However, creating an effective and robust ODCS agent is challenging. In this paper, we present a fully functional ODCS system, Ericson, which includes state-of-the-art question answering and information retrieval components, as well as intent inference and dialogue management models for proactive question refinement and recommendations. Our system was stress-tested in the Amazon Alexa Prize, by engaging in live conversations with thousands of Alexa users, thus providing empirical basis for the analysis of the ODCS system in real settings. Our interaction data analysis revealed that accurate intent classification, encouraging user engagement, and careful proactive recommendations contribute most to the users satisfaction. Our study further identifies limitations of the existing search techniques, and can serve as a building block for the next generation of ODCS agents.

information, information retrieval, machine learning, (21 more...)

2304.02233

Country: North America > United States > Oklahoma (0.04)

Genre: Research Report > New Finding (0.93)

Industry:

Media (1.00)
Leisure & Entertainment (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Mohoney, Jason, Pacaci, Anil, Chowdhury, Shihabur Rahman, Mousavi, Ali, Ilyas, Ihab F., Minhas, Umar Farooq, Pound, Jeffrey, Rekatsinas, Theodoros

High-Throughput Vector Similarity Search in Knowledge Graphs

arXiv.org Artificial IntelligenceApr-4-2023

There is an increasing adoption of machine learning for encoding data into vectors to serve online recommendation and search use cases. As a result, recent data management systems propose augmenting query processing with online vector similarity search. In this work, we explore vector similarity search in the context of Knowledge Graphs (KGs). Motivated by the tasks of finding related KG queries and entities for past KG query workloads, we focus on hybrid vector similarity search (hybrid queries for short) where part of the query corresponds to vector similarity search and part of the query corresponds to predicates over relational attributes associated with the underlying data vectors. For example, given past KG queries for a song entity, we want to construct new queries for new song entities whose vector representations are close to the vector representation of the entity in the past KG query. But entities in a KG also have non-vector attributes such as a song associated with an artist, a genre, and a release date. Therefore, suggested entities must also satisfy query predicates over non-vector attributes beyond a vector-based similarity predicate. While these tasks are central to KGs, our contributions are generally applicable to hybrid queries. In contrast to prior works that optimize online queries, we focus on enabling efficient batch processing of past hybrid query workloads. We present our system, HQI, for high-throughput batch processing of hybrid queries. We introduce a workload-aware vector data partitioning scheme to tailor the vector index layout to the given workload and describe a multi-query optimization technique to reduce the overhead of vector similarity computations. We evaluate our methods on industrial workloads and demonstrate that HQI yields a 31x improvement in throughput for finding related KG queries compared to existing hybrid query processing approaches.

artificial intelligence, machine learning, natural language, (18 more...)

2304.01926

Country:

North America > United States > Washington > King County > Seattle (0.05)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre:

Research Report (0.50)
Overview (0.46)

Industry: Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceApr-4-2023

EDeR: A Dataset for Exploring Dependency Relations Between Events

Li, Ruiqi, Haslum, Patrik, Cui, Leyang

Relation extraction is a central task in natural language processing (NLP) and information retrieval (IR) research. We argue that an important type of relation not explored in NLP or IR research to date is that of an event being an argument - required or optional - of another event. We introduce the human-annotated Event Dependency Relation dataset (EDeR) which provides this dependency relation. The annotation is done on a sample of documents from the OntoNotes dataset, which has the added benefit that it integrates with existing, orthogonal, annotations of this dataset. We investigate baseline approaches for predicting the event dependency relation, the best of which achieves an accuracy of 82.61 for binary argument/non-argument classification. We show that recognizing this relation leads to more accurate event extraction (semantic role labelling) and can improve downstream tasks that depend on this, such as co-reference resolution. Furthermore, we demonstrate that predicting the three-way classification into the required argument, optional argument or non-argument is a more challenging task.

information retrieval, machine learning, natural language, (20 more...)

2304.01612

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > China > Hong Kong (0.04)
(19 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.67)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)

arXiv.org Artificial IntelligenceApr-3-2023

Simple Yet Effective Neural Ranking and Reranking Baselines for Cross-Lingual Information Retrieval

Lin, Jimmy, Alfonso-Hermelo, David, Jeronymo, Vitor, Kamalloo, Ehsan, Lassance, Carlos, Nogueira, Rodrigo, Ogundepo, Odunayo, Rezagholizadeh, Mehdi, Thakur, Nandan, Yang, Jheng-Hong, Zhang, Xinyu

The advent of multilingual language models has generated a resurgence of interest in cross-lingual information retrieval (CLIR), which is the task of searching documents in one language with queries from another. However, the rapid pace of progress has led to a confusing panoply of methods and reproducibility has lagged behind the state of the art. In this context, our work makes two important contributions: First, we provide a conceptual framework for organizing different approaches to cross-lingual retrieval using multi-stage architectures for mono-lingual retrieval as a scaffold. Second, we implement simple yet effective reproducible baselines in the Anserini and Pyserini IR toolkits for test collections from the TREC 2022 NeuCLIR Track, in Persian, Russian, and Chinese. Our efforts are built on a collaboration of the two teams that submitted the most effective runs to the TREC evaluation. These contributions provide a firm foundation for future advances.

information retrieval, natural language, translation, (14 more...)

2304.01019

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
North America > Canada > Ontario > Waterloo Region > Waterloo (0.05)
(13 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)

arXiv.org Artificial IntelligenceApr-2-2023

DeepEverest: Accelerating Declarative Top-K Queries for Deep Neural Network Interpretation

He, Dong, Daum, Maureen, Cai, Walter, Balazinska, Magdalena

A widely used interpretation by example We design, implement, and evaluate DeepEverest, a system for the query is, "find the top-inputs that produce the highest activation efficient execution of interpretation by example queries over the values for an individual neuron or a group of neurons" [12, 14, 21, activation values of a deep neural network. DeepEverest consists 33, 50, 57, 58, 61]. Another common query is, "for any input, find of an efficient indexing technique and a query execution algorithm the k-nearest neighbors in the dataset using the activation values of a with various optimizations. We prove that the proposed query group of neurons based on the proximity in the latent space defined execution algorithm is instance optimal.

artificial intelligence, machine learning, natural language, (20 more...)

2104.02234

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.69)