AITopics

We propose a combined three pre-trained language models (XLM-R, BART, and DeBERTa-V3) as an empower of contextualized embedding for named entity recognition. Our model achieves a 92.9% F1 score on the test set and ranks 5th on the leaderboard at NL4Opt competition subtask 1.

information retrieval, machine learning, natural language, (14 more...)

2212.07219

Country:

Oceania > Australia > Victoria > Melbourne (0.05)
North America > United States > Washington > King County > Seattle (0.05)
Asia > Vietnam (0.05)
Asia > China > Hong Kong (0.05)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

FastClass: A Time-Efficient Approach to Weakly-Supervised Text Classification

Xia, Tingyu, Wang, Yue, Tian, Yuan, Chang, Yi

Weakly-supervised text classification aims to train a classifier using only class descriptions and unlabeled data. Recent research shows that keyword-driven methods can achieve state-of-the-art performance on various tasks. However, these methods not only rely on carefully-crafted class descriptions to obtain class-specific keywords but also require substantial amount of unlabeled data and takes a long time to train. This paper proposes FastClass, an efficient weakly-supervised classification approach. It uses dense text representation to retrieve class-relevant documents from external unlabeled corpus and selects an optimal subset to train a classifier. Compared to keyword-driven methods, our approach is less reliant on initial class descriptions as it no longer needs to expand each class description into a set of class-specific keywords. Experiments on a wide range of classification tasks show that the proposed approach frequently outperforms keyword-driven models in terms of classification accuracy and often enjoys orders-of-magnitude faster training speed.

classification, information retrieval, machine learning, (20 more...)

2212.05506

Country:

North America > United States > North Carolina (0.04)
Asia > China > Jilin Province (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
(2 more...)

Saha, Sourav, Majumdar, Debapriyo, Mitra, Mandar

Explainability of Text Processing and Retrieval Methods: A Critical Survey

Deep Learning and Machine Learning based models have become extremely popular in text processing and information retrieval. However, the non-linear structures present inside the networks make these models largely inscrutable. A significant body of research has focused on increasing the transparency of these models. This article provides a broad overview of research on the explainability and interpretability of natural language processing and information retrieval methods. More specifically, we survey approaches that have been applied to explain word embeddings, sequence modeling, attention modules, transformers, BERT, and document ranking. The concluding section suggests some possible directions for future research on this topic.

information retrieval, machine learning, proc, (18 more...)

2212.07126

Country:

North America > Canada > Ontario > Toronto (0.04)
Asia > India > West Bengal > Kolkata (0.04)
Europe > Spain > Galicia > Madrid (0.04)
(5 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.93)

Industry:

Leisure & Entertainment (0.67)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Augmenting Scientific Creativity with Retrieval across Knowledge Domains

Kang, Hyeonsu B., Mysore, Sheshera, Huang, Kevin, Chang, Haw-Shiuan, Prein, Thorben, McCallum, Andrew, Kittur, Aniket, Olivetti, Elsa

Exposure to ideas in domains outside a scientist's own may benefit her in reformulating existing research problems in novel ways and discovering new application domains for existing solution ideas. While improved performance in scholarly search engines can help scientists efficiently identify relevant advances in domains they may already be familiar with, it may fall short of helping them explore diverse ideas \textit{outside} such domains. In this paper we explore the design of systems aimed at augmenting the end-user ability in cross-domain exploration with flexible query specification. To this end, we develop an exploratory search system in which end-users can select a portion of text core to their interest from a paper abstract and retrieve papers that have a high similarity to the user-selected core aspect but differ in terms of domains. Furthermore, end-users can `zoom in' to specific domain clusters to retrieve more papers from them and understand nuanced differences within the clusters. Our case studies with scientists uncover opportunities and design implications for systems aimed at facilitating cross-domain exploration and inspiration.

information retrieval, machine learning, natural language, (21 more...)

2206.01328

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York > New York County > New York City (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry:

Energy > Energy Storage (0.68)
Materials (0.68)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

arXiv.org Artificial IntelligenceDec-13-2022

Unsupervised Multi-Granularity Summarization

Zhong, Ming, Liu, Yang, Ge, Suyu, Mao, Yuning, Jiao, Yizhu, Zhang, Xingxing, Xu, Yichong, Zhu, Chenguang, Zeng, Michael, Han, Jiawei

Text summarization is a user-preference based task, i.e., for one document, users often have different priorities for summary. As a key aspect of customization in summarization, granularity is used to measure the semantic coverage between the summary and source document. However, developing systems that can generate summaries with customizable semantic coverage is still an under-explored topic. In this paper, we propose the first unsupervised multi-granularity summarization framework, GranuSum. We take events as the basic semantic units of the source documents and propose to rank these events by their salience. We also develop a model to summarize input documents with given events as anchors and hints. By inputting different numbers of events, GranuSum is capable of producing multi-granular summaries in an unsupervised manner. Meanwhile, we annotate a new benchmark GranuDUC that contains multiple summaries at different granularities for each document cluster. Experimental results confirm the substantial superiority of GranuSum on multi-granularity summarization over strong baselines. Further, by exploiting the event information, GranuSum also exhibits state-of-the-art performance under the conventional unsupervised abstractive setting. Dataset for this paper can be found at: https://github.com/maszhongming/GranuDUC

information retrieval, machine learning, natural language, (17 more...)

2201.12502

Country:

Asia > Russia (0.46)
North America > Honduras (0.05)
Europe > Russia (0.04)
(10 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Leisure & Entertainment > Sports > Basketball (1.00)
Health & Medicine (0.94)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.34)

arXiv.org Artificial IntelligenceDec-13-2022

Distantly-Supervised Named Entity Recognition with Adaptive Teacher Learning and Fine-grained Student Ensemble

Qu, Xiaoye, Zeng, Jun, Liu, Daizong, Wang, Zhefeng, Huai, Baoxing, Zhou, Pan

Distantly-Supervised Named Entity Recognition (DS-NER) effectively alleviates the data scarcity problem in NER by automatically generating training samples. Unfortunately, the distant supervision may induce noisy labels, thus undermining the robustness of the learned models and restricting the practical application. To relieve this problem, recent works adopt self-training teacher-student frameworks to gradually refine the training labels and improve the generalization ability of NER models. However, we argue that the performance of the current self-training frameworks for DS-NER is severely underestimated by their plain designs, including both inadequate student learning and coarse-grained teacher updating. Therefore, in this paper, we make the first attempt to alleviate these issues by proposing: (1) adaptive teacher learning comprised of joint training of two teacher-student networks and considering both consistent and inconsistent predictions between two teachers, thus promoting comprehensive student learning. (2) fine-grained student ensemble that updates each fragment of the teacher model with a temporal moving average of the corresponding fragment of the student, which enhances consistent predictions on each model fragment against noise. To verify the effectiveness of our proposed method, we conduct experiments on four DS-NER datasets. The experimental results demonstrate that our method significantly surpasses previous SOTA methods.

information retrieval, machine learning, natural language, (19 more...)

2212.06522

Country:

North America > United States > New York (0.05)
North America > United States > California (0.05)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(5 more...)

Genre: Research Report > New Finding (0.48)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.87)

#artificialintelligenceDec-12-2022, 16:10:07 GMT

Getting Started with Trino Query Engine

The advantage of this last launching method is that if there are some problems, I can easily identify them. For this reason, I prefer to start the server in foreground. Now the server can be queried.

artificial intelligence, machine learning, natural language, (17 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.41)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.31)

Rosa, Guilherme Moraes, Bonifacio, Luiz, Jeronymo, Vitor, Abonizio, Hugo, Fadaee, Marzieh, Lotufo, Roberto, Nogueira, Rodrigo

No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrieval

arXiv.org Artificial IntelligenceDec-12-2022

Recent work has shown that small distilled language models are strong competitors to models that are orders of magnitude larger and slower in a wide range of information retrieval tasks. This has made distilled and dense models, due to latency constraints, the go-to choice for deployment in real-world retrieval applications. In this work, we question this practice by showing that the number of parameters and early query-document interaction play a significant role in the generalization ability of retrieval models. Our experiments show that increasing model size results in marginal gains on in-domain test sets, but much larger gains in new domains never seen during fine-tuning. Furthermore, we show that rerankers largely outperform dense ones of similar size in several tasks. Our largest reranker reaches the state of the art in 12 of the 18 datasets of the Benchmark-IR (BEIR) and surpasses the previous state of the art by 3 average points. Finally, we confirm that in-domain effectiveness is not a good indicator of zero-shot effectiveness.

effectiveness, information retrieval, large language model, (15 more...)

2206.02873

Country:

South America > Brazil > São Paulo (0.04)
North America > United States > Maryland > Montgomery County > Gaithersburg (0.04)
Europe > Netherlands (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.91)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.89)

arXiv.org Artificial IntelligenceDec-12-2022

In Defense of Cross-Encoders for Zero-Shot Retrieval

Rosa, Guilherme, Bonifacio, Luiz, Jeronymo, Vitor, Abonizio, Hugo, Fadaee, Marzieh, Lotufo, Roberto, Nogueira, Rodrigo

Bi-encoders and cross-encoders are widely used in many state-of-the-art retrieval pipelines. In this work we study the generalization ability of these two types of architectures on a wide range of parameter count on both in-domain and out-of-domain scenarios. We find that the number of parameters and early query-document interactions of cross-encoders play a significant role in the generalization ability of retrieval models. Our experiments show that increasing model size results in marginal gains on in-domain test sets, but much larger gains in new domains never seen during fine-tuning. Furthermore, we show that cross-encoders largely outperform bi-encoders of similar size in several tasks. In the BEIR benchmark, our largest cross-encoder surpasses a state-of-the-art bi-encoder by more than 4 average points. Finally, we show that using bi-encoders as first-stage retrievers provides no gains in comparison to a simpler retriever such as BM25 on out-of-domain tasks. The code is available at https://github.com/guilhermemr04/scaling-zero-shot-retrieval.git

arxiv preprint arxiv, large language model, machine learning, (15 more...)

2212.06121

Country:

South America > Brazil > São Paulo (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre: Research Report > New Finding (0.68)

Industry: Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.71)

arXiv.org Artificial IntelligenceDec-11-2022

FactorJoin: A New Cardinality Estimation Framework for Join Queries

Wu, Ziniu, Negi, Parimarjan, Alizadeh, Mohammad, Kraska, Tim, Madden, Samuel

Cardinality estimation is one of the most fundamental and challenging problems in query optimization. Neither classical nor learning-based methods yield satisfactory performance when estimating the cardinality of the join queries. They either rely on simplified assumptions leading to ineffective cardinality estimates or build large models to understand the data distributions, leading to long planning times and a lack of generalizability across queries. In this paper, we propose a new framework FactorJoin for estimating join queries. FactorJoin combines the idea behind the classical join-histogram method to efficiently handle joins with the learning-based methods to accurately capture attribute correlation. Specifically, FactorJoin scans every table in a DB and builds single-table conditional distributions during an offline preparation phase. When a join query comes, FactorJoin translates it into a factor graph model over the learned distributions to effectively and efficiently estimate its cardinality. Unlike existing learning-based methods, FactorJoin does not need to de-normalize joins upfront or require executed query workloads to train the model. Since it only relies on single-table statistics, FactorJoin has small space overhead and is extremely easy to train and maintain. In our evaluation, FactorJoin can produce more effective estimates than the previous state-of-the-art learning-based methods, with 40x less estimation latency, 100x smaller model size, and 100x faster training speed at comparable or better accuracy. In addition, FactorJoin can estimate 10,000 sub-plan queries within one second to optimize the query plan, which is very close to the traditional cardinality estimators in commercial DBMS.

artificial intelligence, machine learning, natural language, (19 more...)

2212.05526

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > District of Columbia > Washington (0.05)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)