AITopics | spanbert

Collaborating Authors

spanbert

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards Transparency in Coreference Resolution: A Quantum-Inspired Approach

Wazni, Hadi, Sadrzadeh, Mehrnoosh

arXiv.org Artificial IntelligenceDec-1-2023

Guided by grammatical structure, words compose to form sentences, and guided by discourse structure, sentences compose to form dialogues and documents. The compositional aspect of sentence and discourse units is often overlooked by machine learning algorithms. A recent initiative called Quantum Natural Language Processing (QNLP) learns word meanings as points in a Hilbert space and acts on them via a translation of grammatical structure into Parametrised Quantum Circuits (PQCs). Previous work extended the QNLP translation to discourse structure using points in a closure of Hilbert spaces. In this paper, we evaluate this translation on a Winograd-style pronoun resolution task. We train a Variational Quantum Classifier (VQC) for binary classification and implement an end-to-end pronoun resolution system. The simulations executed on IBMQ software converged with an F1 score of 87.20%. The model outperformed two out of three classical coreference resolution systems and neared state-of-the-art SpanBERT. A mixed quantum-classical model yet improved these results with an F1 score increase of around 6%.

adjective, gerund phrase, verb phrase, (15 more...)

arXiv.org Artificial Intelligence

2312.00688

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Maryland > Baltimore (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(5 more...)

Genre: Research Report (0.64)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.69)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.68)

Add feedback

Are Large Language Models Robust Coreference Resolvers?

Le, Nghia T., Ritter, Alan

arXiv.org Artificial IntelligenceNov-14-2023

Recent work on extending coreference resolution across domains and languages relies on annotated data in both the target domain and language. At the same time, pre-trained large language models (LMs) have been reported to exhibit strong zero- and few-shot learning abilities across a wide range of NLP tasks. However, prior work mostly studied this ability using artificial sentence-level datasets such as the Winograd Schema Challenge. In this paper, we assess the feasibility of prompt-based coreference resolution by evaluating instruction-tuned language models on difficult, linguistically-complex coreference benchmarks (e.g., CoNLL-2012). We show that prompting for coreference can outperform current unsupervised coreference systems, although this approach appears to be reliant on high-quality mention detectors. Further investigations reveal that instruction-tuned LMs generalize surprisingly well across domains, languages, and time periods; yet continued fine-tuning of neural models should still be preferred if small amounts of annotated examples are available.

coreference resolution, dataset, instructgpt, (13 more...)

arXiv.org Artificial Intelligence

2305.14489

Country:

Asia > China > Hong Kong (0.07)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (0.82)

Industry:

Law (0.70)
Government > Regional Government > North America Government > United States Government (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

Add feedback

Do Question Answering Modeling Improvements Hold Across Benchmarks?

Liu, Nelson F., Lee, Tony, Jia, Robin, Liang, Percy

arXiv.org Artificial IntelligenceMay-30-2023

Do question answering (QA) modeling improvements (e.g., choice of architecture and training procedure) hold consistently across the diverse landscape of QA benchmarks? To study this question, we introduce the notion of concurrence -- two benchmarks have high concurrence on a set of modeling approaches if they rank the modeling approaches similarly. We measure the concurrence between 32 QA benchmarks on a set of 20 diverse modeling approaches and find that human-constructed benchmarks have high concurrence amongst themselves, even if their passage and question distributions are very different. Surprisingly, even downsampled human-constructed benchmarks (i.e., collecting less data) and programmatically-generated benchmarks (e.g., cloze-formatted examples) have high concurrence with human-constructed benchmarks. These results indicate that, despite years of intense community focus on a small number of benchmarks, the modeling improvements studied hold broadly.

benchmark, machine learning, question answering, (21 more...)

arXiv.org Artificial Intelligence

2102.01065

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Texas > McLennan County > Waco (0.04)
North America > United States > Texas > Falls County (0.04)
(7 more...)

Genre: Research Report (0.64)

Industry:

Government > Regional Government > North America Government > United States Government (0.67)
Leisure & Entertainment > Sports > Football (0.46)
Leisure & Entertainment > Sports > Basketball (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

The NLP Task Effectiveness of Long-Range Transformers

Qin, Guanghui, Feng, Yukun, Van Durme, Benjamin

arXiv.org Artificial IntelligenceFeb-10-2023

Transformer models cannot easily scale to long sequences due to their O(N^2) time and space complexity. This has led to Transformer variants seeking to lower computational complexity, such as Longformer and Performer. While such models have theoretically greater efficiency, their effectiveness on real NLP tasks has not been well studied. We benchmark 7 variants of Transformer models on 5 difficult NLP tasks and 7 datasets. We design experiments to isolate the effect of pretraining and hyperparameter settings, to focus on their capacity for long-range attention. Moreover, we present various methods to investigate attention behaviors to illuminate model details beyond metric scores. We find that the modified attention in long-range transformers has advantages on content selection and query-guided decoding, but they come with previously unrecognized drawbacks such as insufficient attention to distant tokens and accumulated approximation error.

longformer, transformer, xlnet, (15 more...)

arXiv.org Artificial Intelligence

2202.07856

Country: South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.47)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Multi-Task Learning Framework for Extracting Emotion Cause Span and Entailment in Conversations

Bhat, Ashwani, Modi, Ashutosh

arXiv.org Artificial IntelligenceNov-7-2022

Predicting emotions expressed in text is a well-studied problem in the NLP community. Recently there has been active research in extracting the cause of an emotion expressed in text. Most of the previous work has done causal emotion entailment in documents. In this work, we propose neural models to extract emotion cause span and entailment in conversations. For learning such models, we use RECCON dataset, which is annotated with cause spans at the utterance level. In particular, we propose MuTEC, an end-to-end Multi-Task learning framework for extracting emotions, emotion cause, and entailment in conversations. This is in contrast to existing baseline models that use ground truth emotions to extract the cause. MuTEC performs better than the baselines for most of the data folds provided in the dataset.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2211.03742

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia > India > Uttar Pradesh > Kanpur (0.04)
(7 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

A Simple and Strong Baseline for End-to-End Neural RST-style Discourse Parsing

Kobayashi, Naoki, Hirao, Tsutomu, Kamigaito, Hidetaka, Okumura, Manabu, Nagata, Masaaki

arXiv.org Artificial IntelligenceNov-1-2022

To promote and further develop RST-style discourse parsing models, we need a strong baseline that can be regarded as a reference for reporting reliable experimental results. This paper explores a strong baseline by integrating existing simple parsing strategies, top-down and bottom-up, with various transformer-based pre-trained language models. The experimental results obtained from two benchmark datasets demonstrate that the parsing performance strongly relies on the pretrained language models rather than the parsing strategies. In particular, the bottom-up parser achieves large performance gains compared to the current best parser when employing DeBERTa. We further reveal that language models with a span-masking scheme especially boost the parsing performance through our analysis within intra- and multi-sentential parsing, and nuclearity prediction.

computational linguistic, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2210.08355

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Maryland > Baltimore (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(15 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Investigating the Role of Centering Theory in the Context of Neural Coreference Resolution Systems

Jiang, Yuchen Eleanor, Cotterell, Ryan, Sachan, Mrinmaya

arXiv.org Artificial IntelligenceOct-26-2022

Centering theory (CT; Grosz et al., 1995) provides a linguistic analysis of the structure of discourse. According to the theory, local coherence of discourse arises from the manner and extent to which successive utterances make reference to the same entities. In this paper, we investigate the connection between centering theory and modern coreference resolution systems. We provide an operationalization of centering and systematically investigate if neural coreference resolvers adhere to the rules of centering theory by defining various discourse metrics and developing a search-based methodology. Our information-theoretic analysis reveals a positive dependence between coreference and centering; but also shows that high-quality neural coreference resolvers may not benefit much from explicitly modeling centering ideas. Our analysis further shows that contextualized embeddings contain much of the coherence information, which helps explain why CT can only provide little gains to modern neural coreference resolvers which make use of pretrained representations. Finally, we discuss factors that contribute to coreference which are not modeled by CT such as world knowledge and recency bias. We formulate a version of CT that also models recency and show that it captures coreference information better compared to vanilla CT.

coreference, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2210.14678

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
Europe > Ukraine (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
(15 more...)

Genre: Research Report > Experimental Study (0.46)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.34)

Add feedback

Is Sluice Resolution really just Question Answering?

Wiriyathammabhum, Peratham

arXiv.org Artificial IntelligenceMay-29-2021

Sluice resolution is a problem where a system needs to output the corresponding antecedents of wh-ellipses. The antecedents are elided contents behind the wh-words but are implicitly referred to using contexts. Previous work frames sluice resolution as question answering where this setting outperforms all its preceding works by large margins. Ellipsis and questions are referentially dependent expressions (anaphoras) and retrieving the corresponding antecedents are like answering questions to output pieces of clarifying information. However, the task is not fully solved. Therefore, we want to further investigate what makes sluice resolution differ to question answering and fill in the error gaps. We also present some results using recent state-of-the-art question answering systems which improve the previous work (86.01 to 90.39 F1).

resolution, sluice resolution, spanbert, (12 more...)

arXiv.org Artificial Intelligence

2105.14347

Country: North America > United States > Illinois > Cook County > Chicago (0.05)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)

Add feedback