AITopics

doi: 10.1111/peps.12593

2304.13933

Country: North America > United States > Texas (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.94)

Industry:

Law (1.00)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

arXiv.org Artificial IntelligenceFeb-17-2023

Multimodal Propaganda Processing

Ng, Vincent, Li, Shengjie

Propaganda campaigns have long been used to influence public opinion via disseminating biased and/or misleading information. Despite the increasing prevalence of propaganda content on the Internet, few attempts have been made by AI researchers to analyze such content. We introduce the task of multimodal propaganda processing, where the goal is to automatically analyze propaganda content. We believe that this task presents a long-term challenge to AI researchers and that successful processing of propaganda could bring machine understanding one important step closer to human understanding. We discuss the technical challenges associated with this task and outline the steps that need to be taken to address it.

artificial intelligence, machine learning, natural language, (19 more...)

2302.08709

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report (0.82)

Industry:

Media (1.00)
Government > Military (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceFeb-10-2023

CrossCodeBench: Benchmarking Cross-Task Generalization of Source Code Models

Niu, Changan, Li, Chuanyi, Ng, Vincent, Luo, Bin

Despite the recent advances showing that a model pre-trained on large-scale source code data is able to gain appreciable generalization capability, it still requires a sizeable amount of data on the target task for fine-tuning. And the effectiveness of the model generalization is largely affected by the size and quality of the fine-tuning data, which is detrimental for target tasks with limited or unavailable resources. Therefore, cross-task generalization, with the goal of improving the generalization of the model to unseen tasks that have not been seen before, is of strong research and application value. In this paper, we propose a large-scale benchmark that includes 216 existing code-related tasks. Then, we annotate each task with the corresponding meta information such as task description and instruction, which contains detailed information about the task and a solution guide. This also helps us to easily create a wide variety of ``training/evaluation'' task splits to evaluate the various cross-task generalization capabilities of the model. Then we perform some preliminary experiments to demonstrate that the cross-task generalization of models can be largely improved by in-context learning methods such as few-shot learning and learning from task instructions, which shows the promising prospects of conducting cross-task learning research on our benchmark. We hope that the collection of the datasets and our benchmark will facilitate future work that is not limited to cross-task generalization.

machine learning, natural language, task type, (21 more...)

2302.0403

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

arXiv.org Artificial IntelligenceDec-3-2022

End-to-End Neural Discourse Deixis Resolution in Dialogue

Li, Shengjie, Ng, Vincent

We adapt Lee et al.'s (2018) span-based entity coreference model to the task of end-to-end discourse deixis resolution in dialogue, specifically by proposing extensions to their model that exploit task-specific characteristics. The resulting model, dd-utt, achieves state-of-the-art results on the four datasets in the CODI-CRAC 2021 shared task.

anaphor, machine learning, natural language, (19 more...)

2211.1598

Country:

North America > United States (1.00)
Europe (1.00)
Asia (0.68)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

AAAI ConferencesFeb-14-2017

Machine Learning for Entity Coreference Resolution: A Retrospective Look at Two Decades of Research

Ng, Vincent (University of Texas at Dallas)

Though extensively investigated since the 1960s, entity coreference resolution, a core task in natural language understanding, is far from being solved. Nevertheless, significant progress has been made on learning-based coreference research since its inception two decades ago. This paper provides an overview of the major milestones made in learning-based coreference research and discusses a hard entity coreference task, the Winograd Schema Challenge, which has recently received a lot of attention in the AI community.

commonsense reasoning, coreference resolution, survey article, (22 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Country: North America > United States > Texas (0.14)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.87)

AAAI ConferencesApr-19-2016

Joint Inference over a Lightly Supervised Information Extraction Pipeline: Towards Event Coreference Resolution for Resource-Scarce Languages

Chen, Chen (University of Texas at Dallas) | Ng, Vincent (University of Texas at Dallas)

We address two key challenges in end-to-end event coreference resolution research: (1) the error propagation problem, where an event coreference resolver has to assume as input the noisy outputs produced by its upstream components in the standard information extraction (IE) pipeline; and (2) the data annotation bottleneck, where manually annotating data for all the components in the IE pipeline is prohibitively expensive. This is the case in the vast majority of the world's natural languages, where such annotated resources are not readily available. To address these problems, we propose to perform joint inference over a lightly supervised IE pipeline, where all the models are trained using either active learning or unsupervised learning. Using our approach, only 25% of the training sentences in the Chinese portion of the ACE 2005 corpus need to be annotated with entity and event mentions in order for our event coreference resolver to surpass its fully supervised counterpart in performance.

artificial intelligence, event mention, text processing, (18 more...)

Thirtieth AAAI Conference on Artificial Intelligence

Country: North America > United States > Texas (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.68)

AAAI ConferencesMar-6-2015

Chinese Common Noun Phrase Resolution: An Unsupervised Probabilistic Model Rivaling Supervised Resolvers

Chen, Chen (University of Texas at Dallas) | Ng, Vincent (University of Texas at Dallas)

Pronoun resolution and common noun phrase resolution are the two most challenging subtasks of coreference resolution. While a lot of work has focused on pronoun resolution, common noun phrase resolution has almost always been tackled in the context of the larger coreference resolution task. In fact, to our knowledge, there has been no attempt to address Chinese common noun phrase resolution as a standalone task. In this paper, we propose a generative model for unsupervised Chinese common noun phrase resolution that not only allows easy incorporation of linguistic constraints on coreference but also performs joint resolution and anaphoricity determination. When evaluated on the Chinese portion of the OntoNotes 5.0 corpus, our model rivals its supervised counterpart in performance.

artificial intelligence, natural language, probabilistic model, (2 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.40)

AAAI ConferencesJul-14-2014

Chinese Zero Pronoun Resolution: An Unsupervised Approach Combining Ranking and Integer Linear Programming

Chen, Chen (University of Texas at Dallas) | Ng, Vincent (University of Texas at Dallas)

State-of-the-art approaches to Chinese zero pronoun resolution are supervised, requiring training documents with manually resolved zero pronouns. To eliminate the reliance on annotated data, we propose an unsupervised approach to this task. Underlying our approach is the novel idea of employing a model trained on manually resolved overt pronouns to resolve zero pronouns. Experimental results on the OntoNotes 5.0 corpus are encouraging: our unsupervised model surpasses its supervised counterparts in performance.

artificial intelligence, integer linear programming, optimization problem, (5 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Genre: Research Report > Promising Solution (0.53)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.40)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.40)

AAAI ConferencesJul-21-2012

Clustering Documents Along Multiple Dimensions

Dasgupta, Sajib (IBM Almaden Research Center) | Golden, Richard M. (University of Texas at Dallas) | Ng, Vincent (University of Texas at Dallas)

Traditional clustering algorithms are designed to search for a single clustering solution despite the fact that multiple alternative solutions might exist for a particular dataset. For example, a set of news articles might be clustered by topic or by the author's gender or age. Similarly, book reviews might be clustered by sentiment or comprehensiveness. In this paper, we address the problem of identifying alternative clustering solutions by developing a Probabilistic Multi-Clustering (PMC) model that discovers multiple, maximally different clusterings of a data sample. Empirical results on six datasets representative of real-world applications show that our PMC model exhibits superior performance to comparable multi-clustering algorithms.

artificial intelligence, book review, dataset, (17 more...)

Twenty-Sixth AAAI Conference on Artificial Intelligence

Country: North America > United States > Texas (0.14)

Genre: Summary/Review (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)