AITopics | Wang, Zhenhailong

Collaborating Authors

Wang, Zhenhailong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks

Wang, Zhenhailong, Pan, Xiaoman, Yu, Dian, Yu, Dong, Chen, Jianshu, Ji, Heng

arXiv.org Artificial IntelligenceMay-22-2023

Although large language models have achieved impressive zero-shot ability, the huge model size generally incurs high cost. Recently, semi-parametric language models, which augment a smaller language model with an external retriever, have demonstrated promising language modeling capabilities. However, it remains unclear whether such semi-parametric language models can perform competitively well as their fully-parametric counterparts on zero-shot generalization to downstream tasks. In this work, we introduce $\text{Zemi}$, a zero-shot semi-parametric language model. To our best knowledge, this is the first semi-parametric language model that can demonstrate strong zero-shot performance on a wide range of held-out unseen tasks. We train $\text{Zemi}$ with a novel semi-parametric multitask prompted training paradigm, which shows significant improvement compared with the parametric multitask training as proposed by T0. Specifically, we augment the multitask training and zero-shot evaluation with retrieval from a large-scale task-agnostic unlabeled corpus. In order to incorporate multiple potentially noisy retrieved augmentations, we further propose a novel $\text{augmentation fusion}$ module leveraging perceiver resampler and gated cross-attention. Notably, our proposed $\text{Zemi}_\text{LARGE}$ outperforms T0-3B by 16% on all seven evaluation tasks while being 3.9x smaller in model size.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2210.00185

Country: North America > United States > Illinois (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment > Sports > Football (1.00)
Media > Film (0.93)
Law (0.68)
Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Rethinking Task Sampling for Few-shot Vision-Language Transfer Learning

Wang, Zhenhailong, Yu, Hang, Li, Manling, Zhao, Han, Ji, Heng

arXiv.org Artificial IntelligenceJul-15-2022

Despite achieving state-of-the-art zero-shot performance, existing vision-language models still fall short of few-shot transfer ability on domain-specific problems. Classical fine-tuning often fails to prevent highly expressive models from exploiting spurious correlations. Although model-agnostic meta-learning (MAML) presents as a natural alternative for few-shot transfer learning, the expensive computation due to implicit second-order optimization limits its use on large-scale vision-language models such as CLIP. While much literature has been devoted to exploring alternative optimization strategies, we identify another essential aspect towards effective few-shot transfer learning, task sampling, which is previously only be viewed as part of data pre-processing in MAML. To show the impact of task sampling, we propose a simple algorithm, Model-Agnostic Multitask Fine-tuning (MAMF), which differentiates classical fine-tuning only on uniformly sampling multiple tasks. Despite its simplicity, we show that MAMF consistently outperforms classical fine-tuning on five few-shot vision-language classification tasks. We further show that the effectiveness of the bi-level optimization in MAML is highly sensitive to the zero-shot performance of a task in the context of few-shot vision-language classification. The goal of this paper is to provide new insights on what makes few-shot learning work, and encourage more research into investigating better task sampling strategies.

artificial intelligence, dataset, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2203.04904

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.82)

Add feedback

Open Vocabulary Electroencephalography-To-Text Decoding and Zero-shot Sentiment Classification

Wang, Zhenhailong, Ji, Heng

arXiv.org Artificial IntelligenceDec-23-2021

State-of-the-art brain-to-text systems have achieved great success in decoding language directly from brain signals using neural networks. However, current approaches are limited to small closed vocabularies which are far from enough for natural communication. In addition, most of the high-performing approaches require data from invasive devices (e.g., ECoG). In this paper, we extend the problem to open vocabulary Electroencephalography(EEG)-To-Text Sequence-To-Sequence decoding and zero-shot sentence sentiment classification on natural reading tasks. We hypothesis that the human brain functions as a special text encoder and propose a novel framework leveraging pre-trained language models (e.g., BART). Our model achieves a 40.1% BLEU-1 score on EEG-To-Text decoding and a 55.6% F1 score on zero-shot EEG-based ternary sentiment classification, which significantly outperforms supervised baselines. Furthermore, we show that our proposed model can handle data from various subjects and sources, showing great potential for a high-performance open vocabulary brain-to-text system once sufficient data is available

eeg feature, machine learning, natural language, (25 more...)

arXiv.org Artificial Intelligence

2112.0269

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
(2 more...)

Add feedback

NewsClaims: A New Benchmark for Claim Detection from News with Background Knowledge

Reddy, Revanth Gangi, Chinthakindi, Sai, Wang, Zhenhailong, Fung, Yi R., Conger, Kathryn S., Elsayed, Ahmed S., Palmer, Martha, Ji, Heng

arXiv.org Artificial IntelligenceDec-15-2021

Claim detection and verification are crucial for news understanding and have emerged as promising technologies for mitigating misinformation in news. However, most existing work focus on analysis of claim sentences while overlooking crucial background attributes, such as the claimer, claim objects, and other knowledge connected to the claim. In this work, we present NewsClaims , a new benchmark for knowledge-aware claim detection in the news domain. We re-define the claim detection problem to include extraction of additional background attributes related to the claim and release 529 claims annotated over 103 news articles. In addition, NewsClaims aims to benchmark claim detection systems in emerging scenarios, comprising unseen topics with little or no training data. Finally, we provide a comprehensive evaluation of various zero-shot and prompt-based baselines for this new benchmark.

machine learning, natural language, news article, (18 more...)

arXiv.org Artificial Intelligence

2112.08544

Country: North America > United States > Colorado (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.71)
Media > News (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.38)

Add feedback

Future is not One-dimensional: Graph Modeling based Complex Event Schema Induction for Event Prediction

Li, Manling, Li, Sha, Wang, Zhenhailong, Huang, Lifu, Cho, Kyunghyun, Ji, Heng, Han, Jiawei, Voss, Clare

arXiv.org Artificial IntelligenceApr-15-2021

Event schemas encode knowledge of stereotypical structures of events and their connections. As events unfold, schemas are crucial to act as a scaffolding. Previous work on event schema induction either focuses on atomic events or linear temporal event sequences, ignoring the interplay between events via arguments and argument relations. We introduce the concept of Temporal Complex Event Schema: a graph-based schema representation that encompasses events, arguments, temporal connections and argument relations. Additionally, we propose a Temporal Event Graph Model that models the emergence of event instances following the temporal complex event schema. To build and evaluate such schemas, we release a new schema learning corpus containing 6,399 documents accompanied with event graphs, and manually constructed gold schemas. Intrinsic evaluation by schema matching and instance graph perplexity, prove the superior quality of our probabilistic graph schema library compared to linear representations. Extrinsic evaluation on schema-guided event prediction further demonstrates the predictive power of our event graph model, significantly surpassing human schemas and baselines by more than 17.8% on HITS@1.

artificial intelligence, graph, neural network, (20 more...)

arXiv.org Artificial Intelligence

2104.06344

Country:

Europe (1.00)
North America > United States (0.28)

Genre: Research Report (0.40)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science > Data Mining (0.69)

Add feedback