AITopics | Lyu, Yajuan

Collaborating Authors

Lyu, Yajuan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

FactGen: Faithful Text Generation by Factuality-aware Pre-training and Contrastive Ranking Fine-tuning

Lan, ZhiBin (a:1:{s:5:"en_US";s:17:"Xiamen University";}) | Li, Wei | Su, Jinsong | Xiao, Xinyan | Liu, Jiachen | Wu, Wenhao | Lyu, Yajuan

Journal of Artificial Intelligence ResearchApr-27-2023

Conditional text generation is supposed to generate a fluent and coherent target text that is faithful to the source text. Although pre-trained models have achieved promising results, they still suffer from the crucial factuality problem. To deal with this issue, we propose a factuality-aware pretraining-finetuning framework named FactGen, which fully considers factuality during two training stages. Specifically, at the pre-training stage, we utilize a natural language inference model to construct target texts that are entailed by the source texts, resulting in a more factually consistent pre-training objective. Then, during the fine-tuning stage, we further introduce a contrastive ranking loss to encourage the model to generate factually consistent text with higher probability. Extensive experiments on three conditional text generation tasks demonstrate the effectiveness and generality of our training framework.

artificial intelligence, machine learning, natural language, (24 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.14267

AI Access Foundation

14267

Journal of Artificial Intelligence Research

Country:

Europe > United Kingdom (0.94)
Asia > China (0.69)

Genre: Research Report (0.46)

Industry: Leisure & Entertainment > Sports (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

$k$NN Prompting: Beyond-Context Learning with Calibration-Free Nearest Neighbor Inference

Xu, Benfeng, Wang, Quan, Mao, Zhendong, Lyu, Yajuan, She, Qiaoqiao, Zhang, Yongdong

arXiv.org Artificial IntelligenceMar-24-2023

In-Context Learning (ICL), which formulates target tasks as prompt completion conditioned on in-context demonstrations, has become the prevailing utilization of LLMs. In this paper, we first disclose an actual predicament for this typical usage that it can not scale up with training data due to context length restriction. Besides, existing works have shown that ICL also suffers from various biases and requires delicate calibration treatment. To address both challenges, we advocate a simple and effective solution, $k$NN Prompting, which first queries LLM with training data for distributed representations, then predicts test instances by simply referring to nearest neighbors. We conduct comprehensive experiments to demonstrate its two-fold superiority: 1) Calibration-Free: $k$NN Prompting does not directly align LLM output distribution with task-specific label space, instead leverages such distribution to align test and training instances. It significantly outperforms state-of-the-art calibration-based methods under comparable few-shot scenario. 2) Beyond-Context: $k$NN Prompting can further scale up effectively with as many training data as are available, continually bringing substantial improvements. The scaling trend holds across 10 orders of magnitude ranging from 2 shots to 1024 shots as well as different LLMs scales ranging from 0.8B to 30B. It successfully bridges data scaling into model scaling, and brings new potentials for the gradient-free paradigm of LLM deployment. Code is publicly available.

artificial intelligence, knn prompting, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2303.13824

Country:

North America > United States (1.00)
Asia (1.00)
Europe > United Kingdom > England (0.28)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment (0.67)
Energy > Oil & Gas > Trading (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

UPainting: Unified Text-to-Image Diffusion Generation with Cross-modal Guidance

Li, Wei, Xu, Xue, Xiao, Xinyan, Liu, Jiachen, Yang, Hu, Li, Guohao, Wang, Zhanpeng, Feng, Zhifan, She, Qiaoqiao, Lyu, Yajuan, Wu, Hua

arXiv.org Artificial IntelligenceNov-2-2022

Diffusion generative models have recently greatly improved the power of text-conditioned image generation. Existing image generation models mainly include text conditional diffusion model and cross-modal guided diffusion model, which are good at small scene image generation and complex scene image generation respectively. In this work, we propose a simple yet effective approach, namely UPainting, to unify simple and complex scene image generation, as shown in Figure 1. Based on architecture improvements and diverse guidance schedules, UPainting effectively integrates cross-modal guidance from a pretrained image-text matching model into a text conditional diffusion model that utilizes a pretrained Transformer language model as the text encoder. Our key findings is that combining the power of large-scale Transformer language model in understanding language and image-text matching model in capturing cross-modal semantics and style, is effective to improve sample fidelity and image-text alignment of image generation. In this way, UPainting has a more general image generation capability, which can generate images of both simple and complex scenes more effectively. To comprehensively compare text-to-image models, we further create a more general benchmark, UniBench, with well-written Chinese and English prompts in both simple and complex scenes. We compare UPainting with recent models and find that UPainting greatly outperforms other models in terms of caption similarity and image fidelity in both simple and complex scenes. UPainting project page \url{https://upainting.github.io/}.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2210.16031

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Building Chinese Biomedical Language Models via Multi-Level Text Discrimination

Wang, Quan, Dai, Songtai, Xu, Benfeng, Lyu, Yajuan, Zhu, Yong, Wu, Hua, Wang, Haifeng

arXiv.org Artificial IntelligenceOct-14-2021

Pre-trained language models (PLMs), such as BERT and GPT, have revolutionized the field of NLP, not only in the general domain but also in the biomedical domain. Most prior efforts in building biomedical PLMs have resorted simply to domain adaptation and focused mainly on English. In this work we introduce eHealth, a biomedical PLM in Chinese built with a new pre-training framework. This new framework trains eHealth as a discriminator through both token-level and sequence-level discrimination. The former is to detect input tokens corrupted by a generator and select their original signals from plausible candidates, while the latter is to further distinguish corruptions of a same original sequence from those of the others. As such, eHealth can learn language semantics at both the token and sequence levels. Extensive experiments on 11 Chinese biomedical language understanding tasks of various forms verify the effectiveness and superiority of our approach. The pre-trained model is available to the public at \url{https://github.com/PaddlePaddle/Research/tree/master/KG/eHealth} and the code will also be released later.

artificial intelligence, machine learning, natural language, (23 more...)

arXiv.org Artificial Intelligence

2110.07244

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Health Care Technology > Telehealth (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.67)

Add feedback

Link Prediction on N-ary Relational Facts: A Graph-based Approach

Wang, Quan, Wang, Haifeng, Lyu, Yajuan, Zhu, Yong

arXiv.org Artificial IntelligenceMay-18-2021

Link prediction on knowledge graphs (KGs) is a key research topic. Previous work mainly focused on binary relations, paying less attention to higher-arity relations although they are ubiquitous in real-world KGs. This paper considers link prediction upon n-ary relational facts and proposes a graph-based approach to this task. The key to our approach is to represent the n-ary structure of a fact as a small heterogeneous graph, and model this graph with edge-biased fully-connected attention. The fully-connected attention captures universal inter-vertex interactions, while with edge-aware attentive biases to particularly encode the graph structure and its heterogeneity. In this fashion, our approach fully models global and local dependencies in each n-ary fact, and hence can more effectively capture associations therein. Extensive evaluation verifies the effectiveness and superiority of our approach. It performs substantially and consistently better than current state-of-the-art across a variety of n-ary relational benchmarks. Our code is publicly available.

information management, n-ary fact, survey article, (23 more...)

arXiv.org Artificial Intelligence

2105.08476

Country: Asia > China (0.14)

Genre:

Research Report (1.00)
Personal > Honors (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Information Management > Search (0.92)
(2 more...)

Add feedback

Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction

Xu, Benfeng, Wang, Quan, Lyu, Yajuan, Zhu, Yong, Mao, Zhendong

arXiv.org Artificial IntelligenceFeb-19-2021

In this work, we formulate such structure as distinctive dependencies between mention pairs. We then propose SSAN, which incorporates these structural dependencies within the standard self-attention mechanism and throughout the overall encoding stage. Specifically, we design two alternative transformation modules inside each self-attention building block to produce attentive biases so as to adaptively regularize its attention flow. Our experiments demonstrate the usefulness of the proposed entity structure and the effectiveness of SSAN. It significantly outperforms competitive baselines, achieving new state-of-the-art results on three popular document-level relation extraction datasets. We further provide ablation and visualization to show how the entity structure guides the model for better relation extraction. Our code is publicly available.

deep learning, dependency, neural network, (19 more...)

arXiv.org Artificial Intelligence

2102.10249

Country:

Europe (1.00)
Asia > China (0.47)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback