AITopics | Wei, Zhongyu

Collaborating Authors

Wei, Zhongyu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Hierarchical Reinforcement Learning for Automatic Disease Diagnosis

Zhong, Cheng, Liao, Kangenbei, Chen, Wei, Liu, Qianlong, Peng, Baolin, Huang, Xuanjing, Peng, Jiajie, Wei, Zhongyu

arXiv.org Artificial IntelligenceNov-7-2023

Motivation: Disease diagnosis oriented dialogue system models the interactive consultation procedure as Markov Decision Process and reinforcement learning algorithms are used to solve the problem. Existing approaches usually employ a flat policy structure that treat all symptoms and diseases equally for action making. This strategy works well in the simple scenario when the action space is small, however, its efficiency will be challenged in the real environment. Inspired by the offline consultation process, we propose to integrate a hierarchical policy structure of two levels into the dialogue systemfor policy learning. The high-level policy consists of amastermodel that is responsible for triggering a low-levelmodel, the lowlevel policy consists of several symptom checkers and a disease classifier. The proposed policy structure is capable to deal with diagnosis problem including large number of diseases and symptoms. Results: Experimental results on three real-world datasets and a synthetic dataset demonstrate that our hierarchical framework achieves higher accuracy and symptom recall in disease diagnosis compared with existing systems. We construct a benchmark including datasets and implementation of existing algorithms to encourage follow-up researches. Availability: The code and data is available from https://github.com/FudanDISC/DISCOpen-MedBox-DialoDiagnosis Contact: 21210980124@m.fudan.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online.

machine learning, natural language, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2004.14254

Country:

Asia > China (0.14)
North America > United States (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Public Health (0.68)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.46)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Health & Medicine > Health Care Technology > Medical Record (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

DISC-FinLLM: A Chinese Financial Large Language Model based on Multiple Experts Fine-tuning

Chen, Wei, Wang, Qiushi, Long, Zefei, Zhang, Xianyin, Lu, Zhongtian, Li, Bingxuan, Wang, Siyuan, Xu, Jiarong, Bai, Xiang, Huang, Xuanjing, Wei, Zhongyu

arXiv.org Artificial IntelligenceOct-25-2023

The financial industry presents unique challenges and opportunities for Natural Language Processing In this paper, we propose a comprehensive approach (NLP) models (Huang et al., 2020). Traditional to build Chinese financial LLMs and present financial NLP models have made progress DISC-FinLLM. Our method aims to enhance general in various tasks such as news sentiment analysis LLMs by equipping them with the skills to (Araci, 2019), financial event extraction (Zheng address typical needs for financial text generation et al., 2019; Yang et al., 2019), financial report and understanding, meaningful multi-turn conversations generation (Chapman et al., 2022), stock price prediction on financial topics, and plugin functionality (Chen et al., 2018) and financial text summarization to support financial modeling and knowledgeenhanced (La Quatra and Cagliero, 2020).

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2310.15205

Country:

Asia > China (0.16)
Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.68)

Industry:

Banking & Finance > Real Estate (0.96)
Banking & Finance > Trading (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Valley: Video Assistant with Large Language model Enhanced abilitY

Luo, Ruipu, Zhao, Ziwang, Yang, Min, Dong, Junwei, Li, Da, Lu, Pengcheng, Wang, Tao, Hu, Linmei, Qiu, Minghui, Wei, Zhongyu

arXiv.org Artificial IntelligenceOct-8-2023

Large language models (LLMs), with their remarkable conversational capabilities, have demonstrated impressive performance across various applications and have emerged as formidable AI assistants. In view of this, it raises an intuitive question: Can we harness the power of LLMs to build multimodal AI assistants for visual applications? Recently, several multi-modal models have been developed for this purpose. They typically pre-train an adaptation module to align the semantics of the vision encoder and language model, followed by fine-tuning on instruction-following data. However, despite the success of this pipeline in image and language understanding, its effectiveness in joint video and language understanding has not been widely explored. In this paper, we aim to develop a novel multi-modal foundation model capable of comprehending video, image, and language within a general framework. To achieve this goal, we introduce Valley, a Video Assistant with Large Language model Enhanced abilitY. The Valley consists of a LLM, a temporal modeling module, a visual encoder, and a simple projection module designed to bridge visual and textual modes. To empower Valley with video comprehension and instruction-following capabilities, we construct a video instruction dataset and adopt a two-stage tuning procedure to train it. Specifically, we employ ChatGPT to facilitate the construction of task-oriented conversation data encompassing various tasks, including multi-shot captions, long video descriptions, action recognition, causal relationship inference, etc. Subsequently, we adopt a pre-training-then-instructions-tuned pipeline to align visual and textual modalities and improve the instruction-following capability of Valley. Qualitative experiments demonstrate that Valley has the potential to function as a highly effective video assistant that can make complex video understanding scenarios easy.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2306.07207

Country: Asia > China (0.14)

Genre: Research Report (0.50)

Industry: Education > Curriculum > Subject-Specific Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.38)

Add feedback

DISC-LawLLM: Fine-tuning Large Language Models for Intelligent Legal Services

Yue, Shengbin, Chen, Wei, Wang, Siyuan, Li, Bingxuan, Shen, Chenchen, Liu, Shujun, Zhou, Yuxuan, Xiao, Yao, Yun, Song, Huang, Xuanjing, Wei, Zhongyu

arXiv.org Artificial IntelligenceSep-23-2023

We propose DISC-LawLLM, an intelligent legal system utilizing large language models (LLMs) to provide a wide range of legal services. We adopt legal syllogism prompting strategies to construct supervised fine-tuning datasets in the Chinese Judicial domain and fine-tune LLMs with legal reasoning capability. We augment LLMs with a retrieval module to enhance models' ability to access and utilize external legal knowledge. A comprehensive legal benchmark, DISC-Law-Eval, is presented to evaluate intelligent legal systems from both objective and subjective dimensions. Quantitative and qualitative results on DISC-Law-Eval demonstrate the effectiveness of our system in serving various users across diverse legal scenarios. The detailed resources are available at https://github.com/FudanDISC/DISC-LawLLM.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2309.11325

Country:

Asia > China (0.30)
Europe (0.28)
North America > United States > Louisiana (0.14)

Genre: Research Report (0.40)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Breaking Down the Task: A Unit-Grained Hybrid Training Framework for Vision and Language Decision Making

Luo, Ruipu, Zhang, Jiwen, Wei, Zhongyu

arXiv.org Artificial IntelligenceJul-16-2023

Vision language decision making (VLDM) is a challenging multimodal task. The agent have to understand complex human instructions and complete compositional tasks involving environment navigation and object manipulation. However, the long action sequences involved in VLDM make the task difficult to learn. From an environment perspective, we find that task episodes can be divided into fine-grained \textit{units}, each containing a navigation phase and an interaction phase. Since the environment within a unit stays unchanged, we propose a novel hybrid-training framework that enables active exploration in the environment and reduces the exposure bias. Such framework leverages the unit-grained configurations and is model-agnostic. Specifically, we design a Unit-Transformer (UT) with an intrinsic recurrent state that maintains a unit-scale cross-modal memory. Through extensive experiments on the TEACH benchmark, we demonstrate that our proposed framework outperforms existing state-of-the-art methods in terms of all evaluation metrics. Overall, our work introduces a novel approach to tackling the VLDM task by breaking it down into smaller, manageable units and utilizing a hybrid-training framework. By doing so, we provide a more flexible and effective solution for multimodal decision making.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2307.08016

Genre: Research Report > Promising Solution (0.68)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.70)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)

Add feedback

Unifying Structure Reasoning and Language Model Pre-training for Complex Reasoning

Wang, Siyuan, Wei, Zhongyu, Xu, Jiarong, Li, Taishan, Fan, Zhihao

arXiv.org Artificial IntelligenceJul-15-2023

Recent pre-trained language models (PLMs) equipped with foundation reasoning skills have shown remarkable performance on downstream complex tasks. However, the significant structure reasoning skill has been rarely studied, which involves modeling implicit structure information within the text and performing explicit logical reasoning over them to deduce the conclusion. This paper proposes a unified learning framework that combines explicit structure reasoning and language pre-training to endow PLMs with the structure reasoning skill. It first identifies several elementary structures within contexts to construct structured queries and performs step-by-step reasoning along the queries to identify the answer entity. The fusion of textual semantics and structure reasoning is achieved by using contextual representations learned by PLMs to initialize the representation space of structures, and performing stepwise reasoning on this semantic representation space. Experimental results on four datasets demonstrate that the proposed model achieves significant improvements in complex reasoning tasks involving diverse structures, and shows transferability to downstream tasks with limited training data and effectiveness for complex reasoning of KGs modality.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2301.08913

Country:

Europe (0.46)
North America > United States > Minnesota (0.28)

Genre: Research Report (0.64)

Industry: Media > Film (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

DSRM: Boost Textual Adversarial Training with Distribution Shift Risk Minimization

Gao, Songyang, Dou, Shihan, Liu, Yan, Wang, Xiao, Zhang, Qi, Wei, Zhongyu, Ma, Jin, Shan, Ying

arXiv.org Artificial IntelligenceJun-26-2023

Adversarial training is one of the best-performing methods in improving the robustness of deep language models. However, robust models come at the cost of high time consumption, as they require multi-step gradient ascents or word substitutions to obtain adversarial samples. In addition, these generated samples are deficient in grammatical quality and semantic consistency, which impairs the effectiveness of adversarial training. To address these problems, we introduce a novel, effective procedure for instead adversarial training with only clean data. Our procedure, distribution shift risk minimization (DSRM), estimates the adversarial loss by perturbing the input data's probability distribution rather than their embeddings. This formulation results in a robust model that minimizes the expected global loss under adversarial attacks. Our approach requires zero adversarial samples for training and reduces time consumption by up to 70\% compared to current best-performing adversarial training methods. Experiments demonstrate that DSRM considerably improves BERT's resistance to textual adversarial attacks and achieves state-of-the-art robust accuracy on various benchmarks.

adversarial training, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2306.15164

Country: Asia (0.29)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (0.69)
Government > Military (0.55)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Open Set Relation Extraction via Unknown-Aware Training

Zhao, Jun, Zhao, Xin, Zhan, Wenyu, Zhang, Qi, Gui, Tao, Wei, Zhongyu, Chen, Yunwen, Gao, Xiang, Huang, Xuanjing

arXiv.org Artificial IntelligenceJun-8-2023

The existing supervised relation extraction methods have achieved impressive performance in a closed-set setting, where the relations during both training and testing remain the same. In a more realistic open-set setting, unknown relations may appear in the test set. Due to the lack of supervision signals from unknown relations, a well-performing closed-set relation extractor can still confidently misclassify them into known relations. In this paper, we propose an unknown-aware training method, regularizing the model by dynamically synthesizing negative instances. To facilitate a compact decision boundary, ``difficult'' negative instances are necessary. Inspired by text adversarial attacks, we adaptively apply small but critical perturbations to original training instances and thus synthesizing negative instances that are more likely to be mistaken by the model as known relations. Experimental results show that this method achieves SOTA unknown relation detection without compromising the classification of known relations.

artificial intelligence, machine learning, relation, (16 more...)

arXiv.org Artificial Intelligence

2306.0495

Country:

Europe (1.00)
Asia > China (0.29)
North America > United States (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Government (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

RE-Matching: A Fine-Grained Semantic Matching Method for Zero-Shot Relation Extraction

Zhao, Jun, Zhan, Wenyu, Zhao, Xin, Zhang, Qi, Gui, Tao, Wei, Zhongyu, Wang, Junzhe, Peng, Minlong, Sun, Mingming

arXiv.org Artificial IntelligenceJun-8-2023

Semantic matching is a mainstream paradigm of zero-shot relation extraction, which matches a given input with a corresponding label description. The entities in the input should exactly match their hypernyms in the description, while the irrelevant contexts should be ignored when matching. However, general matching methods lack explicit modeling of the above matching pattern. In this work, we propose a fine-grained semantic matching method tailored for zero-shot relation extraction. Following the above matching pattern, we decompose the sentence-level similarity score into entity and context matching scores. Due to the lack of explicit annotations of the redundant components, we design a feature distillation module to adaptively identify the relation-irrelevant features and reduce their negative impact on context matching. Experimental results show that our method achieves higher matching $F_1$ score and has an inference speed 10 times faster, when compared with the state-of-the-art methods.

machine learning, natural language, relation, (19 more...)

arXiv.org Artificial Intelligence

2306.04954

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Add feedback

Actively Supervised Clustering for Open Relation Extraction

Zhao, Jun, Zhang, Yongxin, Zhang, Qi, Gui, Tao, Wei, Zhongyu, Peng, Minlong, Sun, Mingming

arXiv.org Artificial IntelligenceJun-8-2023

Current clustering-based Open Relation Extraction (OpenRE) methods usually adopt a two-stage pipeline. The first stage simultaneously learns relation representations and assignments. The second stage manually labels several instances and thus names the relation for each cluster. However, unsupervised objectives struggle to optimize the model to derive accurate clustering assignments, and the number of clusters has to be supplied in advance. In this paper, we present a novel setting, named actively supervised clustering for OpenRE. Our insight lies in that clustering learning and relation labeling can be alternately performed, providing the necessary guidance for clustering without a significant increase in human effort. The key to the setting is selecting which instances to label. Instead of using classical active labeling strategies designed for fixed known classes, we propose a new strategy, which is applicable to dynamically discover clusters of unknown relations. Experimental results show that our method is able to discover almost all relational clusters in the data and improve the SOTA methods by 10.3\% and 5.2\%, on two datasets respectively.

machine learning, natural language, relation, (18 more...)

arXiv.org Artificial Intelligence

2306.04968

Country:

Europe > Italy (0.14)
Asia > China (0.14)
Europe > United Kingdom > Scotland (0.14)
(2 more...)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback