AITopics | Liu, Quan

Collaborating Authors

Liu, Quan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

USTC-NELSLIP at SemEval-2023 Task 2: Statistical Construction and Dual Adaptation of Gazetteer for Multilingual Complex NER

Ma, Jun-Yu, Gu, Jia-Chen, Qi, Jiajun, Ling, Zhen-Hua, Liu, Quan, Zhao, Xiaoyi

arXiv.org Artificial IntelligenceMay-3-2023

This paper describes the system developed by the USTC-NELSLIP team for SemEval-2023 Task 2 Multilingual Complex Named Entity Recognition (MultiCoNER II). A method named Statistical Construction and Dual Adaptation of Gazetteer (SCDAG) is proposed for Multilingual Complex NER. The method first utilizes a statistics-based approach to construct a gazetteer. Secondly, the representations of gazetteer networks and language models are adapted by minimizing the KL divergence between them at both the sentence-level and entity-level. Finally, these two networks are then integrated for supervised named entity recognition (NER) training. The proposed method is applied to XLM-R with a gazetteer built from Wikidata, and shows great generalization ability across different tracks. Experimental results and detailed analysis verify the effectiveness of the proposed method. The official results show that our system ranked 1st on one track (Hindi) in this task.

artificial intelligence, gazetteer, natural language, (13 more...)

arXiv.org Artificial Intelligence

2305.02517

Country:

Europe (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

Semi-Supervised Contrastive Learning for Remote Sensing: Identifying Ancient Urbanization in the South Central Andes

Xu, Jiachen, Guo, Junlin, Zimmer-Dauphinee, James, Liu, Quan, Shi, Yuxuan, Asad, Zuhayr, Wilkes, D. Mitchell, VanValkenburgh, Parker, Wernke, Steven A., Huo, Yuankai

arXiv.org Artificial IntelligenceApr-15-2023

Archaeology has long faced fundamental issues of sampling and scalar representation. Traditionally, the local-to-regional-scale views of settlement patterns are produced through systematic pedestrian surveys. Recently, systematic manual survey of satellite and aerial imagery has enabled continuous distributional views of archaeological phenomena at interregional scales. However, such 'brute force' manual imagery survey methods are both time- and labor-intensive, as well as prone to inter-observer differences in sensitivity and specificity. The development of self-supervised learning methods offers a scalable learning scheme for locating archaeological features using unlabeled satellite and historical aerial images. However, archaeological features are generally only visible in a very small proportion relative to the landscape, while the modern contrastive-supervised learning approach typically yields an inferior performance on highly imbalanced datasets. In this work, we propose a framework to address this long-tail problem. As opposed to the existing contrastive learning approaches that treat the labelled and unlabeled data separately, our proposed method reforms the learning paradigm under a semi-supervised setting in order to utilize the precious annotated data (<7% in our setting). Specifically, the highly unbalanced nature of the data is employed as the prior knowledge in order to form pseudo negative pairs by ranking the similarities between unannotated image patches and annotated anchor images. In this study, we used 95,358 unlabeled images and 5,830 labelled images in order to solve the issues associated with detecting ancient buildings from a long-tailed satellite image dataset. From the results, our semi-supervised contrastive learning model achieved a promising testing balanced accuracy of 79.0%, which is a 3.8% improvement as compared to other state-of-the-art approaches.

artificial intelligence, inductive learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2112.06437

Country: North America > United States > New York (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

Multi-Stage Coarse-to-Fine Contrastive Learning for Conversation Intent Induction

Chu, Caiyuan, Li, Ya, Liu, Yifan, Gu, Jia-Chen, Liu, Quan, Ge, Yongxin, Hu, Guoping

arXiv.org Artificial IntelligenceMar-8-2023

Intent recognition is critical for task-oriented dialogue systems. However, for emerging domains and new services, it is difficult to accurately identify the key intent of a conversation due to time-consuming data annotation and comparatively poor model transferability. Therefore, the automatic induction of dialogue intention is very important for intelligent dialogue systems. This paper presents our solution to Track 2 of Intent Induction from Conversations for Task-Oriented Dialogue at the Eleventh Dialogue System Technology Challenge (DSTC11). The essence of intention clustering lies in distinguishing the representation of different dialogue utterances. The key to automatic intention induction is that, for any given set of new data, the sentence representation obtained by the model can be well distinguished from different labels. Therefore, we propose a multi-stage coarse-to-fine contrastive learning model training scheme including unsupervised contrastive learning pre-training, supervised contrastive learning pre-training, and fine-tuning with joint contrastive learning and clustering to obtain a better dialogue utterance representation model for the clustering task. In the released DSTC11 Track 2 evaluation results, our proposed system ranked first on both of the two subtasks of this Track.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2303.05034

Country:

Asia > China (0.29)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

WIDER & CLOSER: Mixture of Short-channel Distillers for Zero-shot Cross-lingual Named Entity Recognition

Ma, Jun-Yu, Chen, Beiduo, Gu, Jia-Chen, Ling, Zhen-Hua, Guo, Wu, Liu, Quan, Chen, Zhigang, Liu, Cong

arXiv.org Artificial IntelligenceDec-7-2022

Zero-shot cross-lingual named entity recognition (NER) aims at transferring knowledge from annotated and rich-resource data in source languages to unlabeled and lean-resource data in target languages. Existing mainstream methods based on the teacher-student distillation framework ignore the rich and complementary information lying in the intermediate layers of pre-trained language models, and domain-invariant information is easily lost during transfer. In this study, a mixture of short-channel distillers (MSD) method is proposed to fully interact the rich hierarchical information in the teacher model and to transfer knowledge to the student model sufficiently and efficiently. Concretely, a multi-channel distillation framework is designed for sufficient information transfer by aggregating multiple distillers as a mixture. Besides, an unsupervised method adopting parallel domain adaptation is proposed to shorten the channels between the teacher and student models to preserve domain-invariant features. Experiments on four datasets across nine languages demonstrate that the proposed method achieves new state-of-the-art performance on zero-shot cross-lingual NER and shows great generalization and compatibility across languages and fields.

artificial intelligence, computational linguistic, natural language, (17 more...)

arXiv.org Artificial Intelligence

2212.03506

Country:

North America > United States (1.00)
Europe (1.00)
Asia (1.00)

Genre:

Research Report > New Finding (0.66)
Research Report > Experimental Study (0.46)

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

Survival Prediction of Brain Cancer with Incomplete Radiology, Pathology, Genomics, and Demographic Data

Cui, Can, Liu, Han, Liu, Quan, Deng, Ruining, Asad, Zuhayr, Zhao, Yaohong WangShilin, Yang, Haichun, Landman, Bennett A., Huo, Yuankai

arXiv.org Artificial IntelligenceJul-18-2022

Integrating cross-department multi-modal data (e.g., radiological, pathological, genomic, and clinical data) is ubiquitous in brain cancer diagnosis and survival prediction. To date, such an integration is typically conducted by human physicians (and panels of experts), which can be subjective and semi-quantitative. Recent advances in multi-modal deep learning, however, have opened a door to leverage such a process to a more objective and quantitative manner. Unfortunately, the prior arts of using four modalities on brain cancer survival prediction are limited by a "complete modalities" setting (i.e., with all modalities available). Thus, there are still open questions on how to effectively predict brain cancer survival from the incomplete radiological, pathological, genomic, and demographic data (e.g., one or more modalities might not be collected for a patient). For instance, should we use both complete and incomplete data, and more importantly, how to use those data? To answer the preceding questions, we generalize the multi-modal learning on cross-department multi-modal data to a missing data setting. Our contribution is three-fold: 1) We introduce optimal multi-modal learning with missing data (MMD) pipeline with optimized hardware consumption and computational efficiency; 2) We extend multi-modal learning on radiological, pathological, genomic, and demographic data into missing data scenarios; 3) a large-scale public dataset (with 962 patients) is collected to systematically evaluate glioma tumor survival prediction using four modalities. The proposed method improved the C-index of survival prediction from 0.7624 to 0.8053.

artificial intelligence, machine learning, modality, (18 more...)

arXiv.org Artificial Intelligence

2203.04419

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology > Brain Cancer (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Learning Disentangled Representations for Counterfactual Regression via Mutual Information Minimization

Cheng, Mingyuan, Liao, Xinru, Liu, Quan, Ma, Bin, Xu, Jian, Zheng, Bo

arXiv.org Machine LearningJun-2-2022

Learning individual-level treatment effect is a fundamental problem in causal inference and has received increasing attention in many areas, especially in the user growth area which concerns many internet companies. Recently, disentangled representation learning methods that decompose covariates into three latent factors, including instrumental, confounding and adjustment factors, have witnessed great success in treatment effect estimation. However, it remains an open problem how to learn the underlying disentangled factors precisely. Specifically, previous methods fail to obtain independent disentangled factors, which is a necessary condition for identifying treatment effect. In this paper, we propose Disentangled Representations for Counterfactual Regression via Mutual Information Minimization (MIM-DRCFR), which uses a multi-task learning framework to share information when learning the latent factors and incorporates MI minimization learning criteria to ensure the independence of these factors. Extensive experiments including public benchmarks and real-world industrial user growth datasets demonstrate that our method performs much better than state-of-the-art methods.

artificial intelligence, learning disentangled representation, machine learning, (2 more...)

arXiv.org Machine Learning

2206.01022

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.73)

Add feedback

USTC-NELSLIP at SemEval-2022 Task 11: Gazetteer-Adapted Integration Network for Multilingual Complex Named Entity Recognition

Chen, Beiduo, Ma, Jun-Yu, Qi, Jiajun, Guo, Wu, Ling, Zhen-Hua, Liu, Quan

arXiv.org Artificial IntelligenceApr-18-2022

This paper describes the system developed by the USTC-NELSLIP team for SemEval-2022 Task 11 Multilingual Complex Named Entity Recognition (MultiCoNER). We propose a gazetteer-adapted integration network (GAIN) to improve the performance of language models for recognizing complex named entities. The method first adapts the representations of gazetteer networks to those of language models by minimizing the KL divergence between them. After adaptation, these two networks are then integrated for backend supervised named entity recognition (NER) training. The proposed method is applied to several state-of-the-art Transformer-based NER models with a gazetteer built from Wikidata, and shows great generalization ability across them. The final predictions are derived from an ensemble of these trained models. Experimental results and detailed analysis verify the effectiveness of the proposed method. The official results show that our system ranked 1st on three tracks (Chinese, Code-mixed and Bangla) and 2nd on the other ten tracks in this task.

artificial intelligence, gazetteer, natural language, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2022.semeval-1.223

2203.03216

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Learning to Retrieve Entity-Aware Knowledge and Generate Responses with Copy Mechanism for Task-Oriented Dialogue Systems

Tan, Chao-Hong, Yang, Xiaoyu, Zheng, Zi'ou, Li, Tianda, Feng, Yufei, Gu, Jia-Chen, Liu, Quan, Liu, Dan, Ling, Zhen-Hua, Zhu, Xiaodan

arXiv.org Artificial IntelligenceDec-22-2020

Task-oriented conversational modeling with unstructured knowledge access, as track 1 of the 9th Dialogue System Technology Challenges (DSTC 9), requests to build a system to generate response given dialogue history and knowledge access. This challenge can be separated into three subtasks, (1) knowledge-seeking turn detection, (2) knowledge selection, and (3) knowledge-grounded response generation. We use pre-trained language models, ELECTRA and RoBERTa, as our base encoder for different subtasks. For subtask 1 and 2, the coarse-grained information like domain and entity are used to enhance knowledge usage. For subtask 3, we use a latent variable to encode dialog history and selected knowledge better and generate responses combined with copy mechanism. Meanwhile, some useful post-processing strategies are performed on the model's final output to make further knowledge usage in the generation task. As shown in released evaluation results, our proposed system ranks second under objective metrics and ranks fourth under human metrics.

artificial intelligence, knowledge, neural network, (19 more...)

arXiv.org Artificial Intelligence

2012.11937

Country:

Europe (0.46)
North America > Canada (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.70)

Add feedback

Exploring End-to-End Differentiable Natural Logic Modeling

Feng, Yufei, Zheng, Zi'ou, Liu, Quan, Greenspan, Michael, Zhu, Xiaodan

arXiv.org Artificial IntelligenceNov-8-2020

We explore end-to-end trained differentiable models that integrate natural logic with neural networks, aiming to keep the backbone of natural language reasoning based on the natural logic formalism while introducing subsymbolic vector representations and neural components. The proposed model adapts module networks to model natural logic operations, which is enhanced with a memory component to model contextual information. Experiments show that the proposed framework can effectively model monotonicity-based reasoning, compared to the baseline neural network models without built-in inductive bias for monotonicity-based reasoning. Our proposed model shows to be robust when transferred from upward to downward inference. We perform further analyses on the performance of the proposed model on aggregation, showing the effectiveness of the proposed subcomponents on helping achieve better intermediate aggregation performance.

deep learning, neural network, relation, (21 more...)

arXiv.org Artificial Intelligence

2011.04044

Country:

Europe (1.00)
Asia (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.93)

Add feedback

Program Enhanced Fact Verification with Verbalization and Graph Attention Network

Yang, Xiaoyu, Nie, Feng, Feng, Yufei, Liu, Quan, Chen, Zhigang, Zhu, Xiaodan

arXiv.org Artificial IntelligenceNov-4-2020

Performing fact verification based on structured data is important for many real-life applications and is a challenging research problem, particularly when it involves both symbolic operations and informal inference based on language understanding. In this paper, we present a Program-enhanced Verbalization and Graph Attention Network (ProgVGAT) to integrate programs and execution into textual inference models. Specifically, a verbalization with program execution model is proposed to accumulate evidences that are embedded in operations over the tables. Built on that, we construct the graph attention verification networks, which are designed to fuse different sources of evidences from verbalized program execution, program structures, and the original statements and tables, to make the final verification decision. To support the above framework, we propose a program selection module optimized with a new training strategy based on margin loss, to produce more accurate programs, which is shown to be effective in enhancing the final verification results. Experimental results show that the proposed framework achieves the new state-of-the-art performance, a 74.4% accuracy, on the benchmark dataset TABFACT.

deep learning, neural network, verification, (20 more...)

arXiv.org Artificial Intelligence

2010.03084

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.48)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback