AITopics | Yang, Yan

Plotting

Yang, Yan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

State Value Generation with Prompt Learning and Self-Training for Low-Resource Dialogue State Tracking

Gu, Ming, Yang, Yan, Chen, Chengcai, Yu, Zhou

arXiv.org Artificial IntelligenceJan-30-2024

Recently, low-resource dialogue state tracking (DST) has received increasing attention. First obtaining state values then based on values to generate slot types has made great progress in this task. However, obtaining state values is still an under-studied problem. Existing extraction-based approaches cannot capture values that require the understanding of context and are not generalizable either. To address these issues, we propose a novel State VAlue Generation based framework (SVAG), decomposing DST into state value generation and domain slot generation. Specifically, we propose to generate state values and use self-training to further improve state value generation. Moreover, we design an estimator aiming at detecting incomplete generation and incorrect generation for pseudo-labeled data selection during self-training. Experimental results on the MultiWOZ 2.1 dataset show that our method which has only less than 1 billion parameters achieves state-of-the-art performance under the data ratio settings of 5%, 10%, and 25% when limited to models under 100 billion parameters. Compared to models with more than 100 billion parameters, SVAG still reaches competitive results.

large language model, machine learning, state value, (15 more...)

arXiv.org Artificial Intelligence

2401.16862

Country:

Europe (1.00)
North America > United States (0.46)
Asia > Middle East > UAE (0.14)

Genre: Research Report (0.65)

Industry: Consumer Products & Services (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.87)

Add feedback

Debunking Free Fusion Myth: Online Multi-view Anomaly Detection with Disentangled Product-of-Experts Modeling

Wang, Hao, Cheng, Zhi-Qi, Sun, Jingdong, Yang, Xin, Wu, Xiao, Chen, Hongyang, Yang, Yan

arXiv.org Artificial IntelligenceOct-31-2023

Multi-view or even multi-modal data is appealing yet challenging for real-world applications. Detecting anomalies in multi-view data is a prominent recent research topic. However, most of the existing methods 1) are only suitable for two views or type-specific anomalies, 2) suffer from the issue of fusion disentanglement, and 3) do not support online detection after model deployment. To address these challenges, our main ideas in this paper are three-fold: multi-view learning, disentangled representation learning, and generative model. To this end, we propose dPoE, a novel multi-view variational autoencoder model that involves (1) a Product-of-Experts (PoE) layer in tackling multi-view data, (2) a Total Correction (TC) discriminator in disentangling view-common and view-specific representations, and (3) a joint loss function in wrapping up all components. In addition, we devise theoretical information bounds to control both view-common and view-specific representations. Extensive experiments on six real-world datasets markedly demonstrate that the proposed dPoE outperforms baselines.

artificial intelligence, disentangled product-of-expert modeling, machine learning, (2 more...)

arXiv.org Artificial Intelligence

2310.18728

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (0.85)
Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback

A Missing Value Filling Model Based on Feature Fusion Enhanced Autoencoder

Liu, Xinyao, Du, Shengdong, Li, Tianrui, Teng, Fei, Yang, Yan

arXiv.org Artificial IntelligenceAug-3-2023

With the advent of the big data era, the data quality problem is becoming more critical. Among many factors, data with missing values is one primary issue, and thus developing effective imputation models is a key topic in the research community. Recently, a major research direction is to employ neural network models such as self-organizing mappings or automatic encoders for filling missing values. However, these classical methods can hardly discover interrelated features and common features simultaneously among data attributes. Especially, it is a very typical problem for classical autoencoders that they often learn invalid constant mappings, which dramatically hurts the filling performance. To solve the above-mentioned problems, we propose a missing-value-filling model based on a feature-fusion-enhanced autoencoder. We first incorporate into an autoencoder a hidden layer that consists of de-tracking neurons and radial basis function neurons, which can enhance the ability of learning interrelated features and common features. Besides, we develop a missing value filling strategy based on dynamic clustering that is incorporated into an iterative optimization process. This design can enhance the multi-dimensional feature fusion ability and thus improves the dynamic collaborative missing-value-filling performance. The effectiveness of the proposed model is validated by extensive experiments compared to a variety of baseline methods on thirteen data sets.

artificial intelligence, machine learning, neuron, (16 more...)

arXiv.org Artificial Intelligence

2208.13495

Country:

Asia > China (0.47)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology (0.46)
Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Nearest Neighborhood-Based Deep Clustering for Source Data-absent Unsupervised Domain Adaptation

Tang, Song, Yang, Yan, Ma, Zhiyuan, Hendrich, Norman, Zeng, Fanyu, Ge, Shuzhi Sam, Zhang, Changshui, Zhang, Jianwei

arXiv.org Artificial IntelligenceAug-3-2021

In the classic setting of unsupervised domain adaptation (UDA), the labeled source data are available in the training phase. However, in many real-world scenarios, owing to some reasons such as privacy protection and information security, the source data is inaccessible, and only a model trained on the source domain is available. This paper proposes a novel deep clustering method for this challenging task. Aiming at the dynamical clustering at feature-level, we introduce extra constraints hidden in the geometric structure between data to assist the process. Concretely, we propose a geometry-based constraint, named semantic consistency on the nearest neighborhood (SCNNH), and use it to encourage robust clustering. To reach this goal, we construct the nearest neighborhood for every target data and take it as the fundamental clustering unit by building our objective on the geometry. Also, we develop a more SCNNH-compliant structure with an additional semantic credibility constraint, named semantic hyper-nearest neighborhood (SHNNH). After that, we extend our method to this new geometry. Extensive experiments on three challenging UDA datasets indicate that our method achieves state-of-the-art results. The proposed method has significant improvement on all datasets (as we adopt SHNNH, the average accuracy increases by over 3.0% on the large-scaled dataset). Code is available at https://github.com/tntek/N2DCX.

deep learning, domain adaptation, neural network, (22 more...)

arXiv.org Artificial Intelligence

2107.12585

Country: Asia > China (0.68)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Molecular structure prediction based on graph convolutional networks

Lin, Xiaohui, Jiang, Yongquan, Yang, Yan

arXiv.org Artificial IntelligenceJul-1-2021

Due to the important application of molecular structure in many fields, calculation by experimental means or traditional density functional theory is often time consuming. In view of this, a new Model Structure based on Graph Convolutional Neural network (MSGCN) is proposed, which can determine the molecular structure by predicting the distance between two atoms. In order to verify the effect of MSGCN model, the model is compared with the method of calculating molecular three-dimensional conformation in RDKit, and the result is better than it. In addition, the distance predicted by the MSGCN model and the distance calculated by the QM9 dataset were used to predict the molecular properties, thus proving the effectiveness of the distance predicted by the MSGCN model.

deep learning, molecule, neural network, (20 more...)

arXiv.org Artificial Intelligence

2107.01035

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

SLAKE: A Semantically-Labeled Knowledge-Enhanced Dataset for Medical Visual Question Answering

Liu, Bo, Zhan, Li-Ming, Xu, Li, Ma, Lin, Yang, Yan, Wu, Xiao-Ming

arXiv.org Artificial IntelligenceFeb-18-2021

Medical visual question answering (Med-VQA) has tremendous potential in healthcare. However, the development of this technology is hindered by the lacking of publicly-available and high-quality labeled datasets for training and evaluation. In this paper, we present a large bilingual dataset, SLAKE, with comprehensive semantic labels annotated by experienced physicians and a new structural medical knowledge base for Med-VQA. Besides, SLAKE includes richer modalities and covers more human body parts than the currently available dataset. We show that SLAKE can be used to facilitate the development and evaluation of Med-VQA systems. The dataset can be downloaded from http://www.med-vqa.com/slake.

artificial intelligence, dataset, health & medicine, (17 more...)

arXiv.org Artificial Intelligence

2102.09542

Country:

Asia > China (0.15)
North America > United States (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area (0.70)
Health & Medicine > Diagnostic Medicine > Imaging (0.30)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.87)

Add feedback

Fast Calculation of Probabilistic Optimal Power Flow: A Deep Learning Approach

Yang, Yan, Yu, Juan, Yang, Zhifang, Xiang, Mingxu, Liu, Ren

arXiv.org Machine LearningJun-24-2019

Probabilistic optimal power flow (POPF) is an important analytical tool to ensure the secure and economic operation of power systems. POPF needs to solve enormous nonlinear and nonconvex optimization problems. The huge computational burden has become the major bottleneck for the practical application. This paper presents a deep learning approach to solve the POPF problem efficiently and accurately. Taking advantage of the deep structure and reconstructive strategy of stacked denoising auto encoders (SDAE), a SDAE-based optimal power flow (OPF) is developed to extract the high-level nonlinear correlations between the system operating condition and the OPF solution. A training process is designed to learn the feature of POPF. The trained SDAE network can be utilized to conveniently calculate the OPF solution of random samples generated by Monte-Carlo simulation (MCS) without the need of optimization. A modified IEEE 118-bus power system is simulated to demonstrate the effectiveness of the proposed method.

deep learning, neural network, sdae-based opf, (16 more...)

arXiv.org Machine Learning

1906.09951

Country:

North America > United States (0.46)
Asia (0.29)

Genre: Research Report (0.40)

Industry: Energy > Power Industry (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Forward and Backward Knowledge Transfer for Sentiment Classification

Wang, Hao, Liu, Bing, Wang, Shuai, Ma, Nianzu, Yang, Yan

arXiv.org Artificial IntelligenceJun-8-2019

This paper studies the problem of learning a sequence of sentiment classification tasks. The learned knowledge from each task is retained and used to help future or subsequent task learning. This learning paradigm is called Lifelong Learning (LL). However, existing LL methods either only transfer knowledge forward to help future learning and do not go back to improve the model of a previous task or require the training data of the previous task to retrain its model to exploit backward/reverse knowledge transfer. This paper studies reverse knowledge transfer of LL in the context of naive Bayesian (NB) classification. It aims to improve the model of a previous task by leveraging future knowledge without retraining using its training data. This is done by exploiting a key characteristic of the generative model of NB. That is, it is possible to improve the NB classifier for a task by improving its model parameters directly by using the retained knowledge from other tasks. Experimental results show that the proposed method markedly outperforms existing LL baselines.

artificial intelligence, knowledge, text classification, (19 more...)

arXiv.org Artificial Intelligence

1906.03506

Country: North America > United States > Illinois (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.86)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.86)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.72)

Add feedback

Spectral Perturbation Meets Incomplete Multi-view Data

Wang, Hao, Zong, Linlin, Liu, Bing, Yang, Yan, Zhou, Wei

arXiv.org Artificial IntelligenceMay-31-2019

Beyond existing multi-view clustering, this paper studies a more realistic clustering scenario, referred to as incomplete multi-view clustering, where a number of data instances are missing in certain views. To tackle this problem, we explore spectral perturbation theory. In this work, we show a strong link between perturbation risk bounds and incomplete multi-view clustering. That is, as the similarity matrix fed into spectral clustering is a quantity bounded in magnitude O(1), we transfer the missing problem from data to similarity and tailor a matrix completion method for incomplete similarity matrix. Moreover, we show that the minimization of perturbation risk bounds among different views maximizes the final fusion result across all views. This provides a solid fusion criteria for multi-view data. We motivate and propose a Perturbation-oriented Incomplete multi-view Clustering (PIC) method. Experimental results demonstrate the effectiveness of the proposed method.

artificial intelligence, machine learning, matrix, (18 more...)

arXiv.org Artificial Intelligence

1906.00098

Country:

Asia > China (0.28)
North America > United States > Illinois (0.14)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.69)

Add feedback

Deep Air Quality Forecasting Using Hybrid Deep Learning Framework

Du, Shengdong, Li, Tianrui, Yang, Yan, Horng, Shi-Jinn

arXiv.org Machine LearningDec-11-2018

Air quality forecasting has been regarded as the key problem of air pollution early warning and control management. In this paper, we propose a novel deep learning model for air quality (mainly PM2.5) forecasting, which learns the spatial-temporal correlation features and interdependence of multivariate air quality related time series data by hybrid deep learning architecture. Due to the nonlinear and dynamic characteristics of multivariate air quality time series data, the base modules of our model include one-dimensional Convolutional Neural Networks (CNN) and Bi-directional Long Short-term Memory networks (Bi-LSTM). The former is to extract the local trend features and the latter is to learn long temporal dependencies. Then we design a jointly hybrid deep learning framework which based on one-dimensional CNN and Bi-LSTM for shared representation features learning of multivariate air quality related time series data. The experiment results show that our model is capable of dealing with PM2.5 air pollution forecasting with satisfied accuracy.

deep learning, neural network, time series data, (20 more...)

arXiv.org Machine Learning

1812.04783

Country:

Asia > China (0.15)
South America > Chile (0.14)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback