AITopics

2105.14328

Genre: Research Report (0.65)

Industry: Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.92)

arXiv.org Machine LearningMay-6-2021

Machine Collaboration

Liu, Qingfeng, Feng, Yang

We propose a new ensemble framework for supervised learning, named machine collaboration (MaC), based on a collection of base machines for prediction tasks. Different from bagging/stacking (a parallel & independent framework) and boosting (a sequential & top-down framework), MaC is a type of circular & interactive learning framework. The circular & interactive feature helps the base machines to transfer information circularly and update their own structures and parameters accordingly. The theoretical result on the risk bound of the estimator based on MaC shows that circular & interactive feature can help MaC reduce the risk via a parsimonious ensemble. We conduct extensive experiments on simulated data and 119 benchmark real data sets. The results of the experiments show that in most cases, MaC performs much better than several state-of-the-art methods, including CART, neural network, stacking, and boosting.

artificial intelligence, base machine, neural network, (18 more...)

2105.02569

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

arXiv.org Machine LearningFeb-7-2021

RaSE: A Variable Screening Framework via Random Subspace Ensembles

Tian, Ye, Feng, Yang

With the rapid advancement of computing power and technology, high-dimensional data become ubiquitous in many disciplines such as genomics, image analysis, and tomography. With high-dimensional data, the number of variables p could be much larger than the sample size n. What makes statistical inference possible is the sparsity assumption, which assumes only a few variables have contributions to the response. Under this sparsity assumption, there has been a rich literature on the topic of variable selection, including LASSO (Tibshirani, 1996), SCAD (Fan and Li, 2001), elastic net (Zou and Hastie, 2005), and MCP (Zhang, 2010). Despite of the success of these methods in many applications, for the ultra-high dimensional scenario where the dimension p grows exponentially with n, they may not work well due to the "curse of dimensionality" in terms of simultaneous challenges to computational expediency, statistical accuracy, and algorithmic stability (Fan and Lv, 2008).

artificial intelligence, health & medicine, screening method, (18 more...)

2102.03892

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

arXiv.org Artificial IntelligenceJan-19-2021

WeChat AI's Submission for DSTC9 Interactive Dialogue Evaluation Track

Li, Zekang, Li, Zongjia, Zhang, Jinchao, Feng, Yang, Zhou, Jie

We participate in the DSTC9 Interactive Dialogue Evaluation Track (Gunasekara et al. 2020) sub-task 1 (Knowledge Grounded Dialogue) and sub-task 2 (Interactive Dialogue). In sub-task 1, we employ a pre-trained language model to generate topic-related responses and propose a response ensemble method for response selection. In sub-task2, we propose a novel Dialogue Planning Model (DPM) to capture conversation flow in the interaction with humans. We also design an integrated open-domain dialogue system containing pre-process, dialogue model, scoring model, and post-process, which can generate fluent, coherent, consistent, and humanlike responses. We tie 1st on human ratings and also get the highest Meteor, and Bert-score in sub-task 1, and rank 3rd on interactive human evaluation in sub-task 2.

dstc9 interactive dialogue evaluation track, submission, wechat ai

2101.07947

Genre: Research Report (0.40)

Industry: Information Technology > Services (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.53)
Information Technology > Communications > Social Media (0.40)

arXiv.org Machine LearningJan-6-2021

The Interplay of Demographic Variables and Social Distancing Scores in Deep Prediction of U.S. COVID-19 Cases

Tang, Francesca, Feng, Yang, Chiheb, Hamza, Fan, Jianqing

With the severity of the COVID-19 outbreak, we characterize the nature of the growth trajectories of counties in the United States using a novel combination of spectral clustering and the correlation matrix. As the U.S. and the rest of the world are experiencing a severe second wave of infections, the importance of assigning growth membership to counties and understanding the determinants of the growth are increasingly evident. Subsequently, we select the demographic features that are most statistically significant in distinguishing the communities. Lastly, we effectively predict the future growth of a given county with an LSTM using three social distancing scores. This comprehensive study captures the nature of counties' growth in cases at a very micro-level using growth communities, demographic factors, and social distancing performance to help government agencies utilize known information to make appropriate decisions regarding which potential counties to target resources Tang and Feng contribute equally to this work.

deep learning, immunology, low access, (25 more...)

2101.02113

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > Experimental Study (0.47)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceDec-22-2020

Future-Guided Incremental Transformer for Simultaneous Translation

Zhang, Shaolei, Feng, Yang, Li, Liangyou

Simultaneous translation (ST) starts translations synchronously while reading source sentences, and is used in many online scenarios. The previous wait-k policy is concise and achieved good results in ST. However, wait-k policy faces two weaknesses: low training speed caused by the recalculation of hidden states and lack of future source information to guide training. For the low training speed, we propose an incremental Transformer with an average embedding layer (AEL) to accelerate the speed of calculation of the hidden states during training. For future-guided training, we propose a conventional Transformer as the teacher of the incremental Transformer, and try to invisibly embed some future information in the model through knowledge distillation. We conducted experiments on Chinese-English and German-English simultaneous translation tasks and compared with the wait-k policy to evaluate the proposed method. Our method can effectively increase the training speed by about 28 times on average at different k and implicitly embed some predictive abilities in the model, achieving better translation quality than wait-k baseline.

artificial intelligence, machine translation, transformer, (16 more...)

2012.12465

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningDec-7-2020

Spectral clustering via adaptive layer aggregation for multi-layer networks

Huang, Sihan, Weng, Haolei, Feng, Yang

One of the fundamental problems in network analysis is detecting community structure in multi-layer networks, of which each layer represents one type of edge information among the nodes. We propose integrative spectral clustering approaches based on effective convex layer aggregations. Our aggregation methods are strongly motivated by a delicate asymptotic analysis of the spectral embedding of weighted adjacency matrices and the downstream $k$-means clustering, in a challenging regime where community detection consistency is impossible. In fact, the methods are shown to estimate the optimal convex aggregation, which minimizes the mis-clustering error under some specialized multi-layer network models. Our analysis further suggests that clustering using Gaussian mixture models is generally superior to the commonly used $k$-means in spectral clustering. Extensive numerical studies demonstrate that our adaptive aggregation techniques, together with Gaussian mixture model clustering, make the new spectral clustering remarkably competitive compared to several popularly used methods.

artificial intelligence, banking & finance, spectral, (18 more...)

2012.04646

Country: North America > United States (0.92)

Genre: Research Report (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Communications > Networks (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)

arXiv.org Artificial IntelligenceNov-3-2020

Investigating Catastrophic Forgetting During Continual Training for Neural Machine Translation

Gu, Shuhao, Feng, Yang

Neural machine translation (NMT) models usually suffer from catastrophic forgetting during continual training where the models tend to gradually forget previously learned knowledge and swing to fit the newly added data which may have a different distribution, e.g. a different domain. Although many methods have been proposed to solve this problem, we cannot get to know what causes this phenomenon yet. Under the background of domain adaptation, we investigate the cause of catastrophic forgetting from the perspectives of modules and parameters (neurons). The investigation on the modules of the NMT model shows that some modules have tight relation with the general-domain knowledge while some other modules are more essential in the domain adaptation. And the investigation on the parameters shows that some parameters are important for both the general-domain and in-domain translation and the great change of them during continual training brings about the performance decline in general-domain. We conduct experiments across different language pairs and domains to ensure the validity and reliability of our findings.

artificial intelligence, continual training, machine translation, (17 more...)

2011.00678

Country:

Europe > Belgium (0.14)
Oceania > Australia (0.14)
North America > Puerto Rico (0.14)
(3 more...)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

arXiv.org Machine LearningOct-23-2020

RaSE: Random Subspace Ensemble Classification

Tian, Ye, Feng, Yang

We propose a flexible ensemble classification framework, Random Subspace Ensemble (RaSE), for sparse classification. In the RaSE algorithm, we aggregate many weak learners, where each weak learner is a base classifier trained in a subspace optimally selected from a collection of random subspaces. To conduct subspace selection, we propose a new criterion, ratio information criterion (RIC), based on weighted Kullback-Leibler divergence. The theoretical analysis includes the risk and Monte-Carlo variance of RaSE classifier, establishing the screening consistency and weak consistency of RIC, and providing an upper bound for the misclassification rate of RaSE classifier. In addition, we show that in a high-dimensional framework, the number of random subspaces needs to be very large to guarantee that a subspace covering signals is selected. Therefore, we propose an iterative version of RaSE algorithm and prove that under some specific conditions, a smaller number of generated random subspaces are needed to find a desirable subspace through iteration. An array of simulations under various models and real-data applications demonstrate the effectiveness and robustness of the RaSE classifier and its iterative version in terms of low misclassifi-cation rate and accurate feature ranking. The RaSE algorithm is implemented in the R package RaSEn on CRAN.

bayesian inference, health & medicine, null, (17 more...)

2006.08855

Country: North America > United States > California > Orange County > Irvine (0.14)

Genre: Research Report (0.81)

Industry: Health & Medicine (1.00)

arXiv.org Artificial IntelligenceOct-16-2020

Generating Diverse Translation from Model Distribution with Dropout

Wu, Xuanfu, Feng, Yang, Shao, Chenze

Despite the improvement of translation quality, neural machine translation (NMT) often suffers from the lack of diversity in its generation. In this paper, we propose to generate diverse translations by deriving a large number of possible models with Bayesian modelling and sampling models from them for inference. The possible models are obtained by applying concrete dropout to the NMT model and each of them has specific confidence for its prediction, which corresponds to a posterior model distribution under specific training data in the principle of Bayesian modeling. With variational inference, the posterior model distribution can be approximated with a variational distribution, from which the final models for inference are sampled. We conducted experiments on Chinese-English and English-German translation tasks and the results shows that our method makes a better trade-off between diversity and accuracy.

bayesian inference, machine translation, translation, (18 more...)

2010.08178

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)