AITopics | Xiao, Yijun

Directional FDR Control for Sub-Gaussian Sparse GLMs

Cui, Chang, Jia, Jinzhu, Xiao, Yijun, Zhang, Huiming

arXiv.org Machine LearningMay-2-2021

High-dimensional sparse generalized linear models (GLMs) have emerged in the setting that the number of samples and the dimension of variables are large, and even the dimension of variables grows faster than the number of samples. False discovery rate (FDR) control aims to identify some small number of statistically significantly nonzero results after getting the sparse penalized estimation of GLMs. Using the CLIME method for precision matrix estimations, we construct the debiased-Lasso estimator and prove the asymptotical normality by minimax-rate oracle inequalities for sparse GLMs. In practice, it is often needed to accurately judge each regression coefficient's positivity and negativity, which determines whether the predictor variable is positively or negatively related to the response variable conditionally on the rest variables. Using the debiased estimator, we establish multiple testing procedures. Under mild conditions, we show that the proposed debiased statistics can asymptotically control the directional (sign) FDR and directional false discovery variables at a pre-specified significance level. Moreover, it can be shown that our multiple testing procedure can approximately achieve a statistical power of 1. We also extend our methods to the two-sample problems and propose the two-sample test statistics. Under suitable conditions, we can asymptotically achieve directional FDR control and directional FDV control at the specified significance level for two-sample problems. Some numerical simulations have successfully verified the FDR control effects of our proposed testing procedures, which sometimes outperforms the classical knockoff method.

artificial intelligence, machine learning, procedure, (18 more...)

arXiv.org Machine Learning

2105.00393

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Add feedback

Quantifying Uncertainties in Natural Language Processing Tasks

Xiao, Yijun, Wang, William Yang

arXiv.org Artificial IntelligenceNov-17-2018

Reliable uncertainty quantification is a first step towards building explainable, transparent, and accountable artificial intelligent systems. Recent progress in Bayesian deep learning has made such quantification realizable. In this paper, we propose novel methods to study the benefits of characterizing model and data uncertainties for natural language processing (NLP) tasks. With empirical experiments on sentiment analysis, named entity recognition, and language modeling using convolutional and recurrent neural network models, we show that explicitly modeling uncertainties is not only necessary to measure output confidence levels, but also useful at enhancing model performances in various NLP tasks.

data uncertainty, deep learning, neural network, (23 more...)

arXiv.org Artificial Intelligence

1811.07253

Country:

North America > United States > California (0.14)
Asia (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Dirichlet Variational Autoencoder for Text Modeling

Xiao, Yijun, Zhao, Tiancheng, Wang, William Yang

arXiv.org Artificial IntelligenceOct-31-2018

We introduce an improved variational autoencoder (VAE) for text modeling with topic information explicitly modeled as a Dirichlet latent variable. By providing the proposed model topic awareness, it is more superior at reconstructing input texts. Furthermore, due to the inherent interactions between the newly introduced Dirichlet variable and the conventional multivariate Gaussian variable, the model is less prone to KL divergence vanishing. We derive the variational lower bound for the new model and conduct experiments on four different data sets. The results show that the proposed model is superior at text reconstruction across the latent space and classifications on learned representations have higher test accuracies.

deep learning, neural network, topic distribution, (18 more...)

arXiv.org Artificial Intelligence

1811.00135

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.66)

Industry: Consumer Products & Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

Elastic Responding Machine for Dialog Generation with Dynamically Mechanism Selecting

Zhou, Ganbin (Institute of Computing Technology, Chinese Academy of Sciences) | Luo, Ping (Institute of Computing Technology, Chinese Academy of Sciences) | Xiao, Yijun (University of California Santa Barbara) | Lin, Fen (WeChat, Tencent) | Chen, Bo (WeChat, Tencent) | He, Qing (Institute of Computing Technology, Chinese Academy of Sciences)

AAAI ConferencesFeb-8-2018

Neural models aiming at generating meaningful and diverse response is attracting increasing attention over recent years. For a given post, the conventional encoder-decoder models tend to learn high-frequency but trivial responses, or are difficult to determine which speaking styles are suitable to generate responses. To address this issue, we propose the elastic responding machine (ERM), which is based on a proposed encoder-diverter-filter-decoder framework. ERM models the multiple responding mechanisms to not only generate acceptable responses for a given post but also improve the diversity of responses. Here, the mechanisms could be regraded as some latent variables, and for a given post different responses may be generated by different mechanisms. The experiments demonstrate the quality and diversity of the generated responses, intuitively show how the learned model controls response mechanism when responding, and reveal some underlying relationship between mechanism and language style.

deep learning, mechanism, neural network, (22 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country: North America > United States > California > Santa Barbara County > Santa Barbara (0.14)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Tree-Structured Neural Machine for Linguistics-Aware Sentence Generation

Zhou, Ganbin (Institute of Computing Technology, Chinese Academy of Sciences) | Luo, Ping (Institute of Computing Technology, Chinese Academy of Sciences) | Cao, Rongyu (Institute of Computing Technology, Chinese Academy of Sciences) | Xiao, Yijun (Department of Computer Science, University of California Santa Barbara) | Lin, Fen (WeChat Search Application Department, Tencent) | Chen, Bo (WeChat Search Application Department, Tencent) | He, Qing (Institute of Computing Technology, Chinese Academy of Sciences)

AAAI ConferencesFeb-8-2018

Different from other sequential data, sentences in natural language are structured by linguistic grammars. Previous generative conversational models with chain-structured decoder ignore this structure in human language and might generate plausible responses with less satisfactory relevance and fluency. In this study, we aim to incorporate the results from linguistic analysis into the process of sentence generation for high-quality conversation generation. Specifically, we use a dependency parser to transform each response sentence into a dependency tree and construct a training corpus of sentence-tree pairs. A tree-structured decoder is developed to learn the mapping from a sentence to its tree, where different types of hidden states are used to depict the local dependencies from an internal tree node to its children. For training acceleration, we propose a tree canonicalization method, which transforms trees into equivalent ternary trees. Then, with a proposed tree-structured search method, the model is able to generate the most probable responses in the form of dependency trees, which are finally flattened into sequences as the system output. Experimental results demonstrate that the proposed X2Tree framework outperforms baseline methods over 11.15% increase of acceptance ratio.

deep learning, neural network, node, (19 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country: North America > United States > California > Santa Barbara County > Santa Barbara (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)

Add feedback

Tree-Structured Neural Machine for Linguistics-Aware Sentence Generation

Zhou, Ganbin, Luo, Ping, Cao, Rongyu, Xiao, Yijun, Lin, Fen, Chen, Bo, He, Qing

arXiv.org Artificial IntelligenceJan-3-2018

Different from other sequential data, sentences in natural language are structured by linguistic grammars. Previous generative conversational models with chain-structured decoder ignore this structure in human language and might generate plausible responses with less satisfactory relevance and fluency. In this study, we aim to incorporate the results from linguistic analysis into the process of sentence generation for high-quality conversation generation. Specifically, we use a dependency parser to transform each response sentence into a dependency tree and construct a training corpus of sentence-tree pairs. A tree-structured decoder is developed to learn the mapping from a sentence to its tree, where different types of hidden states are used to depict the local dependencies from an internal tree node to its children. For training acceleration, we propose a tree canonicalization method, which transforms trees into equivalent ternary trees. Then, with a proposed tree-structured search method, the model is able to generate the most probable responses in the form of dependency trees, which are finally flattened into sequences as the system output. Experimental results demonstrate that the proposed X2Tree framework outperforms baseline methods over 11.15% increase of acceptance ratio.

deep learning, neural network, node, (20 more...)

arXiv.org Artificial Intelligence

1705.00321

Country: North America > United States > California > Santa Barbara County > Santa Barbara (0.14)

Technology: