AITopics

Jin, Pengzhan, Zhu, Aiqing, Karniadakis, George Em, Tang, Yifa

Symplectic networks: Intrinsic structure-preserving networks for identifying Hamiltonian systems

This work presents a framework of constructing the neural networks preserving the symplectic structure, so-called symplectic networks (SympNets). With the symplectic networks, we show some numerical results about (\romannumeral1) solving the Hamiltonian systems by learning abundant data points over the phase space, and (\romannumeral2) predicting the phase flows by learning a series of points depending on time. All the experiments point out that the symplectic networks perform much more better than the fully-connected networks that without any prior information, especially in the task of predicting which is unable to do within the conventional numerical methods.

deep learning, sympnet, upstream oil & gas, (20 more...)

2001.0375

Country:

Asia > China (0.14)
North America > United States (0.14)

Genre: Research Report (0.50)

Industry: Energy > Oil & Gas > Upstream (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

On Computation and Generalization of Generative Adversarial Imitation Learning

Chen, Minshuo, Wang, Yizhou, Liu, Tianyi, Yang, Zhuoran, Li, Xingguo, Wang, Zhaoran, Zhao, Tuo

Generative Adversarial Imitation Learning (GAIL) is a powerful and practical approach for learning sequential decision-making policies. Different from Reinforcement Learning (RL), GAIL takes advantage of demonstration data by experts (e.g., human), and learns both the policy and reward function of the unknown environment. Despite the significant empirical progresses, the theory behind GAIL is still largely unknown. The major difficulty comes from the underlying temporal dependency of the demonstration data and the minimax computational formulation of GAIL without convex-concave structure. To bridge such a gap between theory and practice, this paper investigates the theoretical properties of GAIL. Specifically, we show: (1) For GAIL with general reward parameterization, the generalization can be guaranteed as long as the class of the reward functions is properly controlled; (2) For GAIL, where the reward is parameterized as a reproducing kernel function, GAIL can be efficiently solved by stochastic first order optimization algorithms, which attain sublinear convergence to a stationary solution. To the best of our knowledge, these are the first results on statistical and computational guarantees of imitation learning with reward/policy function approximation. Numerical experiments are provided to support our analysis.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

2001.02792

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Khatun, Aisha, Rahman, Anisur, Islam, Md. Saiful, Marium-E-Jannat, null

Authorship Attribution in Bangla literature using Character-level CNN

arXiv.org Artificial IntelligenceJan-11-2020

Characters are the smallest unit of text that can extract stylometric signals to determine the author of a text. In this paper, we investigate the effectiveness of character-level signals in Authorship Attribution of Bangla Literature and show that the results are promising but improvable. The time and memory efficiency of the proposed model is much higher than the word level counterparts but accuracy is 2-5% less than the best performing word-level models. Comparison of various word-based models is performed and shown that the proposed model performs increasingly better with larger datasets. We also analyze the effect of pre-training character embedding of diverse Bangla character set in authorship attribution. It is seen that the performance is improved by up to 10% on pre-training. We used 2 datasets from 6 to 14 authors, balancing them before training and compare the results.

authorship attribution, classification, dataset, (12 more...)

arXiv.org Artificial Intelligence

2001.05316

Country: Asia > Bangladesh > Sylhet Division > Sylhet District > Sylhet (0.05)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Schrills, Tim, Franke, Thomas

How to Answer Why -- Evaluating the Explanations of AI Through Mental Model Analysis

arXiv.org Artificial IntelligenceJan-11-2020

To achieve optimal human-system integration in the context of user-AI interaction it is important that users develop a valid representation of how AI works. In most of the everyday interaction with technical systems users construct mental models (i.e., an abstraction of the anticipated mechanisms a system uses to perform a given task). If no explicit explanations are provided by a system (e.g. by a self-explaining AI) or other sources (e.g. an instructor), the mental model is typically formed based on experiences, i.e. the observations of the user during the interaction. The congruence of this mental model and the actual systems functioning is vital, as it is used for assumptions, predictions and consequently for decisions regarding system use. A key question for human-centered AI research is therefore how to validly survey users' mental models. The objective of the present research is to identify suitable elicitation methods for mental model analysis. We evaluated whether mental models are suitable as an empirical research method. Additionally, methods of cognitive tutoring are integrated. We propose an exemplary method to evaluate explainable AI approaches in a human-centered way.

explanation, knowledge element, mental model, (12 more...)

arXiv.org Artificial Intelligence

2002.02526

Country:

North America > United States > California > San Francisco County > San Francisco (0.15)
Europe > Germany > Schleswig-Holstein > Lübeck (0.05)
Asia > Middle East > Jordan (0.05)

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.31)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.57)
Information Technology > Artificial Intelligence > Cognitive Science > Simulation of Human Behavior (0.35)

Sousa, Rodrigo S., Santos, Priscila G. M. dos, Veras, Tiago M. L., de Oliveira, Wilson R., da Silva, Adenilton J.

Parametric Probabilistic Quantum Memory

Probabilistic Quantum Memory (PQM) is a data structure that computes the distance from a binary input to all binary patterns stored in superposition on the memory. This data structure allows the development of heuristics to speed up artificial neural networks architecture selection. In this work, we propose an improved parametric version of the PQM to perform pattern classification, and we also present a PQM quantum circuit suitable for Noisy Intermediate Scale Quantum (NISQ) computers. We present a classical evaluation of a parametric PQM network classifier on public benchmark datasets. We also perform experiments to verify the viability of PQM on a 5-qubit quantum computer. Introduction Quantum Computing is a computational paradigm that has been harvesting increasing attention for decades now. Several quantum algorithms have time advantages over their best known classical counterparts [1, 2, 3, 4]. The current advances in quantum hardware are bringing us to the era of Noisy Intermediate-Scale Quantum (NISQ) computers [5]. The quest for quantum supremacy is the search for an efficient solution of a task in a quantum computer that current classical computers are not able to efficiently solve. Some authors argue that given the current state of the art, we will achieve quantum supremacy in the next few years [6]. One of the approaches to achieve this supremacy and to expand the potential applications of quantum computers is through quantum machine learning [7]. Machine learning (ML) [8] aims at developing automated ways for computers to learn a specific task from a given set of data samples.

algorithm, computer, retrieval algorithm, (15 more...)

2001.04798

Country:

Europe > Spain > Canary Islands > Tenerife (0.05)
South America > Brazil > Pernambuco (0.04)
North America > United States (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Zhu, Rui, Ghosal, Subhashis

Bayesian Semi-supervised learning under nonparanormality

Semi-supervised learning is a classification method which makes use of both labeled data and unlabeled data for training. In this paper, we propose a semi-supervised learning algorithm using a Bayesian semi-supervised model. We make a general assumption that the observations will follow two multivariate normal distributions depending on their true labels after the same unknown transformation. We use B-splines to put a prior on the transformation function for each component. To use unlabeled data in a semi-supervised setting, we assume the labels are missing at random. The posterior distributions can then be described using our assumptions, which we compute by the Gibbs sampling technique. The proposed method is then compared with several other available methods through an extensive simulation study. Finally we apply the proposed method in real data contexts for diagnosing breast cancer and classify radar returns. We conclude that the proposed method has better prediction accuracy in a wide variety of cases.

assumption, semi-supervised, transformation, (14 more...)

2001.03798

Country:

North America > United States > North Carolina > Wake County > Raleigh (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
(2 more...)

Confidence Scores Make Instance-dependent Label-noise Learning Possible

Berthon, Antonin, Han, Bo, Niu, Gang, Liu, Tongliang, Sugiyama, Masashi

Learning with noisy labels has drawn a lot of attention. In this area, most of recent works only consider class-conditional noise, where the label noise is independent of its input features. This noise model may not be faithful to many real-world applications. Instead, few pioneer works have studied instance-dependent noise, but these methods are limited to strong assumptions on noise models. To alleviate this issue, we introduce confidence-scored instance-dependent noise (CSIDN), where each instance-label pair is associated with a confidence score. The confidence scores are sufficient to estimate the noise functions of each instance with minimal assumptions. Moreover, such scores can be easily and cheaply derived during the construction of the dataset through crowdsourcing or automatic annotation. To handle CSIDN, we design a benchmark algorithm termed instance-level forward correction. Empirical results on synthetic and real-world datasets demonstrate the utility of our proposed method.

confidence score, instance-dependent noise, noise, (16 more...)

2001.03772

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.97)

Stochastic Recursive Gradient Descent Ascent for Stochastic Nonconvex-Strongly-Concave Minimax Problems

Luo, Luo, Ye, Haishan, Zhang, Tong

We consider nonconvex-concave minimax problems of the form $\min_{\bf x}\max_{\bf y} f({\bf x},{\bf y})$, where $f$ is strongly-concave in $\bf y$ but possibly nonconvex in $\bf x$. We focus on the stochastic setting, where we can only access an unbiased stochastic gradient estimate of $f$ at each iteration. This formulation includes many machine learning applications as special cases such as adversary training and certifying robustness in deep learning. We are interested in finding an ${\mathcal O}(\varepsilon)$-stationary point of the function $\Phi(\cdot)=\max_{\bf y} f(\cdot, {\bf y})$. The most popular algorithm to solve this problem is stochastic gradient decent ascent, which requires $\mathcal O(\kappa^3\varepsilon^{-4})$ stochastic gradient evaluations, where $\kappa$ is the condition number. In this paper, we propose a novel method called Stochastic Recursive gradiEnt Descent Ascent (SREDA), which estimates gradients more efficiently using variance reduction. This method achieves the best known stochastic gradient complexity of ${\mathcal O}(\kappa^3\varepsilon^{-3})$, and its dependency on $\varepsilon$ is optimal for this problem.

complexity, inequality, stochastic gradient evaluation, (13 more...)

2001.03724

Country:

Asia > Middle East > Jordan (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Hong Kong (0.04)
Europe > Hungary > Csongrád-Csanád County > Szeged (0.04)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Hasan, Md Mahmudul, Wei, Shuangqing, Moharrer, Ali

Latent Factor Analysis of Gaussian Distributions under Graphical Constraints

Latent Factor Analysis of Gaussian Distributions under Graphical Constraints Md Mahmudul Hasan, Shuangqing Wei, Ali Moharrer Abstract --We explore the algebraic structure of the solution space of convex optimization problem Constrained Minimum Trace Factor Analysis (CMTF A), when the population covariance matrix Σ x has an additional latent graphical constraint, namely, a latent star topology. In particular, we have shown that CMTF A can have either a rank 1 or a rank n 1 solution and nothing in between. We found explicit conditions for both rank 1 and rank n 1 solutions for CMTF A solution of Σ x. As a basic attempt towards building a more general Gaussian tree, we have found a necessary and a sufficient condition for multiple clusters, each having rank 1 CMTF A solution, to satisfy a minimum probability to combine together to build a Gaussian tree. T o support our analytical findings we have presented some numerical demonstrating the usefulness of the contributions of our work. Index T erms --Factor Analysis, MTF A, CMTF A, CMDF A I. INTRODUCTION A. Motivation Factor Analysis (FA) is a commonly used tool in multivariate statistics to represent the correlation structure of a set of observables in terms of significantly smaller number of variables called "latent factors". With the growing use in data mining, high dimensional data analytics, factor analysis has already become a prolific area of research [1] [2]. Classical factor analysis models seek to decompose the correlation matrix of an n -dimensional random vector X R n, Σ x, as the sum of a diagonal matrix D and a Gramian matrix Σ x D .

cmtfa solution, matrix, rank 1, (15 more...)

2001.02712

Country:

North America > United States > Louisiana > East Baton Rouge Parish > Baton Rouge (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)