Country
A group of new astronauts join NASA under the Artemis program and could be the first to step on Mars
It has been more than two years in the making, but 13 new astronauts have finally joined NASA under the mission that will bring the first female to the moon -and some may be the first humans to step on Mars. The candidates, who have been training since 2017, participated in the first public graduation ceremony for astronauts on Friday at the American space Agency's Johnson Space Center in Houston. The group includes six women and seven men, two of them were Canadian Space Agency (CSA) astronauts, and all were chosen from record-setting pool of more than 18,000 applicants. During the ceremony, each of the bright-eyed graduates were given a silver pin that symbolizes the Mercury 7 – NASA's first astronaut group that was selected in 1959. They will then be awarded a gold pin once they completed their first spaceflights.
Symplectic networks: Intrinsic structure-preserving networks for identifying Hamiltonian systems
Jin, Pengzhan, Zhu, Aiqing, Karniadakis, George Em, Tang, Yifa
This work presents a framework of constructing the neural networks preserving the symplectic structure, so-called symplectic networks (SympNets). With the symplectic networks, we show some numerical results about (\romannumeral1) solving the Hamiltonian systems by learning abundant data points over the phase space, and (\romannumeral2) predicting the phase flows by learning a series of points depending on time. All the experiments point out that the symplectic networks perform much more better than the fully-connected networks that without any prior information, especially in the task of predicting which is unable to do within the conventional numerical methods.
On Computation and Generalization of Generative Adversarial Imitation Learning
Chen, Minshuo, Wang, Yizhou, Liu, Tianyi, Yang, Zhuoran, Li, Xingguo, Wang, Zhaoran, Zhao, Tuo
Generative Adversarial Imitation Learning (GAIL) is a powerful and practical approach for learning sequential decision-making policies. Different from Reinforcement Learning (RL), GAIL takes advantage of demonstration data by experts (e.g., human), and learns both the policy and reward function of the unknown environment. Despite the significant empirical progresses, the theory behind GAIL is still largely unknown. The major difficulty comes from the underlying temporal dependency of the demonstration data and the minimax computational formulation of GAIL without convex-concave structure. To bridge such a gap between theory and practice, this paper investigates the theoretical properties of GAIL. Specifically, we show: (1) For GAIL with general reward parameterization, the generalization can be guaranteed as long as the class of the reward functions is properly controlled; (2) For GAIL, where the reward is parameterized as a reproducing kernel function, GAIL can be efficiently solved by stochastic first order optimization algorithms, which attain sublinear convergence to a stationary solution. To the best of our knowledge, these are the first results on statistical and computational guarantees of imitation learning with reward/policy function approximation. Numerical experiments are provided to support our analysis.
Authorship Attribution in Bangla literature using Character-level CNN
Khatun, Aisha, Rahman, Anisur, Islam, Md. Saiful, Marium-E-Jannat, null
Characters are the smallest unit of text that can extract stylometric signals to determine the author of a text. In this paper, we investigate the effectiveness of character-level signals in Authorship Attribution of Bangla Literature and show that the results are promising but improvable. The time and memory efficiency of the proposed model is much higher than the word level counterparts but accuracy is 2-5% less than the best performing word-level models. Comparison of various word-based models is performed and shown that the proposed model performs increasingly better with larger datasets. We also analyze the effect of pre-training character embedding of diverse Bangla character set in authorship attribution. It is seen that the performance is improved by up to 10% on pre-training. We used 2 datasets from 6 to 14 authors, balancing them before training and compare the results.
How to Answer Why -- Evaluating the Explanations of AI Through Mental Model Analysis
To achieve optimal human-system integration in the context of user-AI interaction it is important that users develop a valid representation of how AI works. In most of the everyday interaction with technical systems users construct mental models (i.e., an abstraction of the anticipated mechanisms a system uses to perform a given task). If no explicit explanations are provided by a system (e.g. by a self-explaining AI) or other sources (e.g. an instructor), the mental model is typically formed based on experiences, i.e. the observations of the user during the interaction. The congruence of this mental model and the actual systems functioning is vital, as it is used for assumptions, predictions and consequently for decisions regarding system use. A key question for human-centered AI research is therefore how to validly survey users' mental models. The objective of the present research is to identify suitable elicitation methods for mental model analysis. We evaluated whether mental models are suitable as an empirical research method. Additionally, methods of cognitive tutoring are integrated. We propose an exemplary method to evaluate explainable AI approaches in a human-centered way.
Parametric Probabilistic Quantum Memory
Sousa, Rodrigo S., Santos, Priscila G. M. dos, Veras, Tiago M. L., de Oliveira, Wilson R., da Silva, Adenilton J.
Probabilistic Quantum Memory (PQM) is a data structure that computes the distance from a binary input to all binary patterns stored in superposition on the memory. This data structure allows the development of heuristics to speed up artificial neural networks architecture selection. In this work, we propose an improved parametric version of the PQM to perform pattern classification, and we also present a PQM quantum circuit suitable for Noisy Intermediate Scale Quantum (NISQ) computers. We present a classical evaluation of a parametric PQM network classifier on public benchmark datasets. We also perform experiments to verify the viability of PQM on a 5-qubit quantum computer. Introduction Quantum Computing is a computational paradigm that has been harvesting increasing attention for decades now. Several quantum algorithms have time advantages over their best known classical counterparts [1, 2, 3, 4]. The current advances in quantum hardware are bringing us to the era of Noisy Intermediate-Scale Quantum (NISQ) computers [5]. The quest for quantum supremacy is the search for an efficient solution of a task in a quantum computer that current classical computers are not able to efficiently solve. Some authors argue that given the current state of the art, we will achieve quantum supremacy in the next few years [6]. One of the approaches to achieve this supremacy and to expand the potential applications of quantum computers is through quantum machine learning [7]. Machine learning (ML) [8] aims at developing automated ways for computers to learn a specific task from a given set of data samples.
Bayesian Semi-supervised learning under nonparanormality
Semi-supervised learning is a classification method which makes use of both labeled data and unlabeled data for training. In this paper, we propose a semi-supervised learning algorithm using a Bayesian semi-supervised model. We make a general assumption that the observations will follow two multivariate normal distributions depending on their true labels after the same unknown transformation. We use B-splines to put a prior on the transformation function for each component. To use unlabeled data in a semi-supervised setting, we assume the labels are missing at random. The posterior distributions can then be described using our assumptions, which we compute by the Gibbs sampling technique. The proposed method is then compared with several other available methods through an extensive simulation study. Finally we apply the proposed method in real data contexts for diagnosing breast cancer and classify radar returns. We conclude that the proposed method has better prediction accuracy in a wide variety of cases.
Confidence Scores Make Instance-dependent Label-noise Learning Possible
Berthon, Antonin, Han, Bo, Niu, Gang, Liu, Tongliang, Sugiyama, Masashi
Learning with noisy labels has drawn a lot of attention. In this area, most of recent works only consider class-conditional noise, where the label noise is independent of its input features. This noise model may not be faithful to many real-world applications. Instead, few pioneer works have studied instance-dependent noise, but these methods are limited to strong assumptions on noise models. To alleviate this issue, we introduce confidence-scored instance-dependent noise (CSIDN), where each instance-label pair is associated with a confidence score. The confidence scores are sufficient to estimate the noise functions of each instance with minimal assumptions. Moreover, such scores can be easily and cheaply derived during the construction of the dataset through crowdsourcing or automatic annotation. To handle CSIDN, we design a benchmark algorithm termed instance-level forward correction. Empirical results on synthetic and real-world datasets demonstrate the utility of our proposed method.
Stochastic Recursive Gradient Descent Ascent for Stochastic Nonconvex-Strongly-Concave Minimax Problems
Luo, Luo, Ye, Haishan, Zhang, Tong
We consider nonconvex-concave minimax problems of the form $\min_{\bf x}\max_{\bf y} f({\bf x},{\bf y})$, where $f$ is strongly-concave in $\bf y$ but possibly nonconvex in $\bf x$. We focus on the stochastic setting, where we can only access an unbiased stochastic gradient estimate of $f$ at each iteration. This formulation includes many machine learning applications as special cases such as adversary training and certifying robustness in deep learning. We are interested in finding an ${\mathcal O}(\varepsilon)$-stationary point of the function $\Phi(\cdot)=\max_{\bf y} f(\cdot, {\bf y})$. The most popular algorithm to solve this problem is stochastic gradient decent ascent, which requires $\mathcal O(\kappa^3\varepsilon^{-4})$ stochastic gradient evaluations, where $\kappa$ is the condition number. In this paper, we propose a novel method called Stochastic Recursive gradiEnt Descent Ascent (SREDA), which estimates gradients more efficiently using variance reduction. This method achieves the best known stochastic gradient complexity of ${\mathcal O}(\kappa^3\varepsilon^{-3})$, and its dependency on $\varepsilon$ is optimal for this problem.
Latent Factor Analysis of Gaussian Distributions under Graphical Constraints
Hasan, Md Mahmudul, Wei, Shuangqing, Moharrer, Ali
Latent Factor Analysis of Gaussian Distributions under Graphical Constraints Md Mahmudul Hasan, Shuangqing Wei, Ali Moharrer Abstract --We explore the algebraic structure of the solution space of convex optimization problem Constrained Minimum Trace Factor Analysis (CMTF A), when the population covariance matrix Σ x has an additional latent graphical constraint, namely, a latent star topology. In particular, we have shown that CMTF A can have either a rank 1 or a rank n 1 solution and nothing in between. We found explicit conditions for both rank 1 and rank n 1 solutions for CMTF A solution of Σ x. As a basic attempt towards building a more general Gaussian tree, we have found a necessary and a sufficient condition for multiple clusters, each having rank 1 CMTF A solution, to satisfy a minimum probability to combine together to build a Gaussian tree. T o support our analytical findings we have presented some numerical demonstrating the usefulness of the contributions of our work. Index T erms --Factor Analysis, MTF A, CMTF A, CMDF A I. INTRODUCTION A. Motivation Factor Analysis (FA) is a commonly used tool in multivariate statistics to represent the correlation structure of a set of observables in terms of significantly smaller number of variables called "latent factors". With the growing use in data mining, high dimensional data analytics, factor analysis has already become a prolific area of research [1] [2]. Classical factor analysis models seek to decompose the correlation matrix of an n -dimensional random vector X R n, Σ x, as the sum of a diagonal matrix D and a Gramian matrix Σ x D .