Country
Asymptotic Unbiasedness of the Permutation Importance Measure in Random Forest Models
Variable selection in sparse regression models is an important task as applications ranging from biomedical research to econometrics have shown. Especially for higher dimensional regression problems, for which the link function between response and covariates cannot be directly detected, the selection of informative variables is challenging. Under these circumstances, the Random Forest method is a helpful tool to predict new outcomes while delivering measures for variable selection. One common approach is the usage of the permutation importance. Due to its intuitive idea and flexible usage, it is important to explore circumstances, for which the permutation importance based on Random Forest correctly indicates informative covariates. Regarding the latter, we deliver theoretical guarantees for the validity of the permutation importance measure under specific assumptions and prove its (asymptotic) unbiasedness. An extensive simulation study verifies our findings.
A Study into Echocardiography View Conversion
Abdi, Amir H., Jafari, Mohammad H., Fels, Sidney, Tsang, Theresa, Abolmaesumi, Purang
Transthoracic echo is one of the most common means of cardiac studies in the clinical routines. During the echo exam, the sonographer captures a set of standard cross sections (echo views) of the heart. Each 2D echo view cuts through the 3D cardiac geometry via a unique plane. Consequently, different views share some limited information. In this work, we investigate the feasibility of generating a 2D echo view using another view based on adversarial generative models. The objective optimized to train the view-conversion model is based on the ideas introduced by LSGAN, PatchGAN and Conditional GAN (cGAN). The size and length of the left ventricle in the generated target echo view is compared against that of the target ground-truth to assess the validity of the echo view conversion. Results show that there is a correlation of 0.50 between the LV areas and 0.49 between the LV lengths of the generated target frames and the real target frames.
Hybrid Kronecker Product Decomposition and Approximation
Cai, Chencheng, Chen, Rong, Xiao, Han
Discovering the underlying low dimensional structure of high dimensional data has attracted a significant amount of researches recently and has shown to have a wide range of applications. As an effective dimension reduction tool, singular value decomposition is often used to analyze high dimensional matrices, which are traditionally assumed to have a low rank matrix approximation. In this paper, we propose a new approach. We assume a high dimensional matrix can be approximated by a sum of a small number of Kronecker products of matrices with potentially different configurations, named as a hybird Kronecker outer Product Approximation (hKoPA). It provides an extremely flexible way of dimension reduction compared to the low-rank matrix approximation. Challenges arise in estimating a hKoPA when the configurations of component Kronecker products are different or unknown. We propose an estimation procedure when the set of configurations are given and a joint configuration determination and component estimation procedure when the configurations are unknown. Specifically, a least squares backfitting algorithm is used when the configuration is given. When the configuration is unknown, an iterative greedy algorithm is used. Both simulation and real image examples show that the proposed algorithms have promising performances. The hybrid Kronecker product approximation may have potentially wider applications in low dimensional representation of high dimensional data
A pedestrian path-planning model in accordance with obstacle's danger with reinforcement learning
Trinh, Thanh-Trung, Vu, Dinh-Minh, Kimura, Masaomi
Most microscopic pedestrian navigation models use the concept of "forces" applied to the pedestrian agents to replicate the navigation environment. While the approach could provide believable results in regular situations, it does not always resemble natural p edestrian navigation behaviour in many typical settings. In our research, we proposed a novel approach using reinforcement learning for simulation of pedestrian agent path planning and collision avoidance problem. The primary focus of this approach is usi ng human perception of the environment and danger awareness of interferences . The implementation of our model has shown that the path planned by the agent shares many similarities with a human pedestrian in several aspects such as following common walking conventions and human behaviours .
Risk-Aware MMSE Estimation
Kalogerias, Dionysios S., Chamon, Luiz F. O., Pappas, George J., Ribeiro, Alejandro
Despite the simplicity and intuitive interpretation of Minimum Mean Squared Error (MMSE) estimators, their effectiveness in certain scenarios is questionable. Indeed, minimizing squared errors on average does not provide any form of stability, as the volatility of the estimation error is left unconstrained. When this volatility is statistically significant, the difference between the average and realized performance of the MMSE estimator can be drastically different. To address this issue, we introduce a new risk-aware MMSE formulation which trades between mean performance and risk by explicitly constraining the expected predictive variance of the involved squared error. We show that, under mild moment boundedness conditions, the corresponding risk-aware optimal solution can be evaluated explicitly, and has the form of an appropriately biased nonlinear MMSE estimator. We further illustrate the effectiveness of our approach via several numerical examples, which also showcase the advantages of risk-aware MMSE estimation against risk-neutral MMSE estimation, especially in models involving skewed, heavy-tailed distributions.
Optimization algorithms inspired by the geometry of dissipative systems
Bravetti, Alessandro, Daza-Torres, Maria L., Flores-Arguedas, Hugo, Betancourt, Michael
Optimization algorithms inspired by the geometry of dissipative systems Alessandro Bravetti 1, Maria L. Daza-Torres 2, Hugo Flores-Arguedas 3, and Michael Betancourt 4 1 Instituto de Investigaciones en Matemรกticas Aplicadas y en Sistemas (IIMAS), Universidad Nacional Autรณnoma de Mรฉxico, A.P. 70-543, 04510 Ciudad de Mรฉxico, Mรฉxico alessandro.bravetti@iimas.unam.mx 2 Universidad de Guadalajara, Guadalajara, Mรฉxico, mdazatorres@cimat.mx 3 Centro de Investigaciรณn en Matemรกticas (CIMAT), Guanajuato, Mรฉxico, hugo.flores@cimat.mx 4 Symplectomorphic LLC, New York, USA, betan@symplectomorphic.com Abstract Accelerated gradient methods are a powerful optimization tool in machine learning and statistics but their development has traditionally been driven by heuristic motivations. Recent research, however, has demonstrated that these methods can be derived as discretizations of dynamical systems, which in turn has provided a basis for more systematic investigations, especially into the structure of those dynamical systems and their structure preserving discretizations. In this work we introduce dynamical systems defined through a contact geometry which are not only naturally suited to the optimization goal but also subsume all previous methods based on geometric dynamical systems. These contact dynamical systems also admit a natural, robust discretization through geometric contact integrators. We demonstrate these features in paradigmatic examples which show that we can indeed obtain optimization algorithms that achieve oracle lower bounds on convergence rates while also improving on previous proposals in terms of stability. Keywords: optimization, accelerated gradient, geometric integrators, contact geometry 1 arXiv:1912.02928v1 Despite their practical utility and explicit convergence bounds, accelerated gradient methods have long been difficult to motivate from a fundamental theory.
On the Intrinsic Privacy of Stochastic Gradient Descent
Hyland, Stephanie L., Tople, Shruti
Stephanie L. Hyland Microsoft Research Shruti Tople Microsoft Research Abstract --Protecting the privacy of training data is important for the safe deployment of machine learning models. Private learning algorithms have been proposed that ensure strong differential-privacy (DP) guarantees. However, the additional noise required for such protection comes at the cost of reduced model utility. Meanwhile, the stochastic gradient descent (SGD) method -- the most common optimization algorithm for neural networks -- contains intrinsic randomness which has not been leveraged for privacy. Arguing that SGD guarantees intrinsic privacy, we investigate the extent to which this privacy can be quantified and used to improve the utility of privately learned models. In effect, we ask the question; "If SGD were a differentially-private mechanism, how good would it be?" In this work, we take the first step towards analysing the intrinsic privacy properties of SGD. Our primary contribution is a large-scale empirical analysis of SGD on both convex and non-convex objectives. T o this end, we evaluate the inherent variability due to the stochasticity in SGD on 3 different datasets and calculate the null values due to the intrinsic noise. First, we show that the variability in model parameters due to the random sampling almost always exceeds that due to changes in the data. We observe that SGD provides intrinsic null values of 7. 8, 6 .9 Next, we propose a method to augment the intrinsic noise of SGD with additional noise to achieve the desired null. Our augmented SGD outputs model that outperform existing approaches with the same privacy guarantee, thus closing the gap to noiseless utility between 0 . Finally, we show that the existing theoretical bound on the sensitivity of SGD is not tight. By estimating the tightest bound empirically, we achieve near-noiseless performance at null 1, closing the utility gap to the noiseless model between 3 . Our experiments provide concrete evidence that changing the seed in SGD is likely to have a far greater impact on the resulting model than including or excluding any given training example. By properly accounting for this intrinsic randomness, higher utility can be achieved without sacrificing further privacy. With these results, we hope to inspire the research community to further explore and characterise the randomness in SGD, its impact on privacy, and the parallels with generalisation in machine learning. I NTRODUCTION Respecting the privacy of users contributing their data to train machine learning models is important.
A Clustering Approach to Edge Controller Placement in Software Defined Networks with Cost Balancing
Soleymanifar, Reza, Srivastava, Amber, Beck, Carolyn, Salapaka, Srinivasa
A Clustering Approach to Edge Controller Placement in Software Defined Networks with Cost Balancing Reza Soleymanifar, Amber Srivastava, Carolyn Beck, Srinivasa Salapaka Abstract -- In this work we introduce two novel deterministic annealing based clustering algorithms to address the problem of Edge Controller Placement (ECP) in wireless edge networks. These networks lie at the core of the fifth generation (5G) wireless systems and beyond. These algorithms, ECP-LL and ECP-LB, address the dominant leader-less and leader-based controller placement topologies and have linear computational complexity in terms of network size, maximum number of clusters and dimensionality of data. Each algorithm tries to place controllers close to edge node clusters and not far away from other controllers to maintain a reasonable balance between synchronization and delay costs. While the ECP problem can be conveniently expressed as a multi-objective mixed integer nonlinear program (MINLP), our algorithms outperform state of art MINLP solver, BARON both in terms of accuracy and speed. Our proposed algorithms have the competitive edge of avoiding poor local minima through a Shannon entropy term in the clustering objective function. Most ECP algorithms are highly susceptible to poor local minima and greatly depend on initialization. Keywords: Clustering, deterministic annealing, 5G networks, software defined networks, wireless edge networks, edge controller placement I.
Transfer Learning from an Auxiliary Discriminative Task for Unsupervised Anomaly Detection
Muaz, Urwa, Sobolevsky, Stanislav
Unsupervised anomaly detection from high dimensional data like mobility networks is a challenging task. Study of different approaches of feature engineering from such high dimensional data have been a focus of research in this field. This study aims to investigate the transferability of features learned by network classification to unsupervised anomaly detection. We propose use of an auxiliary classification task to extract features from unlabelled data by supervised learning, which can be used for unsupervised anomaly detection. We validate this approach by designing experiments to detect anomalies in mobility network data from New York and Taipei, and compare the results to traditional unsupervised feature learning approaches of PCA and autoencoders. We find that our feature learning approach yields best anomaly detection performance for both datasets, outperforming other studied approaches. This establishes the utility of this approach to feature engineering, which can be applied to other problems of similar nature.
Combining Q-Learning and Search with Amortized Value Estimates
Hamrick, Jessica B., Bapst, Victor, Sanchez-Gonzalez, Alvaro, Pfaff, Tobias, Weber, Theophane, Buesing, Lars, Battaglia, Peter W.
We introduce "Search with Amortized Value Estimates" (SAVE), an approach for combining model-free Q-learning with model-based Monte-Carlo Tree Search (MCTS). In SAVE, a learned prior over state-action values is used to guide MCTS, which estimates an improved set of state-action values. The new Q-estimates are then used in combination with real experience to update the prior. This effectively amortizes the value computation performed by MCTS, resulting in a cooperative relationship between model-free learning and model-based search. SAVE can be implemented on top of any Q-learning agent with access to a model, which we demonstrate by incorporating it into agents that perform challenging physical reasoning tasks and Atari. SAVE consistently achieves higher rewards with fewer training steps, and---in contrast to typical model-based search approaches---yields strong performance with very small search budgets. By combining real experience with information computed during search, SAVE demonstrates that it is possible to improve on both the performance of model-free learning and the computational cost of planning.