Plotting

 Country


Evolutionary Computation in Astronomy and Astrophysics: A Review

arXiv.org Artificial Intelligence

In general Evolutionary Computation (EC) includes a number of optimization methods inspired by biological mechanisms of evolution. The methods catalogued in this area use the Darwinian principles of life evolution to produce algorithms that returns high quality solutions to hard-to-solve optimization problems. The main strength of EC is precisely that they provide good solutions even if the computational resources (e.g., running time) are limited. Astronomy and Astrophysics are two fields that often require optimizing problems of high complexity or analyzing a huge amount of data and the so-called complete optimization methods are inherently limited by the size of the problem/data. For instance, reliable analysis of large amounts of data is central to modern astrophysics and astronomical sciences in general. EC techniques perform well where other optimization methods are inherently limited (as complete methods applied to NP-hard problems), and in the last ten years, numerous proposals have come up that apply with greater or lesser success methodologies of evolutional computation to common engineering problems. Some of these problems, such as the estimation of non-lineal parameters, the development of automatic learning techniques, the implementation of control systems, or the resolution of multi-objective optimization problems, have had (and have) a special repercussion in the fields. For these reasons EC emerges as a feasible alternative for traditional methods. In this paper, we discuss some promising applications in this direction and a number of recent works in this area; the paper also includes a general description of EC to provide a global perspective to the reader and gives some guidelines of application of EC techniques for future research


Robust Spatio-Temporal Signal Recovery from Noisy Counts in Social Media

arXiv.org Artificial Intelligence

Many real-world phenomena can be represented by a spatio-temporal signal: where, when, and how much. Social media is a tantalizing data source for those who wish to monitor such signals. Unlike most prior work, we assume that the target phenomenon is known and we are given a method to count its occurrences in social media. However, counting is plagued by sample bias, incomplete data, and, paradoxically, data scarcity -- issues inadequately addressed by prior work. We formulate signal recovery as a Poisson point process estimation problem. We explicitly incorporate human population bias, time delays and spatial distortions, and spatio-temporal regularization into the model to address the noisy count issues. We present an efficient optimization algorithm and discuss its theoretical properties. We show that our model is more accurate than commonly-used baselines. Finally, we present a case study on wildlife roadkill monitoring, where our model produces qualitatively convincing results.


Robust Nonnegative Matrix Factorization via $L_1$ Norm Regularization

arXiv.org Machine Learning

Nonnegative Matrix Factorization (NMF) is a widely used technique in many applications such as face recognition, motion segmentation, etc. It approximates the nonnegative data in an original high dimensional space with a linear representation in a low dimensional space by using the product of two nonnegative matrices. In many applications data are often partially corrupted with large additive noise. When the positions of noise are known, some existing variants of NMF can be applied by treating these corrupted entries as missing values. However, the positions are often unknown in many real world applications, which prevents the usage of traditional NMF or other existing variants of NMF. This paper proposes a Robust Nonnegative Matrix Factorization (RobustNMF) algorithm that explicitly models the partial corruption as large additive noise without requiring the information of positions of noise. In practice, large additive noise can be used to model outliers. In particular, the proposed method jointly approximates the clean data matrix with the product of two nonnegative matrices and estimates the positions and values of outliers/noise. An efficient iterative optimization algorithm with a solid theoretical justification has been proposed to learn the desired matrix factorization. Experimental results demonstrate the advantages of the proposed algorithm.


Estimation of causal orders in a linear non-Gaussian acyclic model: a method robust against latent confounders

arXiv.org Machine Learning

We consider to learn a causal ordering of variables in a linear non-Gaussian acyclic model called LiNGAM. Several existing methods have been shown to consistently estimate a causal ordering assuming that all the model assumptions are correct. But, the estimation results could be distorted if some assumptions actually are violated. In this paper, we propose a new algorithm for learning causal orders that is robust against one typical violation of the model assumptions: latent confounders. We demonstrate the effectiveness of our method using artificial data.


Matrix Completion from Noisy Entries

arXiv.org Machine Learning

Given a matrix M of low-rank, we consider the problem of reconstructing it from noisy observations of a small, random subset of its entries. The problem arises in a variety of applications, from collaborative filtering (the `Netflix problem') to structure-from-motion and positioning. We study a low complexity algorithm introduced by Keshavan et al.(2009), based on a combination of spectral techniques and manifold optimization, that we call here OptSpace. We prove performance guarantees that are order-optimal in a number of circumstances.



Skin-color based videos categorization

arXiv.org Artificial Intelligence

On dedicated websites, people can upload videos and share it with the rest of the world. Currently these videos are cat- egorized manually by the help of the user community. In this paper, we propose a combination of color spaces with the Bayesian network approach for robust detection of skin color followed by an automated video categorization. Exper- imental results show that our method can achieve satisfactory performance for categorizing videos based on skin color.


Knapsack based Optimal Policies for Budget-Limited Multi-Armed Bandits

arXiv.org Artificial Intelligence

In budget-limited multi-armed bandit (MAB) problems, the learner's actions are costly and constrained by a fixed budget. Consequently, an optimal exploitation policy may not be to pull the optimal arm repeatedly, as is the case in other variants of MAB, but rather to pull the sequence of different arms that maximises the agent's total reward within the budget. This difference from existing MABs means that new approaches to maximising the total reward are required. Given this, we develop two pulling policies, namely: (i) KUBE; and (ii) fractional KUBE. Whereas the former provides better performance up to 40% in our experimental settings, the latter is computationally less expensive. We also prove logarithmic upper bounds for the regret of both policies, and show that these bounds are asymptotically optimal (i.e. they only differ from the best possible regret by a constant factor).


Randomized Smoothing for Stochastic Optimization

arXiv.org Machine Learning

We analyze convergence rates of stochastic optimization procedures for non-smooth convex optimization problems. By combining randomized smoothing techniques with accelerated gradient methods, we obtain convergence rates of stochastic optimization procedures, both in expectation and with high probability, that have optimal dependence on the variance of the gradient estimates. To the best of our knowledge, these are the first variance-based rates for non-smooth optimization. We give several applications of our results to statistical estimation problems, and provide experimental results that demonstrate the effectiveness of the proposed algorithms. We also describe how a combination of our algorithm with recent work on decentralized optimization yields a distributed stochastic optimization algorithm that is order-optimal.


Clustering and Bayesian network for image of faces classification

arXiv.org Artificial Intelligence

In a content based image classification system, target images are sorted by feature similarities with respect to the query (CBIR). In this paper, we propose to use new approach combining distance tangent, k-means algorithm and Bayesian network for image classification. First, we use the technique of tangent distance to calculate several tangent spaces representing the same image. The objective is to reduce the error in the classification phase. Second, we cut the image in a whole of blocks. For each block, we compute a vector of descriptors. Then, we use K-means to cluster the low-level features including color and texture information to build a vector of labels for each image. Finally, we apply five variants of Bayesian networks classifiers (Na\"ive Bayes, Global Tree Augmented Na\"ive Bayes (GTAN), Global Forest Augmented Na\"ive Bayes (GFAN), Tree Augmented Na\"ive Bayes for each class (TAN), and Forest Augmented Na\"ive Bayes for each class (FAN) to classify the image of faces using the vector of labels. In order to validate the feasibility and effectively, we compare the results of GFAN to FAN and to the others classifiers (NB, GTAN, TAN). The results demonstrate FAN outperforms than GFAN, NB, GTAN and TAN in the overall classification accuracy.