Goto

Collaborating Authors

 Mathematical & Statistical Methods


best way to be a machine learning engineer

#artificialintelligence

Becoming a machine learning engineer requires a combination of skills and knowledge in various areas such as mathematics, programming, data analysis, and machine learning algorithms. Learn the basics of mathematics and statistics: Machine learning requires a strong foundation in mathematics and statistics. You should be familiar with calculus, linear algebra, probability, and statistics. Master a programming language: You should learn a programming language such as Python or R, which are commonly used for machine learning. You should also be familiar with data structures, algorithms, and object-oriented programming.


Metrizing Fairness

arXiv.org Artificial Intelligence

We study supervised learning problems for predicting properties of individuals who belong to one of two demographic groups, and we seek predictors that are fair according to statistical parity. This means that the distributions of the predictions within the two groups should be close with respect to the Kolmogorov distance, and fairness is achieved by penalizing the dissimilarity of these two distributions in the objective function of the learning problem. In this paper, we showcase conceptual and computational benefits of measuring unfairness with integral probability metrics (IPMs) other than the Kolmogorov distance. Conceptually, we show that the generator of any IPM can be interpreted as a family of utility functions and that unfairness with respect to this IPM arises if individuals in the two demographic groups have diverging expected utilities. We also prove that the unfairness-regularized prediction loss admits unbiased gradient estimators if unfairness is measured by the squared $\mathcal L^2$-distance or by a squared maximum mean discrepancy. In this case, the fair learning problem is susceptible to efficient stochastic gradient descent (SGD) algorithms. Numerical experiments on real data show that these SGD algorithms outperform state-of-the-art methods for fair learning in that they achieve superior accuracy-unfairness trade-offs -- sometimes orders of magnitude faster. Finally, we identify conditions under which statistical parity can improve prediction accuracy.


Applications of Population Growth part1(Non Linear Dynamics)

#artificialintelligence

Abstract: n varying environments it is beneficial for organisms to utilize available cues to infer the conditions they may encounter and express potentially favorable traits. However, external cues can be unreliable or too costly to use. We consider an alternative strategy where organisms exploit internal sources of information. Even without sensing environmental cues, their internal states may become correlated with the environment as a result of selection, which then form a memory that helps predict future conditions. To demonstrate the adaptive value of such internal memory in varying environments, we revisit the classic example of seed dormancy in annual plants.


The joint node degree distribution in the Erd\H{o}s-R\'enyi network

arXiv.org Artificial Intelligence

The Erd\H{o}s-R\'enyi random graph is the simplest model for node degree distribution, and it is one of the most widely studied. In this model, pairs of $n$ vertices are selected and connected uniformly at random with probability $p$, consequently, the degrees for a given vertex follow the binomial distribution. If the number of vertices is large, the binomial can be approximated by Normal using the Central Limit Theorem, which is often allowed when $\min (np, n(1-p)) > 5$. This is true for every node independently. However, due to the fact that the degrees of nodes in a graph are not independent, we aim in this paper to test whether the degrees of per node collectively in the Erd\H{o}s-R\'enyi graph have a multivariate normal distribution MVN. A chi square goodness of fit test for the hypothesis that binomial is a distribution for the whole set of nodes is rejected because of the dependence between degrees. Before testing MVN we show that the covariance and correlation between the degrees of any pair of nodes in the graph are $p(1-p)$ and $1/(n-1)$, respectively. We test MVN considering two assumptions: independent and dependent degrees, and we obtain our results based on the percentages of rejected statistics of chi square, the $p$-values of Anderson Darling test, and a CDF comparison. We always achieve a good fit of multivariate normal distribution with large values of $n$ and $p$, and very poor fit when $n$ or $p$ are very small. The approximation seems valid when $np \geq 10$. We also compare the maximum likelihood estimate of $p$ in MVN distribution where we assume independence and dependence. The estimators are assessed using bias, variance and mean square error.


Domain Randomization for Robust, Affordable and Effective Closed-loop Control of Soft Robots

arXiv.org Artificial Intelligence

Figure 1: From top to bottom: a) naïve RL with training directly on the real world; b) RL where the policy is trained in simulation Soft robotics is a rapidly developing field that has the and tested on the real world; c) Sim-to-Real transfer with potential to revolutionize how robots interact with their domain randomization increases robustness to modelling environment [1]. Unlike their rigid counterparts, soft robots errors and enables environmental constraints exploitation; are made from materials that can deform and adapt to d) posterior distributions over simulator parameters may be their surroundings, enabling them to perform novel and automatically inferred from real-world data for use with DR. unprecedented tasks in fields such as healthcare [2] and exploration [3]. However, controlling the complex dynamics of continuous soft robots is a challenging task, as an accurate Many attempts have been made to control soft devices modelling requires infinite degrees of freedom (DoF) [4] and through model-based techniques, also pushed by the advancement nonlinear dynamics parameters that are difficult to accurately of modelling techniques [6].


Working with Hyperbolic Random Graphs part1

#artificialintelligence

Abstract: Undirected hyperbolic graph models have been extensively used as models of scale-free small-world networks with high clustering coefficient. Here we presented a simple directed hyperbolic model, where nodes randomly distributed on a hyperbolic disk are connected to a fixed number m of their nearest spatial neighbours. We introduce also a canonical version of this network (which we call network with varied connection radius''), where maximal length of outgoing bond is space-dependent and is determined by fixing the average out-degree at m. We study local bond length, in-degree and reciprocity in these networks as a function of spacial coordinates of the nodes, and show that the network has a distinct core-periphery structure. We show that for small densities of nodes the overall in-degree has a truncated power law distribution.


Working with Hyperbolic Random Graphs part2

#artificialintelligence

Abstract: We study random walks on the giant component of Hyperbolic Random Graphs (HRGs), in the regime when the degree distribution obeys a power law with exponent in the range (2,3). In particular, we focus on the expected times for a random walk to hit a given vertex or visit, i.e. cover, all vertices. We show that up to multiplicative constants: the cover time is n(logn)2, the maximum hitting time is nlogn, and the average hitting time is n. The first two results hold in expectation and a.a.s. and the last in expectation (with respect to the HRG). We prove these results by determining the effective resistance either between an average vertex and the well-connected "center" of HRGs or between an appropriately chosen collection of extremal vertices.


An Online Algorithm for Chance Constrained Resource Allocation

arXiv.org Artificial Intelligence

This paper studies the online stochastic resource allocation problem (RAP) with chance constraints. The online RAP is a 0-1 integer linear programming problem where the resource consumption coefficients are revealed column by column along with the corresponding revenue coefficients. When a column is revealed, the corresponding decision variables are determined instantaneously without future information. Moreover, in online applications, the resource consumption coefficients are often obtained by prediction. To model their uncertainties, we take the chance constraints into the consideration. To the best of our knowledge, this is the first time chance constraints are introduced in the online RAP problem. Assuming that the uncertain variables have known Gaussian distributions, the stochastic RAP can be transformed into a deterministic but nonlinear problem with integer second-order cone constraints. Next, we linearize this nonlinear problem and analyze the performance of vanilla online primal-dual algorithm for solving the linearized stochastic RAP. Under mild technical assumptions, the optimality gap and constraint violation are both on the order of $\sqrt{n}$. Then, to further improve the performance of the algorithm, several modified online primal-dual algorithms with heuristic corrections are proposed. Finally, extensive numerical experiments on both synthetic and real data demonstrate the applicability and effectiveness of our methods.


Quantum Bayesian Computation

arXiv.org Artificial Intelligence

Quantum Bayesian Computation (QBC) is an emerging field that levers the computational gains available from quantum computers to provide an exponential speed-up in Bayesian computation. Our paper adds to the literature in two ways. First, we show how von Neumann quantum measurement can be used to simulate machine learning algorithms such as Markov chain Monte Carlo (MCMC) and Deep Learning (DL) that are fundamental to Bayesian learning. Second, we describe data encoding methods needed to implement quantum machine learning including the counterparts to traditional feature extraction and kernel embeddings methods. Our goal then is to show how to apply quantum algorithms directly to statistical machine learning problems. On the theoretical side, we provide quantum versions of high dimensional regression, Gaussian processes (Q-GP) and stochastic gradient descent (Q-SGD). On the empirical side, we apply a Quantum FFT model to Chicago housing data. Finally, we conclude with directions for future research.


Regularized Newton Method with Global $O(1/k^2)$ Convergence

arXiv.org Artificial Intelligence

The history of Newton's method spans over several centuries and the method has become famous for being extremely fast, and infamous for converging only from initialization that is close to a solution. Despite the latter drawback, Newton's method is a cornerstone of convex optimization and it motivated the development of numerous popular algorithms, such as quasi-Newton and trust-region procedures. Its applications and extensions are countless, so we refer to the study in [19] that lists more than 1,000 references in total. Although widely acknowledged, the extreme behaviour of Newton's method is still startling. Why does it converge so efficiently from one initialization and hopelessly diverge from a tiny perturbation of the same initialization?