Goto

Collaborating Authors

 Mathematical & Statistical Methods


Sub-sampled Newton Methods with Non-uniform Sampling

Neural Information Processing Systems

We consider the problem of finding the minimizer of a convex function $F: \mathbb R^d \rightarrow \mathbb R$ of the form $F(w) \defeq \sum_{i=1}^n f_i(w) + R(w)$ where a low-rank factorization of $\nabla^2 f_i(w)$ is readily available.We consider the regime where $n \gg d$. We propose randomized Newton-type algorithms that exploit \textit{non-uniform} sub-sampling of $\{\nabla^2 f_i(w)\}_{i=1}^{n}$, as well as inexact updates, as means to reduce the computational complexity, and are applicable to a wide range of problems in machine learning. Two non-uniform sampling distributions based on {\it block norm squares} and {\it block partial leverage scores} are considered. Under certain assumptions, we show that our algorithms inherit a linear-quadratic convergence rate in $w$ and achieve a lower computational complexity compared to similar existing methods. In addition, we show that our algorithms exhibit more robustness and better dependence on problem specific quantities, such as the condition number. We numerically demonstrate the advantages of our algorithms on several real datasets.


Learning from Rational Behavior: Predicting Solutions to Unknown Linear Programs

Neural Information Processing Systems

We define and study the problem of predicting the solution to a linear program (LP) given only partial information about its objective and constraints. This generalizes the problem of learning to predict the purchasing behavior of a rational agent who has an unknown objective function, that has been studied under the name "Learning from Revealed Preferences". We give mistake bound learning algorithms in two settings: in the first, the objective of the LP is known to the learner but there is an arbitrary, fixed set of constraints which are unknown. Each example is defined by an additional known constraint and the goal of the learner is to predict the optimal solution of the LP given the union of the known and unknown constraints. This models the problem of predicting the behavior of a rational agent whose goals are known, but whose resources are unknown. In the second setting, the objective of the LP is unknown, and changing in a controlled way. The constraints of the LP may also change every day, but are known. An example is given by a set of constraints and partial information about the objective, and the task of the learner is again to predict the optimal solution of the partially known LP.


Enhancing Observability in Distribution Grids using Smart Meter Data

arXiv.org Machine Learning

Abstract--Due to limited metering infrastructure, distribution grids are currently challenged by observability issues. On the other hand, smart meter data, including local voltage magnitudes and power injections, are communicated to the utility operator from grid buses with renewable generation and demand-response programs. This work employs grid data from metered buses towards inferring the underlying grid state. T o this end, a coupled formulation of the power flow problem (CPF) is put forth. Exploiting the high variability of injections at metered buses, the controllability of solar inverters, and the relative time-invariance of conventional loads, the idea is to solve the nonlinear power flow equations jointly over consecutive time instants. An intuitive and easily verifiable rule pertaining to the locations of metered and non-metered buses on the physical grid is shown to be a necessary and sufficient criterion for local observability in radial networks. T o account for noisy smart meter readings, a coupled power system state estimation (CPSSE) problem is further developed. Both CPF and CPSSE tasks are tackled via augmented semi-definite program relaxations. The observability criterion along with the CPF and CPSSE solvers are numerically corroborated using synthetic and actual solar generation and load data on the IEEE 34-bus benchmark feeder . Power flow (PF) and power system state estimation (PSSE) are central to planning, monitoring and control of electricity networks.


Introduction to Number Theory: Fascinating Facts and Conjectures about Primes and Other Special Numbers

@machinelearnbot

I discuss here off-the-beaten-path beautiful, even spectacular results from number theory: not just about prime numbers, but also about related problems such as integers that are sum of two squares. The connection between these numbers and prime numbers will appear later in this article. A few important unsolved mathematical conjectures are presented in a unified approach, and some new research material is also introduced, especially an attempt at generalizing and unifying concepts related to data set density and limiting distributions. The approach is very applied, focusing on algorithms, simulations, and big data, to help discover fascinating results. Even though some of the most exciting topics of mathematics are discussed here (including fundamental, century-old problems still unresolved as well as brand new hypotheses), most of the article can be understood by the layman. Among other things, you will learn some new ways to estimate Pi based on non-traditional experiments, or how a conjecture for prime numbers somehow generalizes to apply to Fibonacci numbers as well.


Safety Verification and Control for Collision Avoidance at Road Intersections

arXiv.org Artificial Intelligence

This paper presents the design of a supervisory algorithm that monitors safety at road intersections and overrides drivers with a safe input when necessary. The design of the supervisor consists of two parts: safety verification and control design. Safety verification is the problem to determine if vehicles will be able to cross the intersection without colliding with current drivers' inputs. We translate this safety verification problem into a jobshop scheduling problem, which minimizes the maximum lateness and evaluates if the optimal cost is zero. The zero optimal cost corresponds to the case in which all vehicles can cross each conflict area without collisions. Computing the optimal cost requires solving a Mixed Integer Nonlinear Programming (MINLP) problem due to the nonlinear second-order dynamics of the vehicles. We therefore estimate this optimal cost by formulating two related Mixed Integer Linear Programming (MILP) problems that assume simpler vehicle dynamics. We prove that these two MILP problems yield lower and upper bounds of the optimal cost. We also quantify the worst case approximation errors of these MILP problems. We design the supervisor to override the vehicles with a safe control input if the MILP problem that computes the upper bound yields a positive optimal cost. We theoretically demonstrate that the supervisor keeps the intersection safe and is non-blocking. Computer simulations further validate that the algorithms can run in real time for problems of realistic size.


Facebook's advice to students interested in artificial intelligence

#artificialintelligence

That's the gist of the advice to students interested in AI from Facebook's Yann LeCun and Joaquin Quiñonero Candela who run the company's Artificial Intelligence Lab and Applied Machine Learning group respectively. Tech companies often advocate STEM (science, technology, engineering and math), but today's tips are particularly pointed. The pair specifically note that students should eat their vegetables take Calc I, Calc II, Calc III, Linear Algebra, Probability and Statistics as early as possible. From this list, probability and statistics are perhaps the most interesting. From what I remember about high-school, those two subjects are regularly dismissed as too-obvious strategies for skirting the informal AP Calculus preference of top colleges and universities (AP Statistics is often thought of as a cop-out by students).


Facebook's advice to students interested in artificial intelligence

#artificialintelligence

That's the gist of the advice to students interested in AI from Facebook's Yann LeCun and Joaquin Quiñonero Candela who run the company's Artificial Intelligence Lab and Applied Machine Learning group respectively. Tech companies often advocate STEM (science, technology, engineering and math), but today's tips are particularly pointed. The pair specifically note that students should eat their vegetables take Calc I, Calc II, Calc III, Linear Algebra, Probability and Statistics as early as possible. From this list, probability and statistics are perhaps the most interesting. From what I remember about high-school, those two subjects are regularly dismissed as too-obvious strategies for skirting the informal AP Calculus preference of top colleges and universities (AP Statistics is often thought of as a cop-out by students).


16 analytic disciplines compared to data science

#artificialintelligence

What are the differences between data science, data mining, machine learning, statistics, operations research, and so on? Here I compare several analytic disciplines that overlap, to explain the differences and common denominators. Sometimes differences exist for nothing else other than historical reasons. Sometimes the differences are real and subtle. I also provide typical job titles, types of analyses, and industries traditionally attached to each discipline. Underlined domains are main sub-domains. It would be great if someone can add an historical perspective to my article. First, let's start by describing data science, the new discipline. Job titles include data scientist, chief scientist, senior analyst, director of analytics and many more.


Think Stats: Probability and Statistics for Programmers

#artificialintelligence

Think Stats emphasizes simple techniques you can use to explore real data sets and answer interesting questions. The book presents a case study using data from the National Institutes of Health. Readers are encouraged to work on a project with real datasets. If you have basic skills in Python, you can use them to learn concepts in probability and statistics. Think Stats is based on a Python library for probability distributions (PMFs and CDFs).


Chi-squared Amplification: Identifying Hidden Hubs

arXiv.org Machine Learning

We consider the following general hidden hubs model: an $n \times n$ random matrix $A$ with a subset $S$ of $k$ special rows (hubs): entries in rows outside $S$ are generated from the probability distribution $p_0 \sim N(0,\sigma_0^2)$; for each row in $S$, some $k$ of its entries are generated from $p_1 \sim N(0,\sigma_1^2)$, $\sigma_1>\sigma_0$, and the rest of the entries from $p_0$. The problem is to identify the high-degree hubs efficiently. This model includes and significantly generalizes the planted Gaussian Submatrix Model, where the special entries are all in a $k \times k$ submatrix. There are two well-known barriers: if $k\geq c\sqrt{n\ln n}$, just the row sums are sufficient to find $S$ in the general model. For the submatrix problem, this can be improved by a $\sqrt{\ln n}$ factor to $k \ge c\sqrt{n}$ by spectral methods or combinatorial methods. In the variant with $p_0=\pm 1$ (with probability $1/2$ each) and $p_1\equiv 1$, neither barrier has been broken. We give a polynomial-time algorithm to identify all the hidden hubs with high probability for $k \ge n^{0.5-\delta}$ for some $\delta >0$, when $\sigma_1^2>2\sigma_0^2$. The algorithm extends to the setting where planted entries might have different variances each at least as large as $\sigma_1^2$. We also show a nearly matching lower bound: for $\sigma_1^2 \le 2\sigma_0^2$, there is no polynomial-time Statistical Query algorithm for distinguishing between a matrix whose entries are all from $N(0,\sigma_0^2)$ and a matrix with $k=n^{0.5-\delta}$ hidden hubs for any $\delta >0$. The lower bound as well as the algorithm are related to whether the chi-squared distance of the two distributions diverges. At the critical value $\sigma_1^2=2\sigma_0^2$, we show that the general hidden hubs problem can be solved for $k\geq c\sqrt n(\ln n)^{1/4}$, improving on the naive row sum-based method.