If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."
However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …
Many people are keenly interested in machine learning, and with good reason. Machine learning is applicable to a wide variety domains, including engineering, education, healthcare, and government. The broad applicability of machine learning is a double edged sword: Although an ever increasing pool of people want to use machine learning, a decreasing portion of them understand the mathematics and computer science that are it's foundation. Just because people don't natively speak the language of machine learning doesn't mean that they're not going to try to apply machine learning to their problems. We should provide tools to the fledgling machine learning practitioner that allow them to understand what they're asking the system to do.
The recent deep learning hype aims to reach the Artificial General Intelligence (AGI): an AI that would express (supra-)human-like intelligence. Unfortunately current deep learning models are flawed in many ways: one of them is that they are unable to learn continuously as human does through years of schooling, and so on. Regardless of the far away goal of AGI, there are several practicals reasons why we want our model to learn continuously. A real applications of these two constraints is robotics: a robot in the wild should learn continuously its environment. Furthermore due to hardware limitation, it may neither store all previous data nor spend too much computational resource.
Graph matching and MAP inference are essential problems in computer vision and machine learning. We introduce a novel algorithm that can accommodate both problems and solve them efficiently. Recent graph matching algorithms are based on a general quadratic programming formulation, that takes in consideration both unary and second-order terms reflecting the similarities in local appearance as well as in the pairwise geometric relationships between the matched features. In this case the problem is NP-hard and a lot of effort has been spent in finding efficiently approximate solutions by relaxing the constraints of the original problem. Most algorithms find optimal continuous solutions of the modified problem, ignoring during the optimization the original discrete constraints.
Reliably recovering 3D human pose from monocular video requires constraints that bias the estimates towards typical human poses and motions. We define priors for people tracking using a Laplacian Eigenmaps Latent Variable Model (LELVM). LELVM is a probabilistic dimensionality reduction model that naturally combines the advantages of latent variable models---definining a multimodal probability density for latent and observed variables, and globally differentiable nonlinear mappings for reconstruction and dimensionality reduction---with those of spectral manifold learning methods---no local optima, ability to unfold highly nonlinear manifolds, and good practical scaling to latent spaces of high dimension. LELVM is computationally efficient, simple to learn from sparse training data, and compatible with standard probabilistic trackers such as particle filters. We analyze the performance of a LELVM-based probabilistic sigma point mixture tracker in several real and synthetic human motion sequences and demonstrate that LELVM provides sufficient constraints for robust operation in the presence of missing, noisy and ambiguous image measurements.
In this paper, we derive a method to refine a Bayes network diagnostic model by exploiting constraints implied by expert decisions on test ordering. At each step, the expert executes an evidence gathering test, which suggests the test's relative diagnostic value. We demonstrate that consistency with an expert's test selection leads to non-convex constraints on the model parameters. We incorporate these constraints by augmenting the network with nodes that represent the constraint likelihoods. Gibbs sampling, stochastic hill climbing and greedy search algorithms are proposed to find a MAP estimate that takes into account test ordering constraints and any data available.
Estimating 3D pose from monocular images is a highly ambiguous problem. Physical constraints can be exploited to restrict the space of feasible configurations. In this paper we propose an approach to constraining the prediction of a discriminative predictor. We first show that the mean prediction of a Gaussian process implicitly satisfies linear constraints if those constraints are satisfied by the training examples. We then show how, by performing a change of variables, a GP can be forced to satisfy quadratic constraints.
Kernel learning is a powerful framework for nonlinear data modeling. Using the kernel trick, a number of problems have been formulated as semidefinite programs (SDPs). These include Maximum Variance Unfolding (MVU) (Weinberger et al., 2004) in nonlinear dimensionality reduction, and Pairwise Constraint Propagation (PCP) (Li et al., 2008) in constrained clustering. Although in theory SDPs can be efficiently solved, the high computational complexity incurred in numerically processing the huge linear matrix inequality constraints has rendered the SDP approach unscalable. In this paper, we show that a large class of kernel learning problems can be reformulated as semidefinite-quadratic-linear programs (SQLPs), which only contain a simple positive semidefinite constraint, a second-order cone constraint and a number of linear constraints.
We propose a new class of consistency constraints for Linear Programming (LP) relaxations for finding the most probable (MAP) configuration in graphical models. Usual cluster-based LP relaxations enforce joint consistency of the beliefs of a cluster of variables, with computational cost increasing exponentially with the size of the clusters. By partitioning the state space of a cluster and enforcing consistency only across partitions, we obtain a class of constraints which, although less tight, are computationally feasible for large clusters. We show how to solve the cluster selection and partitioning problem monotonically in the dual LP, using the current beliefs to guide these choices. We obtain a dual message-passing algorithm and apply it to protein design problems where the variables have large state spaces and the usual cluster-based relaxations are very costly.
We study the problem of segmenting specific white matter structures of interest from Diffusion Tensor (DT-MR) images of the human brain. This is an important requirement in many Neuroimaging studies: for instance, to evaluate whether a brain structure exhibits group level differences as a function of disease in a set of images. Typically, interactive expert guided segmentation has been the method of choice for such applications, but this is tedious for large datasets common today. To address this problem, we endow an image segmentation algorithm with'advice' encoding some global characteristics of the region(s) we want to extract. This is accomplished by constructing (using expert-segmented images) an epitome of a specific region - as a histogram over a bag of'words' (e.g.,suitable feature descriptors).
The learning of appropriate distance metrics is a critical problem in classification. In this work, we propose a boosting-based technique, termed BoostMetric, for learning a Mahalanobis distance metric. One of the primary difficulties in learning such a metric is to ensure that the Mahalanobis matrix remains positive semidefinite. Semidefinite programming is sometimes used to enforce this constraint, but does not scale well. BoostMetric is instead based on a key observation that any positive semidefinite matrix can be decomposed into a linear positive combination of trace-one rank-one matrices.