AITopics | Jalali, Ali

Collaborating Authors

Jalali, Ali

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Dirty Model for Multi-task Learning

Jalali, Ali, Sanghavi, Sujay, Ruan, Chao, Ravikumar, Pradeep K.

Neural Information Processing SystemsFeb-15-2020, 01:29:26 GMT

We consider the multiple linear regression problem, in a setting where some of the set of relevant features could be shared across the tasks. However, these papers also caution that the performance of such block-regularized methods are very dependent on the {\em extent} to which the features are shared across tasks. We are far away from a realistic multi-task setting: not only do the set of relevant features have to be exactly the same across tasks, but their values have to as well. Here, we ask the question: can we leverage support and parameter overlap when it exists, but not pay a penalty when it does not? Indeed, this falls under a more general question of whether we can model such \emph{dirty data} which may not fall into a single neat structural bracket (all block-sparse, or all low-rank and so on).

artificial intelligence, dirty model, machine learning, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Clustering Partially Observed Graphs via Convex Optimization

Chen, Yudong, Jalali, Ali, Sanghavi, Sujay, Xu, Huan

arXiv.org Machine LearningJul-23-2014

This paper considers the problem of clustering a partially observed unweighted graph---i.e., one where for some node pairs we know there is an edge between them, for some others we know there is no edge, and for the remaining we do not know whether or not there is an edge. We want to organize the nodes into disjoint clusters so that there is relatively dense (observed) connectivity within clusters, and sparse across clusters. We take a novel yet natural approach to this problem, by focusing on finding the clustering that minimizes the number of "disagreements"---i.e., the sum of the number of (observed) missing edges within clusters, and (observed) present edges across clusters. Our algorithm uses convex optimization; its basis is a reduction of disagreement minimization to the problem of recovering an (unknown) low-rank matrix and an (unknown) sparse matrix from their partially observed sum. We evaluate the performance of our algorithm on the classical Planted Partition/Stochastic Block Model. Our main theorem provides sufficient conditions for the success of our algorithm as a function of the minimum cluster size, edge density and observation probability; in particular, the results characterize the tradeoff between the observation probability and the edge density gap. When there are a constant number of clusters of equal size, our results are optimal up to logarithmic factors.

artificial intelligence, disagreement, optimization problem, (19 more...)

arXiv.org Machine Learning

1104.4803

Country:

Asia (0.28)
North America > United States > Texas > Travis County > Austin (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

A Lipschitz Exploration-Exploitation Scheme for Bayesian Optimization

Jalali, Ali, Azimi, Javad, Fern, Xiaoli, Zhang, Ruofei

arXiv.org Machine LearningJul-16-2013

The problem of optimizing unknown costly-to-evaluate functions has been studied for a long time in the context of Bayesian Optimization. Algorithms in this field aim to find the optimizer of the function by asking only a few function evaluations at locations carefully selected based on a posterior model. In this paper, we assume the unknown function is Lipschitz continuous. Leveraging the Lipschitz property, we propose an algorithm with a distinct exploration phase followed by an exploitation phase. The exploration phase aims to select samples that shrink the search space as much as possible. The exploitation phase then focuses on the reduced search space and selects samples closest to the optimizer. Considering the Expected Improvement (EI) as a baseline, we empirically show that the proposed algorithm significantly outperforms EI.

algorithm, optimization problem, upstream oil & gas, (18 more...)

arXiv.org Machine Learning

1204.0047

Country:

North America > United States (0.68)
North America > Canada > Alberta (0.28)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas > Upstream (0.65)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.69)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.55)

Add feedback

A New Greedy Algorithm for Multiple Sparse Regression

Jalali, Ali, Sanghavi, Sujay

arXiv.org Machine LearningJun-7-2012

This paper proposes a new algorithm for multiple sparse regression in high dimensions, where the task is to estimate the support and values of several (typically related) sparse vectors from a few noisy linear measurements. Our algorithm is a "forward-backward" greedy procedure that -- uniquely -- operates on two distinct classes of objects. In particular, we organize our target sparse vectors as a matrix; our algorithm involves iterative addition and removal of both (a) individual elements, and (b) entire rows (corresponding to shared features), of the matrix. Analytically, we establish that our algorithm manages to recover the supports (exactly) and values (approximately) of the sparse vectors, under assumptions similar to existing approaches based on convex optimization. However, our algorithm has a much smaller computational complexity. Perhaps most interestingly, it is seen empirically to require visibly fewer samples. Ours represents the first attempt to extend greedy algorithms to the class of models that can only/best be represented by a combination of component structural assumptions (sparse and group-sparse, in our case).

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1206.1402

Country: Europe > Netherlands (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Hybrid Batch Bayesian Optimization

Azimi, Javad, Jalali, Ali, Fern, Xiaoli

arXiv.org Artificial IntelligenceApr-30-2012

Bayesian Optimization aims at optimizing an unknown non-convex/concave function that is costly to evaluate. We are interested in application scenarios where concurrent function evaluations are possible. Under such a setting, BO could choose to either sequentially evaluate the function, one input at a time and wait for the output of the function before making the next selection, or evaluate the function at a batch of multiple inputs at once. These two different settings are commonly referred to as the sequential and batch settings of Bayesian Optimization. In general, the sequential setting leads to better optimization performance as each function evaluation is selected with more information, whereas the batch setting has an advantage in terms of the total experimental time (the number of iterations). In this work, our goal is to combine the strength of both settings. Specifically, we systematically analyze Bayesian optimization using Gaussian process as the posterior estimator and provide a hybrid algorithm that, based on the current state, dynamically switches between a sequential policy and a batch policy with variable batch sizes. We provide theoretical justification for our algorithm and present experimental results on eight benchmark BO problems. The results show that our method achieves substantial speedup (up to %78) compared to a pure sequential policy, without suffering any significant performance loss.

artificial intelligence, experiment, optimization problem, (17 more...)

arXiv.org Artificial Intelligence

1202.5597

Country: North America > United States > Texas (0.14)

Genre: Research Report > New Finding (0.88)

Industry: Energy (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.93)

Add feedback

Clustering using Max-norm Constrained Optimization

Jalali, Ali, Srebro, Nathan

arXiv.org Machine LearningApr-13-2012

We suggest using the max-norm as a convex surrogate constraint for clustering. We show how this yields a better exact cluster recovery guarantee than previously suggested nuclear-norm relaxation, and study the effectiveness of our method, and other related convex relaxations, compared to other clustering approaches.

artificial intelligence, optimization problem, relaxation, (17 more...)

arXiv.org Machine Learning

1202.5598

Country:

North America > United States > Texas (0.14)
North America > United States > Massachusetts (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.69)

Add feedback

On Learning Discrete Graphical Models using Greedy Methods

Jalali, Ali, Johnson, Christopher C., Ravikumar, Pradeep K.

Neural Information Processing SystemsDec-31-2011

In this paper, we address the problem of learning the structure of a pairwise graphical model from samples in a high-dimensional setting. Our first main result studies the sparsistency, or consistency in sparsity pattern recovery, properties of a forward-backward greedy algorithm as applied to general statistical models. As a special case, we then apply this algorithm to learn the structure of a discrete graphical model via neighborhood estimation. As a corollary of our general result, we derive sufficient conditions on the number of samples n, the maximum node-degree d and the problem size p, as well as other conditions on the model parameters, so that the algorithm recovers all the edges with high probability. Our result guarantees graph selection for samples scaling as n = Omega(d log(p)), in contrast to existing convex-optimization based algorithms that require a sample complexity of Omega(d^2 log(p)). Further, the greedy algorithm only requires a restricted strong convexity condition which is typically milder than irrepresentability assumptions. We corroborate these results using numerical simulations at the end.

artificial intelligence, greedy algorithm, machine learning, (13 more...)

Neural Information Processing Systems

Country: North America > United States > Texas (0.14)

Genre: Research Report > New Finding (0.36)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

High-dimensional Sparse Inverse Covariance Estimation using Greedy Methods

Johnson, Christopher C., Jalali, Ali, Ravikumar, Pradeep

arXiv.org Machine LearningDec-29-2011

In this paper we consider the task of estimating the non-zero pattern of the sparse inverse covariance matrix of a zero-mean Gaussian random vector from a set of iid samples. Note that this is also equivalent to recovering the underlying graph structure of a sparse Gaussian Markov Random Field (GMRF). We present two novel greedy approaches to solving this problem. The first estimates the non-zero covariates of the overall inverse covariance matrix using a series of global forward and backward greedy steps. The second estimates the neighborhood of each node in the graph separately, again using greedy forward and backward steps, and combines the intermediate neighborhoods to form an overall estimate. The principal contribution of this paper is a rigorous analysis of the sparsistency, or consistency in recovering the sparsity pattern of the inverse covariance matrix. Surprisingly, we show that both the local and global greedy methods learn the full structure of the model with high probability given just $O(d\log(p))$ samples, which is a \emph{significant} improvement over state of the art $\ell_1$-regularized Gaussian MLE (Graphical Lasso) that requires $O(d^2\log(p))$ samples. Moreover, the restricted eigenvalue and smoothness conditions imposed by our greedy methods are much weaker than the strong irrepresentable conditions required by the $\ell_1$-regularization based methods. We corroborate our results with extensive simulations and examples, comparing our local and global greedy methods to the $\ell_1$-regularized Gaussian MLE as well as the Neighborhood Greedy method to that of nodewise $\ell_1$-regularized linear regression (Neighborhood Lasso).

artificial intelligence, greedy algorithm, machine learning, (12 more...)

arXiv.org Machine Learning

1112.6411

Country: Europe (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Low-rank Matrix Recovery from Errors and Erasures

Chen, Yudong, Jalali, Ali, Sanghavi, Sujay, Caramanis, Constantine

arXiv.org Machine LearningAug-25-2011

This paper considers the recovery of a low-rank matrix from an observed version that simultaneously contains both (a) erasures: most entries are not observed, and (b) errors: values at a constant fraction of (unknown) locations are arbitrarily corrupted. We provide a new unified performance guarantee on when the natural convex relaxation of minimizing rank plus support succeeds in exact recovery. Our result allows for the simultaneous presence of random and deterministic components in both the error and erasure patterns. On the one hand, corollaries obtained by specializing this one single result in different ways recover (up to poly-log factors) all the existing works in matrix completion, and sparse and low-rank matrix recovery. On the other hand, our results also provide the first guarantees for (a) recovery when we observe a vanishing fraction of entries of a corrupted matrix, and (b) deterministic matrix completion.

artificial intelligence, machine learning, matrix, (18 more...)

arXiv.org Machine Learning

1104.0354

Country: North America > United States > Texas > Travis County > Austin (0.14)

Genre: Research Report > New Finding (0.54)

Technology: Information Technology > Artificial Intelligence (0.93)

Add feedback

On Learning Discrete Graphical Models Using Greedy Methods

Jalali, Ali, Johnson, Chris, Ravikumar, Pradeep

arXiv.org Machine LearningJul-16-2011

In this paper, we address the problem of learning the structure of a pairwise graphical model from samples in a high-dimensional setting. Our first main result studies the sparsistency, or consistency in sparsity pattern recovery, properties of a forward-backward greedy algorithm as applied to general statistical models. As a special case, we then apply this algorithm to learn the structure of a discrete graphical model via neighborhood estimation. As a corollary of our general result, we derive sufficient conditions on the number of samples n, the maximum node-degree d and the problem size p, as well as other conditions on the model parameters, so that the algorithm recovers all the edges with high probability. Our result guarantees graph selection for samples scaling as n = Omega(d^2 log(p)), in contrast to existing convex-optimization based algorithms that require a sample complexity of \Omega(d^3 log(p)). Further, the greedy algorithm only requires a restricted strong convexity condition which is typically milder than irrepresentability assumptions. We corroborate these results using numerical simulations at the end.

artificial intelligence, greedy algorithm, machine learning, (17 more...)

arXiv.org Machine Learning

1107.3258

Country: North America > United States > Texas (0.14)

Genre: Research Report > New Finding (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback