AITopics | Stefankovic, Daniel

Collaborating Authors

Stefankovic, Daniel

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Understanding Diversity based Pruning of Neural Networks -- Statistical Mechanical Analysis

Acharyya, Rupam, Zhang, Boyu, Chattoraj, Ankani, Das, Shouman, Stefankovic, Daniel

arXiv.org Machine LearningJun-30-2020

Deep learning architectures with a huge number of parameters are often compressed using pruning techniques to ensure computational efficiency of inference during deployment. Despite multitude of empirical advances, there is no theoretical understanding of the effectiveness of different pruning methods. We address this issue by setting up the problem in the statistical mechanics formulation of a teacher-student framework and deriving generalization error (GE) bounds of specific pruning methods. This theoretical premise allows comparison between pruning methods and we use it to investigate compression of neural networks via diversity-based pruning methods. A recent work showed that Determinantal Point Process (DPP) based node pruning method is notably superior to competing approaches when tested on real datasets. Using GE bounds in the aforementioned setup we provide theoretical guarantees for their empirical observations. Another consistent finding in literature is that sparse neural networks (edge pruned) generalize better than dense neural networks (node pruned) for a fixed number of parameters. We use our theoretical setup to prove that baseline random edge pruning method performs better than DPP node pruning method. Finally, we draw motivation from our theoretical results to propose a DPP edge pruning technique for neural networks which empirically outperforms other competing pruning methods on real datasets.

deep learning, neural network, pruning, (19 more...)

arXiv.org Machine Learning

2006.16617

Genre: Research Report (0.50)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

On The Projection Operator to A Three-view Cardinality Constrained Set

Yang, Haichuan, Gui, Shupeng, Ke, Chuyang, Stefankovic, Daniel, Fujimaki, Ryohei, Liu, Ji

arXiv.org Machine LearningJun-14-2017

The cardinality constraint is an intrinsic way to restrict the solution structure in many domains, for example, sparse learning, feature selection, and compressed sensing. To solve a cardinality constrained problem, the key challenge is to solve the projection onto the cardinality constraint set, which is NP-hard in general when there exist multiple overlapped cardinality constraints. In this paper, we consider the scenario where the overlapped cardinality constraints satisfy a Three-view Cardinality Structure (TVCS), which reflects the natural restriction in many applications, such as identification of gene regulatory networks and task-worker assignment problem. We cast the projection into a linear programming, and show that for TVCS, the vertex solution of this linear programming is the solution for the original projection problem. We further prove that such solution can be found with the complexity proportional to the number of variables and constraints. We finally use synthetic experiments and two interesting applications in bioinformatics and crowdsourcing to validate the proposed TVCS model and method.

constraint, health & medicine, optimization problem, (17 more...)

arXiv.org Machine Learning

1703.07345

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Rapid Mixing Swendsen-Wang Sampler for Stochastic Partitioned Attractive Models

Park, Sejun, Jang, Yunhun, Galanis, Andreas, Shin, Jinwoo, Stefankovic, Daniel, Vigoda, Eric

arXiv.org Machine LearningApr-6-2017

The Gibbs sampler is a particularly popular Markov chain used for learning and inference problems in Graphical Models (GMs). These tasks are computationally intractable in general, and the Gibbs sampler often suffers from slow mixing. In this paper, we study the Swendsen-Wang dynamics which is a more sophisticated Markov chain designed to overcome bottlenecks that impede the Gibbs sampler. We prove O(\log n) mixing time for attractive binary pairwise GMs (i.e., ferromagnetic Ising models) on stochastic partitioned graphs having n vertices, under some mild conditions, including low temperature regions where the Gibbs sampler provably mixes exponentially slow. Our experiments also confirm that the Swendsen-Wang sampler significantly outperforms the Gibbs sampler when they are used for learning parameters of attractive GMs.

artificial intelligence, machine learning, swendsen-wang chain, (16 more...)

arXiv.org Machine Learning

1704.02232

Country:

North America > United States > California (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report (1.00)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Slice Normalized Dynamic Markov Logic Networks

Papai, Tivadar, Kautz, Henry, Stefankovic, Daniel

Neural Information Processing SystemsDec-31-2012

Markov logic is a widely used tool in statistical relational learning, which uses a weighted first-order logic knowledge base to specify a Markov random field (MRF) or a conditional random field (CRF). In many applications, a Markov logic network (MLN) is trained in one domain, but used in a different one. This paper focuses on dynamic Markov logic networks, where the domain of time points typically varies between training and testing. It has been previously pointed out that the marginal probabilities of truth assignments to ground atoms can change if one extends or reduces the domains of predicates in an MLN. We show that in addition to this problem, the standard way of unrolling a Markov logic theory into a MRF may result in time-inhomogeneity of the underlying Markov chain. Furthermore, even if these representational problems are not significant for a given domain, we show that the more practical problem of generating samples in a sequential conditional random field for the next slice relying on the samples from the previous slice has high computational cost in the general case, due to the need to estimate a normalization factor for each sample. We propose a new discriminative model, slice normalized dynamic Markov logic networks (SN-DMLN), that suffers from none of these issues. It supports efficient online inference, and can directly model influences between variables within a time slice that do not have a causal direction, in contrast with fully directed models (e.g., DBNs). Experimental results show an improvement in accuracy over previous approaches to online inference in dynamic Markov logic networks.

artificial intelligence, formula, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Subset Selection for Gaussian Markov Random Fields

Mahalanabis, Satyaki, Stefankovic, Daniel

arXiv.org Machine LearningSep-26-2012

Given a Gaussian Markov random field, we consider the problem of selecting a subset of variables to observe which minimizes the total expected squared prediction error of the unobserved variables. We first show that finding an exact solution is NP-hard even for a restricted class of Gaussian Markov random fields, called Gaussian free fields, which arise in semi-supervised learning and computer vision. We then give a simple greedy approximation algorithm for Gaussian free fields on arbitrary graphs. Finally, we give a message passing algorithm for general Gaussian Markov random fields on bounded tree-width graphs.

artificial intelligence, machine learning, matrix, (16 more...)

arXiv.org Machine Learning

1209.5991

Country:

North America > United States > New York (0.14)
North America > United States > Massachusetts (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback