AITopics

Shalev-Shwartz, Shai, Zhang, Tong

Stochastic Dual Coordinate Ascent Methods for Regularized Loss Minimization

arXiv.org Machine LearningJan-30-2013

Stochastic Gradient Descent (SGD) has become popular for solving large scale supervised machine learning optimization problems such as SVM, due to their strong theoretical guarantees. While the closely related Dual Coordinate Ascent (DCA) method has been implemented in various software packages, it has so far lacked good convergence analysis. This paper presents a new analysis of Stochastic Dual Coordinate Ascent (SDCA) showing that this class of methods enjoy strong theoretical guarantees that are comparable or better than SGD. This analysis justifies the effectiveness of SDCA for practical applications.

artificial intelligence, duality gap, machine learning, (13 more...)

arXiv.org Machine Learning

1209.1873

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Flexible and Approximate Computation through State-Space Reduction

Zhang, Weixiong

In the real world, insufficient information, limited computation resources, and complex problem structures often force an autonomous agent to make a decision in time less than that required to solve the problem at hand completely. Flexible and approximate computations are two approaches to decision making under limited computation resources. Flexible computation helps an agent to flexibly allocate limited computation resources so that the overall system utility is maximized. Approximate computation enables an agent to find the best satisfactory solution within a deadline. In this paper, we present two state-space reduction methods for flexible and approximate computation: quantitative reduction to deal with inaccurate heuristic information, and structural reduction to handle complex problem structures. These two methods can be applied successively to continuously improve solution quality if more computation is available. Our results show that these reduction methods are effective and efficient, finding better solutions with less computation than some existing well-known methods.

artificial intelligence, optimization problem, reduction, (18 more...)

1301.7418

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.94)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.69)

Zhang, Nevin Lianwen, Lee, Stephen S.

Planning with Partially Observable Markov Decision Processes: Advances in Exact Solution Method

There is much interest in using partially observable Markov decision processes (POMDPs) as a formal model for planning in stochastic domains. This paper is concerned with finding optimal policies for POMDPs. We propose several improvements to incremental pruning, presently the most efficient exact algorithm for solving POMDPs.

artificial intelligence, machine learning, vector, (16 more...)

1301.7417

Country: North America > United States > Rhode Island (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Probabilistic Inference in Influence Diagrams

Zhang, Nevin Lianwen

This paper is about reducing influence diagram (ID) evaluation into Bayesian network (BN) inference problems. Such reduction is interesting because it enables one to readily use one's favorite BN inference algorithm to efficiently evaluate IDs. Two such reduction methods have been proposed previously (Cooper 1988, Shachter and Peot 1992). This paper proposes a new method. The BN inference problems induced by the mew method are much easier to solve than those induced by the two previous methods.

artificial intelligence, machine learning, node, (17 more...)

1301.7416

Country: North America > United States > California (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Bayesian Networks from the Point of View of Chain Graphs

Studeny, Milan

AThe paper gives a few arguments in favour of the use of chain graphs for description of probabilistic conditional independence structures. Every Bayesian network model can be equivalently introduced by means of a factorization formula with respect to a chain graph which is Markov equivalent to the Bayesian network. A graphical characterization of such graphs is given. The class of equivalent graphs can be represented by a distinguished graph which is called the largest chain graph. The factorization formula with respect to the largest chain graph is a basis of a proposal of how to represent the corresponding (discrete) probability distribution in a computer (i.e. parametrize it). This way does not depend on the choice of a particular Bayesian network from the class of equivalent networks and seems to be the most efficient way from the point of view of memory demands. A separation criterion for reading independency statements from a chain graph is formulated in a simpler way. It resembles the well-known d-separation criterion for Bayesian networks and can be implemented locally.

artificial intelligence, bayesian inference, machine learning, (18 more...)

1301.7414

Country: Europe > Czechia (0.15)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Switching Portfolios

Singer, Yoram

A constant rebalanced portfolio is an asset allocation algorithm which keeps the same distribution of wealth among a set of assets along a period of time. Recently, there has been work on on-line portfolio selection algorithms which are competitive with the best constant rebalanced portfolio determined in hindsight. By their nature, these algorithms employ the assumption that high returns can be achieved using a fixed asset allocation strategy. However, stock markets are far from being stationary and in many cases the wealth achieved by a constant rebalanced portfolio is much smaller than the wealth achieved by an ad-hoc investment strategy that adapts to changes in the market. In this paper we present an efficient Bayesian portfolio selection algorithm that is able to track a changing market. We also describe a simple extension of the algorithm for the case of a general transaction cost, including the transactions cost models recently investigated by Blum and kalai. We provide a simple analysis of the competitiveness of the algorithm and check its performance on real stock data from the New York Stock Exchange accumulated during a 22-year period.

algorithm, artificial intelligence, machine learning, (15 more...)

1301.7413

Country: North America > United States > New York (0.24)

Genre: Research Report (0.64)

Industry: Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Bayes-Ball: The Rational Pastime (for Determining Irrelevance and Requisite Information in Belief Networks and Influence Diagrams)

Shachter, Ross D.

One of the benefits of belief networks and influence diagrams is that so much knowledge is captured in the graphical structure. In particular, statements of conditional irrelevance (or independence) can be verified in time linear in the size of the graph. To resolve a particular inference query or decision problem, only some of the possible states and probability distributions must be specified, the "requisite information." This paper presents a new, simple, and efficient "Bayes-ball" algorithm which is well-suited to both new students of belief networks and state of the art implementations. The Bayes-ball algorithm determines irrelevant sets and requisite information more efficiently than existing methods, and is linear in the size of the graph for belief networks and influence diagrams.

artificial intelligence, machine learning, node, (14 more...)

1301.7412

Country: North America > United States > California (0.29)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Sebastiani, Paola, Ramoni, Marco

Decision Theoretic Foundations of Graphical Model Selection

This paper describes a decision theoretic formulation of learning the graphical structure of a Bayesian Belief Network from data. This framework subsumes the standard Bayesian approach of choosing the model with the largest posterior probability as the solution of a decision problem with a 0-1 loss function and allows the use of more general loss functions able to trade-off the complexity of the selected model and the error of choosing an oversimplified model. A new class of loss functions, called disintegrable, is introduced, to allow the decision problem to match the decomposability of the graphical model. With this class of loss functions, the optimal solution to the decision problem can be found using an efficient bottom-up search strategy.

artificial intelligence, bayesian inference, machine learning, (17 more...)

1301.741

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Empirical Evaluation of Approximation Algorithms for Probabilistic Decoding

Rish, Irina, Kask, Kalev, Dechter, Rina

It was recently shown that the problem of decoding messages transmitted through a noisy channel can be formulated as a belief updating task over a probabilistic network [McEliece]. Moreover, it was observed that iterative application of the (linear time) Pearl's belief propagation algorithm designed for polytrees outperformed state of the art decoding algorithms, even though the corresponding networks may have many cycles. This paper demonstrates empirically that an approximation algorithm approx-mpe for solving the most probable explanation (MPE) problem, developed within the recently proposed mini-bucket elimination framework [Dechter96], outperforms iterative belief propagation on classes of coding networks that have bounded induced width. Our experiments suggest that approximate MPE decoders can be good competitors to the approximate belief updating decoders.

algorithm, artificial intelligence, machine learning, (14 more...)

1301.7409

Country: North America > United States > California (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)