AITopics

doi: 10.1613/jair.2575

10576

Country:

North America > United States > Illinois > Champaign County > Urbana (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(4 more...)

Genre: Research Report (0.92)

Industry: Leisure & Entertainment (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
(3 more...)

Salojärvi, Jarkko, Puolamäki, Kai, Savia, Eerika, Kaski, Samuel

Inference with Discriminative Posterior

arXiv.org Machine LearningNov-18-2008

We study Bayesian discriminative inference given a model family $p(c,\x, \theta)$ that is assumed to contain all our prior information but still known to be incorrect. This falls in between "standard" Bayesian generative modeling and Bayesian regression, where the margin $p(\x,\theta)$ is known to be uninformative about $p(c|\x,\theta)$. We give an axiomatic proof that discriminative posterior is consistent for conditional inference; using the discriminative posterior is standard practice in classical Bayesian regression, but we show that it is theoretically justified for model families of joint densities as well. A practical benefit compared to Bayesian regression is that the standard methods of handling missing values in generative modeling can be extended into discriminative inference, which is useful if the amount of data is small. Compared to standard generative modeling, discriminative posterior results in better conditional inference if the model family is incorrect. If the model family contains also the true model, the discriminative posterior gives the same result as standard Bayesian generative modeling. Practical computation is done with Markov chain Monte Carlo.

artificial intelligence, machine learning, posterior, (18 more...)

arXiv.org Machine Learning

0807.3470

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report (0.85)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)

arXiv.org Machine LearningNov-7-2008

Improved Estimation of High-dimensional Ising Models

Kolar, M., Xing, E. P.

We consider the problem of jointly estimating the parameters as well as the structure of binary valued Markov Random Fields, in contrast to earlier work that focus on one of the two problems. We formulate the problem as a maximization of $\ell_1$-regularized surrogate likelihood that allows us to find a sparse solution. Our optimization technique efficiently incorporates the cutting-plane algorithm in order to obtain a tighter outer bound on the marginal polytope, which results in improvement of both parameter estimates and approximation to marginals. On synthetic data, we compare our algorithm on the two estimation tasks to the other existing methods. We analyze the method in the high-dimensional setting, where the number of dimensions $p$ is allowed to grow with the number of observations $n$. The rate of convergence of the estimate is demonstrated to depend explicitly on the sparsity of the underlying graph.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

0811.1239

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Ryabko, Daniil, Hutter, Marcus

On the Possibility of Learning in Reactive Environments with Arbitrary Dependence

arXiv.org Artificial IntelligenceOct-31-2008

We address the problem of reinforcement learning in which observations may exhibit an arbitrary form of stochastic dependence on past observations and actions, i.e. environments more general than (PO)MDPs. The task for an agent is to attain the best possible asymptotic reward where the true generating environment is unknown but belongs to a known countable family of environments. We find some sufficient conditions on the class of environments under which an agent exists which attains the best asymptotic reward for any environment in the class. We analyze how tight these conditions are and how they relate to different probabilistic assumptions known in reinforcement learning and related fields, such as Markov Decision Processes and mixing conditions.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

0810.5636

Country:

Europe (0.68)
Oceania > Australia (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.50)

Jeon, Sung-eok, Ji, Chunayi

Randomized Distributed Configuration Management of Wireless Networks: Multi-layer Markov Random Fields and Near-Optimality

arXiv.org Artificial IntelligenceSep-11-2008

Distributed configuration management is imperative for wireless infrastructureless networks where each node adjusts locally its physical and logical configuration through information exchange with neighbors. Two issues remain open. The first is the optimality. The second is the complexity. We study these issues through modeling, analysis, and randomized distributed algorithms. Modeling defines the optimality. We first derive a global probabilistic model for a network configuration which characterizes jointly the statistical spatial dependence of a physical- and a logical-configuration. We then show that a local model which approximates the global model is a two-layer Markov Random Field or a random bond model. The complexity of the local model is the communication range among nodes. The local model is near-optimal when the approximation error to the global model is within a given error bound. We analyze the trade-off between an approximation error and complexity, and derive sufficient conditions on the near-optimality of the local model. We validate the model, the analysis and the randomized distributed algorithms also through simulation.

algorithm, configuration, local model, (16 more...)

arXiv.org Artificial Intelligence

0809.1916

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > New York > Rensselaer County > Troy (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(5 more...)

Genre: Research Report (0.81)

Industry: Telecommunications > Networks (0.89)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.61)

Journal of Artificial Intelligence ResearchJul-29-2008

Online Planning Algorithms for POMDPs

Ross, S., Pineau, J., Paquet, S., Chaib-draa, B.

Partially Observable Markov Decision Processes (POMDPs) provide a rich framework for sequential decision-making under uncertainty in stochastic domains. However, solving a POMDP is often intractable except for small problems due to their complexity. Here, we focus on online approaches that alleviate the computational complexity by computing good local policies at each decision step during the execution. Online algorithms generally consist of a lookahead search to find the best action to execute at each time step in an environment. Our objectives here are to survey the various existing online POMDP methods, analyze their properties and discuss their advantages and disadvantages; and to thoroughly evaluate these online approaches in different environments under various metrics (return, error bound reduction, lower bound improvement). Our experimental results indicate that state-of-the-art online heuristic search methods can handle large POMDP domains efficiently.

algorithm, belief state, node, (15 more...)

doi: 10.1613/jair.2567

10559

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.67)
Overview (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Freund, Yoav, Hsu, Daniel

A new Hedging algorithm and its application to inferring latent random variables

arXiv.org Artificial IntelligenceJun-30-2008

We present a new online learning algorithm for cumulative discounted gain. This learning algorithm does not use exponential weights on the experts. Instead, it uses a weighting scheme that depends on the regret of the master algorithm relative to the experts. In particular, experts whose discounted cumulative gain is smaller (worse) than that of the master algorithm receive zero weight. We also sketch how a regret-based algorithm can be used as an alternative to Bayesian averaging in the context of inferring latent random variables.

artificial intelligence, machine learning, normalhedge, (16 more...)

arXiv.org Artificial Intelligence

0806.4802

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Journal of Artificial Intelligence ResearchJun-27-2008

Refining the Execution of Abstract Actions with Learned Action Models

Stulp, F., Beetz, M.

Robots reason about abstract actions, such as "go to position `l'", in order to decide what to do or to generate plans for their intended course of action. The use of abstract actions enables robots to employ small action libraries, which reduces the search space for decision making. When executing the actions, however, the robot must tailor the abstract actions to the specific task and situation context at hand. In this article we propose a novel robot action execution system that learns success and performance models for possible specializations of abstract actions. At execution time, the robot uses these models to optimize the execution of abstract actions to the respective task contexts. The robot can so use abstract actions for efficient reasoning, without compromising the performance of action execution. We show the impact of our action execution model in three robotic domains and on two kinds of action execution problems: (1) the instantiation of free action parameters to optimize the expected performance of action sequences; (2) the automatic introduction of additional subgoals to make action sequences more reliable.

action model, robot, subgoal refinement, (13 more...)

doi: 10.1613/jair.2479

10554

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
Asia > Middle East > Jordan (0.04)
(4 more...)

Genre:

Workflow (1.00)
Research Report > Experimental Study (0.93)

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.92)
(3 more...)

Csaji, B. C., Monostori, L.

Adaptive Stochastic Resource Control: A Machine Learning Approach

Journal of Artificial Intelligence ResearchJun-25-2008

The paper investigates stochastic resource allocation problems with scarce, reusable resources and non-preemtive, time-dependent, interconnected tasks. This approach is a natural generalization of several standard resource management problems, such as scheduling and transportation problems. First, reactive solutions are considered and defined as control policies of suitably reformulated Markov decision processes (MDPs). We argue that this reformulation has several favorable properties, such as it has finite state and action spaces, it is aperiodic, hence all policies are proper and the space of control policies can be safely restricted. Next, approximate dynamic programming (ADP) methods, such as fitted Q-learning, are suggested for computing an efficient control policy. In order to compactly maintain the cost-to-go function, two representations are studied: hash tables and support vector regression (SVR), particularly, nu-SVRs. Several additional improvements, such as the application of limited-lookahead rollout algorithms in the initial phases, action space decomposition, task clustering and distributed sampling are investigated, too. Finally, experimental results on both benchmark and industry-related data are presented.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

doi: 10.1613/jair.2548

10553

Country:

North America > United States > Massachusetts (0.28)
Europe > Hungary (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Energy > Oil & Gas > Upstream (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)

Oliehoek, F. A., Spaan, M. T. J., Vlassis, N.

Optimal and Approximate Q-value Functions for Decentralized POMDPs

Journal of Artificial Intelligence ResearchMay-28-2008

Decision-theoretic planning is a popular approach to sequential decision making problems, because it treats uncertainty in sensing and acting in a principled way. In single-agent frameworks like MDPs and POMDPs, planning can be carried out by resorting to Q-value functions: an optimal Q-value function Q* is computed in a recursive manner by dynamic programming, and then an optimal policy is extracted from Q*. In this paper we study whether similar Q-value functions can be defined for decentralized POMDP models (Dec-POMDPs), and how policies can be extracted from such value functions. We define two forms of the optimal Q-value function for Dec-POMDPs: one that gives a normative description as the Q-value function of an optimal pure joint policy and another one that is sequentially rational and thus gives a recipe for computation. This computation, however, is infeasible for all but the smallest problems. Therefore, we analyze various approximate Q-value functions that allow for efficient computation. We describe how they relate, and we prove that they all provide an upper bound to the optimal Q-value function Q*. Finally, unifying some previous approaches for solving Dec-POMDPs, we describe a family of algorithms for extracting policies from such Q-value functions, and perform an experimental evaluation on existing test problems, including a new firefighting benchmark problem.

agent, dec-pomdp, q-value function, (12 more...)

doi: 10.1613/jair.2447

10549

Country:

North America > United States (0.27)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)
Europe > Greece > Crete > Chania (0.04)

Genre: Overview (0.45)

Industry:

Leisure & Entertainment > Games (0.67)
Law Enforcement & Public Safety > Fire & Emergency Services (0.48)
Leisure & Entertainment > Sports (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)