AITopics

We develop a protocol for optimizing dynamic behavior of a network of simple electronic components, such as a sensor network, an ad hoc network of mobile devices, or a network of communication switches. This protocol requires only local communication and simple computations which are distributed among devices. The protocol is scalable to large networks. As a motivating example, we discuss a problem involving optimization of power consumption, delay, and buffer overflow in a sensor network. Our approach builds on policy gradient methods for optimization of Markov decision processes. The protocol can be viewed as an extension of policy gradient methods to a context involving a team of agents optimizing aggregate performance through asynchronous distributed communication and computation. We establish that the dynamics of the protocol approximate the solution to an ordinary differential equation that follows the gradient of the performance objective.

ith sensor, protocol, sensor, (12 more...)

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Semiconductors & Electronics (0.34)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Rudary, Matthew R., Singh, Satinder P.

A Nonlinear Predictive State Representation

Predictive state representations (PSRs) use predictions of a set of tests to represent the state of controlled dynamical systems. One reason why this representation is exciting as an alternative to partially observable Markov decision processes (POMDPs) is that PSR models of dynamical systems may be much more compact than POMDP models. Empirical work on PSRs to date has focused on linear PSRs, which have not allowed for compression relative to POMDPs. We introduce a new notion of tests which allows us to define a new type of PSR that is nonlinear in general and allows for exponential compression in some deterministic dynamical systems. These new tests, called e-tests, are related to the tests used by Rivest and Schapire [1] in their work with the diversity representation, but our PSR avoids some of the pitfalls of their representation--in particular, its potential to be exponentially larger than the equivalent POMDP.

dynamical system, psr, representation, (15 more...)

Country: North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Fern, Alan, Yoon, Sungwook, Givan, Robert

Approximate Policy Iteration with a Policy Language Bias

We explore approximate policy iteration, replacing the usual costfunction learning step with a learning step in policy space. We give policy-language biases that enable solution of very large relational Markov decision processes (MDPs) that no previous technique can solve. In particular, we induce high-quality domain-specific planners for classical planning domains (both deterministic and stochastic variants) by solving such domains as extremely large MDPs.

api, control knowledge, planning domain, (14 more...)

Country: North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.68)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.47)
(2 more...)

Poupart, Pascal, Boutilier, Craig

Bounded Finite State Controllers

We describe a new approximation algorithm for solving partially observable MDPs. Our bounded policy iteration approach searches through the space of bounded-size, stochastic finite state controllers, combining several advantages of gradient ascent (efficiency, search through restricted controller space) and policy iteration (less vulnerability to local optima).

belief state, controller, node, (14 more...)

Country:

North America > Canada > Ontario > Toronto (0.30)
Europe > Sweden > Stockholm > Stockholm (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(6 more...)

Industry: Government > Regional Government > North America Government > United States Government (0.61)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.52)

Chang, Yu-han, Ho, Tracey, Kaelbling, Leslie P.

All learning is Local: Multi-agent Learning in Global Reward Games

In large multiagent games, partial observability, coordination, and credit assignment persistently plague attempts to design good learning algorithms. We provide a simple and efficient algorithm that in part uses a linear system to model the world from a single agent's limited perspective, and takes advantage of Kalman filtering to allow an agent to construct a good training signal and learn an effective policy.

agent, optimal policy, reward signal, (15 more...)

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Parkes, David C., Singh, Satinder P.

An MDP-Based Approach to Online Mechanism Design

Online mechanism design (MD) considers the problem of providing incentives to implement desired system-wide outcomes in systems with self-interested agents that arrive and depart dynamically. Agents can choose to misrepresent their arrival and departure times, in addition to information about their value for different outcomes. We consider the problem of maximizing the total longterm value of the system despite the self-interest of agents. The online MD problem induces a Markov Decision Process (MDP), which when solved can be used to implement optimal policies in a truth-revealing Bayesian-Nash equilibrium.

agent, mechanism, vcg mechanism, (17 more...)

Country:

North America > United States > New York (0.04)
North America > United States > Michigan (0.04)

Industry: Consumer Products & Services > Travel (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Theocharous, Georgios, Kaelbling, Leslie P.

Approximate Planning in POMDPs with Macro-Actions

Recent research has demonstrated that useful POMDP solutions do not require consideration of the entire belief space. We extend this idea with the notion of temporal abstraction. We present and explore a new reinforcement learning algorithm over grid-points in belief space, which uses macro-actions and Monte Carlo updates of the Q-values. We apply the algorithm to a large scale robot navigation task and demonstrate that with temporal abstraction we can consider an even smaller part of the belief space, we can learn POMDP policies faster, and we can do information gathering more efficiently.

algorithm, belief space, pomdp, (15 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Washington > King County > Seattle (0.04)

Genre: Research Report > New Finding (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Pineau, Joelle, Gordon, Geoffrey J., Thrun, Sebastian

Applying Metric-Trees to Belief-Point POMDPs

Recent developments in grid-based and point-based approximation algorithms for POMDPs have greatly improved the tractability of POMDP planning. These approaches operate on sets of belief points by individually learning a value function for each point. In reality, belief points exist in a highly-structured metric simplex, but current POMDP algorithms do not exploit this property. This paper presents a new metric-tree algorithm which can be used in the context of POMDP planning to sort belief points spatially, and then perform fast value function updates over groups of points. We present results showing that this approach can reduce computation in point-based POMDP algorithms for a wide range of problems.

algorithm, belief point, node, (15 more...)

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Hughes, Nicholas P., Tarassenko, Lionel, Roberts, Stephen J.

Markov Models for Automated ECG Interval Analysis

We examine the use of hidden Markov and hidden semi-Markov models for automatically segmenting an electrocardiogram waveform into its constituent waveform features. An undecimated wavelet transform is used to generate an overcomplete representation of the signal that is more appropriate for subsequent modelling. We show that the state durations implicit in a standard hidden Markov model are ill-suited to those of real ECG features, and we investigate the use of hidden semi-Markov models for improved state duration modelling.

ecg waveform, state duration, waveform, (13 more...)

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Hungary > Budapest > Budapest (0.04)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Felzenszwalb, Pedro F., Huttenlocher, Daniel P., Kleinberg, Jon M.

Fast Algorithms for Large-State-Space HMMs with Applications to Web Usage Analysis

In applying Hidden Markov Models to the analysis of massive data streams, it is often necessary to use an artificially reduced set of states; this is due in large part to the fact that the basic HMM estimation algorithms have a quadratic dependence on the size of the state set. We present algorithms that reduce this computational bottleneck to linear or near-linear time, when the states can be embedded in an underlying grid of parameters. This type of state representation arises in many domains; in particular, we show an application to traffic analysis at a high-volume Web site.

algorithm, probability, sequence, (15 more...)

Country:

North America > United States > New York > Tompkins County > Ithaca (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)