AITopics

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > San Diego County (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.35)

Lemmon, Michael, Szymanski, Peter T.

Interior Point Implementations of Alternating Minimization Training

Neural Information Processing SystemsDec-31-1995

AM techniques were first introduced in soft-competitive learning algorithms[l]. Thistraining procedure was later shown to be closely related to Expectation-Maximization algorithms used by the statistical estimation community[2].

algorithm, artificial intelligence, machine learning, (17 more...)

Country: North America > United States > California (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.95)

AUTOCELL: An Intelligent Cellular Mobile Network Management System

Low, Chee-Meng, Tan, You-Tong, Choo, Soo-Yong, Lau, Sie-Hung, Tay, Soo-Meng

AI MagazineDec-15-1995

AUTOCELL is a system developed to assist in the operation and management of cellular mobile networks operated by Singapore Telecom. Its deployment is in line with the company's strategic move to introduce intelligent software into its operations. With the help of AI concepts and techniques, the system has enhanced the operational efficiency and network capacity and increased customer satisfaction with the network.

artificial intelligence, autocell, optimization problem, (18 more...)

AI Magazine

Country: Asia > Singapore (0.27)

Industry:

Telecommunications > Networks (1.00)
Information Technology > Networks (1.00)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Russell, S. J., Subramanian, D.

Provably Bounded-Optimal Agents

Journal of Artificial Intelligence ResearchMay-1-1995

Since its inception, artificial intelligence has relied upon a theoretical foundation centered around perfect rationality as the desired property of intelligent systems. We argue, as others have done, that this foundation is inadequate because it imposes fundamentally unsatisfiable requirements. As a result, there has arisen a wide gap between theory and practice in AI, hindering progress in the field. We propose instead a property called bounded optimality. Roughly speaking, an agent is bounded-optimal if its program is a solution to the constrained optimization problem presented by its architecture and the task environment. We show how to construct agents with this property for a simple class of machine architectures in a broad class of real-time environments. We illustrate these results using a simple model of an automated mail sorting facility. We also define a weaker property, asymptotic bounded optimality (ABO), that generalizes the notion of optimality in classical complexity theory. We then construct universal ABO programs, i.e., programs that are ABO no matter what real-time constraints are applied. Universal ABO programs can be used as building blocks for more complex systems. We conclude with a discussion of the prospects for bounded optimality as a theoretical basis for AI, and relate it to similar trends in philosophy, economics, and game theory.

arrival time average utility, provably bounded-optimal agent, rule 0, (2 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.133

AI Access Foundation

10134

Journal of Artificial Intelligence Research

Technology:

Information Technology > Architecture > Real Time Systems (0.73)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.53)

Buckland, Kenneth M., Lawrence, Peter D.

Transition Point Dynamic Programming

Transition point dynamic programming (TPDP) is a memorybased, reinforcement learning, direct dynamic programming approach to adaptive optimal control that can reduce the learning time and memory usage required for the control of continuous stochastic dynamic systems. TPDP does so by determining an ideal set of transition points (TPs) which specify only the control action changes necessary for optimal control. TPDP converges to an ideal TP set by using a variation of Q-Iearning to assess the merits of adding, swapping and removing TPs from states throughout the state space. When applied to a race track problem, TPDP learned the optimal control policy much sooner than conventional Q-Iearning, and was able to do so using less memory. 1 INTRODUCTION Dynamic programming (DP) approaches can be utilized to determine optimal control policies for continuous stochastic dynamic systems when the state spaces of those systems have been quantized with a resolution suitable for control (Barto et al., 1991). DP controllers, in lheir simplest form, are memory-based controllers that operate by repeatedly updating cost values associated with every state in the discretized state space (Barto et al., 1991).

buckland, q-iearning, tpdp, (14 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.05)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Lu, Chien-Ping, Mjolsness, Eric

Two-Dimensional Object Localization by Coarse-to-Fine Correlation Matching

Two tightly coupled subproblems need to be solved for locating and recognizing the model: the correspondence problem (how are scene features put into correspondence with model features?),

line segment, objective function, two-dimensional object localization, (13 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Connecticut > New Haven County > New Haven (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)

Elfadel, I. M., J. L. Wyatt, Jr.

The "Softmax" Nonlinearity: Derivation Using Statistical Mechanics and Useful Properties as a Multiterminal Analog Circuit Element

In this paper, we show a reciprocal implementation of the "softmax" nonlinearity that is usually used to enforce local competition between neurons [Peterson, 1989]. We show that the circuit is passive and incrementally passive, and we explicitly compute its content and co-content functions. This circuit adds a new element to the library of the analog circuit designer that can be combined with reciprocal constraint boxes [Harris, 1988] and nonlinear resistive fuses [Harris, 1989] to form fast, analog VLSI optimization networks.

circuit implementation, implementation, multiterminal element, (11 more...)

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.15)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > California > San Diego County > San Diego (0.05)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Jaakkola, Tommi, Jordan, Michael I., Singh, Satinder P.

Convergence of Stochastic Iterative Dynamic Programming Algorithms

Increasing attention has recently been paid to algorithms based on dynamic programming (DP) due to the suitability of DP for learning problems involving control. In stochastic environments where the system being controlled is only incompletely known, however, a unifying theoretical account of these methods has been missing. In this paper we relate DPbased learning algorithms to the powerful techniques of stochastic approximation via a new convergence theorem, enabling us to establish a class of convergent algorithms to which both TD("\) and Q-Iearning belong. 1 INTRODUCTION Learning to predict the future and to find an optimal way of controlling it are the basic goals of learning systems that interact with their environment. A variety of algorithms are currently being studied for the purposes of prediction and control in incompletely specified, stochastic environments. Here we consider learning algorithms defined in Markov environments. There are actions or controls (u) available for the learner that affect both the state transition probabilities, and the probability distribution for the immediate, state dependent costs (Ci(u)) incurred by the learner.

algorithm, convergence, theorem 1, (12 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > California > San Diego County > San Diego (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.73)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.73)

Using Local Trajectory Optimizers to Speed Up Global Optimization in Dynamic Programming

Atkeson, Christopher G.

Dynamic programming provides a methodology to plan trajectories and design controllers and estimators for nonlinear systems. However, general dynamic programming is computationally intractable. We have developed procedures that allow more complex planning problems to be solved. We have modified the State Increment Dynamic Programming approach of Larson (1968) in several ways: 1. In State Increment DP, a constant action is integrated to form a trajectory segment from the center of a cell to its boundary. We use second order local trajectory optimization (Differential Dynamic Programming) to generate an optimal trajectory and form an optimal policy in a tube surrounding the optimal trajectory within a cell. The trajectory segment and local policy are globally optimal, up to the resolution of the representation of the value function on the boundary of the cell.

optimal trajectory, trajectory, value function, (12 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > New York > New York County > New York City (0.05)
North America > United States > New Jersey > Mercer County > Princeton (0.04)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Buckland, Kenneth M., Lawrence, Peter D.

Transition Point Dynamic Programming

Transition point dynamic programming (TPDP) is a memorybased, reinforcement learning, direct dynamic programming approach to adaptive optimal control that can reduce the learning time and memory usage required for the control of continuous stochastic dynamic systems. TPDP does so by determining an ideal set of transition points (TPs) which specify only the control action changes necessary for optimal control. TPDP converges to an ideal TP set by using a variation of Q-Iearning to assess the merits of adding, swapping and removing TPs from states throughout the state space. When applied to a race track problem, TPDP learned the optimal control policy much sooner than conventional Q-Iearning, and was able to do so using less memory. 1 INTRODUCTION Dynamic programming (DP) approaches can be utilized to determine optimal control policies for continuous stochastic dynamic systems when the state spaces of those systems have been quantized with a resolution suitable for control (Barto et al., 1991). DP controllers, in lheir simplest form, are memory-based controllers that operate by repeatedly updating cost values associated with every state in the discretized state space (Barto et al., 1991).

buckland, q-iearning, tpdp, (14 more...)