Asia
An Alternative Model for Mixtures of Experts
Xu, Lei, Jordan, Michael I., Hinton, Geoffrey E.
Hinton Dept. of Computer Science University of Toronto Toronto, M5S lA4, Canada Abstract We propose an alternative model for mixtures of experts which uses a different parametric form for the gating network. The modified model is trained by the EM algorithm. In comparison with earlier models-trained by either EM or gradient ascent-there is no need to select a learning stepsize. We report simulation experiments which show that the new architecture yields faster convergence. We also apply the new model to two problem domains: piecewise nonlinear function approximation and the combination of multiple previously trained classifiers. 1 INTRODUCTION For the mixtures of experts architecture (Jacobs, Jordan, Nowlan & Hinton, 1991), the EM algorithm decouples the learning process in a manner that fits well with the modular structure and yields a considerably improved rate of convergence (Jordan & Jacobs, 1994).
On-line Learning of Dichotomies
Barkai, N., Seung, H. S., Sompolinsky, H.
The performance of online algorithms for learning dichotomies is studied. In online learning, thenumber of examples P is equivalent to the learning time, since each example is presented only once. The learning curve, or generalization error as a function of P, depends on the schedule at which the learning rate is lowered. For a target that is a perceptron rule, the learning curve of the perceptron algorithm can decrease as fast as p-1,if the schedule is optimized. If the target is not realizable by a perceptron, the perceptron algorithm does not generally converge to the solution with lowest generalization error.
Unsupervised Classification of 3D Objects from 2D Views
Suzuki, Satoshi, Ando, Hiroshi
Satoshi Suzuki Hiroshi Ando ATR Human Information Processing Research Laboratories 2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-02, Japan satoshi@hip.atr.co.jp, ando@hip.atr.co.jp Abstract This paper presents an unsupervised learning scheme for categorizing 3D objects from their 2D projected images. The scheme exploits an auto-associative network's ability to encode each view of a single object into a representation that indicates its view direction. We propose two models that employ different classification mechanisms; the first model selects an auto-associative network whose recovered view best matches the input view, and the second model is based on a modular architecture whose additional network classifies the views by splitting the input space nonlinearly. We demonstrate the effectiveness of the proposed classification models through simulations using 3D wire-frame objects. 1 INTRODUCTION The human visual system can recognize various 3D (three-dimensional) objects from their 2D (two-dimensional) retinal images although the images vary significantly as the viewpoint changes. Recent computational models have explored how to learn to recognize 3D objects from their projected views (Poggio & Edelman, 1990). Most existing models are, however, based on supervised learning, i.e., during training the teacher tells which object each view belongs to.
Adaptive Elastic Input Field for Recognition Improvement
For machines to perform classification tasks, such as speech and character recognition, appropriately handling deformed patterns is a key to achieving high performance. The authors presents a new type of classification system, an Adaptive Input Field Neural Network(AIFNN), which includes a simple pre-trained neural network and an elastic input field attached to an input layer. By using an iterative method, AIFNN can determine an optimal affine translation for an elastic input field to compensate for the original deformations. The convergence of the AIFNN algorithm is shown. AIFNN is applied for handwritten numerals recognition. Consequently, 10.83%of originally misclassified patterns are correctly categorized and total performance is improved, without modifying the neural network. 1 Introduction For machines to accomplish classification tasks, such as speech and character recognition, appropriatelyhandling deformed patterns is a key to achieving high performance [Simard 92] [Simard 93] [Hinton 92] [Barnard 91]. The number of reasonable deformations of patterns is enormous, since they can be either linear translations (an affine translation or a time shifting) or nonlinear deformations (a set of combinations ofpartial translations), or both. Although a simple neural network (e.g. a 3-layered neural network) is able to adapt 1102 MinoruAsogawa
Boltzmann Chains and Hidden Markov Models
Saul, Lawrence K., Jordan, Michael I.
Statistical models of discrete time series have a wide range of applications, most notably to problems in speech recognition (Juang & Rabiner, 1991) and molecular biology (Baldi, Chauvin, Hunkapiller, & McClure, 1992). A common problem in these fields is to find a probabilistic model, and a set of model parameters, that 436 LawrenceK. Saul, Michael I. Jordan account for sequences of observed data. Hidden Markov models (HMMs) have been particularly successful at modeling discrete time series. One reason for this is the powerful learning rule (Baum) 1972ยป) a special case of the Expectation-Maximization (EM) procedure for maximum likelihood estimation (Dempster) Laird) & Rubin) 1977).
DAS: Intelligent Scheduling Systems for Shipbuilding
Lee, Jae Kyu, Lee, Kyoung Jun, Hong, June Seok, Kim, Wooju, Kim, Eun Young, Choi, Soo Yeoul, Kim, Ho Dong, Yang, Ok Ryul, Choi, Hyung Rim
Daewoo Shipbuilding Company, one of the largest shipbuilders in the world, has experienced great deal of trouble with the planning and scheduling of its production process. To solve the problems, from 1991 to 1993, Korea Advanced Institute of Science and Technology (KAIST) and Daewoo jointly conducted the Daewoo Shipbuilding Scheduling (das) Project. To integrate the scheduling expert systems for shipbuilding, we used a hierarchical scheduling architecture. To automate the dynamic spatial layout of objects in various areas of the shipyard, we developed spatial scheduling expert systems. For reliable estimation of person-hour requirements, we implemented the neural network-based person-hour estimator. In addition, we developed the paneled-block assembly shop scheduler and the long-range production planner. For this large-scale project, we devised a phased development strategy consisting of three phases: (1) vision revelation, (2) data-dependent realization, and (3) prospective enhancement. The DAS systems were successfully launched in January 1994 and are actively being used as indispensable systems in the shipyard, resulting in significant improvement in productivity and visible and positive effects in many areas.
Montreal Wrap-Up
Randy Davis announced the appointment of six new program managers at ARPA. At IJCAI-95, Randall Davis assumed the office of president of the American For many attending the Fourteenth for consideration this year," noted Association for Artificial Intelligence International Joint Conference on Ray Perrault of SRI International, (AAAI). Davis is a professor of Artificial Intelligence (IJCAI-95), the chair of the conference. "This is more electrical engineering and computer most difficult problem was choosing than at IJCAI-93 and at recent science and associate director of the which session to attend in the rich, National Conferences on AI in the AI Lab at the Massachusetts Institute varied program. Davis succeeds data-mining application from rate was under 25 percent, showing Barbara Grosz, Gordon McKay professor the U.S. Department of the Treasury that there is a great deal of work of computer science in the Division that identifies potential money laundering going on, and the scientific standard of Applied Sciences at Harvard to a small mobile LEG0 robot of IJCAI matches or exceeds that of University.
AUTOCELL: An Intelligent Cellular Mobile Network Management System
Low, Chee-Meng, Tan, You-Tong, Choo, Soo-Yong, Lau, Sie-Hung, Tay, Soo-Meng
AUTOCELL is a system developed to assist in the operation and management of cellular mobile networks operated by Singapore Telecom. Its deployment is in line with the company's strategic move to introduce intelligent software into its operations. With the help of AI concepts and techniques, the system has enhanced the operational efficiency and network capacity and increased customer satisfaction with the network.
OPUS: An Efficient Admissible Algorithm for Unordered Search
OPUS is a branch and bound search algorithm that enables efficient admissible search through spaces for which the order of search operator application is not significant. The algorithm's search efficiency is demonstrated with respect to very large machine learning search spaces. The use of admissible search is of potential value to the machine learning community as it means that the exact learning biases to be employed for complex learning tasks can be precisely specified and manipulated. OPUS also has potential for application in other areas of artificial intelligence, notably, truth maintenance.
Translating between Horn Representations and their Characteristic Models
Characteristic models are an alternative, model based, representation for Horn expressions. It has been shown that these two representations are incomparable and each has its advantages over the other. It is therefore natural to ask what is the cost of translating, back and forth, between these representations. Interestingly, the same translation questions arise in database theory, where it has applications to the design of relational databases. This paper studies the computational complexity of these problems. Our main result is that the two translation problems are equivalent under polynomial reductions, and that they are equivalent to the corresponding decision problem. Namely, translating is equivalent to deciding whether a given set of models is the set of characteristic models for a given Horn expression. We also relate these problems to the hypergraph transversal problem, a well known problem which is related to other applications in AI and for which no polynomial time algorithm is known. It is shown that in general our translation problems are at least as hard as the hypergraph transversal problem, and in a special case they are equivalent to it.