osition
Biological Neurons Compete with Deep Reinforcement Learning in Sample Efficiency in a Simulated Gameworld
Khajehnejad, Moein, Habibollahi, Forough, Paul, Aswin, Razi, Adeel, Kagan, Brett J.
How do biological systems and machine learning algorithms compare in the number of samples required to show significant improvements in completing a task? We compared the learning efficiency of in vitro biological neural networks to the state-of-the-art deep reinforcement learning (RL) algorithms in a simplified simulation of the game `Pong'. Using DishBrain, a system that embodies in vitro neural networks with in silico computation using a high-density multi-electrode array, we contrasted the learning rate and the performance of these biological systems against time-matched learning from three state-of-the-art deep RL algorithms (i.e., DQN, A2C, and PPO) in the same game environment. This allowed a meaningful comparison between biological neural systems and deep RL. We find that when samples are limited to a real-world time course, even these very simple biological cultures outperformed deep RL algorithms across various game performance characteristics, implying a higher sample efficiency. Ultimately, even when tested across multiple types of information input to assess the impact of higher dimensional data input, biological neurons showcased faster learning than all deep reinforcement learning agents.
- Oceania > Australia > Victoria > Melbourne (0.04)
- Europe > Switzerland (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.68)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.92)
- Leisure & Entertainment > Games > Computer Games (0.87)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
GIB: Imperfect Information in a Computationally Challenging Game
This paper investigates the problems arising in the construction of a program to play the game of contract bridge. These problems include both the difficulty of solving the game's perfect information variant, and techniques needed to address the fact that bridge is not, in fact, a perfect information game. GIB, the program being described, involves five separate technical advances: partition search, the practical application of Monte Carlo techniques to realistic problems, a focus on achievable sets to solve problems inherent in the Monte Carlo approach, an extension of alpha-beta pruning from total orders to arbitrary distributive lattices, and the use of squeaky wheel optimization to find approximately optimal solutions to cardplay problems. GIB is currently believed to be of approximately expert caliber, and is currently the strongest computer bridge program in the world.
- Europe > Denmark (0.04)
- Oceania > Australia (0.04)
- North America > United States > Oregon > Lane County > Eugene (0.04)
- (2 more...)
Cue Phrase Classification Using Machine Learning
Cue phrases may be used in a discourse sense to explicitly signal discourse structure, but also in a sentential sense to convey semantic rather than structural information. Correctly classifying cue phrases as discourse or sentential is critical in natural language processing systems that exploit discourse structure, e.g., for performing tasks such as anaphora resolution and plan recognition. This paper explores the use of machine learning for classifying cue phrases as discourse or sentential. Two machine learning programs (Cgrendel and C4.5) are used to induce classification models from sets of pre-classified cue phrases and their features in text and speech. Machine learning is shown to be an effective technique for not only automating the generation of classification models, but also for improving upon previous results. When compared to manually derived classification models already in the literature, the learned models often perform with higher accuracy and contain new linguistic insights into the data. In addition, the ability to automatically construct classification models makes it easier to comparatively analyze the utility of alternative feature representations of the data. Finally, the ease of retraining makes the learning approach more scalable and flexible than manual methods.
A Divergence Critic for Inductive Proof
Inductive theorem provers often diverge. This paper describes a simple critic, a computer program which monitors the construction of inductive proofs attempting to identify diverging proof attempts. Divergence is recognized by means of a ``difference matching'' procedure. The critic then proposes lemmas and generalizations which ``ripple'' these differences away so that the proof can go through without divergence. The critic enables the theorem prover Spike to prove many theorems completely automatically from the definitions alone.
The Difficulties of Learning Logic Programs with Cut
Bergadano, F., Gunetti, D., Trinchero, U.
As real logic programmers normally use cut (!), an effective learning procedure for logic programs should be able to deal with it. Because the cut predicate has only a procedural meaning, clauses containing cut cannot be learned using an extensional evaluation method, as is done in most learning systems. On the other hand, searching a space of possible programs (instead of a space of independent clauses) is unfeasible. An alternative solution is to generate first a candidate base program which covers the positive examples, and then make it consistent by inserting cut where appropriate. The problem of learning programs with cut has not been investigated before and this seems to be a natural and reasonable approach. We generalize this scheme and investigate the difficulties that arise. Some of the major shortcomings are actually caused, in general, by the need for intensional evaluation. As a conclusion, the analysis of this paper suggests, on precise and technical grounds, that learning cut is difficult, and current induction techniques should probably be restricted to purely declarative logic languages.