Goto

Collaborating Authors

 connectionist system


From Words to Worlds: Compositionality for Cognitive Architectures

arXiv.org Artificial Intelligence

Large language models (LLMs) are very performant connectionist systems, but do they exhibit more compositionality? More importantly, is that part of why they perform so well? We present empirical analyses across four LLM families (12 models) and three task categories, including a novel task introduced below. Our findings reveal a nuanced relationship in learning of compositional strategies by LLMs -- while scaling enhances compositional abilities, instruction tuning often has a reverse effect. Such disparity brings forth some open issues regarding the development and improvement of large language models in alignment with human cognitive capacities.


Analysis of Distributed Representation of Constituent Structure in Connectionist Systems

Neural Information Processing Systems

A general method, the tensor product representation, is described for the distributed representation of value/variable bindings. The method allows the fully distributed representation of symbolic structures: the roles in the structures, as well as the fillers for those roles, can be arbitrarily non-local. Fully and partially localized special cases reduce to existing cases of connectionist representations of structured data; the tensor product representation generalizes these and the few existing examples of fuUy distributed representations of structures. The representation saturates gracefully as larger structures are represented; it penn its recursive construction of complex representations from simpler ones; it respects the independence of the capacities to generate and maintain multiple bindings in parallel; it extends naturally to continuous structures and continuous representational patterns; it pennits values to also serve as variables; it enables analysis of the interference of symbolic structures stored in associative memories; and it leads to characterization of optimal distributed representations of roles and a recirculation algorithm for learning them.


DeepMind Combines Logic and Neural Networks to Extract Rules from Noisy Data

#artificialintelligence

I recently started an AI-focused educational newsletter, that already has over 80,000 subscribers. TheSequence is a no-BS (meaning no hype, no news etc) ML-oriented newsletter that takes 5 minutes to read. The goal is to keep you up to date with machine learning projects, research papers and concepts. In his book "The Master Algorithm", artificial intelligence researcher Pedro Domingos explores the idea of a single algorithm that can combine the major schools of machine learning. The idea is, without a doubt, extremely ambitious but we are already seeing some iterations of it. Last year, Google published a research paper under the catchy title of "One Model to Learn Them All" that combines heterogeneous learning techniques under a single machine learning model.


The Next AI Milestone: Bridging the Semantic Gap – Intuition Machine – Medium

#artificialintelligence

John Launchbury of DARPA has an excellent video that I recommend everyone watch ( viewing just the slides will give one a wrong impression of the content). Statistical Learning -- Where programmers create statistical models for specific problem domains and train them on big data. Contextual Adaptation -- Where systems construct contextual explanatory models for classes of real world phenomena. It's a bit of a simplified presentation because it lumps all of machine learning, Bayesian methods and Deep Learning into a single category. There are many more approaches to AI that don't fit within DARPA's 3 waves.


Analysis of Distributed Representation of Constituent Structure in Connectionist Systems

Neural Information Processing Systems

A general method, the tensor product representation, is described for the distributed representation of value/variable bindings. The method allows the fully distributed representation of symbolic structures: the roles in the structures, as well as the fillers for those roles, can be arbitrarily non-local. Fully and partially localized special cases reduce to existing cases of connectionist representations of structured data; the tensor product representation generalizes these and the few existing examples of fuUy distributed representations of structures. The representation saturates gracefully as larger structures are represented; it penn its recursive construction of complex representations from simpler ones; it respects the independence of the capacities to generate and maintain multiple bindings in parallel; it extends naturally to continuous structures and continuous representational patterns; it pennits values to also serve as variables; it enables analysis of the interference of symbolic structures stored in associative memories; and it leads to characterization of optimal distributed representations of roles and a recirculation algorithm for learning them. Introduction Any model of complex infonnation processing in networks of simple processors must solve the problem of representing complex structures over network elements. Connectionist models of realistic natural language processing, for example, must employ computationally adequate representations of complex sentences. Many connectionists feel that to develop connectionist systems with the computational power required by complex tasks, distributed representations must be used: an individual processing unit must participate in the representation of multiple items, and each item must be represented as a pattern of activity of multiple processors. Connectionist models have used more or less distributed representations of more or less complex structures, but little if any general analysis of the problem of distributed representation of complex infonnation has been carried out This paper reports results of an analysis of a general method called the tensor product representation.



Probabilistic Characterization of Neural Model Computations

Neural Information Processing Systems

This viewpoint allows the class of probability distributions, P, the neural network can acquire to be explicitly specified. Learning algorithms for the neural network which search for the "most probable" member of P can then be designed. Statistical tests which decide if the "true" or environmental probability distribution is in P can also be developed. Example applications of the theory to the highly nonlinear back-propagation learning algorithm, and the networks of Hopfield and Anderson are discussed. INTRODUCTION A connectionist system is a network of simple neuron-like computing elements which can store and retrieve information, and most importantly make generalizations. Using terminology suggested by Rumelhart & McClelland 1, the computing elements of a connectionist system are called units, and each unit is associated with a real number indicating its activity level. The activity level of a given unit in the system can also influence the activity level of another unit. The degree of influence between two such units is often characterized by a parameter of the system known as a connection strength. During the information retrieval process some subset of the units in the system are activated, and these units in turn activate neighboring units via the inter-unit connection strengths.


Analysis of Distributed Representation of Constituent Structure in Connectionist Systems

Neural Information Processing Systems

A general method, the tensor product representation, is described for the distributed representation of value/variable bindings. The method allows the fully distributed representation of symbolic structures: the roles in the structures, as well as the fillers for those roles, can be arbitrarily non-local. Fully and partially localized special cases reduce to existing cases of connectionist representations of structured data; the tensor product representation generalizes these and the few existing examples of fuUy distributed representations of structures. The representation saturates gracefully as larger structures are represented; it penn its recursive construction of complex representations from simpler ones; it respects the independence of the capacities to generate and maintain multiple bindings in parallel; it extends naturally to continuous structures and continuous representational patterns; it pennits values to also serve as variables; it enables analysis of the interference of symbolic structures stored in associative memories; and it leads to characterization of optimal distributed representations of roles and a recirculation algorithm for learning them. Introduction Any model of complex infonnation processing in networks of simple processors must solve the problem of representing complex structures over network elements. Connectionist models of realistic natural language processing, for example, must employ computationally adequate representations of complex sentences. Many connectionists feel that to develop connectionist systems with the computational power required by complex tasks, distributed representations must be used: an individual processing unit must participate in the representation of multiple items, and each item must be represented as a pattern of activity of multiple processors. Connectionist models have used more or less distributed representations of more or less complex structures, but little if any general analysis of the problem of distributed representation of complex infonnation has been carried out This paper reports results of an analysis of a general method called the tensor product representation.



Probabilistic Characterization of Neural Model Computations

Neural Information Processing Systems

This viewpoint allows the class of probability distributions, P, the neural network can acquire to be explicitly specified. Learning algorithms for the neural network which search for the "most probable" member of P can then be designed. Statistical tests which decide if the "true" or environmental probability distribution is in P can also be developed. Example applications of the theory to the highly nonlinear back-propagation learning algorithm, and the networks of Hopfield and Anderson are discussed. INTRODUCTION A connectionist system is a network of simple neuron-like computing elements which can store and retrieve information, and most importantly make generalizations. Using terminology suggested by Rumelhart & McClelland 1, the computing elements of a connectionist system are called units, and each unit is associated with a real number indicating its activity level. The activity level of a given unit in the system can also influence the activity level of another unit. The degree of influence between two such units is often characterized by a parameter of the system known as a connection strength. During the information retrieval process some subset of the units in the system are activated, and these units in turn activate neighboring units via the inter-unit connection strengths.