Goto

Collaborating Authors

 Genre


A Simple Oscillatory Short-Term Memory

AAAI Conferences

Oscillatory neural networks have been an increasing focus of study over the last several years. Here we consider simple oscillatory memories for short-term retention of items occurring as temporal sequences. By incorporating decay as well as interference, we find that it is easy to match behavioral data from human subjects recalling temporal sequences under different situations by adjusting a single parameter in the model. These results suggest that simple oscillatory memories capture at least some key properties of human short-term memory, and might be used effectively in future biologically-inspired cognitive architectures.


Measuring Rates of Human Memory Retrieval

AAAI Conferences

Memory retrieval is a spontaneous process difficult to measure in naturalistic settings. By adapting an automated paging process, we measured spontaneous autobiographical and prospective memory retrieval probability, and found the derived frequency of recall in a given time period to be significantly higher than expected. Altogether, this research provides a quantitative characterization of human memory.


Taking a Mental Stance Towards Artificial Systems

AAAI Conferences

This paper argues that supervised cognitive growth in artifacts will be very difficult to achieve without detailed knowledge about systemsโ€™ internal states. Physical information is too low level to provide a useful understanding of a systemโ€™s behavior, and it is more pragmatically useful to take a mental stance towards an artificial system and interpret its actions in terms of mental states. This mental stance is similar to Dennettโ€™s intentional stance, except the ascription of beliefs and rationality in the intentional stance is replaced by the attribution of low level mental states in the mental stance. In some cases it might also be useful to take a conscious stance towards an artificial system that interprets its behavior as the outcome of a conscious decision making process. Since most artifacts lack language, automatic analysis techniques have to be used to identify the contents of their minds, and the second half of this paper suggests how some of the earlier work of Aleksander and Atlas can be applied in this area.


Back to the Basics โ€“ Redefining Information, Knowledge, Intelligence, and Artificial Intelligence Using Only the Adaptive Systems Theory

AAAI Conferences

Decades ago, Alan Turing proposed a test to show if a machine has intelligence, a test that has yet to be replaced by a more comprehensive theory. The same test however, says nothing about what is intelligence. This paper proposes a definition based on a system ability to deal with uncertainty, which is the main attribute of our intelligence. It introduces a new adaptive system theory and the Viable Complex System (VCS), concept that is applied to organisms, social organizations, and to the design and architecture of IT systems. All VCSs share a dual structure built on two function types: operations (i.e. resource processing) and change (adaptability). A system adapts by learning from the interactions with environment on how to improve its chances to survive. All systems sharing common operations are part of a realm. Obviously, we may have systems which could live in two realms at the same time. In conclusion, we define information as the interaction between two similar VCSs, and intelligence as a property of adaptive systems which exist in the context of two realms (i.e. humans being biological organisms and members of the society). We extend the model to quantify intelligence through the use of a new term called information density. This concept associates complexity of the logic embedded in a message, especially the one related to changes, with the system ability to process that logic in its quest to survive. The more intelligent the system, the better it is at extracting information towards higher efficiency and higher viability. We are closing the paper with the presentation of two case studies from our practice that shows how this model can be applied in the IT when designing enterprise systems.


DeSTIN: A Scalable Deep Learning Architecture with Application to High-Dimensional Robust Pattern Recognition

AAAI Conferences

The topic of deep learning systems has received significant attention during the past few years, particularly as a biologically-inspired approach to processing highdimensional signals. The latter often involve spatiotemporal information that may span large scales, rendering its representation in the general case highly challenging. Deep learning networks attempt to overcome this challenge by means of a hierarchical architecture that is comprised of common circuits with similar (and often cortically influenced) functionality. The goal of such systems is to represent sensory observations in a manner that will later facilitate robust pattern classification, mimicking a key attribute of the mammal brain. This stands in contrast with the mainstream approach of pre-processing the data so as to reduce its dimensionality โ€” a paradigm that often results in sub-optimal performance. This paper presents a Deep SpatioTemporal Inference Network (DeSTIN) โ€” a scalable deep learning architecture that relies on a combination of unsupervised learning and Bayesian inference. Dynamic pattern learning forms an inherent way of capturing complex spatiotemporal dependencies. Simulation results demonstrate the core capabilities of the proposed framework, particularly in the context of high-dimensional signal classification.


Iconic Training and Effective Information: Evaluating Meaning in Discrete Neural Networks

AAAI Conferences

In discussions about the physical support of conscious experience, a recent trend has been introduced (by Tononi and various colleagues) that measures the capacity of a network to discriminate among different states and integrate the information generated by this discrimination. This capacity to generate and integrate information can be used to understand the information processing in a network and Tononi has claimed that it is also linked to conscious experience. This paper describes experiments in which networks of weightless neurons were used to explore how different connection patterns and architectures affected the effective information generated by a network. The training of these networks using easily recognizable images made it easy to monitor their internal states, and this supports the interpretation of the system using the mental stance, which is described in a companion paper. By applying the same training to different architectures we were also able to study how the informational relationships depended on a combination of training and other dynamic effects.


Causal Inference on Discrete Data using Additive Noise Models

arXiv.org Machine Learning

Inferring causal relations by analyzing statistical dependences among observed random variables is a challenging task if no controlled randomized experiments are available. Socalled constraint-based approaches to causal discovery (Pearl, 2000; Spirtes et al., 1993) select among all directed acyclic graphs (DAGs) those that satisfy the Markov condition and the faithfulness assumption, i.e., those for which the observed independences are imposed by the structure rather than being a result of specific choices of parameters of the Bayesian network. These approaches are unable to distinguish among causal DAGs that impose the same independences. In particular, it is impossible to distinguish between X Y and Y X. More recently, several methods have been suggested that use not only conditional independences, but also more sophisticated properties of the joint distribution. For simplicity, we explain the ideas for the two variable setting since this case is particularly challenging. Kano & Shimizu (2003) use models Y f(X) N (1) where f is a linear function and N is additive noise that is independent of the hypothetical cause X. This is an example for an additive noise model from X to Y. Apart from trivial


ParamILS: An Automatic Algorithm Configuration Framework

Journal of Artificial Intelligence Research

The identification of performance-optimizing parameter settings is an important part of the development and application of algorithms. We describe an automatic framework for this algorithm configuration problem. More formally, we provide methods for optimizing a target algorithms performance on a given class of problem instances by varying a set of ordinal and/or categorical parameters. We review a family of local-search-based algorithm configuration procedures and present novel techniques for accelerating them by adaptively limiting the time spent for evaluating individual configurations. We describe the results of a comprehensive experimental evaluation of our methods, based on the configuration of prominent complete and incomplete algorithms for SAT. We also present what is, to our knowledge, the first published work on automatically configuring the CPLEX mixed integer programming solver. All the algorithms we considered had default parameter settings that were manually identified with considerable effort. Nevertheless, using our automated algorithm configuration procedures, we achieved substantial and consistent performance improvements.


Distinguishing Cause and Effect via Second Order Exponential Models

arXiv.org Machine Learning

We propose a method to infer causal structures containing both discrete and continuous variables. The idea is to select causal hypotheses for which the conditional density of every variable, given its causes, becomes smooth. We define a family of smooth densities and conditional densities by second order exponential models, i.e., by maximizing conditional entropy subject to first and second statistical moments. If some of the variables take only values in proper subsets of R^n, these conditionals can induce different families of joint distributions even for Markov-equivalent graphs. We consider the case of one binary and one real-valued variable where the method can distinguish between cause and effect. Using this example, we describe that sometimes a causal hypothesis must be rejected because P(effect|cause) and P(cause) share algorithmic information (which is untypical if they are chosen independently). This way, our method is in the same spirit as faithfulness-based causal inference because it also rejects non-generic mutual adjustments among DAG-parameters.


Which graphical models are difficult to learn?

arXiv.org Machine Learning

We consider the problem of learning the structure of Ising models (pairwise binary Markov random fields) from i.i.d. samples. While several methods have been proposed to accomplish this task, their relative merits and limitations remain somewhat obscure. By analyzing a number of concrete examples, we show that low-complexity algorithms systematically fail when the Markov random field develops long-range correlations. More precisely, this phenomenon appears to be related to the Ising model phase transition (although it does not coincide with it).