Goto

Collaborating Authors

 Genre


Learning, Social Intelligence and the Turing Test - why an "out-of-the-box" Turing Machine will not pass the Turing Test

arXiv.org Artificial Intelligence

The Turing Test (TT) checks for human intelligence, rather than any putative general intelligence. It involves repeated interaction requiring learning in the form of adaption to the human conversation partner. It is a macro-level post-hoc test in contrast to the definition of a Turing Machine (TM), which is a prior micro-level definition. This raises the question of whether learning is just another computational process, i.e. can be implemented as a TM. Here we argue that learning or adaption is fundamentally different from computation, though it does involve processes that can be seen as computations. To illustrate this difference we compare (a) designing a TM and (b) learning a TM, defining them for the purpose of the argument. We show that there is a well-defined sequence of problems which are not effectively designable but are learnable, in the form of the bounded halting problem. Some characteristics of human intelligence are reviewed including it's: interactive nature, learning abilities, imitative tendencies, linguistic ability and context-dependency. A story that explains some of these is the Social Intelligence Hypothesis. If this is broadly correct, this points to the necessity of a considerable period of acculturation (social learning in context) if an artificial intelligence is to pass the TT. Whilst it is always possible to 'compile' the results of learning into a TM, this would not be a designed TM and would not be able to continually adapt (pass future TTs). We conclude three things, namely that: a purely "designed" TM will never pass the TT; that there is no such thing as a general intelligence since it necessary involves learning; and that learning/adaption and computation should be clearly distinguished.


A Proximal-Gradient Homotopy Method for the Sparse Least-Squares Problem

arXiv.org Machine Learning

We consider solving the $\ell_1$-regularized least-squares ($\ell_1$-LS) problem in the context of sparse recovery, for applications such as compressed sensing. The standard proximal gradient method, also known as iterative soft-thresholding when applied to this problem, has low computational cost per iteration but a rather slow convergence rate. Nevertheless, when the solution is sparse, it often exhibits fast linear convergence in the final stage. We exploit the local linear convergence using a homotopy continuation strategy, i.e., we solve the $\ell_1$-LS problem for a sequence of decreasing values of the regularization parameter, and use an approximate solution at the end of each stage to warm start the next stage. Although similar strategies have been studied in the literature, there have been no theoretical analysis of their global iteration complexity. This paper shows that under suitable assumptions for sparse recovery, the proposed homotopy strategy ensures that all iterates along the homotopy solution path are sparse. Therefore the objective function is effectively strongly convex along the solution path, and geometric convergence at each stage can be established. As a result, the overall iteration complexity of our method is $O(\log(1/\epsilon))$ for finding an $\epsilon$-optimal solution, which can be interpreted as global geometric rate of convergence. We also present empirical results to support our theoretical analysis.


Generalisation of language and knowledge models for corpus analysis

arXiv.org Artificial Intelligence

This paper takes new look on language and knowledge modelling for corpus linguistics. Using ideas of Chaitin, a line of argument is made against language/knowledge separation in Natural Language Processing. A simplistic model, that generalises approaches to language and knowledge, is proposed. One of hypothetical consequences of this model is Strong AI.


Combining Voting Rules Together

arXiv.org Artificial Intelligence

We propose a simple method for combining together voting rules that performs a run-off between the different winners of each voting rule. We prove that this combinator has several good properties. For instance, even if just one of the base voting rules has a desirable property like Condorcet consistency, the combination inherits this property. In addition, we prove that combining voting rules together in this way can make finding a manipulation more computationally difficult. Finally, we study the impact of this combinator on approximation methods that find close to optimal manipulations.


SAS+ Planning as Satisfiability

Journal of Artificial Intelligence Research

Planning as satisfiability is a principal approach to planning with many eminent advantages. The existing planning as satisfiability techniques usually use encodings compiled from STRIPS. We introduce a novel SAT encoding scheme (SASE) based on the SAS+ formalism. The new scheme exploits the structural information in SAS+, resulting in an encoding that is both more compact and efficient for planning. We prove the correctness of the new encoding by establishing an isomorphism between the solution plans of SASE and that of STRIPS based encodings. We further analyze the transition variables newly introduced in SASE to explain why it accommodates modern SAT solving algorithms and improves performance. We give empirical statistical results to support our analysis. We also develop a number of techniques to further reduce the encoding size of SASE, and conduct experimental studies to show the strength of each individual technique. Finally, we report extensive experimental results to demonstrate significant improvements of SASE over the state-of-the-art STRIPS based encoding schemes in terms of both time and memory efficiency.


Generalized Beta Mixtures of Gaussians

arXiv.org Machine Learning

In recent years, a rich variety of shrinkage priors have been proposed that have great promise in addressing massive regression problems. In general, these new priors can be expressed as scale mixtures of normals, but have more complex forms and better properties than traditional Cauchy and double exponential priors. We first propose a new class of normal scale mixtures through a novel generalized beta distribution that encompasses many interesting priors as special cases. This encompassing framework should prove useful in comparing competing priors, considering properties and revealing close connections. We then develop a class of variational Bayes approximations through the new hierarchy presented that will scale more efficiently to the types of truly massive data sets that are now encountered routinely.


A Generalized Least Squares Matrix Decomposition

arXiv.org Machine Learning

Variables in many massive high-dimensional data sets are structured, arising for example from measurements on a regular grid as in imaging and time series or from spatial-temporal measurements as in climate studies. Classical multivariate techniques ignore these structural relationships often resulting in poor performance. We propose a generalization of the singular value decomposition (SVD) and principal components analysis (PCA) that is appropriate for massive data sets with structured variables or known two-way dependencies. By finding the best low rank approximation of the data with respect to a transposable quadratic norm, our decomposition, entitled the Generalized least squares Matrix Decomposition (GMD), directly accounts for structural relationships. As many variables in high-dimensional settings are often irrelevant or noisy, we also regularize our matrix decomposition by adding two-way penalties to encourage sparsity or smoothness. We develop fast computational algorithms using our methods to perform generalized PCA (GPCA), sparse GPCA, and functional GPCA on massive data sets. Through simulations and a whole brain functional MRI example we demonstrate the utility of our methodology for dimension reduction, signal recovery, and feature selection with high-dimensional structured data.


Differential Privacy for Functions and Functional Data

arXiv.org Machine Learning

Differential privacy is a framework for privately releasing summaries of a database. Previous work has focused mainly on methods for which the output is a finite dimensional vector, or an element of some discrete set. We develop methods for releasing functions while preserving differential privacy. Specifically, we show that adding an appropriate Gaussian process to the function of interest yields differential privacy. When the functions lie in the same RKHS as the Gaussian process, then the correct noise level is established by measuring the "sensitivity" of the function in the RKHS norm. As examples we consider kernel density estimation, kernel support vector machines, and functions in reproducing kernel Hilbert spaces.


A Probabilistic Transmission Expansion Planning Methodology based on Roulette Wheel Selection and Social Welfare

arXiv.org Artificial Intelligence

Abstract: A new probabilistic methodology for transmission expansion planning (TEP) th at does not require a priori specification of new/additional transmission capacities and uses the concept of social welfare has been proposed. Two new concepts have been introduced in this paper: (i) roulette wheel methodology has been used to calculate t he capacity of new transmission lines and (ii) load flow analysis has been used to calculate expected demand not served (EDNS). The overall methodology has been implemented on a modified IEEE 5 - bus test system. Simulations show an important result: addit ion of only new transmission lines is not sufficient to minimize EDNS. Nowadays, the need for appropriate planned power syste ms to reduce generation cost, minimize the consumer cost and improve the quality of the power supply has become imperative [1] - [3]. As a result, transmission expansion planning (TEP) is gaining more significance.


Imitation learning of motor primitives and language bootstrapping in robots

arXiv.org Artificial Intelligence

Imitation learning in robots, also called programing by demonstration, has made important advances in recent years, allowing humans to teach context dependant motor skills/tasks to robots. We propose to extend the usual contexts investigated to also include acoustic linguistic expressions that might denote a given motor skill, and thus we target joint learning of the motor skills and their potential acoustic linguistic name. In addition to this, a modification of a class of existing algorithms within the imitation learning framework is made so that they can handle the unlabeled demonstration of several tasks/motor primitives without having to inform the imitator of what task is being demonstrated or what the number of tasks are, which is a necessity for language learning, i.e; if one wants to teach naturally an open number of new motor skills together with their acoustic names. Finally, a mechanism for detecting whether or not linguistic input is relevant to the task is also proposed, and our architecture also allows the robot to find the right framing for a given identified motor primitive. With these additions it becomes possible to build an imitator that bridges the gap between imitation learning and language learning by being able to learn linguistic expressions using methods from the imitation learning community. In this sense the imitator can learn a word by guessing whether a certain speech pattern present in the context means that a specific task is to be executed. The imitator is however not assumed to know that speech is relevant and has to figure this out on its own by looking at the demonstrations: indeed, the architecture allows the robot to transparently also learn tasks which should not be triggered by an acoustic word, but for example by the color or position of an object or a gesture made by someone in the environment. To demonstrate this ability to find the ...