Goto

Collaborating Authors

 Industry


Complex networks and human language

arXiv.org Artificial Intelligence

This paper introduces how human languages can be studied in light of recent development of network theories. There are two directions of exploration. One is to study networks existing in the language system. Various lexical networks can be built based on different relationships between words, being semantic or syntactic. Recent studies have shown that these lexical networks exhibit small-world and scale-free features. The other direction of exploration is to study networks of language users (i.e. social networks of people in the linguistic community), and their role in language evolution. Social networks also show small-world and scale-free features, which cannot be captured by random or regular network models. In the past, computational models of language change and language emergence often assume a population to have a random or regular structure, and there has been little discussion how network structures may affect the dynamics. In the second part of the paper, a series of simulation models of diffusion of linguistic innovation are used to illustrate the importance of choosing realistic conditions of population structure for modeling language change. Four types of social networks are compared, which exhibit two categories of diffusion dynamics. While the questions about which type of networks are more appropriate for modeling still remains, we give some preliminary suggestions for choosing the type of social networks for modeling.


Universal Algorithmic Intelligence: A mathematical top->down approach

arXiv.org Artificial Intelligence

Sequential decision theory formally solves the problem of rational agents in uncertain worlds if the true environmental prior probability distribution is known. Solomonoff's theory of universal induction formally solves the problem of sequence prediction for unknown prior distribution. We combine both ideas and get a parameter-free theory of universal Artificial Intelligence. We give strong arguments that the resulting AIXI model is the most intelligent unbiased agent possible. We outline how the AIXI model can formally solve a number of problem classes, including sequence prediction, strategic games, function minimization, reinforcement and supervised learning. The major drawback of the AIXI model is that it is uncomputable. To overcome this problem, we construct a modified algorithm AIXItl that is still effectively more intelligent than any other time t and length l bounded agent. The computation time of AIXItl is of the order t x 2^l. The discussion includes formal definitions of intelligence order relations, the horizon problem and relations of the AIXI theory to other AI approaches.


A Delta Debugger for ILP Query Execution

arXiv.org Artificial Intelligence

Because query execution is the most crucial part of Inductive Logic Programming (ILP) algorithms, a lot of effort is invested in developing faster execution mechanisms. These execution mechanisms typically have a low-level implementation, making them hard to debug. Moreover, other factors such as the complexity of the problems handled by ILP algorithms and size of the code base of ILP data mining systems make debugging at this level a very difficult job. In this work, we present the trace-based debugging approach currently used in the development of new execution mechanisms in hipP, the engine underlying the ACE Data Mining system. This debugger uses the delta debugging algorithm to automatically reduce the total time needed to expose bugs in ILP execution, thus making manual debugging step much lighter.


A Note on Local Ultrametricity in Text

arXiv.org Artificial Intelligence

Structures that are inherent to data of any type can be of import ance, and hierarchical structure is a prime example. In this work we take text corpora and assess the extent of hierarchical structure among words co nstituting the texts. By comprehensively taking context into account we seek to study hierarchical structures in the domain semantics. The data studied in Rammal et al. (1986) and Murtagh (2004) is point pattern data: observational features with their measurements on many coordinate dimensions. Data may be instead presented as time-varyin g signals and in a similar way, related to the findings of Rammal et al. (1986) and 1 Murtagh (2004), we have investigated ultrametric-related prope rties of time series or 1D signals in Murtagh (2005a).


On Approximating Optimal Weighted Lobbying, and Frequency of Correctness versus Average-Case Polynomial Time

arXiv.org Artificial Intelligence

We investigate issues related to two hard problems related to voting, the optimal weighted lobbying problem and the winner problem for Dodgson elections. Regarding the former, Christian et al. [CFRS06] showed that optimal lobbying is intractable in the sense of parameterized complexity. We provide an efficient greedy algorithm that achieves a logarithmic approximation ratio for this problem and even for a more general variant--optimal weighted lobbying. We prove that essentially no better approximation ratio than ours can be proven for this greedy algorithm. The problem of determining Dodgson winners is known to be complete for parallel access to NP [HHR97]. Homan and Hemaspaandra [HH06] proposed an efficient greedy heuristic for finding Dodgson winners with a guaranteed frequency of success, and their heuristic is a ``frequently self-knowingly correct algorithm.'' We prove that every distributional problem solvable in polynomial time on the average with respect to the uniform distribution has a frequently self-knowingly correct polynomial-time algorithm. Furthermore, we study some features of probability weight of correctness with respect to Procaccia and Rosenschein's junta distributions [PR07].


Computing Good Nash Equilibria in Graphical Games

arXiv.org Artificial Intelligence

This paper addresses the problem of fair equilibrium selection in graphical games. Our approach is based on the data structure called the {\em best response policy}, which was proposed by Kearns et al. \cite{kls} as a way to represent all Nash equilibria of a graphical game. In \cite{egg}, it was shown that the best response policy has polynomial size as long as the underlying graph is a path. In this paper, we show that if the underlying graph is a bounded-degree tree and the best response policy has polynomial size then there is an efficient algorithm which constructs a Nash equilibrium that guarantees certain payoffs to all participants. Another attractive solution concept is a Nash equilibrium that maximizes the social welfare. We show that, while exactly computing the latter is infeasible (we prove that solving this problem may involve algebraic numbers of an arbitrarily high degree), there exists an FPTAS for finding such an equilibrium as long as the best response policy has polynomial size. These two algorithms can be combined to produce Nash equilibria that satisfy various fairness criteria.


Spines of Random Constraint Satisfaction Problems: Definition and Connection with Computational Complexity

arXiv.org Artificial Intelligence

We study the connection between the order of phase transitions in combinatorial problems and the complexity of decision algorithms for such problems. We rigorously show that, for a class of random constraint satisfaction problems, a limited connection between the two phenomena indeed exists. Specifically, we extend the definition of the spine order parameter of Bollobas et al. to random constraint satisfaction problems, rigorously showing that for such problems a discontinuity of the spine is associated with a $2^{Ω(n)}$ resolution complexity (and thus a $2^{Ω(n)}$ complexity of DPLL algorithms) on random instances. The two phenomena have a common underlying cause: the emergence of ``large'' (linear size) minimally unsatisfiable subformulas of a random formula at the satisfiability phase transition. We present several further results that add weight to the intuition that random constraint satisfaction problems with a sharp threshold and a continuous spine are ``qualitatively similar to random 2-SAT''. Finally, we argue that it is the spine rather than the backbone parameter whose continuity has implications for the decision complexity of combinatorial problems, and we provide experimental evidence that the two parameters can behave in a different manner.


Causes and Explanations: A Structural-Model Approach, Part I: Causes

arXiv.org Artificial Intelligence

We propose a new definition of actual cause, using structural equations to model counterfactuals. We show that the definition yields a plausible and elegant account of causation that handles well examples which have caused problems for other definitions and resolves major difficulties in the traditional account.


Penalized Likelihood Methods for Estimation of Sparse High Dimensional Directed Acyclic Graphs

arXiv.org Machine Learning

Directed acyclic graphs (DAGs) are commonly used to represent causal relationships among random variables in graphical models. Applications of these models arise in the study of physical, as well as biological systems, where directed edges between nodes represent the influence of components of the system on each other. The general problem of estimating DAGs from observed data is computationally NP-hard, Moreover two directed graphs may be observationally equivalent. When the nodes exhibit a natural ordering, the problem of estimating directed graphs reduces to the problem of estimating the structure of the network. In this paper, we propose a penalized likelihood approach that directly estimates the adjacency matrix of DAGs. Both lasso and adaptive lasso penalties are considered and an efficient algorithm is proposed for estimation of high dimensional DAGs. We study variable selection consistency of the two penalties when the number of variables grows to infinity with the sample size. We show that although lasso can only consistently estimate the true network under stringent assumptions, adaptive lasso achieves this task under mild regularity conditions. The performance of the proposed methods is compared to alternative methods in simulated, as well as real, data examples.


Sparse Convolved Multiple Output Gaussian Processes

arXiv.org Machine Learning

Recently there has been an increasing interest in methods that deal with multiple outputs. This has been motivated partly by frameworks like multitask learning, multisensor networks or structured output data. From a Gaussian processes perspective, the problem reduces to specifying an appropriate covariance function that, whilst being positive semi-definite, captures the dependencies between all the data points and across all the outputs. One approach to account for non-trivial correlations between outputs employs convolution processes. Under a latent function interpretation of the convolution transform we establish dependencies between output variables. The main drawbacks of this approach are the associated computational and storage demands. In this paper we address these issues. We present different sparse approximations for dependent output Gaussian processes constructed through the convolution formalism. We exploit the conditional independencies present naturally in the model. This leads to a form of the covariance similar in spirit to the so called PITC and FITC approximations for a single output. We show experimental results with synthetic and real data, in particular, we show results in pollution prediction, school exams score prediction and gene expression data.