AITopics

Although the Internet AS-level topology has been extensively studied over the past few years, little is known about the details of the AS taxonomy. An AS "node" can represent a wide variety of organizations, e.g., large ISP, or small private business, university, with vastly different network characteristics, external connectivity patterns, network growth tendencies, and other properties that we can hardly neglect while working on veracious Internet representations in simulation environments. In this paper, we introduce a radically new approach based on machine learning techniques to map all the ASes in the Internet into a natural AS taxonomy. We successfully classify 95.3% of ASes with expected accuracy of 78.1%. We release to the community the AS-level topology dataset augmented with: 1) the AS taxonomy information and 2) the set of AS attributes we used to classify ASes. We believe that this dataset will serve as an invaluable addition to further understanding of the structure and evolution of the Internet.

artificial intelligence, ase, machine learning, (16 more...)

cs/0604015

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.40)

Semi-Supervised Learning -- A Statistical Physics Approach

Getz, Gad, Shental, Noam, Domany, Eytan

We present a novel approach to semi-supervised learning which is based on statistical physics. Most of the former work in the field of semi-supervised learning classifies the points by minimizing a certain energy function, which corresponds to a minimal k-way cut solution. In contrast to these methods, we estimate the distribution of classifications, instead of the sole minimal k-way cut, which yields more accurate and robust results. Our approach may be applied to all energy functions used for semi-supervised learning. The method is based on sampling using a Mul-ticanonical Markov chain Monte-Carlo algorithm, and has a straightforward probabilistic interpretation, which allows for soft assignments of points to classes, and also to cope with yet unseen class types. The suggested approach is demonstrated on a toy data set and on two real-life data sets of gene expression.

classification, labelled point, machine learning, (16 more...)

cs/0604011

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology > Leukemia (0.69)
Health & Medicine > Therapeutic Area > Hematology (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Can an Organism Adapt Itself to Unforeseen Circumstances?

Melkikh, Alexey V.

A model of an organism as an au tonomous intelligent system has been proposed. This model was used to analyz e learning of an organism in various environmental conditions. Processes of learning were divided into two types: strong and weak processes taking place in the absence an d the presence of aprioristic information about an object respectively. Weak lear ning is synonymous to adaptation when aprioristic programs already available in a system (an organism) are started. It was shown that strong learning is impossible fo r both an organism and any autonomous intelligent system. It was shown also that the knowledge base of an organism cannot be updated. Therefore, all behavior programs of an organism are congenital. A model of a conditioned reflex as a series of consecutive measurements of environmental parameters has been advanced. Repeated measurements are necessary in this case to reduce the error during decision making.

artificial intelligence, expert system, machine learning, (17 more...)

cs/0604009

Country:

North America > United States (0.94)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.50)

Industry: Education > Educational Setting (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.36)

If a tree casts a shadow is it telling the time?

Abbott, Russ

Physical processes are computations only when we use them to externalize thought. Computation is the performance of one or more fixed processes within a contingent environment. We reformula te the Church-Turing thesis so that it applies to programs rather than to c omputability. When suitably formulated agent-based computing in an open, multi-scalar environment represents the current consensus view of how we interact with the world. But we don't know how to formulate multi-scalar environments. Keywords: agents, agent-based, agent-based computation, Church-Turing thesis, Church's thesis, computing, computation, envir onment ideas, interaction, interactive computation, models, multi-scalar envir onment, thought, thought tools, unconventional computation.

artificial intelligence, computation, computer, (17 more...)

cs/0603125

Country: North America > United States > California (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.75)

Approximation Algorithms for K-Modes Clustering

He, Zengyou

In this paper, we study clustering with respect to the k-modes objective function, a natural formulation of clustering for categorical data. One of the main contributions of this paper is to establish the connection between k-modes and k-median, i.e., the optimum of k-median is at most twice the optimum of k-modes for the same categorical data clustering problem. Based on this observation, we derive a deterministic algorithm that achieves an approximation factor of 2. Furthermore, we prove that the distance measure in k-modes defines a metric. Hence, we are able to extend existing approximation algorithms for metric k-median to k-modes. Empirical results verify the superiority of our method.

approximation algorithm, artificial intelligence, machine learning, (2 more...)

cs/0603120

Genre: Research Report (0.40)

Technology:

Information Technology > Mathematics of Computing (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.53)

Ryabko, Daniil, Hutter, Marcus

Asymptotic Learnability of Reinforcement Problems with Arbitrary Dependence

We address the problem of reinforcement learning in which observations may exhibit an arbitrary form of stochastic dependence on past observations and actions. The task for an agent is to attain the best possible asymptotic reward where the true generating environment is unknown but belongs to a known countable family of environments. We find some sufficient conditions on the class of environments under which an agent exists which attains the best asymptotic reward for any environment in the class. We analyze how tight these conditions are and how they relate to different probabilistic assumptions known in reinforcement learning and related fields, such as Markov Decision Processes and mixing conditions.

machine learning, reinforcement learning, value-stable environment, (17 more...)

cs/0603110

Country: Europe (0.68)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.50)

Gorban, A. N., Sumner, N. R., Zinovyev, A. Y.

Topological Grammars for Data Approximation

A method of {\it topological grammars} is proposed for multidimensional data approximation. For data with complex topology we define a {\it principal cubic complex} of low dimension and given complexity that gives the best approximation for the dataset. This complex is a generalization of linear and non-linear principal manifolds and includes them as particular cases. The problem of optimal principal complex construction is transformed into a series of minimization problems for quadratic functionals. These quadratic functionals have a physically transparent interpretation in terms of elastic energy. For the energy computation, the whole complex is represented as a system of nodes and springs. Topologically, the principal complex is a product of one-dimensional continuums (represented by graphs), and the grammars describe how these continuums transform during the process of optimal complex construction. This factorization of the whole process onto one-dimensional transformations using minimization of quadratic energy functionals allow us to construct efficient algorithms.

artificial intelligence, graph, machine learning, (18 more...)

doi: 10.1016/j.aml.2006.04.022

cs/0603090

Country:

Europe > United Kingdom (0.14)
North America > United States > Massachusetts (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Sakhanenko, Nikita A., Luger, George F., Makaruk, Hanna E., Holtkamp, David B.

Application of Support Vector Regression to Interpolation of Sparse Shock Physics Data Sets

Experimental physics, along with many other fields in applied and basic research, uses experiments, physical tests, and observations to gain insight into various phenomena and to validate hypotheses and models. Shock p hysics is a field that explores the response of materials to the extremes of p ressure, deformation, and temperature which are present when shock waves interact with those materials [17]. High explosive (HE) or propellant guns are often used to generate these strong shock waves. Many different diagnostic ap proaches have been used to probe these phenomena [8]. Because of the energetic nature of the shock wave drive, often a large amount of experimental equipment is destroyed during the test.

artificial intelligence, experiment, machine learning, (14 more...)

cs/0603081

Country: North America > United States > New Mexico (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Yet Another Efficient Unification Algorithm

Suciu, Alin

The unification algorithm is at the heart of the logic program ming paradigm [3]. Starting with the classic algorithm of Robinson [5], the unification algor ithm was developed to become more and more efficient [4]. Even nowadays the unification theory is sti ll under development and is receiving continuous scrutiny from the scientific community [2]. The present paper presents yet another efficient unification a lgorithm centered on a data structure called Unification Table, which borrows some ideas from the data structures used by the Warren's Abstract Machine [1]. The next paragraph presents in detail the proposed unificati on algorithm, giving the C-style pseudo code. An example of application of the algorithm take n from [1] is also presented.

artificial intelligence, logic & formal reasoning, programming language, (15 more...)

cs/0603080

Country:

Europe > Romania (0.15)
North America > United States (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Software > Programming Languages (0.71)

Hoyer, Patrik O., Shimizu, Shohei, Kerminen, Antti J.

Estimation of linear, non-gaussian causal models in the presence of confounding latent variables

The estimation of linear causal models (also known as structural equation models) from data is a well-known problem which has received much attention in the past. Most previous work has, however, made an explicit or implicit assumption of gaussianity, limiting the identifiability of the models. We have recently shown (Shimizu et al, 2005; Hoyer et al, 2006) that for non-gaussian distributions the full causal model can be estimated in the no hidden variables case. In this contribution, we discuss the estimation of the model when confounding latent variables are present. Although in this case uniqueness is no longer guaranteed, there is at most a finite set of models which can fit the data. We develop an algorithm for estimating this set, and describe numerical simulations which confirm the theoretical arguments and demonstrate the practical viability of the approach. Full Matlab code is provided for all simulations.

artificial intelligence, latent variable, latent variable lingam model, (16 more...)

cs/0603038

Country: Europe > United Kingdom (0.28)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.83)