AITopics

Country: North America > United States > California > Santa Cruz County > Santa Cruz (0.14)

Genre: Research Report (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.32)

arXiv.org Artificial IntelligenceApr-3-2000

A Theory of Universal Artificial Intelligence based on Algorithmic Complexity

Hutter, Marcus

Decision theory formally solves the problem of rational agents in uncertain worlds if the true environmental prior probability distribution is known. Solomonoff's theory of universal induction formally solves the problem of sequence prediction for unknown prior distribution. We combine both ideas and get a parameterless theory of universal Artificial Intelligence. We give strong arguments that the resulting AIXI model is the most intelligent unbiased agent possible. We outline for a number of problem classes, including sequence prediction, strategic games, function minimization, reinforcement and supervised learning, how the AIXI model can formally solve them. The major drawback of the AIXI model is that it is uncomputable. To overcome this problem, we construct a modified algorithm AIXI-tl, which is still effectively more intelligent than any other time t and space l bounded agent. The computation time of AIXI-tl is of the order tx2^l. Other discussed topics are formal definitions of intelligence order relations, the horizon problem and relations of the AIXI theory to other AI approaches.

aiξ model, artificial intelligence, bayesian inference, (15 more...)

arXiv.org Artificial Intelligence

cs/0004001

Country: Europe > Germany (0.14)

Industry: Leisure & Entertainment > Games > Chess (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Bartlett, Peter L., Maiorov, Vitaly, Meir, Ron

Almost Linear VC Dimension Bounds for Piecewise Polynomial Networks

VitalyMaiorov Department of Mathematics Technion, Haifa 32000 Israel Ron Meir Department of Electrical Engineering Technion, Haifa 32000 Israel rmeir@dumbo.technion.ac.il Abstract We compute upper and lower bounds on the VC dimension of feedforward networks of units with piecewise polynomial activation functions.We show that if the number of layers is fixed, then the VC dimension grows as W log W, where W is the number of parameters in the network. The VC dimension is an important measure of the complexity of a class of binaryvalued functions,since it characterizes the amount of data required for learning in the PAC setting (see [BEHW89, Vap82]). In this paper, we establish upper and lower bounds on the VC dimension of a specific class of multi-layered feedforward neural networks. Let F be the class of binary-valued functions computed by a feedforward neural network with W weights and k computational (non-input) units, each with a piecewise polynomial activation function. O(W2), which would lead one to conclude that the bounds Almost Linear VC Dimension Bounds for Piecewise Polynomial Networks 191 are in fact tight up to a constant.

artificial intelligence, computation unit, neural network, (13 more...)

Country: Asia > Middle East > Israel > Haifa District > Haifa (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Tight Bounds for the VC-Dimension of Piecewise Polynomial Networks

Sakurai, Akito

ASakurai@jaist.ac.jp Abstract O(ws(s log d log(dqh/ s))) and O(ws((h/ s) log q) log(dqh/s)) are upper bounds for the VC-dimension of a set of neural networks of units with piecewise polynomial activation functions, where s is the depth of the network, h is the number of hidden units, w is the number of adjustable parameters, q is the maximum of the number of polynomial segments of the activation function, and d is the maximum degree of the polynomials; also n(wslog(dqh/s)) is a lower bound for the VC-dimension of such a network set, which are tight for the cases s 8(h) and s is constant. For the special case q 1, the VC-dimension is 8(ws log d). 1 Introduction In spite of its importance, we had been unable to obtain VC-dimension values for practical types of networks, until fairly tight upper and lower bounds were obtained ([6], [8], [9], and [10]) for linear threshold element networks in which all elements perform a threshold function on weighted sum of inputs. This is mainly because the differentiability ofthe functions is needed to perform backpropagation or other learning algorithms. Unfortunately explicit bounds obtained so far for the VC-dimension of sigmoidal networks exhibit large gaps (O(w2h2) ([3]), n(w log h) for bounded depth 324 A.Sakurai and f!(wh) for unbounded depth) and are hard to improve. For the piecewise linear case, Maass obtained a result that the VO-dimension is O(w210g q), where q is the number of linear pieces of the function ([5]).

artificial intelligence, neural network, vc-dimension, (14 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Herschkowitz, Didier, Nadal, Jean-Pierre

Unsupervised and Supervised Clustering: The Mutual Information between Parameters and Observations

Recent works in parameter estimation and neural coding have demonstrated that optimal performance are related to the mutual information between parameters and data. We consider the mutual information in the case where the dependency in the parameter (a vector 8) of the conditional p.d.f. of each observation (a vector

artificial intelligence, machine learning, mutual information, (12 more...)

Country:

Asia (0.16)
North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.36)

Tight Bounds for the VC-Dimension of Piecewise Polynomial Networks

Sakurai, Akito

O(ws(s log d log(dqh/ s))) and O(ws((h/ s) log q) log(dqh/ s)) are upper bounds for the VC-dimension of a set of neural networks of units with piecewise polynomial activation functions, where s is the depth of the network, h is the number of hidden units, w is the number of adjustable parameters, q is the maximum of the number of polynomial segments of the activation function, and d is the maximum degree of the polynomials; also n(wslog(dqh/s)) is a lower bound for the VC-dimension of such a network set, which are tight for the cases s 8(h) and s is constant. For the special case q 1, the VC-dimension is 8(ws log d). 1 Introduction In spite of its importance, we had been unable to obtain VC-dimension values for practical types of networks, until fairly tight upper and lower bounds were obtained ([6], [8], [9], and [10]) for linear threshold element networks in which all elements perform a threshold function on weighted sum of inputs. This is mainly because the differentiability of the functions is needed to perform backpropagation or other learning algorithms. Unfortunately explicit bounds obtained so far for the VC-dimension of sigmoidal networks exhibit large gaps (O(w2h2) ([3]), n(w log h) for bounded depth 324 A. Sakurai and f!(wh) for unbounded depth) and are hard to improve. For the piecewise linear case, Maass obtained a result that the VO-dimension is O(w210g q), where q is the number of linear pieces of the function ([5]).

artificial intelligence, neural network, vc-dimension, (15 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Herschkowitz, Didier, Nadal, Jean-Pierre

Unsupervised and Supervised Clustering: The Mutual Information between Parameters and Observations

artificial intelligence, machine learning, mutual information, (12 more...)

Country:

Asia (0.16)
North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.36)

arXiv.org Artificial IntelligenceJan-27-1999

Minimum Description Length Induction, Bayesianism, and Kolmogorov Complexity

Vitanyi, Paul, Li, Ming

The relationship between the Bayesian approach and the minimum description length approach is established. We sharpen and clarify the general modeling principles MDL and MML, abstracted as the ideal MDL principle and defined from Bayes's rule by means of Kolmogorov complexity. The basic condition under which the ideal principle should be applied is encapsulated as the Fundamental Inequality, which in broad terms states that the principle is valid when the data are random, relative to every contemplated hypothesis and also these hypotheses are random relative to the (universal) prior. Basically, the ideal principle states that the prior probability associated with the hypothesis should be given by the algorithmic universal probability, and the sum of the log universal probability of the model plus the log of the probability of the data given the model should be minimized. If we restrict the model class to the finite sets then application of the ideal principle turns into Kolmogorov's minimal sufficient statistic. In general we show that data compression is almost always the best strategy, both in hypothesis identification and prediction.

artificial intelligence, hypothesis, machine learning, (16 more...)

arXiv.org Artificial Intelligence

cs/9901014

Country: North America > Canada > Ontario > Toronto (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.86)

AAAI News

Hamilton, Carol

AI MagazineMar-15-1998

However, all eligible students are Intelligence (AAAI-98) will be Third Annual Genetic Programming encouraged to apply. After the conference, available in late March by writing to Conference (GP-98), July 22-25 an expense report will be required ncai@aaai.org Please note that the deadline Eleventh Annual Conference on scholarships@aaai.org or at 445 Burgess for early registrations is May 27, 1998. Computational Learning Theory Drive, Menlo Park, CA 94025, The conference will be held July (COLT '98), July 24-26 (theory.lcs.mit. All student scholarship recipients Monona Terrace Convention Center, Fifteenth International Conference will be required to participate in the designed by Frank Lloyd Wright, in on Machine Learning (ICML '98), July Student Volunteer Program to support Madison, Wisconsin.

aaai, artificial intelligence, press release, (20 more...)

AI Magazine

Country:

North America > United States > Wisconsin > Dane County > Madison (0.25)
North America > United States > California > San Mateo County > Menlo Park (0.24)

Genre: Instructional Material (0.68)

Industry:

Leisure & Entertainment > Games (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.35)

Rangarajan, Anand, Yuille, Alan L., Gold, Steven, Mjolsness, Eric

A Convergence Proof for the Softassign Quadratic Assignment Algorithm

Neural Information Processing SystemsDec-31-1997

The softassign quadratic assignment algorithm has recently emerged as an effective strategy for a variety of optimization problems in pattern recognition and combinatorial optimization. While the effectiveness of the algorithm was demonstrated in thousands of simulations, there was no known proof of convergence. Here, we provide a proof of convergence for the most general form of the algorithm.

algorithm, artificial intelligence, optimization problem, (17 more...)

Country:

North America > United States > Connecticut > New Haven County (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > San Diego County (0.14)

Industry: Health & Medicine (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.42)