AITopics

Mair, Jamie F., Rose, Dominic C., Garrahan, Juan P.

Training neural network ensembles via trajectory sampling

arXiv.org Artificial IntelligenceMay-10-2023

In machine learning, there is renewed interest in neural network ensembles (NNEs), whereby predictions are obtained as an aggregate from a diverse set of smaller models, rather than from a single larger model. Here, we show how to define and train a NNE using techniques from the study of rare trajectories in stochastic systems. We define an NNE in terms of the trajectory of the model parameters under a simple, and discrete in time, diffusive dynamics, and train the NNE by biasing these trajectories towards a small time-integrated loss, as controlled by appropriate counting fields which act as hyperparameters. We demonstrate the viability of this technique on a range of simple supervised learning tasks. We discuss potential advantages of our trajectory sampling approach compared with more conventional gradient based methods.

artificial intelligence, machine learning, trajectory, (16 more...)

arXiv.org Artificial Intelligence

2209.11116

Country:

Europe > United Kingdom > England > Nottinghamshire > Nottingham (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.49)

Neural Information Processing SystemsApr-6-2023, 19:28:49 GMT

Dynamics of Generalization in Linear Perceptrons

We study the evolution of the generalization ability of a simple linear per(cid:173) ceptron with N inputs which learns to imitate a "teacher perceptron". The system is trained on p aN binary example inputs and the generaliza(cid:173) tion ability measured by testing for agreement with the teacher on all 2N possible binary input patterns. The dynamics may be solved analytically and exhibits a phase transition from imperfect to perfect generalization at a 1. Except at this point the generalization ability approaches its asymptotic value exponentially, with critical slowing down near the tran(cid:173) sition; the relaxation time is ex (1 - y'a)-2. Right at the critical point, 1 the approach to perfect generalization follows a power law ex t - '2.

generalization, generalization ability, linear perceptron, (2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.66)

Neural Information Processing SystemsApr-6-2023, 16:58:12 GMT

Efficient Learning of Linear Perceptrons

We consider the existence of efficient algorithms for learning the class of half-spaces in n in the agnostic learning model (Le., mak(cid:173) ing no prior assumptions on the example-generating distribution). The resulting combinatorial problem - finding the best agreement half-space over an input sample - is NP hard to approximate to within some constant factor. We suggest a way to circumvent this theoretical bound by introducing a new measure of success for such algorithms. An algorithm is IL-margin successful if the agreement ratio of the half-space it outputs is as good as that of any half-space once training points that are inside the IL-margins of its separating hyper-plane are disregarded. We prove crisp computational com(cid:173) plexity results with respect to this success measure: On one hand, for every positive IL, there exist efficient (poly-time) IL-margin suc(cid:173) cessful learning algorithms.

algorithm, efficient learning, linear perceptron, (1 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.40)

Marion, Glenn, Saad, David

Hyperparameters Evidence and Generalisation for an Unrealisable Rule

Using a statistical mechanical formalism we calculate the evidence, generalisation error and consistency measure for a linear perceptron trained and tested on a set of examples generated by a non linear teacher. The teacher is said to be unrealisable because the student can never model it without error. Our model allows us to interpolate between the known case of a linear teacher, and an unrealisable, nonlinear teacher. A comparison of the hyperparameters which maximise the evidence with those that optimise the performance measures reveals that, in the nonlinear case, the evidence procedure is a misleading guide to optimising performance. Finally, we explore the extent to which the evidence procedure is unreliable and find that, despite being sub-optimal, in some circumstances it might be a useful method for fixing the hyperparameters. 1 INTRODUCTION The analysis of supervised learning or learning from examples is a major field of research within neural networks.

evidence procedure, generalisation error, performance measure, (13 more...)

Country:

Europe > United Kingdom (0.14)
North America > United States > California > San Mateo County > San Mateo (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Learning in large linear perceptrons and why the thermodynamic limit is relevant to the real world

Sollich, Peter

We first rederive the known results for the'thermodynamic limit' of infinite perceptron size N and show explicitly that 9

correction, generalization error, thermodynamic limit, (13 more...)

Country:

North America > United States > New York (0.05)
North America > United States > Indiana > Grant County > Marion (0.04)
Europe > United Kingdom (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.64)

Marion, Glenn, Saad, David

Hyperparameters Evidence and Generalisation for an Unrealisable Rule

evidence procedure, generalisation error, performance measure, (13 more...)

Country:

Europe > United Kingdom (0.14)
North America > United States > California > San Mateo County > San Mateo (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Learning in large linear perceptrons and why the thermodynamic limit is relevant to the real world

Sollich, Peter

We first rederive the known results for the'thermodynamic limit' of infinite perceptron size N and show explicitly that 9

correction, generalization error, thermodynamic limit, (13 more...)

Country:

North America > United States > New York (0.05)
North America > United States > Indiana > Grant County > Marion (0.04)
Europe > United Kingdom (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.64)

Marion, Glenn, Saad, David

Hyperparameters Evidence and Generalisation for an Unrealisable Rule

Using a statistical mechanical formalism we calculate the evidence, generalisation error and consistency measure for a linear perceptron trainedand tested on a set of examples generated by a non linear teacher. The teacher is said to be unrealisable because the student can never model it without error. Our model allows us to interpolate between the known case of a linear teacher, and an unrealisable, nonlinearteacher. A comparison of the hyperparameters which maximise the evidence with those that optimise the performance measuresreveals that, in the nonlinear case, the evidence procedure is a misleading guide to optimising performance. Finally, we explore the extent to which the evidence procedure is unreliable and find that, despite being sub-optimal, in some circumstances it might be a useful method for fixing the hyperparameters. 1 INTRODUCTION The analysis of supervised learning or learning from examples is a major field of research within neural networks.

artificial intelligence, inductive learning, machine learning, (15 more...)