Goto

Collaborating Authors

 approximate function


Structure of universal formulas

Neural Information Processing Systems

By universal formulas we understand parameterized analytic expressions that have a fixed complexity, but nevertheless can approximate any continuous function on a compact set. There exist various examples of such formulas, including some in the form of neural networks. In this paper we analyze the essential structural elements of these highly expressive models. We introduce a hierarchy of expressiveness classes connecting the global approximability property to the weaker property of infinite VC dimension, and prove a series of classification results for several increasingly complex functional families.


Structure of universal formulas

Neural Information Processing Systems

By universal formulas we understand parameterized analytic expressions that have a fixed complexity, but nevertheless can approximate any continuous function on a compact set. There exist various examples of such formulas, including some in the form of neural networks. In this paper we analyze the essential structural elements of these highly expressive models. We introduce a hierarchy of expressiveness classes connecting the global approximability property to the weaker property of infinite VC dimension, and prove a series of classification results for several increasingly complex functional families. As a consequence, we show that fixed-size neural networks with not more than one layer of neurons having transcendental activations (e.g., sine or standard sigmoid) cannot in general approximate functions on arbitrary finite sets.


Speeding-up Evolutionary Algorithms to solve Black-Box Optimization Problems

Echevarrieta, Judith, Arza, Etor, Pérez, Aritz

arXiv.org Machine Learning

Population-based evolutionary algorithms are often considered when approaching computationally expensive black-box optimization problems. They employ a selection mechanism to choose the best solutions from a given population after comparing their objective values, which are then used to generate the next population. This iterative process explores the solution space efficiently, leading to improved solutions over time. However, these algorithms require a large number of evaluations to provide a quality solution, which might be computationally expensive when the evaluation cost is high. In some cases, it is possible to replace the original objective function with a less accurate approximation of lower cost. This introduces a trade-off between the evaluation cost and its accuracy. In this paper, we propose a technique capable of choosing an appropriate approximate function cost during the execution of the optimization algorithm. The proposal finds the minimum evaluation cost at which the solutions are still properly ranked, and consequently, more evaluations can be computed in the same amount of time with minimal accuracy loss. An experimental section on four very different problems reveals that the proposed approach can reach the same objective value in less than half of the time in certain cases.


Statistical learning theory and empirical risk

#artificialintelligence

Here, I'll be giving an overview and theoretical concepts of the statistical learning. Supervised learning can play a key role in learning from examples. From this algorithm, useful information can be easily extracted from large datasets, the problem of learning from examples consecutively involves approximating functions from a sparse and noisy data. In supervised learning, network is trained on a dataset of the form, T {xk, dk} from k 1 to Q. It is observed that using MLP multilayer perceptron with sufficient number of hidden neurons, it is possible to approximate a given function to any arbitrary degree of accuracy.


r/MachineLearning - [D] What is the best neural network structure to approximate functions?

#artificialintelligence

I have heard that both feed-forward networks as well as recurrent neural networks are universal function approximators, i.e., they can approximate any function arbitrarily close with increasing hidden layers and hidden units. Is there a difference between smooth and non-smooth functions? What is better, a few hidden layers with many hidden units per layer or deep learning, i.e., many hidden layers? Has anyone tried to use LSTM networks to do this kind of stuff?


Successive Convex Approximation Algorithms for Sparse Signal Estimation with Nonconvex Regularizations

Yang, Yang, Pesavento, Marius, Chatzinotas, Symeon, Ottersten, Björn

arXiv.org Machine Learning

In this paper, we propose a successive convex approximation framework for sparse optimization where the nonsmooth regularization function in the objective function is nonconvex and it can be written as the difference of two convex functions. The proposed framework is based on a nontrivial combination of the majorization-minimization framework and the successive convex approximation framework proposed in literature for a convex regularization function. The proposed framework has several attractive features, namely, i) flexibility, as different choices of the approximate function lead to different type of algorithms; ii) fast convergence, as the problem structure can be better exploited by a proper choice of the approximate function and the stepsize is calculated by the line search; iii) low complexity, as the approximate function is convex and the line search scheme is carried out over a differentiable function; iv) guaranteed convergence to a stationary point. We demonstrate these features by two example applications in subspace learning, namely, the network anomaly detection problem and the sparse subspace clustering problem. Customizing the proposed framework by adopting the best-response type approximation, we obtain soft-thresholding with exact line search algorithms for which all elements of the unknown parameter are updated in parallel according to closed-form expressions. The attractive features of the proposed algorithms are illustrated numerically.