AITopics

Plotting

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Tight Complexity Bounds for Optimizing Composite Objectives

Neural Information Processing SystemsMay-27-2025, 18:35:48 GMT

We provide tight upper and lower bounds on the complexity of minimizing the average of m convex functions using gradient and prox oracles of the component functions. We show a significant gap between the complexity of deterministic vs randomized optimization. For smooth functions, we show that accelerated gradient descent (AGD) and an accelerated variant of SVRG are optimal in the deterministic and randomized settings respectively, and that a gradient oracle is sufficient for the optimal rate. For non-smooth functions, having access to prox oracles reduces the complexity and we present optimal methods based on smoothing that improve over methods using just gradient accesses.

artificial intelligence, machine learning, optimizing composite objective, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.50)

Add feedback

Fast learning rates with heavy-tailed losses

Neural Information Processing SystemsMay-27-2025, 18:35:26 GMT

We study fast learning rates when the losses are not necessarily bounded and may have a distribution with heavy tails. To enable such analyses, we introduce two new conditions: (i) the envelope function \sup_{f \in \mathcal{F}} \ell \circ f, where \ell is the loss function and \mathcal{F} is the hypothesis class, exists and is L r -integrable, and (ii) \ell satisfies the multi-scale Bernstein's condition on \mathcal{F} . Under these assumptions, we prove that learning rate faster than O(n {-1/2}) can be obtained and, depending on r and the multi-scale Bernstein's powers, can be arbitrarily close to O(n {-1}) . We then verify these assumptions and derive fast learning rates for the problem of vector quantization by k -means clustering with heavy-tailed distributions. The analyses enable us to obtain novel learning rates that extend and complement existing results in the literature from both theoretical and practical viewpoints.

artificial intelligence, heavy-tailed loss, machine learning, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.65)

Add feedback

Gaussian Process Bandit Optimisation with Multi-fidelity Evaluations

Neural Information Processing SystemsMay-27-2025, 18:34:34 GMT

In many scientific and engineering applications, we are tasked with the optimisation of an expensive to evaluate black box function \func . Traditional methods for this problem assume just the availability of this single function. However, in many cases, cheap approximations to \func may be obtainable. For example, the expensive real world behaviour of a robot can be approximated by a cheap computer simulation. We can use these approximations to eliminate low function value regions cheaply and use the expensive evaluations of \func in a small but promising region and speedily identify the optimum.

artificial intelligence, gaussian process bandit optimisation, multi-fidelity evaluation, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.83)

Add feedback

Supervised Learning with Tensor Networks

Neural Information Processing SystemsMay-27-2025, 18:18:31 GMT

Tensor networks are approximations of high-order tensors which are efficient to work with and have been very successful for physics and mathematics applications. We demonstrate how algorithms for optimizing tensor networks can be adapted to supervised learning tasks by using matrix product states (tensor trains) to parameterize non-linear kernel learning models. For the MNIST data set we obtain less than 1% test set classification error. We discuss an interpretation of the additional structure imparted by the tensor network to the learned model.

artificial intelligence, machine learning, supervised learning, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Learning values across many orders of magnitude

Neural Information Processing SystemsMay-27-2025, 18:18:02 GMT

Most learning algorithms are not invariant to the scale of the signal that is being approximated. We propose to adaptively normalize the targets used in the learning updates. This is important in value-based reinforcement learning, where the magnitude of appropriate value approximations can change over time when we update the policy of behavior. Our main motivation is prior work on learning to play Atari games, where the rewards were clipped to a predetermined range. This clipping facilitates learning across many different games with a single learning algorithm, but a clipped reward function can result in qualitatively different behavior.

artificial intelligence, magnitude, reinforcement learning, (4 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games > Computer Games (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.31)

Add feedback

Using Social Dynamics to Make Individual Predictions: Variational Inference with a Stochastic Kinetic Model

Neural Information Processing SystemsMay-27-2025, 18:13:44 GMT

Social dynamics is concerned primarily with interactions among individuals and the resulting group behaviors, modeling the temporal evolution of social systems via the interactions of individuals within these systems. In particular, the availability of large-scale data from social networks and sensor networks offers an unprecedented opportunity to predict state-changing events at the individual level. Examples of such events include disease transmission, opinion transition in elections, and rumor propagation. Unlike previous research focusing on the collective effects of social systems, this study makes efficient inferences at the individual level. In order to cope with dynamic interactions among a large number of individuals, we introduce the stochastic kinetic model to capture adaptive transition probabilities and propose an efficient variational inference algorithm the complexity of which grows linearly -- rather than exponentially-- with the number of individuals.

artificial intelligence, make individual prediction, stochastic kinetic model, (7 more...)

Neural Information Processing Systems

Industry: Information Technology (0.43)

Technology:

Information Technology > Artificial Intelligence (0.81)
Information Technology > Communications > Networks > Sensor Networks (0.66)

Add feedback

Minimax Estimation of Maximum Mean Discrepancy with Radial Kernels

Neural Information Processing SystemsMay-27-2025, 18:13:29 GMT

Maximum Mean Discrepancy (MMD) is a distance on the space of probability measures which has found numerous applications in machine learning and nonparametric testing. This distance is based on the notion of embedding probabilities in a reproducing kernel Hilbert space. In this paper, we present the first known lower bounds for the estimation of MMD based on finite samples. Our lower bounds hold for any radial universal kernel on \R d and match the existing upper bounds up to constants that depend only on the properties of the kernel. Using these lower bounds, we establish the minimax rate optimality of the empirical estimator and its U -statistic variant, which are usually employed in applications.

artificial intelligence, machine learning, maximum mean discrepancy, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.48)

Add feedback

Combining Fully Convolutional and Recurrent Neural Networks for 3D Biomedical Image Segmentation

Neural Information Processing SystemsMay-27-2025, 18:12:02 GMT

Segmentation of 3D images is a fundamental problem in biomedical image analysis. Deep learning (DL) approaches have achieved the state-of-the-art segmentation performance. To exploit the 3D contexts using neural networks, known DL segmentation methods, including 3D convolution, 2D convolution on the planes orthogonal to 2D slices, and LSTM in multiple directions, all suffer incompatibility with the highly anisotropic dimensions in common 3D biomedical images. In this paper, we propose a new DL framework for 3D image segmentation, based on a combination of a fully convolutional network (FCN) and a recurrent neural network (RNN), which are responsible for exploiting the intra-slice and inter-slice contexts, respectively. To our best knowledge, this is the first DL framework for 3D image segmentation that explicitly leverages 3D image anisotropism.

artificial intelligence, convolutional and recurrent neural network, machine learning, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Consistent Kernel Mean Estimation for Functions of Random Variables

Neural Information Processing SystemsMay-27-2025, 18:11:34 GMT

We provide a theoretical foundation for non-parametric estimation of functions of random variables using kernel mean embeddings. We show that for any continuous function f, consistent estimators of the mean embedding of a random variable X lead to consistent estimators of the mean embedding of f(X). For Matern kernels and sufficiently smooth functions we also provide rates of convergence. Our results extend to functions of multiple random variables. If the variables are dependent, we require an estimator of the mean embedding of their joint distribution as a starting point; if they are independent, it is sufficient to have separate estimators of the mean embeddings of their marginal distributions.

artificial intelligence, consistent kernel mean estimation, machine learning, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.44)

Add feedback

Probing the Compositionality of Intuitive Functions

Neural Information Processing SystemsMay-27-2025, 18:05:37 GMT

How do people learn about complex functional structure? Taking inspiration from other areas of cognitive science, we propose that this is accomplished by harnessing compositionality: complex structure is decomposed into simpler building blocks. We show that participants prefer compositional over non-compositional function extrapolations, that samples from the human prior over functions are best described by a compositional model, and that people perceive compositional functions as more predictable than their non-compositional but otherwise similar counterparts. We argue that the compositional nature of intuitive functions is consistent with broad principles of human cognition.

artificial intelligence, compositionality, intuitive function

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.48)

Add feedback