AITopics

Collaborating Authors

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

User-Dependent Neural Sequence Models for Continuous-Time Event Data Alex Boyd Robert Bamler 2 Stephan Mandt 1,2

Neural Information Processing SystemsMar-21-2025, 21:31:48 GMT

Continuous-time event data are common in applications such as individual behavior data, financial transactions, and medical health records. Modeling such data can be very challenging, in particular for applications with many different types of events, since it requires a model to predict the event types as well as the time of occurrence. Recurrent neural networks that parameterize time-varying intensity functions are the current state-of-the-art for predictive modeling with such data. These models typically assume that all event sequences come from the same data distribution. However, in many applications event sequences are generated by different sources, or users, and their characteristics can be very different. In this paper, we extend the broad class of neural marked point process models to mixtures of latent embeddings, where each mixture component models the characteristic traits of a given user. Our approach relies on augmenting these models with a latent variable that encodes user characteristics, represented by a mixture model over user behavior that is trained via amortized variational inference. We evaluate our methods on four large real-world datasets and demonstrate systematic improvements from our approach over existing work for a variety of predictive metrics such as log-likelihood, next event ranking, and source-of-sequence identification.

artificial intelligence, machine learning, sequence, (16 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Industry:

Media (0.94)
Information Technology > Security & Privacy (0.93)
Health & Medicine (0.66)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Online Inventory Problems Beyond the i . Setting with Online Convex Optimization

Neural Information Processing SystemsMar-21-2025, 21:31:34 GMT

We study multi-product inventory control problems where a manager makes sequential replenishment decisions based on partial historical information in order to minimize its cumulative losses. Our motivation is to consider general demands, losses and dynamics to go beyond standard models which usually rely on newsvendor-type losses, fixed dynamics, and unrealistic i.i.d.

artificial intelligence, inventory problem, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe > France (0.14)
Asia > China (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

On the Loss Landscape of Adversarial Training: Identifying Challenges and How to Overcome Them Tao Lin

Neural Information Processing SystemsMar-21-2025, 21:31:28 GMT

We analyze the influence of adversarial training on the loss landscape of machine learning models. To this end, we first provide analytical studies of the properties of adversarial loss functions under different adversarial budgets. We then demonstrate that the adversarial loss landscape is less favorable to optimization, due to increased curvature and more scattered gradients. Our conclusions are validated by numerical analyses, which show that training under large adversarial budgets impede the escape from suboptimal random initialization, cause non-vanishing gradients and make the model find sharper minima. Based on these observations, we show that a periodic adversarial scheduling (PAS) strategy can effectively overcome these challenges, yielding better results than vanilla adversarial training while being much less sensitive to the choice of learning rate.

adversarial training, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Without Learning Rate Resets

Neural Information Processing SystemsMar-21-2025, 21:31:09 GMT

Clean and robust error on the test set under various adversarial attacks. The numbers between the brackets indicate the standard deviation across different runs. Specifically, for example, 28.25(47) stands for 28.25 0.47. We thank the reviewers for their constructive comments. The table above shows that our PAS strategy still yields better performance under stronger attacks.

artificial intelligence, learning rate reset, machine learning, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On the Power of Differentiable Learning versus PAC and SQ Learning Emmanuel Abbe

Neural Information Processing SystemsMar-21-2025, 21:31:05 GMT

We study the power of learning via mini-batch stochastic gradient descent (SGD) on the population loss, and batch Gradient Descent (GD) on the empirical loss, of a differentiable model or neural network, and ask what learning problems can be learnt using these paradigms. We show that SGD and GD can always simulate learning with statistical queries (SQ), but their ability to go beyond that depends on the precision ρ of the gradient calculations relative to the minibatch size b (for SGD) and sample size m (for GD). With fine enough precision relative to minibatch size, namely when bρ is small enough, SGD can go beyond SQ learning and simulate any sample-based learning algorithm and thus its learning power is equivalent to that of PAC learning; this extends prior work that achieved this result for b = 1. Similarly, with fine enough precision relative to the sample size m, GD can also simulate any sample-based learning algorithm based on m samples. In particular, with polynomially many bits of precision (i.e. when ρ is exponentially small), SGD and GD can both simulate PAC learning regardless of the mini-batch size.

artificial intelligence, machine learning, query, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.75)

Add feedback

cc225865b743ecc91c4743259813f604-Paper.pdf

Neural Information Processing SystemsMar-21-2025, 21:31:02 GMT

artificial intelligence, machine learning, query, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.72)

Add feedback

Understanding Generalizability of Diffusion Models Requires Rethinking the Hidden Gaussian Structure

Neural Information Processing SystemsMar-21-2025, 21:30:55 GMT

In this work, we study the generalizability of diffusion models by looking into the hidden properties of the learned score functions, which are essentially a series of deep denoisers trained on various noise levels. We observe that as diffusion models transition from memorization to generalization, their corresponding nonlinear diffusion denoisers exhibit increasing linearity. This discovery leads us to investigate the linear counterparts of the nonlinear diffusion models, which are a series of linear models trained to match the function mappings of the nonlinear diffusion denoisers. Interestingly, these linear denoisers are approximately the optimal denoisers for a multivariate Gaussian distribution characterized by the empirical mean and covariance of the training dataset. This finding implies that diffusion models have the inductive bias towards capturing and utilizing the Gaussian structure (covariance information) of the training dataset for data generation. We empirically demonstrate that this inductive bias is a unique property of diffusion models in the generalization regime, which becomes increasingly evident when the model's capacity is relatively small compared to the training dataset size. In the case where the model is highly overparameterized, this inductive bias emerges during the initial training phases before the model fully memorizes its training data. Our study provides crucial insights into understanding the notable strong generalization phenomenon recently observed in real-world diffusion models.

artificial intelligence, diffusion model, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Europe > Germany (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

50d277e84b2bcbaadcd84548a87e8cc4-Supplemental-Conference.pdf

Neural Information Processing SystemsMar-21-2025, 21:30:48 GMT

artificial intelligence, querypose, sparse multi-person pose regression, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.59)

Add feedback

QueryPose: Sparse Multi-Person Pose Regression via Spatial-Aware Part-Level Query

Neural Information Processing SystemsMar-21-2025, 21:30:44 GMT

We propose a sparse end-to-end multi-person pose regression framework, termed QueryPose, which can directly predict multi-person keypoint sequences from the input image. The existing end-to-end methods rely on dense representations to preserve the spatial detail and structure for precise keypoint localization. However, the dense paradigm introduces complex and redundant post-processes during inference. In our framework, each human instance is encoded by several learnable spatial-aware part-level queries associated with an instance-level query. First, we propose the Spatial Part Embedding Generation Module (SPEGM) that considers the local spatial attention mechanism to generate several spatial-sensitive part embeddings, which contain spatial details and structural information for enhancing the part-level queries. Second, we introduce the Selective Iteration Module (SIM) to adaptively update the sparse part-level queries via the generated spatial-sensitive part embeddings stage-by-stage. Based on the two proposed modules, the part-level queries are able to fully encode the spatial details and structural information for precise keypoint regression. With the bipartite matching, QueryPose avoids the hand-designed post-processes and surpasses the existing dense end-to-end methods with 73.6 AP on MS COCO mini-val set and 72.7 AP on CrowdPose test set.

artificial intelligence, machine learning, part-level query, (12 more...)

Neural Information Processing Systems

Country: Asia > China (0.29)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
(2 more...)

Add feedback

Filters

Collaborating Authors

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

User-Dependent Neural Sequence Models for Continuous-Time Event Data Alex Boyd Robert Bamler 2 Stephan Mandt 1,2

Online Inventory Problems Beyond the i . Setting with Online Convex Optimization

On the Loss Landscape of Adversarial Training: Identifying Challenges and How to Overcome Them Tao Lin

f56d8183992b6c54c92c16a8519a6e2b-Paper.pdf

Without Learning Rate Resets

On the Power of Differentiable Learning versus PAC and SQ Learning Emmanuel Abbe

cc225865b743ecc91c4743259813f604-Paper.pdf

Understanding Generalizability of Diffusion Models Requires Rethinking the Hidden Gaussian Structure

50d277e84b2bcbaadcd84548a87e8cc4-Supplemental-Conference.pdf

QueryPose: Sparse Multi-Person Pose Regression via Spatial-Aware Part-Level Query