AITopics

Plotting

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Stochastic Optimization with Heavy-Tailed Noise via Accelerated Gradient Clipping Marina Danilova

Neural Information Processing SystemsMay-31-2025, 10:22:22 GMT

In this paper, we propose a new accelerated stochastic first-order method called clipped-SSTM for smooth convex stochastic optimization with heavy-tailed distributed noise in stochastic gradients and derive the first high-probability complexity bounds for this method closing the gap in the theory of stochastic optimization with heavy-tailed noise. Our method is based on a special variant of accelerated Stochastic Gradient Descent (SGD) and clipping of stochastic gradients. We extend our method to the strongly convex case and prove new complexity bounds that outperform state-of-the-art results in this case. Finally, we extend our proof technique and derive the first non-trivial high-probability complexity bounds for SGD with clipping without light-tails assumption on the noise.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe (0.45)
Asia (0.28)
North America > Canada (0.27)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

20885c72ca35d75619d6a378edea9f76-AuthorFeedback.pdf

Neural Information Processing SystemsMay-31-2025, 10:21:39 GMT

Figure 1: (a) Some of the features extracted by LFOICA. We would like to thank all reviewers for the constructive comments. The citation indices below are consistent with those in the main paper. To R#1: MoG and sparsity have been studied. We will further discuss the suggested references in the "practical consideration" subsection. With additional constraints on A, the permutation and scaling ambiguity can be further reduced.

artificial intelligence, lfoica, mog and sparsity, (13 more...)

Neural Information Processing Systems

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.49)
Information Technology > Artificial Intelligence (0.33)

Add feedback

Learning with Fitzpatrick Losses

Neural Information Processing SystemsMay-31-2025, 10:21:25 GMT

Fenchel-Young losses are a family of loss functions, encompassing the squared, logistic and sparsemax losses, among others. They are convex w.r.t. the model output and the target, separately. Each Fenchel-Young loss is implicitly associated with a link function, that maps model outputs to predictions. For instance, the logistic loss is associated with the soft argmax link function. Can we build new loss functions associated with the same link function as Fenchel-Young losses?

artificial intelligence, fitzpatrick loss, machine learning, (19 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Explaining Datasets in Words: Statistical Models with Natural Language Parameters

Neural Information Processing SystemsMay-31-2025, 10:21:08 GMT

To make sense of massive data, we often first fit simplified models and then interpret the parameters; for example, we cluster the text embeddings and then interpret the mean parameters of each cluster. However, these parameters are often highdimensional and hard to interpret. To make model parameters directly interpretable, we introduce a family of statistical models--including clustering, time series, and classification models--parameterized by natural language predicates. For example, a cluster of text about COVID could be parameterized by the predicate "discusses COVID". To learn these statistical models effectively, we develop a model-agnostic algorithm that optimizes continuous relaxations of predicate parameters with gradient descent and discretizes them by prompting language models (LMs). Finally, we apply our framework to a wide range of problems: taxonomizing user chat dialogues, characterizing how they evolve across time, finding categories where one language model is better than the other, clustering math problems based on subareas, and explaining visual features in memorable images. Our framework is highly versatile, applicable to both textual and visual domains, can be easily steered to focus on specific properties (e.g.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > UAE (0.14)
North America > United States > Louisiana (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.67)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.94)
(2 more...)

Add feedback

Assessing Social and Intersectional Biases in Contextualized Word Representations

Yi Chern Tan, L. Elisa Celis

Neural Information Processing SystemsMay-31-2025, 10:19:18 GMT

Social bias in machine learning has drawn significant attention, with work ranging from demonstrations of bias in a multitude of applications, curating definitions of fairness for different contexts, to developing algorithms to mitigate bias. In natural language processing, gender bias has been shown to exist in context-free word embeddings. Recently, contextual word representations have outperformed word embeddings in several downstream NLP tasks. These word representations are conditioned on their context within a sentence, and can also be used to encode the entire sentence. In this paper, we analyze the extent to which state-of-the-art models for contextual word representations, such as BERT and GPT-2, encode biases with respect to gender, race, and intersectional identities. Towards this, we propose assessing bias at the contextual word level. This novel approach captures the contextual effects of bias missing in context-free word embeddings, yet avoids confounding effects that underestimate bias at the sentence encoding level. We demonstrate evidence of bias at the corpus level, find varying evidence of bias in embedding association tests, show in particular that racial bias is strongly encoded in contextual word models, and observe that bias effects for intersectional minorities are exacerbated beyond their constituent minority identities. Further, evaluating bias effects at the contextual word level captures biases that are not captured at the sentence level, confirming the need for our novel approach.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.15)
North America > United States > Louisiana (0.14)
North America > United States > California (0.14)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (0.96)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

201d546992726352471cfea6b0df0a48-AuthorFeedback.pdf

Neural Information Processing SystemsMay-31-2025, 10:19:03 GMT

Reviewer 1: Thank you for the detailed thought-provoking comments which will help us improve the final work. We will include a test for associating M/F names and occ words in the final version. Indeed the method uses the contextual representation, so there is no pooling involved. Reviewer 2: Thank you for the incisive comments which help us improve our discussion and interpretation of results. Indeed, our word lists were constructed in prior work (Caliskan et al. [6] and May et al. [23]), which E.g., for the extension of the Heilman double bind tests to race, we kept the Some prior work [23] also found negative effect sizes for BERT and GPT (for sent encodings).

artificial intelligence, effect size, literature, (14 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.45)

Technology: Information Technology > Artificial Intelligence (0.32)

Add feedback

Dendritic Integration Inspired Artificial Neural Networks Capture Data Correlation, and Douglas Zhou

Neural Information Processing SystemsMay-31-2025, 10:18:03 GMT

Incorporating biological neuronal properties into Artificial Neural Networks (ANNs) to enhance computational capabilities is under active investigation in the field of deep learning. Inspired by recent findings indicating that dendrites adhere to quadratic integration rule for synaptic inputs, this study explores the computational benefits of quadratic neurons. We theoretically demonstrate that quadratic neurons inherently capture correlation within structured data, a feature that grants them superior generalization abilities over traditional neurons. This is substantiated by few-shot learning experiments. Furthermore, we integrate the quadratic rule into Convolutional Neural Networks (CNNs) using a biologically plausible approach, resulting in innovative architectures--Dendritic integration inspired CNNs (Dit-CNNs). Our Dit-CNNs compete favorably with state-of-the-art models across multiple classification benchmarks, e.g., ImageNet-1K, while retaining the simplicity and efficiency of traditional CNNs.

artificial intelligence, deep learning, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Asia (0.68)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Humanoid Locomotion as Next Token Prediction

Neural Information Processing SystemsMay-31-2025, 10:17:46 GMT

We cast real-world humanoid control as a next token prediction problem, akin to predicting the next word in language. Our model is a causal transformer trained via autoregressive prediction of sensorimotor sequences. To account for the multimodal nature of the data, we perform prediction in a modality-aligned way, and for each input token predict the next token from the same modality. This general formulation enables us to leverage data with missing modalities, such as videos without actions. We train our model on a dataset of sequences from a prior neural network policy, a model-based controller, motion capture, and YouTube videos of humans. We show that our model enables a real humanoid robot to walk in San Francisco zero-shot. Our model can transfer to the real world even when trained on only 27 hours of walking data, and can generalize to commands not seen during training. These findings suggest a promising path toward learning challenging real-world control tasks by generative modeling of sensorimotor sequences.

machine learning, natural language, trajectory, (19 more...)

Neural Information Processing Systems

Country: North America > United States > California > San Francisco County > San Francisco (0.25)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

abb451a12cf1a9d93292e81f0d4fdd7a-AuthorFeedback.pdf

Neural Information Processing SystemsMay-31-2025, 10:16:44 GMT

We thank the reviewers for their thoughtful feedback. Below, we provide clarifications to their concerns. Reviewer R1. - (Sufficient conditions for learnability) As noted on Line 72, an online learning with dynamics These comprise the broad class of instances for which our learnability characterization is tight. We will elaborate further in our revised draft. For example, consider linear dynamical systems parameterized by matrices (A, B).

artificial intelligence, machine learning, online, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.73)

Add feedback

Chirality Nets for Human Pose Regression

Raymond Yeh, Yuan-Ting Hu, Alexander Schwing

Neural Information Processing SystemsMay-31-2025, 10:14:41 GMT

We propose Chirality Nets, a family of deep nets that is equivariant to the "chirality transform," i.e., the transformation to create a chiral pair. Through parameter sharing, odd and even symmetry, we propose and prove variants of standard building blocks of deep nets that satisfy the equivariance property, including fully connected layers, convolutional layers, batch-normalization, and LSTM/GRU cells. The proposed layers lead to a more data efficient representation and a reduction in computation by exploiting symmetry. We evaluate chirality nets on the task of human pose regression, which naturally exploits the left/right mirroring of the human body. We study three pose regression tasks: 3D pose estimation from video, 2D pose forecasting, and skeleton based activity recognition. Our approach achieves/matches state-of-the-art results, with more significant gains on small datasets and limited-data settings.

artificial intelligence, machine learning, pose estimation, (17 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback