AITopics

Plotting

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Textual Training for the Hassle-Free Removal of Unwanted Visual Data: Case Studies on OOD and Hateful Image Detection Sangha Park

Neural Information Processing SystemsMar-27-2025, 12:22:10 GMT

In our study, we explore methods for detecting unwanted content lurking in visual datasets. We provide a theoretical analysis demonstrating that a model capable of successfully partitioning visual data can be obtained using only textual data. Based on the analysis, we propose Hassle-Free Textual Training (HFTT), a streamlined method capable of acquiring detectors for unwanted visual content, using only synthetic textual data in conjunction with pre-trained vision-language models. HFTT features an innovative objective function that significantly reduces the necessity for human involvement in data annotation. Furthermore, HFTT employs a clever textual data synthesis method, effectively emulating the integration of unknown visual data distribution into the training process at no extra cost. The unique characteristics of HFTT extend its utility beyond traditional out-of-distribution detection, making it applicable to tasks that address more abstract concepts. We complement our analyses with experiments in out-of-distribution detection and hateful image detection.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Asia > South Korea (0.14)
Europe > Spain (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(2 more...)

Add feedback

Grow and Merge: A Unified Framework for Continuous Categories Discovery

Neural Information Processing SystemsMar-27-2025, 12:21:59 GMT

Although a number of studies are devoted to novel category discovery, most of them assume a static setting where both labeled and unlabeled data are given at once for finding new categories. In this work, we focus on the application scenarios where unlabeled data are continuously fed into the category discovery system. We refer to it as the Continuous Category Discovery (CCD) problem, which is significantly more challenging than the static setting. A common challenge faced by novel category discovery is that different sets of features are needed for classification and category discovery: class discriminative features are preferred for classification, while rich and diverse features are more suitable for new category mining. This challenge becomes more severe for dynamic setting as the system is asked to deliver good performance for known classes over time, and at the same time continuously discover new classes from unlabeled data. To address this challenge, we develop a framework of Grow and Merge (GM) that works by alternating between a growing phase and a merging phase: in the growing phase, it increases the diversity of features through a continuous self-supervised learning for effective category mining, and in the merging phase, it merges the grown model with a static one to ensure satisfying performance for known classes. Our extensive studies verify that the proposed GM framework is significantly more effective than the state-of-the-art approaches for continuous category discovery.

artificial intelligence, category, machine learning, (13 more...)

Neural Information Processing Systems

Genre: Research Report > Promising Solution (0.34)

Technology:

Add feedback

9098e2901b4eb54772f83535f89cb8ac-Supplemental-Conference.pdf

Neural Information Processing SystemsMar-27-2025, 12:21:45 GMT

artificial intelligence, machine learning, pre-training, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.73)

Add feedback

DropPos: Pre-Training Vision Transformers by Reconstructing Dropped Positions Haochen Wang 1,3 Junsong Fan 1,4 Yuxi Wang 1,4 Kaiyou Song 2

Neural Information Processing SystemsMar-27-2025, 12:21:42 GMT

As it is empirically observed that Vision Transformers (ViTs) are quite insensitive to the order of input tokens, the need for an appropriate self-supervised pretext task that enhances the location awareness of ViTs is becoming evident. To address this, we present DropPos, a novel pretext task designed to reconstruct Dropped Positions. The formulation of DropPos is simple: we first drop a large random subset of positional embeddings and then the model classifies the actual position for each non-overlapping patch among all possible positions solely based on their visual appearance. To avoid trivial solutions, we increase the difficulty of this task by keeping only a subset of patches visible. Additionally, considering there may be different patches with similar visual appearances, we propose position smoothing and attentive reconstruction strategies to relax this classification problem, since it is not necessary to reconstruct their exact positions in these cases.

artificial intelligence, computer vision, machine learning, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Links to Resources

Neural Information Processing SystemsMar-27-2025, 12:21:35 GMT

Table 7: Examples of Generated Cartoon Descriptions Type of descriptions GPT-4o Human Written [20] Canny description A knight in armor is There are two men on riding a horse, holding a horse. They are a lance with a traffic wearing soldier outfits. A line Businessmen follow behind of businessmen in suits them. Uncanny Description It's unusual to see a There are businessmen medieval knight leading following a two guys on modern businessmen as if horses who are soldiers. We use GPT-4o to generate descriptions for each cartoon. We used the five shot method to generate a set of descriptions.

caption, large language model, machine learning, (22 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Humor in AI: Massive Scale Crowd-Sourced Preferences and Benchmarks for Cartoon Captioning

Neural Information Processing SystemsMar-27-2025, 12:21:34 GMT

We present a novel multimodal preference dataset for creative tasks, consisting of over 250 million human ratings on more than 2.2 million captions, collected through crowdsourcing rating data for The New Yorker's weekly cartoon caption contest over the past eight years. This unique dataset supports the development and evaluation of multimodal large language models and preference-based fine-tuning algorithms for humorous caption generation. We propose novel benchmarks for judging the quality of model-generated captions, utilizing both GPT4 and human judgments to establish ranking-based evaluation strategies. Our experimental results highlight the limitations of current fine-tuning methods, such as RLHF and DPO, when applied to creative tasks. Furthermore, we demonstrate that even stateof-the-art models like GPT4 and Claude currently underperform top human contestants in generating humorous captions. As we conclude this extensive data collection effort, we release the entire preference dataset to the research community, fostering further advancements in AI humor generation and evaluation.

caption, large language model, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > New York (0.27)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

fMRI predictors based on language models of increasing complexity recover brain left lateralization

Neural Information Processing SystemsMar-27-2025, 12:19:37 GMT

Over the past decade, studies of naturalistic language processing where participants are scanned while listening to continuous text have flourished. Using word embeddings at first, then large language models, researchers have created encoding models to analyze the brain signals. Presenting these models with the same text as the participants allows to identify brain areas where there is a significant correlation between the functional magnetic resonance imaging (fMRI) time series and the ones predicted by the models' artificial neurons. One intriguing finding from these studies is that they have revealed highly symmetric bilateral activation patterns, somewhat at odds with the well-known left lateralization of language processing. Here, we report analyses of an fMRI dataset where we manipulate the complexity of large language models, testing 28 pretrained models from 8 different families, ranging from 124M to 14.2B parameters.

artificial intelligence, large language model, natural language, (6 more...)

Neural Information Processing Systems

Industry:

Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.61)
Health & Medicine > Therapeutic Area > Neurology (0.45)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.55)

Add feedback

Hierarchical Uncertainty Exploration via Feedforward Posterior Trees

Neural Information Processing SystemsMar-27-2025, 12:19:37 GMT

When solving ill-posed inverse problems, one often desires to explore the space of potential solutions rather than be presented with a single plausible reconstruction. Valuable insights into these feasible solutions and their associated probabilities are embedded in the posterior distribution. However, when confronted with data of high dimensionality (such as images), visualizing this distribution becomes a formidable challenge, necessitating the application of effective summarization techniques before user examination. In this work, we introduce a new approach for visualizing posteriors across multiple levels of granularity using tree-valued predictions. Our method predicts a tree-valued hierarchical summarization of the posterior distribution for any input measurement, in a single forward pass of a neural network.

artificial intelligence, machine learning, probability, (17 more...)

Neural Information Processing Systems

Country:

Asia > Middle East (0.14)
Europe > Germany (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.67)
Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

A Appendix

Neural Information Processing SystemsMar-27-2025, 12:19:27 GMT

Memory Cost of Self-attention Weights in DETR: DETR has six encoder-decoder pairs. Figure 1 presents the structure of the encoder, decoder, and embedded Multi-Head Self-Attention (MHSA) layer. Each MHSA layer has a self-attention weight tensor produced by the multiplication of Query and Key as shown in Figure 1. The memory cost of this tensor during training under different hyperparameter settings and optimization strategies are plotted in Figure 2. It shows that more attention heads, especially large downsampling ratios, significantly increase the memory cost. Additionally, Adam and AdamW optimizers, commonly used to train vision transformers, take more memory than simple SGD.

artificial intelligence, attention weight, memory cost, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Vision (0.50)

Add feedback