AITopics

865dfbde8a344b44095495f3591f7407-AuthorFeedback.pdf

Neural Information Processing SystemsMay-29-2025, 23:33:07 GMT

We thank the reviewers and AC for their thoughtful comments and thorough review. We will include detailed comparisons in the camera-ready version of the paper. We agree with the reviewer's statement that the entropy of the average is not the same as the average of We will describe this calculation in detail in the appendix. We will make this conceptual point explicit in the camera-ready version. We will mention it explicitly in the camera-ready version.

artificial intelligence, machine learning, reviewer, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.30)

Add feedback

Exploring the Edges of Latent State Clusters for Goal-Conditioned Reinforcement Learning

Neural Information Processing SystemsMay-29-2025, 23:32:46 GMT

Exploring unknown environments efficiently is a fundamental challenge in unsupervised goal-conditioned reinforcement learning. While selecting exploratory goals at the frontier of previously explored states is an effective strategy, the policy during training may still have limited capability of reaching rare goals on the frontier, resulting in reduced exploratory behavior. We propose "Cluster Edge Exploration" (CE

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

What Makes a " Good " Data Augmentation in Knowledge Distillation - A Statistical Perspective (with Appendix)

Neural Information Processing SystemsMay-29-2025, 23:32:39 GMT

Knowledge distillation (KD) is a general neural network training approach that uses a teacher model to guide the student model. Existing works mainly study KD from the network output side (e.g., trying to design a better KD loss function), while few have attempted to understand it from the input side. Especially, its interplay with data augmentation (DA) has not been well understood. In this paper, we ask: Why do some DA schemes (e.g., CutMix) inherently perform much better than others in KD? What makes a "good" DA in KD? Our investigation from a statistical perspective suggests that a good DA scheme should reduce the covariance of the teacher-student cross-entropy.

artificial intelligence, machine learning, test loss, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (1.00)

Industry: Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

What Makes a " Good " Data Augmentation in Knowledge Distillation - A Statistical Perspective

Neural Information Processing SystemsMay-29-2025, 23:32:35 GMT

Knowledge distillation (KD) is a general neural network training approach that uses a teacher model to guide the student model. Existing works mainly study KD from the network output side (e.g., trying to design a better KD loss function), while few have attempted to understand it from the input side. Especially, its interplay with data augmentation (DA) has not been well understood. In this paper, we ask: Why do some DA schemes (e.g., CutMix) inherently perform much better than others in KD? What makes a "good" DA in KD? Our investigation from a statistical perspective suggests that a good DA scheme should reduce the covariance of the teacher-student cross-entropy.

artificial intelligence, distillation, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (0.93)

Industry: Education (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Hierarchical Granularity Transfer Learning Shaobo Min

Neural Information Processing SystemsMay-29-2025, 23:31:48 GMT

In the real world, object categories usually have a hierarchical granularity tree. Nowadays, most researchers focus on recognizing categories in a specific granularity, e.g., basic-level or sub(ordinate)-level. Compared with basic-level categories, the sub-level categories provide more valuable information, but its training annotations are harder to acquire. Therefore, an attractive problem is how to transfer the knowledge learned from basic-level annotations to sub-level recognition. In this paper, we introduce a new task, named Hierarchical Granularity Transfer Learning (HGTL), to recognize sub-level categories with basic-level annotations and semantic descriptions for hierarchical categories. Different from other recognition tasks, HGTL has a serious granularity gap, i.e., the two granularities share an image space but have different category domains, which impede the knowledge transfer. To this end, we propose a novel Bi-granularity Semantic Preserving Network (BigSPN) to bridge the granularity gap for robust knowledge transfer. Explicitly, BigSPN constructs specific visual encoders for different granularities, which are aligned with a shared semantic interpreter via a novel subordinate entropy loss. Experiments on three benchmarks with hierarchical granularities show that BigSPN is an effective framework for Hierarchical Granularity Transfer Learning.

artificial intelligence, category, machine learning, (9 more...)

Neural Information Processing Systems

Country: Asia > China (0.29)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

861637a425ef06e6d539aaaff113d1d5-AuthorFeedback.pdf

Neural Information Processing SystemsMay-29-2025, 23:31:36 GMT

We thank the reviewers for their comments. As agreed by all reviewers, it is an interesting and novel task, which will attract more researches to follow. Yes, we actually have adjusted DA to HGTL with a two-stage manner. All the compared methods and BigSPN are the two-stage models, as stated in Q2. Q1:Illustrating why a new visual encoder is needed.

artificial intelligence, bigspn, hgtl, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.51)

Add feedback

Repurposing Language Models into Embedding Models: Finding the Compute-Optimal Recipe Bartosz Piotrowski University of Cambridge IDEAS NCBR IDEAS NCBR University of Warsaw IMPAN Wenda Li Mateja Jamnik

Neural Information Processing SystemsMay-29-2025, 23:31:10 GMT

Text embeddings are essential for many tasks, such as document retrieval, clustering, and semantic similarity assessment. In this paper, we study how to contrastively train text embedding models in a compute-optimal fashion, given a suite of pre-trained decoder-only language models. Our innovation is an algorithm that produces optimal configurations of model sizes, data quantities, and fine-tuning methods for text-embedding models at different computational budget levels. The resulting recipe, which we obtain through extensive experiments, can be used by practitioners to make informed design choices for their embedding models. Specifically, our findings suggest that full fine-tuning and low-rank adaptation fine-tuning produce optimal models at lower and higher computational budgets respectively.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

North America > United States (1.00)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.40)
Europe > Poland > Masovia Province > Warsaw (0.40)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.66)

Add feedback

Tree of Attacks: Jailbreaking Black-Box LLMs Automatically Manolis Zampetakis Yale University

Neural Information Processing SystemsMay-29-2025, 23:30:51 GMT

While Large Language Models (LLMs) display versatile functionality, they continue to generate harmful, biased, and toxic content, as demonstrated by the prevalence of human-designed jailbreaks. In this work, we present Tree of Attacks with Pruning (TAP), an automated method for generating jailbreaks that only requires black-box access to the target LLM. TAP utilizes an attacker LLM to iteratively refine candidate (attack) prompts until one of the refined prompts jailbreaks the target. In addition, before sending prompts to the target, TAP assesses them and prunes the ones unlikely to result in jailbreaks, reducing the number of queries sent to the target LLM. In empirical evaluations, we observe that TAP generates prompts that jailbreak state-of-the-art LLMs (including GPT4-Turbo and GPT4o) for more than 80% of the prompts. This significantly improves upon the previous state-of-the-art black-box methods for generating jailbreaks while using a smaller number of queries than them. Furthermore, TAP is also capable of jailbreaking LLMs protected by state-of-the-art guardrails, e.g., LlamaGuard.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Asia (1.00)
North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Media > News (1.00)
Information Technology > Security & Privacy (1.00)
Transportation > Air (0.90)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Neural Conditional Probability for Uncertainty Quantification

Neural Information Processing SystemsMay-29-2025, 23:24:04 GMT

We introduce Neural Conditional Probability (NCP), an operator-theoretic approach to learning conditional distributions with a focus on statistical inference tasks. NCP can be used to build conditional confidence regions and extract key statistics such as conditional quantiles, mean, and covariance. It offers streamlined learning via a single unconditional training phase, allowing efficient inference without the need for retraining even when conditioning changes. By leveraging the approximation capabilities of neural networks, NCP efficiently handles a wide variety of complex probability distributions. We provide theoretical guarantees that ensure both optimization consistency and statistical accuracy. In experiments, we show that NCP with a 2-hidden-layer network matches or outperforms leading methods. This demonstrates that a a minimalistic architecture with a theoretically grounded loss can achieve competitive results, even in the face of more complex architectures.

artificial intelligence, bayesian inference, machine learning, (21 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Europe > Germany (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)
Overview (0.67)

Industry:

Add feedback

Supplementary Material

Neural Information Processing SystemsMay-29-2025, 23:23:33 GMT

We implemented the encoder (feature mapping f and g) by using multilayer perceptrons with ReLU activation and the decoder by using a symmetric architecture of encoder.

artificial intelligence, machine learning, representation, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.78)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Add feedback

Filters

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

865dfbde8a344b44095495f3591f7407-AuthorFeedback.pdf

Exploring the Edges of Latent State Clusters for Goal-Conditioned Reinforcement Learning

What Makes a " Good " Data Augmentation in Knowledge Distillation - A Statistical Perspective (with Appendix)

What Makes a " Good " Data Augmentation in Knowledge Distillation - A Statistical Perspective

Hierarchical Granularity Transfer Learning Shaobo Min

861637a425ef06e6d539aaaff113d1d5-AuthorFeedback.pdf

Repurposing Language Models into Embedding Models: Finding the Compute-Optimal Recipe Bartosz Piotrowski University of Cambridge IDEAS NCBR IDEAS NCBR University of Warsaw IMPAN Wenda Li Mateja Jamnik

Tree of Attacks: Jailbreaking Black-Box LLMs Automatically Manolis Zampetakis Yale University

Neural Conditional Probability for Uncertainty Quantification

Supplementary Material