AITopics

af790b7ae573771689438bbcfc5933fe-Paper-Conference.pdf

Neural Information Processing SystemsMar-27-2025, 12:17:39 GMT

artificial intelligence, bayesian inference, machine learning, (14 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Communications (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Confidence Regulation Neurons in Language Models

Neural Information Processing SystemsMar-27-2025, 12:17:28 GMT

Despite their widespread use, the mechanisms by which large language models (LLMs) represent and regulate uncertainty in next-token predictions remain largely unexplored. This study investigates two critical components believed to influence this uncertainty: the recently discovered entropy neurons and a new set of components that we term token frequency neurons. Entropy neurons are characterized by an unusually high weight norm and influence the final layer normalization (LayerNorm) scale to effectively scale down the logits. Our work shows that entropy neurons operate by writing onto an unembedding null space, allowing them to impact the residual stream norm with minimal direct effect on the logits themselves. We observe the presence of entropy neurons across a range of models, up to 7 billion parameters. On the other hand, token frequency neurons, which we discover and describe here for the first time, boost or suppress each token's logit proportionally to its log frequency, thereby shifting the output distribution towards or away from the unigram distribution. Finally, we present a detailed case study where entropy neurons actively manage confidence in the setting of induction, i.e. detecting and continuing repeated subsequences.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Europe (0.92)
North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > UAE (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Active Observing in Continuous-time Control

Neural Information Processing SystemsMar-27-2025, 12:17:20 GMT

The control of continuous-time environments while actively deciding when to take costly observations in time is a crucial yet unexplored problem, particularly relevant to real-world scenarios such as medicine, low-power systems, and resource management. Existing approaches either rely on continuous-time control methods that take regular, expensive observations in time or discrete-time control with costly observation methods, which are inapplicable to continuous-time settings due to the compounding discretization errors introduced by time discretization. In this work, we are the first to formalize the continuous-time control problem with costly observations. Our key theoretical contribution shows that observing at regular time intervals is not optimal in certain environments, while irregular observation policies yield higher expected utility. This perspective paves the way for the development of novel methods that can take irregular observations in continuous-time control with costly observations. We empirically validate our theoretical findings in various continuous-time environments, including a cancer simulation, by constructing a simple initial method to solve this new problem, with a heuristic threshold on the variance of reward rollouts in an offline continuous-time model-based model predictive control (MPC) planner. Although determining the optimal method remains an open problem, our work offers valuable insights and understanding of this unique problem, laying the foundation for future research in this area.

dynamic model, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.93)
(3 more...)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Active Observing in Continuous-time Control

Neural Information Processing SystemsMar-27-2025, 12:17:16 GMT

The control of continuous-time environments while actively deciding when to take costly observations in time is a crucial yet unexplored problem, particularly relevant to real-world scenarios such as medicine, low-power systems, and resource management. Existing approaches either rely on continuous-time control methods that take regular, expensive observations in time or discrete-time control with costly observation methods, which are inapplicable to continuous-time settings due to the compounding discretization errors introduced by time discretization. In this work, we are the first to formalize the continuous-time control problem with costly observations. Our key theoretical contribution shows that observing at regular time intervals is not optimal in certain environments, while irregular observation policies yield higher expected utility. This perspective paves the way for the development of novel methods that can take irregular observations in continuous-time control with costly observations. We empirically validate our theoretical findings in various continuous-time environments, including a cancer simulation, by constructing a simple initial method to solve this new problem, with a heuristic threshold on the variance of reward rollouts in an offline continuous-time model-based model predictive control (MPC) planner. Although determining the optimal method remains an open problem, our work offers valuable insights and understanding of this unique problem, laying the foundation for future research in this area.

dynamic model, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.93)
(3 more...)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Tolerant Algorithms for Learning with Arbitrary Covariate Shift

Neural Information Processing SystemsMar-27-2025, 12:17:09 GMT

We study the problem of learning under arbitrary distribution shift, where the learner is trained on a labeled set from one distribution but evaluated on a different, potentially adversarially generated test distribution. We focus on two frameworks: PQ learning [GKKM20], allowing abstention on adversarially generated parts of the test distribution, and TDS learning [KSV24b], permitting abstention on the entire test distribution if distribution shift is detected. All prior known algorithms either rely on learning primitives that are computationally hard even for simple function classes, or end up abstaining entirely even in the presence of a tiny amount of distribution shift. We address both these challenges for natural function classes, including intersections of halfspaces and decision trees, and standard training distributions, including Gaussians. For PQ learning, we give efficient learning algorithms, while for TDS learning, our algorithms can tolerate moderate amounts of distribution shift. At the core of our approach is an improved analysis of spectral outlier-removal techniques from learning with nasty noise. Our analysis can (1) handle arbitrarily large fraction of outliers, which is crucial for handling arbitrary distribution shifts, and (2) obtain stronger bounds on polynomial moments of the distribution after outlier removal, yielding new insights into polynomial regression under distribution shifts. Lastly, our techniques lead to novel results for tolerant testable learning [RV23], and learning with nasty noise.

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.27)
North America > Canada (0.27)

Genre: Research Report > Experimental Study (0.92)

Industry: Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.45)

Add feedback

Open-Vocabulary Object Detection via Language Hierarchy

Neural Information Processing SystemsMar-27-2025, 12:17:02 GMT

Recent studies on generalizable object detection have attracted increasing attention with additional weak supervision from large-scale datasets with image-level labels. However, weakly-supervised detection learning often suffers from image-to-box label mismatch, i.e., image-level labels do not convey precise object information. We design Language Hierarchical Self-training (LHST) that introduces language hierarchy into weakly-supervised detector training for learning more generalizable detectors. LHST expands the image-level labels with language hierarchy and enables co-regularization between the expanded labels and self-training. Specifically, the expanded labels regularize self-training by providing richer supervision and mitigating the image-to-box label mismatch, while self-training allows assessing and selecting the expanded labels according to the predicted reliability. In addition, we design language hierarchical prompt generation that introduces language hierarchy into prompt generation which helps bridge the vocabulary gaps between training and testing. Extensive experiments show that the proposed techniques achieve superior generalization performance consistently across 14 widely studied object detection datasets.

artificial intelligence, dataset, machine learning, (12 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Israel (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry:

Transportation (0.68)
Information Technology (0.46)
Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

af66ac99716a64476c07ae8b089d59f8-Paper-Conference.pdf

Neural Information Processing SystemsMar-27-2025, 12:17:00 GMT

artificial intelligence, machine learning, perturbation, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Genre: Research Report (0.68)

Industry:

Government (0.46)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)

Add feedback

Garmin Connect Plus brings AI to your wrist, but it's not free

ZDNetMar-27-2025, 12:16:50 GMT

The main reason I always have a Garmin watch on my wrist is due to the comprehensive ecosystem powered by Garmin Connect. This service will continue to provide a free, personalized experience, via its phone app and website, but Garmin is now launching another option, called Garmin Connect Plus, for those who want more from their devices. Garmin Connect Plus is an optional subscription service for 6.99 monthly or 69.99 annually. There's a free 30-day trial to test the service. It has Active Intelligence, powered by AI, which is designed to give you insights personalized to your daily health and activity data collected by various Garmin devices.

artificial intelligence, garmin connect, garmin connect plus bring ai, (7 more...)

ZDNet

Technology: Information Technology > Artificial Intelligence (0.40)

Add feedback

903f778fe1341e5351b5b63e0e6b197f-Supplemental-Conference.pdf

Neural Information Processing SystemsMar-27-2025, 12:16:46 GMT

artificial intelligence, machine learning, point cloud, (13 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

Slagle - 2024 - SpaceByte: Towards Deleting Tokenization from Large Language Modeling

Kevin Slagle

Neural Information Processing SystemsMar-27-2025, 12:14:11 GMT

Tokenization is widely used in large language models because it significantly improves performance. However, tokenization imposes several disadvantages, such as performance biases, increased adversarial vulnerability, decreased characterlevel modeling performance, and increased modeling complexity. To address these disadvantages without sacrificing performance, we propose SpaceByte, a novel byte-level decoder architecture that closes the performance gap between byte-level and subword autoregressive language modeling. SpaceByte consists of a byte-level Transformer model, but with extra larger transformer blocks inserted in the middle of the layers. We find that performance is significantly improved by applying these larger blocks only after certain bytes, such as space characters, which typically denote word boundaries. Our experiments show that for a fixed training and inference compute budget, SpaceByte outperforms other byte-level architectures and roughly matches the performance of tokenized Transformer architectures.

large language model, machine learning, spacebyte, (22 more...)

Neural Information Processing Systems

Country: