Well File:

5 AI prompts to put serious money in your pocket

FOX News

A majority of small businesses are using artificial intelligence and finding out it can save time and money. So, you want to start making money using AI but you're not trying to build Skynet or learn 15 coding languages first? Good, because neither am I. You don't need to become the next Sam Altman or have a Ph.D. in machine learning to turn artificial intelligence into real income. What you do need is curiosity, a dash of creativity, and the right prompts.


Google AI Overviews still struggles to answer basic questions and count

Mashable

Remember those old school sports and actions movies -- think Billy Bob in Varsity Blues -- where they'd ask dazed people simple questions to see if they're concussed? How many fingers am I holding up? Or, what year is it? Well, even by that low, low standard, Google's AI overviews may not pass concussion protocol. This week, folks noticed that Google's AI overviews couldn't reliably discern that the year was, in fact, 2025. There were a number of posts about it online.


SeTAR: Out-of-Distribution Detection with Selective Low-Rank Approximation Yixia Li

Neural Information Processing Systems

Out-of-distribution (OOD) detection is crucial for the safe deployment of neural networks. Existing CLIP-based approaches perform OOD detection by devising novel scoring functions or sophisticated fine-tuning methods. In this work, we propose SeTAR, a novel, training-free OOD detection method that leverages selective low-rank approximation of weight matrices in vision-language and vision-only models. SeTAR enhances OOD detection via post-hoc modification of the model's weight matrices using a simple greedy search algorithm. Based on SeTAR, we further propose SeTAR+FT, a fine-tuning extension optimizing model performance for OOD detection tasks. Extensive evaluations on ImageNet1K and Pascal-VOC benchmarks show SeTAR's superior performance, reducing the relatively false positive rate by up to 18.95% and 36.80%


Order-Independence Without Fine Tuning

Neural Information Processing Systems

The development of generative language models that can create long and coherent textual outputs via autoregression has lead to a proliferation of uses and a corresponding sweep of analyses as researches work to determine the limitations of this new paradigm. Unlike humans, these'Large Language Models' (LLMs) are highly sensitive to small changes in their inputs, leading to unwanted inconsistency in their behavior. One problematic inconsistency when LLMs are used to answer multiple-choice questions or analyze multiple inputs is order dependency: the output of an LLM can (and often does) change significantly when sub-sequences are swapped, despite both orderings being semantically identical. In this paper we present Set-Based Prompting, a technique that guarantees the output of an LLM will not have order dependence on a specified set of sub-sequences. We show that this method provably eliminates order dependency, and that it can be applied to any transformer-based LLM to enable text generation that is unaffected by re-orderings. Delving into the implications of our method, we show that, despite our inputs being out of distribution, the impact on expected accuracy is small, where the expectation is over the order of uniformly chosen shuffling of the candidate responses, and usually significantly less in practice. Thus, Set-Based Prompting can be used as a'dropped-in' method on fully trained models. Finally, we discuss how our method's success suggests that other strong guarantees can be obtained on LLM performance via modifying the input representations.


Hugging Faces new humanoid robot HopeJr may only cost 3,000

Mashable

If you want a robot assistant to live in your home and act vaguely like a human, you might be in luck. Hugging Face, a company that largely specializes in machine learning but has branched out into robotics recently, has a new humanoid robot called HopeJr coming out potentially by the end of 2025. As you can see in a video posted to X, it has a pretty wide range of movement capabilities. Per TechCrunch, it is specifically capable of 66 independent movements. The caption on the video claims it is capable of walking and "manipulating many objects," though we don't get to see the bot walk in the video.


MinMax Methods for Optimal Transport and Beyond: Regularization, Approximation and Numerics

Neural Information Processing Systems

We study MinMax solution methods for a general class of optimization problems related to (and including) optimal transport. Theoretically, the focus is on fitting a large class of problems into a single MinMax framework and generalizing regularization techniques known from classical optimal transport. We show that regularization techniques justify the utilization of neural networks to solve such problems by proving approximation theorems and illustrating fundamental issues if no regularization is used. We further study the relation to the literature on generative adversarial nets, and analyze which algorithmic techniques used therein are particularly suitable to the class of problems studied in this paper. Several numerical experiments showcase the generality of the setting and highlight which theoretical insights are most beneficial in practice.


Generative Retrieval Meets Multi-Graded Relevance Yubao Tang 1,2

Neural Information Processing Systems

Generative retrieval represents a novel approach to information retrieval. It uses an encoder-decoder architecture to directly produce relevant document identifiers (docids) for queries. While this method offers benefits, current approaches are limited to scenarios with binary relevance data, overlooking the potential for documents to have multi-graded relevance. Extending generative retrieval to accommodate multi-graded relevance poses challenges, including the need to reconcile likelihood probabilities for docid pairs and the possibility of multiple relevant documents sharing the same identifier.


852f50969a9e523ec41d26f2f68bd456-Paper-Conference.pdf

Neural Information Processing Systems

Distributed learning is essential to train machine learning algorithms across heterogeneous agents while maintaining data privacy. We conduct an asymptotic analysis of Unified Distributed SGD (UD-SGD), exploring a variety of communication patterns, including decentralized SGD and local SGD within Federated Learning (FL), as well as the increasing communication interval in the FL setting. In this study, we assess how different sampling strategies, such as i.i.d.


Dissecting the Interplay of Attention Paths in a Statistical Mechanics Theory of Transformers Lorenzo Tiberi Francesca Mignacco 3,4

Neural Information Processing Systems

Despite the remarkable empirical performance of transformers, their theoretical understanding remains elusive. Here, we consider a deep multi-head self-attention network, that is closely related to transformers yet analytically tractable. We develop a statistical mechanics theory of Bayesian learning in this model, deriving exact equations for the network's predictor statistics under the finite-width thermodynamic limit, i.e., N, P, P/N = O(1), where N is the network width and P is the number of training examples. Our theory shows that the predictor statistics are expressed as a sum of independent kernels, each one pairing different attention paths, defined as information pathways through different attention heads across layers. The kernels are weighted according to a task-relevant kernel combination mechanism that aligns the total kernel with the task labels. As a consequence, this interplay between attention paths enhances generalization performance. Experiments confirm our findings on both synthetic and real-world sequence classification tasks. Finally, our theory explicitly relates the kernel combination mechanism to properties of the learned weights, allowing for a qualitative transfer of its insights to models trained via gradient descent. As an illustration, we demonstrate an efficient size reduction of the network, by pruning those attention heads that are deemed less relevant by our theory.


Decision Mamba: Reinforcement Learning via Hybrid Selective Sequence Modeling

Neural Information Processing Systems

Recent works have shown the remarkable superiority of transformer models in reinforcement learning (RL), where the decision-making problem is formulated as sequential generation. Transformer-based agents could emerge with selfimprovement in online environments by providing task contexts, such as multiple trajectories, called in-context RL. However, due to the quadratic computation complexity of attention in transformers, current in-context RL methods suffer from huge computational costs as the task horizon increases. In contrast, the Mamba model is renowned for its efficient ability to process long-term dependencies, which provides an opportunity for in-context RL to solve tasks that require long-term memory. To this end, we first implement Decision Mamba (DM) by replacing the backbone of Decision Transformer (DT).