A Walsh Hadamard Derived Linear Vector Symbolic Architecture
VSAs support the commutativity and associativity of this binding operation, along with an inverse operation, allowing one to construct symbolicstyle manipulations over real-valued vectors. Most VSAs were developed before deep learning and automatic differentiation became popular and instead focused on efficacy in hand-designed systems. In this work, we introduce the Hadamardderived linear Binding (HLB), which is designed to have favorable computational efficiency, and efficacy in classic VSA tasks, and perform well in differentiable systems.
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration
Mobile device operation tasks are increasingly becoming a popular multi-modal AI application scenario. Current Multi-modal Large Language Models (MLLMs), constrained by their training data, lack the capability to function effectively as operation assistants. Instead, MLLM-based agents, which enhance capabilities through tool invocation, are gradually being applied to this scenario. However, the two major navigation challenges in mobile device operation tasks -- task progress navigation and focus content navigation -- are difficult to effectively solve under the single-agent architecture of existing work. This is due to the overly long token sequences and the interleaved text-image data format, which limit performance.
Credit Attribution and Stable Compression Roi Livni Shay Moran
Credit attribution is crucial across various fields. In academic research, proper citation acknowledges prior work and establishes original contributions. Similarly, in generative models, such as those trained on existing artworks or music, it is important to ensure that any generated content influenced by these works appropriately credits the original creators. We study credit attribution by machine learning algorithms. We propose new definitions-relaxations of Differential Privacy-that weaken the stability guarantees for a designated subset of datapoints. These datapoints can be used non-stably with permission from their owners, potentially in exchange for compensation. Meanwhile, each of the remaining datapoints is guaranteed to have no significant influence on the algorithm's output. Our framework extends well-studied notions of stability, including Differential Privacy (= 0), differentially private learning with public data (where the public datapoints are fixed in advance), and stable sample compression (where the datapoints are selected adaptively by the algorithm). We examine the expressive power of these stability notions within the PAC learning framework, provide a comprehensive characterization of learnability for algorithms adhering to these principles, and propose directions and questions for future research.
Beyond accuracy: Tracking more like Human via Visual Search Xuchen Li1,2
Human visual search ability enables efficient and accurate tracking of an arbitrary moving target, which is a significant research interest in cognitive neuroscience. The recently proposed Central-Peripheral Dichotomy (CPD) theory sheds light on how humans effectively process visual information and track moving targets in complex environments. However, existing visual object tracking algorithms still fall short of matching human performance in maintaining tracking over time, particularly in complex scenarios requiring robust visual search skills. These scenarios often involve Spatio-Temporal Discontinuities (i.e., STDChallenge), prevalent in long-term tracking and global instance tracking. To address this issue, we conduct research from a human-like modeling perspective: (1) Inspired by the CPD, we propose a new tracker named CPDTrack to achieve human-like visual search ability. The central vision of CPDTrack leverages the spatio-temporal continuity of videos to introduce priors and enhance localization precision, while the peripheral vision improves global awareness and detects object movements.
The Good Robot podcast: Transhumanist fantasies with Alexander Thomas
Hosted by Eleanor Drage and Kerry McInerney, The Good Robot is a podcast which explores the many complex intersections between gender, feminism and technology. In this episode, Eleanor talks to Alexander Thomas, a filmmaker and academic who leads the BA in Media Production at the University of East London. They discuss his new book about transhumanism, a philosophical movement that aims to improve human capabilities through technology and whose followers includes Jeff Bezos, Elon Musk, Larry Page, and also apparently the DJ Steve Aoki. Alex is himself one of the foremost commentators on transhumanism. He explores transhumanist fantasies about the future of the human, is obsessed with the extremes of possibility: they either think that AI will bring us radical abundance or total extinction.
BERTs are Generative In-Context Learners
While in-context learning is commonly associated with causal language models, such as GPT, we demonstrate that this capability also'emerges' in masked language models. Through an embarrassingly simple inference technique, we enable an existing masked model, DeBERTa, to perform generative tasks without additional training or architectural changes. Our evaluation reveals that the masked and causal language models behave very differently, as they clearly outperform each other on different categories of tasks. These complementary strengths suggest that the field's focus on causal models for in-context learning may be limiting - both architectures can develop these capabilities, but with distinct advantages; pointing toward promising hybrid approaches that combine the strengths of both objectives.
Reinforcement Learning with Adaptive Regularization for Safe Control of Critical Systems
Reinforcement Learning (RL) is a powerful method for controlling dynamic systems, but its learning mechanism can lead to unpredictable actions that undermine the safety of critical systems. Here, we propose RL with Adaptive Regularization (RL-AR), an algorithm that enables safe RL exploration by combining the RL policy with a policy regularizer that hard-codes the safety constraints. RL-AR performs policy combination via a "focus module," which determines the appropriate combination depending on the state--relying more on the safe policy regularizer for less-exploited states while allowing unbiased convergence for well-exploited states. In a series of critical control applications, we demonstrate that RL-AR not only ensures safety during training but also achieves a return competitive with the standards of model-free RL that disregards safety.
A OpenXLand Components
In this environment, the agent receives reward when the orange sphere makes contact with the blue pyramid. We see that the orange sphere is elevated, and therefore the agent must find it and use the ramps to access it. As to the blue pyramid, we do not see it because it is not there: the agent must first get the orange sphere near the black rounded cube first to spawn one. This environment also contains a grey pyramid that serves as a distraction. Importantly, if the agent brings the grey pyramid near the black rounded cube, both will disappear, making it impossible for the agent to spawn a blue pyramid and subsequently obtain its reward.