Goto

Collaborating Authors

 mover



Supervised Word Mover's Distance

Gao Huang, Chuan Guo, Matt J. Kusner, Yu Sun, Fei Sha, Kilian Q. Weinberger

Neural Information Processing Systems

Recently, a new document metric called the word mover's distance (WMD) has been proposed with unprecedented results on k NN-based document classification. The WMD elevates high-quality word embeddings to a document metric by formulating the distance between two documents as an optimal transport problem between the embedded words. However, the document distances are entirely unsupervised and lack a mechanism to incorporate supervision when available. In this paper we propose an efficient technique to learn a supervised metric, which we call the Supervised-WMD (S-WMD) metric.


MagBotSim: Physics-Based Simulation and Reinforcement Learning Environments for Magnetic Robotics

Bergmann, Lara, Grothues, Cedric, Neumann, Klaus

arXiv.org Artificial Intelligence

Magnetic levitation is about to revolutionize in-machine material flow in industrial automation. Such systems are flexibly configurable and can include a large number of independently actuated shuttles (movers) that dynamically rebalance production capacity. Beyond their capabilities for dynamic transportation, these systems possess the inherent yet unexploited potential to perform manipulation. By merging the fields of transportation and manipulation into a coordinated swarm of magnetic robots (MagBots), we enable manufacturing systems to achieve significantly higher efficiency, adaptability, and compactness. To support the development of intelligent algorithms for magnetic levitation systems, we introduce MagBotSim (Magnetic Robotics Simulation): a physics-based simulation for magnetic levitation systems. By framing magnetic levitation systems as robot swarms and providing a dedicated simulation, this work lays the foundation for next generation manufacturing systems powered by Magnetic Robotics. MagBotSim's documentation, videos, experiments, and code are available at: https://ubi-coro.github.io/MagBotSim/



End-to-End Low-Level Neural Control of an Industrial-Grade 6D Magnetic Levitation System

Hartmann, Philipp, Stranghöner, Jannick, Neumann, Klaus

arXiv.org Artificial Intelligence

Magnetic levitation is poised to revolutionize industrial automation by integrating flexible in-machine product transport and seamless manipulation. It is expected to become the standard drive for automated manufacturing. However, controlling such systems is inherently challenging due to their complex, unstable dynamics. Traditional control approaches, which rely on hand-crafted control engineering, typically yield robust but conservative solutions, with their performance closely tied to the expertise of the engineering team. In contrast, neural control learning presents a promising alternative. This paper presents the first neural controller for 6D magnetic levitation. Trained end-to-end on interaction data from a proprietary controller, it directly maps raw sensor data and 6D reference poses to coil current commands. The neural controller can effectively generalize to previously unseen situations while maintaining accurate and robust control. These results underscore the practical feasibility of learning-based neural control in complex physical systems and suggest a future where such a paradigm could enhance or even substitute traditional engineering approaches in demanding real-world applications. The trained neural controller, source code, and demonstration videos are publicly available at https://sites.google.com/view/neural-maglev.


MOVER: Multimodal Optimal Transport with Volume-based Embedding Regularization

You, Haochen, Liu, Baojing

arXiv.org Artificial Intelligence

Recent advances in multimodal learning have largely relied on pairwise contrastive objectives to align different modalities, such as text, video, and audio, in a shared embedding space. While effective in bi-modal setups, these approaches struggle to generalize across multiple modalities and often lack semantic structure in high-dimensional spaces. In this paper, we propose MOVER, a novel framework that combines optimal transport-based soft alignment with volume-based geometric regularization to build semantically aligned and structured multimodal representations. By integrating a transport-guided matching mechanism with a geometric volume minimization objective (GAVE), MOVER encourages consistent alignment across all modalities in a modality-agnostic manner. Experiments on text-video-audio retrieval tasks demonstrate that MOVER significantly outperforms prior state-of-the-art methods in both zero-shot and finetuned settings. Additional analysis shows improved generalization to unseen modality combinations and stronger structural consistency in the learned embedding space.


Note: Harnessing Tellurium Nanoparticles in the Digital Realm Plasmon Resonance, in the Context of Brewster's Angle and the Drude Model for Fake News Adsorption in Incomplete Information Games

Kawahata, Yasuko

arXiv.org Artificial Intelligence

This note explores the innovative application of soliton theory and plasmonic phenomena in modeling user behavior and engagement within digital health platforms. By introducing the concept of soliton solutions, we present a novel approach to understanding stable patterns of health improvement behaviors over time. Additionally, we delve into the role of tellurium nanoparticles and their plasmonic properties in adsorbing fake news, thereby influencing user interactions and engagement levels. Through a theoretical framework that combines nonlinear dynamics with the unique characteristics of tellurium nanoparticles, we aim to provide new insights into the dynamics of user engagement in digital health environments. Our analysis highlights the potential of soliton theory in capturing the complex, nonlinear dynamics of user behavior, while the application of plasmonic phenomena offers a promising avenue for enhancing the sensitivity and effectiveness of digital health platforms. This research ventures into an uncharted territory where optical phenomena such as Brewster's Angle and Snell's Law, along with the concept of spin solitons, are metaphorically applied to address the challenge of fake news dissemination. By exploring the analogy between light refraction, reflection, and the propagation of information in digital platforms, we unveil a novel perspective on how the 'angle' at which information is presented can significantly affect its acceptance and spread. Additionally, we propose the use of tellurium nanoparticles to manage 'information waves' through mechanisms akin to plasmonic resonance and soliton dynamics. This theoretical exploration aims to bridge the gap between physical sciences and digital communication, offering insights into the development of strategies for mitigating misinformation.


Plasmon Resonance Model: Investigation of Analysis of Fake News Diffusion Model with Third Mover Intervention Using Soliton Solution in Non-Complete Information Game under Repeated Dilemma Condition

Kawahata, Yasuko

arXiv.org Artificial Intelligence

In this study, we attempt to model the prominent problem of fake news diffusion in modern society using the framework of incomplete information games and nonlinear partial differential equations. In particular, we focus on the plasmon resonance phenomenon, in which fake news diffuses rapidly under certain conditions and causes significant social impact, and aim to theoretically elucidate its mechanism. We also incorporate the concepts of first movers, second movers, and third movers in game theory to explore how their strategies affect the dynamics of fake news diffusion. The proliferation of fake news is a complex process in which truth and misinformation intersect, and its effects reach across political, economic, and social strata. To address this issue, it is essential to understand how fake news is widely accepted and shared. In this study, we liken this diffusion process to the concept of plasmon resonance in physics to model the phenomenon of the rapid amplification of fake Figure 1: Comparison of Third Mover Soliton Solution and news within a particular social group. Plasmon resonance is Parabolic Strategies under Plasmon Influence a resonance phenomenon that occurs when electron density waves interact with light on a metal surface.


OpenToM: A Comprehensive Benchmark for Evaluating Theory-of-Mind Reasoning Capabilities of Large Language Models

Xu, Hainiu, Zhao, Runcong, Zhu, Lixing, Du, Jinhua, He, Yulan

arXiv.org Artificial Intelligence

Neural Theory-of-Mind (N-ToM), machine's ability to understand and keep track of the mental states of others, is pivotal in developing socially intelligent agents. However, prevalent N-ToM benchmarks have several shortcomings, including the presence of ambiguous and artificial narratives, absence of personality traits and preferences, a lack of questions addressing characters' psychological mental states, and limited diversity in the questions posed. In response to these issues, we construct OpenToM, a new benchmark for assessing N-ToM with (1) longer and clearer narrative stories, (2) characters with explicit personality traits, (3) actions that are triggered by character intentions, and (4) questions designed to challenge LLMs' capabilities of modeling characters' mental states of both the physical and psychological world. Using OpenToM, we reveal that state-of-the-art LLMs thrive at modeling certain aspects of mental states in the physical world but fall short when tracking characters' mental states in the psychological world.


Differentiable Earth Mover's Distance for Data Compression at the High-Luminosity LHC

Shenoy, Rohan, Duarte, Javier, Herwig, Christian, Hirschauer, James, Noonan, Daniel, Pierini, Maurizio, Tran, Nhan, Suarez, Cristina Mantilla

arXiv.org Artificial Intelligence

The Earth mover's distance (EMD) is a useful metric for image recognition and classification, but its usual implementations are not differentiable or too slow to be used as a loss function for training other algorithms via gradient descent. In this paper, we train a convolutional neural network (CNN) to learn a differentiable, fast approximation of the EMD and demonstrate that it can be used as a substitute for computing-intensive EMD implementations. We apply this differentiable approximation in the training of an autoencoder-inspired neural network (encoder NN) for data compression at the high-luminosity LHC at CERN. The goal of this encoder NN is to compress the data while preserving the information related to the distribution of energy deposits in particle detectors. We demonstrate that the performance of our encoder NN trained using the differentiable EMD CNN surpasses that of training with loss functions based on mean squared error.