Goto

Collaborating Authors

 mixer





Free Energy Mixer

Lu, Jiecheng, Yang, Shihao

arXiv.org Machine Learning

Standard attention stores keys/values losslessly but reads them via a per-head convex average, blocking channel-wise selection. We propose the Free Energy Mixer (FEM): a free-energy (log-sum-exp) read that applies a value-driven, per-channel log-linear tilt to a fast prior (e.g., from queries/keys in standard attention) over indices. Unlike methods that attempt to improve and enrich the $(q,k)$ scoring distribution, FEM treats it as a prior and yields a value-aware posterior read at unchanged complexity, smoothly moving from averaging to per-channel selection as the learnable inverse temperature increases, while still preserving parallelism and the original asymptotic complexity ($O(T^2)$ for softmax; $O(T)$ for linearizable variants). We instantiate a two-level gated FEM that is plug-and-play with standard and linear attention, linear RNNs and SSMs. It consistently outperforms strong baselines on NLP, vision, and time-series at matched parameter budgets.


Object Scene Representation Transformer

Neural Information Processing Systems

Figure 3: Exampleviewsofscenesfrom CLEVR-3D (left) and MSN-Easy (right). Figure 6: Novelview (left) withaslotremoved (center) oraslotaddedfromanotherscene (right).


qc-kmeans: A Quantum Compressive K-Means Algorithm for NISQ Devices

Chumpitaz-Flores, Pedro, Duong, My, Mao, Ying, Hua, Kaixun

arXiv.org Artificial Intelligence

Clustering on NISQ hardware is constrained by data loading and limited qubits. We present \textbf{qc-kmeans}, a hybrid compressive $k$-means that summarizes a dataset with a constant-size Fourier-feature sketch and selects centroids by solving small per-group QUBOs with shallow QAOA circuits. The QFF sketch estimator is unbiased with mean-squared error $O(\varepsilon^2)$ for $B,S=Θ(\varepsilon^{-2})$, and the peak-qubit requirement $q_{\text{peak}}=\max\{D,\lceil \log_2 B\rceil + 1\}$ does not scale with the number of samples. A refinement step with elitist retention ensures non-increasing surrogate cost. In Qiskit Aer simulations (depth $p{=}1$), the method ran with $\le 9$ qubits on low-dimensional synthetic benchmarks and achieved competitive sum-of-squared errors relative to quantum baselines; runtimes are not directly comparable. On nine real datasets (up to $4.3\times 10^5$ points), the pipeline maintained constant peak-qubit usage in simulation. Under IBM noise models, accuracy was similar to the idealized setting. Overall, qc-kmeans offers a NISQ-oriented formulation with shallow, bounded-width circuits and competitive clustering quality in simulation.


Apriel-H1: Towards Efficient Enterprise Reasoning Models

Ostapenko, Oleksiy, Kumar, Luke, Li, Raymond, Kocetkov, Denis, Lamy-Poirier, Joel, Radhakrishna, Shruthan, Parikh, Soham, Mishra, Shambhavi, Paquet, Sebastien, Sunkara, Srinivas, Bécaert, Valérie, Madhusudhan, Sathwik Tejaswi, Scholak, Torsten

arXiv.org Artificial Intelligence

Large Language Models (LLMs) achieve remarkable reasoning capabilities through transformer architectures with attention mechanisms. However, transformers suffer from quadratic time and memory complexity in the attention module (MHA) and require caching key-value states during inference, which severely limits throughput and scalability. High inference throughput is critical for agentic tasks, long-context reasoning, efficient deployment under high request loads, and more efficient test-time compute scaling. State Space Models (SSMs) such as Mamba offer a promising alternative with linear inference complexity and a constant memory footprint via recurrent computation with fixed-size hidden states. In this technical report we introduce the Apriel-H1 family of hybrid LLMs that combine transformer attention and SSM sequence mixers for efficient reasoning at 15B model size. These models are obtained through incremental distillation from a pretrained reasoning transformer, Apriel-Nemotron-15B-Thinker, progressively replacing less critical attention layers with linear Mamba blocks. We release multiple post-distillation variants of Apriel-H1-15B-Thinker with different SSM-to-MHA ratios and analyse how reasoning performance degrades as more Mamba layers replace MHA. Additionally, we release a 30/50 hybrid variant of Apriel-H1, further fine-tuned on a supervised dataset of reasoning traces, achieving over 2x higher inference throughput when deployed in the production-ready vLLM environment, with minimal degradation in reasoning performance. This shows that distilled hybrid SSM-Transformer architectures can deliver substantial efficiency gains over the pretrained transformer equivalent without substantially compromising the reasoning quality.



CLAQS: Compact Learnable All-Quantum Token Mixer with Shared-ansatz for Text Classification

Chen, Junhao, Zhou, Yifan, Jiang, Hanqi, Pan, Yi, Li, Yiwei, Zhao, Huaqin, Zhang, Wei, Wang, Yingfeng, Liu, Tianming

arXiv.org Artificial Intelligence

Quantum compute is scaling fast, from cloud QPUs to high throughput GPU simulators, making it timely to prototype quantum NLP beyond toy tasks. However, devices remain qubit limited and depth limited, training can be unstable, and classical attention is compute and memory heavy. This motivates compact, phase aware quantum token mixers that stabilize amplitudes and scale to long sequences. We present CLAQS, a compact, fully quantum token mixer for text classification that jointly learns complex-valued mixing and nonlinear transformations within a unified quantum circuit. To enable stable end-to-end optimization, we apply l1 normalization to regulate amplitude scaling and introduce a two-stage parameterized quantum architecture that decouples shared token embeddings from a window-level quantum feed-forward module. Operating under a sliding-window regime with document-level aggregation, CLAQS requires only eight data qubits and shallow circuits, yet achieves 91.64% accuracy on SST-2 and 87.08% on IMDB, outperforming both classical Transformer baselines and strong hybrid quantum-classical counterparts.


ROSflight 2.0: Lean ROS 2-Based Autopilot for Unmanned Aerial Vehicles

Moore, Jacob, Tokumaru, Phil, Reid, Ian, Sutherland, Brandon, Ritchie, Joseph, Snow, Gabe, McLain, Tim

arXiv.org Artificial Intelligence

ROSflight 2.0: Lean ROS 2-Based Autopilot for Unmanned Aerial V ehicles Abstract-- ROSflight is a lean, open-source autopilot ecosystem for unmanned aerial vehicles (UA Vs). Designed by researchers for researchers, it is built to lower the barrier to entry to UA V research and accelerate the transition from simulation to hardware experiments by maintaining a lean (not full-featured), well-documented, and modular codebase. This publication builds on previous treatments and describes significant additions to the architecture that improve the modularity and usability of ROSflight, including the transition from ROS 1 to ROS 2, supported hardware, low-level actuator mixing, and the simulation environment. We believe that these changes improve the usability of ROSflight and enable ROSflight to accelerate research in areas like advanced-air mobility. Hardware results are provided, showing that ROSflight is able to control a multirotor over a serial connection at 400 Hz while closing all control loops on the companion computer . In recent years, interest in unmanned aerial vehicles (UA Vs) has increased significantly. Technological advances have enabled numerous applications of UA Vs, including package delivery, photography, search-and-rescue, firefighting, as well as military applications. Advanced air mobility (AAM), a category broadly referring to increasing autonomy in urban areas for civilian use, is also currently an area of high interest.