Goto

Collaborating Authors

 blimp


Different types of syntactic agreement recruit the same units within large language models

Kryvosheieva, Daria, de Varda, Andrea, Fedorenko, Evelina, Tuckute, Greta

arXiv.org Artificial Intelligence

Large language models (LLMs) can reliably distinguish grammatical from ungrammatical sentences, but how grammatical knowledge is represented within the models remains an open question. We investigate whether different syntactic phenomena recruit shared or distinct components in LLMs. Using a functional localization approach inspired by cognitive neuroscience, we identify the LLM units most responsive to 67 English syntactic phenomena in seven open-weight models. These units are consistently recruited across sentences containing the phenomena and causally support the models' syntactic performance. Critically, different types of syntactic agreement (e.g., subject-verb, anaphor, determiner-noun) recruit overlapping sets of units, suggesting that agreement constitutes a meaningful functional category for LLMs. This pattern holds in English, Russian, and Chinese; and further, in a cross-lingual analysis of 57 diverse languages, structurally more similar languages share more units for subject-verb agreement. Taken together, these findings reveal that syntactic agreement-a critical marker of syntactic dependencies-constitutes a meaningful category within LLMs' representational spaces.


POSESTITCH-SLT: Linguistically Inspired Pose-Stitching for End-to-End Sign Language Translation

Joshi, Abhinav, Sharma, Vaibhav, Singh, Sanjeet, Modi, Ashutosh

arXiv.org Artificial Intelligence

Sign language translation remains a challenging task due to the scarcity of large-scale, sentence-aligned datasets. Prior arts have focused on various feature extraction and architectural changes to support neural machine translation for sign languages. We propose POSESTITCH-SLT, a novel pre-training scheme that is inspired by linguistic-templates-based sentence generation technique. With translation comparison on two sign language datasets, How2Sign and iSign, we show that a simple transformer-based encoder-decoder architecture outperforms the prior art when considering template-generated sentence pairs in training. We achieve BLEU-4 score improvements from 1.97 to 4.56 on How2Sign and from 0.55 to 3.43 on iSign, surpassing prior state-of-the-art methods for pose-based gloss-free translation. The results demonstrate the effectiveness of template-driven synthetic supervision in low-resource sign language settings.


Lightweight Tracking Control for Computationally Constrained Aerial Systems with the Newton-Raphson Method

Morales-Cuadrado, Evanns, Baird, Luke, Wardi, Yorai, Coogan, Samuel

arXiv.org Artificial Intelligence

--We investigate the performance of a lightweight tracking controller, based on a flow version of the Newton-Raphson method, applied to a miniature blimp and a mid-size quadrotor . This tracking technique has been shown to enjoy theoretical guarantees of performance and has been applied with success in simulation studies and on mobile robots with simple motion models. This paper investigates the technique through real-world flight experiments on aerial hardware platforms subject to realistic deployment and onboard computational constraints. The technique's performance is assessed in comparison with the established control frameworks of feedback linearization for the blimp, and nonlinear model predictive control for both quadrotor and blimp. The performance metrics under consideration are (i) root mean square error of flight trajectories with respect to target trajectories, (ii) algorithms' computation times, and (iii) CPU energy consumption associated with the control algorithms. The experimental findings show that the Newton-Raphson flow-based tracking controller achieves comparable or superior tracking performance to the baseline methods with substantially reduced computation time and energy expenditure. HE past two decades have seen a significant shift in the nature of hardware research for trajectory control of aerial platforms like quadrotors. First, testing and verification of novel techniques relied heavily on numerical simulators, later transitioning to real-world deployments that depended on ground station computers and simplified models (e.g. Today, powerful single-board computers (SBCs) have enabled research to shift toward onboard execution even for computationally intensive control methods [2]-[4].


Blimp-based Crime Scene Analysis

Cooney, Martin, Alonso-Fernandez, Fernando

arXiv.org Artificial Intelligence

Crime is a critical problem -- which often takes place behind closed doors, posing additional difficulties for investigators. To bring hidden truths to light, evidence at indoor crime scenes must be documented before any contamination or degradation occurs. Here, we address this challenge from the perspective of artificial intelligence (AI), computer vision, and robotics: Specifically, we explore the use of a blimp as a "floating camera" to drift over and record evidence with minimal disturbance. Adopting a rapid prototyping approach, we develop a proof-of-concept to investigate capabilities required for manual or semi-autonomous operation. Consequently, our results demonstrate the feasibility of equipping indoor blimps with various components (such as RGB and thermal cameras, LiDARs, and WiFi, with 20 minutes of battery life). Moreover, we confirm the core premise: that such blimps can be used to observe crime scene evidence while generating little airflow. We conclude by proposing some ideas related to detection (e.g., of bloodstains), mapping, and path planning, with the aim of stimulating further discussion and exploration.


Design of a Formation Control System to Assist Human Operators in Flying a Swarm of Robotic Blimps

Wu, Tianfu, Fu, Jiaqi, Meng, Wugang, Cho, Sungjin, Zhan, Huanzhe, Zhang, Fumin

arXiv.org Artificial Intelligence

Formation control is essential for swarm robotics, enabling coordinated behavior in complex environments. In this paper, we introduce a novel formation control system for an indoor blimp swarm using a specialized leader-follower approach enhanced with a dynamic leader-switching mechanism. This strategy allows any blimp to take on the leader role, distributing maneuvering demands across the swarm and enhancing overall formation stability. Only the leader blimp is manually controlled by a human operator, while follower blimps use onboard monocular cameras and a laser altimeter for relative position and altitude estimation. A leader-switching scheme is proposed to assist the human operator to maintain stability of the swarm, especially when a sharp turn is performed. Experimental results confirm that the leader-switching mechanism effectively maintains stable formations and adapts to dynamic indoor environments while assisting human operator.


Robust Safety Critical Control Under Multiple State and Input Constraints: Volume Control Barrier Function Method

Dong, Jinyang, Wu, Shizhen, Liu, Rui, Liang, Xiao, Lu, Biao, Fang, Yongchun

arXiv.org Artificial Intelligence

IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS Robust Safety Critical Control Under Multiple State and Input Constraints: Volume Control Barrier Function Method Jinyang Dong, Shizhen Wu, Rui Liu, Xiao Liang, Senior Member, IEEE, Biao Lu, Member, IEEE, and Y ongchun Fang, Senior Member, IEEE Abstract --In this paper, the safety-critical control problem for uncertain systems under multiple control barrier function (CBF) constraints and input constraints is investigated. A novel framework is proposed to generate a safety filter that minimizes changes to reference inputs when safety risks arise, ensuring a balance between safety and performance. A nonlinear disturbance observer (DOB) based on the robust integral of the sign of the error (RISE) is used to estimate system uncertainties, ensuring that the estimation error converges to zero exponentially. This error bound is integrated into the safety-critical controller to reduce conservativeness while ensuring safety. To further address the challenges arising from multiple CBF and input constraints, a novel Volume CBF (VCBF) is proposed by analyzing the feasible space of the quadratic programming (QP) problem. To ensure that the feasible space does not vanish under disturbances, a DOB-VCBF-based method is introduced, ensuring system safety while maintaining the feasibility of the resulting QP . Subsequently, several groups of simulation and experimental results are provided to validate the effectiveness of the proposed controller. I NTRODUCTION A S automation systems have become integral to our daily lives, the development of safe and high-performance controllers for these systems is of paramount importance. To meet this need, the Control Barrier Function (CBF) is a powerful tool to ensure the safety of control systems [1].


MochiSwarm: A testbed for robotic blimps in realistic environments

Xu, Jiawei, Vu, Thong, D'Antonio, Diego S., Saldaña, David

arXiv.org Artificial Intelligence

Testing aerial robots in tasks such as pickup-and-delivery and surveillance significantly benefits from high energy efficiency and scalability of the deployed robotic system. This paper presents MochiSwarm, an open-source testbed of light-weight robotic blimps, ready for multi-robot operation without external localization. We introduce the system design in hardware, software, and perception, which capitalizes on modularity, low cost, and light weight. The hardware allows for rapid modification, which enables the integration of additional sensors to enhance autonomy for different scenarios. The software framework supports different actuation models and communication between the base station and multiple blimps. The detachable perception module allows independent blimps to perform tasks that involve detection and autonomous actuation. We showcase a differential-drive module as an example, of which the autonomy is enabled by visual servoing using the perception module. A case study of pickup-and-delivery tasks with up to 12 blimps highlights the autonomy of the MochiSwarm without external infrastructures.

  Genre: Research Report (0.64)
  Industry:

BabyHGRN: Exploring RNNs for Sample-Efficient Training of Language Models

Haller, Patrick, Golde, Jonas, Akbik, Alan

arXiv.org Artificial Intelligence

This paper explores the potential of recurrent neural networks (RNNs) and other subquadratic architectures as competitive alternatives to transformer-based models in low-resource language modeling scenarios. We utilize HGRN2 (Qin et al., 2024), a recently proposed RNN-based architecture, and comparatively evaluate its effectiveness against transformer-based baselines and other subquadratic architectures (LSTM, xLSTM, Mamba). Our experimental results show that BABYHGRN, our HGRN2 language model, outperforms transformer-based models in both the 10M and 100M word tracks of the challenge, as measured by their performance on the BLiMP, EWoK, GLUE and BEAR benchmarks. Further, we show the positive impact of knowledge distillation. Our findings challenge the prevailing focus on transformer architectures and indicate the viability of RNN-based models, particularly in resource-constrained environments.


BabyLM Challenge: Exploring the Effect of Variation Sets on Language Model Training Efficiency

Haga, Akari, Fukatsu, Akiyo, Oba, Miyu, Bisazza, Arianna, Oseki, Yohei

arXiv.org Artificial Intelligence

While current large language models have achieved a remarkable success, their data efficiency remains a challenge to overcome. Recently it has been suggested that child-directed speech (CDS) can improve training data efficiency of modern language models based on Transformer neural networks. However, it is not yet understood which specific properties of CDS are effective for training these models. In the context of the BabyLM Challenge, we focus on Variation Sets (VSs), sets of consecutive utterances expressing a similar intent with slightly different words and structures, which are ubiquitous in CDS. To assess the impact of VSs on training data efficiency, we augment CDS data with different proportions of artificial VSs and use these datasets to train an auto-regressive model, GPT-2. We find that the best proportion of VSs depends on the evaluation benchmark: BLiMP and GLUE scores benefit from the presence of VSs, but EWOK scores do not. Additionally, the results vary depending on multiple factors such as the number of epochs and the order of utterance presentation. Taken together, these findings suggest that VSs can have a beneficial influence on language models, while leaving room for further investigation.


What Should Baby Models Read? Exploring Sample-Efficient Data Composition on Model Performance

Yam, Hong Meng, Paek, Nathan J

arXiv.org Artificial Intelligence

We explore the impact of pre-training data composition on the performance of small language models in a sample-efficient setting. Using datasets limited to 10 million words, we evaluate several dataset sources, including child-directed speech (CHILDES), classic books (Gutenberg), synthetic data (TinyStories), and a mix of these (Mix) across different model sizes ranging from 18 million to 705 million parameters. Our experiments show that smaller models (e.g., GPT2-97M, GPT2-705M, Llama-360M) perform better when trained on more complex and rich datasets like Gutenberg. Models trained on the CHILDES and TinyStories datasets underperformed across all model sizes. These findings suggest that the optimal dataset for sample efficient training depends on the model size, and that neither child-directed speech nor simplified stories are optimal for language models of all sizes. We highlight the importance of considering both dataset composition and model capacity for effective sample efficient language model training.