Overview
GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS
Kazemkhani, Saman, Pandya, Aarav, Cornelisse, Daphne, Shacklett, Brennan, Vinitsky, Eugene
Multi-agent learning algorithms have been successful at generating superhuman planning in a wide variety of games but have had little impact on the design of deployed multi-agent planners. A key bottleneck in applying these techniques to multi-agent planning is that they require billions of steps of experience. To enable the study of multi-agent planning at this scale, we present GPUDrive, a GPU-accelerated, multi-agent simulator built on top of the Madrona Game Engine that can generate over a million steps of experience per second. Observation, reward, and dynamics functions are written directly in C++, allowing users to define complex, heterogeneous agent behaviors that are lowered to high-performance CUDA. We show that using GPUDrive we are able to effectively train reinforcement learning agents over many scenes in the Waymo Motion dataset, yielding highly effective goal-reaching agents in minutes for individual scenes and generally capable agents in a few hours.
Evaluating the Impact of Advanced LLM Techniques on AI-Lecture Tutors for a Robotics Course
Kahl, Sebastian, Löffler, Felix, Maciol, Martin, Ridder, Fabian, Schmitz, Marius, Spanagel, Jennifer, Wienkamp, Jens, Burgahn, Christopher, Schilling, Malte
This study evaluates the performance of Large Language Models (LLMs) as an Artificial Intelligence-based tutor for a university course. In particular, different advanced techniques are utilized, such as prompt engineering, Retrieval-Augmented-Generation (RAG), and fine-tuning. We assessed the different models and applied techniques using common similarity metrics like BLEU-4, ROUGE, and BERTScore, complemented by a small human evaluation of helpfulness and trustworthiness. Our findings indicate that RAG combined with prompt engineering significantly enhances model responses and produces better factual answers. In the context of education, RAG appears as an ideal technique as it is based on enriching the input of the model with additional information and material which usually is already present for a university course. Fine-tuning, on the other hand, can produce quite small, still strong expert models, but poses the danger of overfitting. Our study further asks how we measure performance of LLMs and how well current measurements represent correctness or relevance? We find high correlation on similarity metrics and a bias of most of these metrics towards shorter responses. Overall, our research points to both the potential and challenges of integrating LLMs in educational settings, suggesting a need for balanced training approaches and advanced evaluation frameworks.
Detection and Characterization of Coordinated Online Behavior: A Survey
Mannocci, Lorenzo, Mazza, Michele, Monreale, Anna, Tesconi, Maurizio, Cresci, Stefano
Coordination is a fundamental aspect of life. The advent of social media has made it integral also to online human interactions, such as those that characterize thriving online communities and social movements. At the same time, coordination is also core to effective disinformation, manipulation, and hate campaigns. This survey collects, categorizes, and critically discusses the body of work produced as a result of the growing interest on coordinated online behavior. We reconcile industry and academic definitions, propose a comprehensive framework to study coordinated online behavior, and review and critically discuss the existing detection and characterization methods. Our analysis identifies open challenges and promising directions of research, serving as a guide for scholars, practitioners, and policymakers in understanding and addressing the complexities inherent to online coordination.
Low-Power Vibration-Based Predictive Maintenance for Industry 4.0 using Neural Networks: A Survey
Vasilache, Alexandru, Nitzsche, Sven, Floegel, Daniel, Schuermann, Tobias, von Dosky, Stefan, Bierweiler, Thomas, Mußler, Marvin, Kälber, Florian, Hohmann, Soeren, Becker, Juergen
The advancements in smart sensors for Industry 4.0 offer ample opportunities for low-powered predictive maintenance and condition monitoring. However, traditional approaches in this field rely on processing in the cloud, which incurs high costs in energy and storage. This paper investigates the potential of neural networks for low-power on-device computation of vibration sensor data for predictive maintenance. We review the literature on Spiking Neural Networks (SNNs) and Artificial Neuronal Networks (ANNs) for vibration-based predictive maintenance by analyzing datasets, data preprocessing, network architectures, and hardware implementations. Our findings suggest that no satisfactory standard benchmark dataset exists for evaluating neural networks in predictive maintenance tasks. Furthermore frequency domain transformations are commonly employed for preprocessing. SNNs mainly use shallow feed forward architectures, whereas ANNs explore a wider range of models and deeper networks. Finally, we highlight the need for future research on hardware implementations of neural networks for low-power predictive maintenance applications and the development of a standardized benchmark dataset.
Automatic Pull Request Description Generation Using LLMs: A T5 Model Approach
Sakib, Md Nazmus, Islam, Md Athikul, Arifin, Md Mashrur
Developers create pull request (PR) descriptions to provide an overview of their changes and explain the motivations behind them. These descriptions help reviewers and fellow developers quickly understand the updates. Despite their importance, some developers omit these descriptions. To tackle this problem, we propose an automated method for generating PR descriptions based on commit messages and source code comments. This method frames the task as a text summarization problem, for which we utilized the T5 text-to-text transfer model. We fine-tuned a pre-trained T5 model using a dataset containing 33,466 PRs. The model's effectiveness was assessed using ROUGE metrics, which are recognized for their strong alignment with human evaluations. Our findings reveal that the T5 model significantly outperforms LexRank, which served as our baseline for comparison.
Y Social: an LLM-powered Social Media Digital Twin
Rossetti, Giulio, Stella, Massimo, Cazabet, Rémy, Abramski, Katherine, Cau, Erica, Citraro, Salvatore, Failla, Andrea, Improta, Riccardo, Morini, Virginia, Pansanella, Valentina
Online social media (OSM henceforth) have revolutionized the way we exchange information. From the user's perspective, these digital ecosystems are largely effortless [136], enabling convenient ways of exchanging personal content [1], seeking information [129] and synchronizing with others [37]. This convenience has catalyzed a massive digital shift in social and information exchanges from offline to online settings [136], which has provided novel access to massive amounts of online data regarding human behaviour [141]. Unconstrained by geographical barriers, the massive adoption of social media has given rise to novel phenomena that are absent in in-person interactions, such as the influence of complexity and artificial intelligence. Complexity in social media is strongly related to the motto "more is different" [7]: the idea that the co-occurrence of many, even similar, interactions within the same context can lead to unexpected phenomena. Examples include acts as simple and seemingly insignificant as following another user, or re-sharing content. Taken individually, these actions can be understood in terms of a user's activity, psychology, and engagement [91, 97, 141], but when repeated by vast amounts of users, these actions can determine the unexpected rise
Risks, Causes, and Mitigations of Widespread Deployments of Large Language Models (LLMs): A Survey
Sakib, Md Nazmus, Islam, Md Athikul, Pathak, Royal, Arifin, Md Mashrur
Recent advancements in Large Language Models (LLMs), such as ChatGPT and LLaMA, have significantly transformed Natural Language Processing (NLP) with their outstanding abilities in text generation, summarization, and classification. Nevertheless, their widespread adoption introduces numerous challenges, including issues related to academic integrity, copyright, environmental impacts, and ethical considerations such as data bias, fairness, and privacy. The rapid evolution of LLMs also raises concerns regarding the reliability and generalizability of their evaluations. This paper offers a comprehensive survey of the literature on these subjects, systematically gathered and synthesized from Google Scholar. Our study provides an in-depth analysis of the risks associated with specific LLMs, identifying sub-risks, their causes, and potential solutions. Furthermore, we explore the broader challenges related to LLMs, detailing their causes and proposing mitigation strategies. Through this literature analysis, our survey aims to deepen the understanding of the implications and complexities surrounding these powerful models.
Collecting Larg-Scale Robotic Datasets on a High-Speed Mobile Platform
Lin, Yuxin, Ma, Jiaxuan, Gu, Sizhe, Kong, Jipeng, Xu, Bowen, Zhao, Xiting, Zhao, Dengji, Cao, Wenhan, Schwertfeger, Sören
Mobile robotics datasets are essential for research on robotics, for example for research on Simultaneous Localization and Mapping (SLAM). Therefore the ShanghaiTech Mapping Robot was constructed, that features a multitude high-performance sensors and a 16-node cluster to collect all this data. That robot is based on a Clearpath Husky mobile base with a maximum speed of 1 meter per second. This is fine for indoor datasets, but to collect large-scale outdoor datasets a faster platform is needed. This system paper introduces our high-speed mobile platform for data collection. The mapping robot is secured on the rear-steered flatbed car with maximum field of view. Additionally two encoders collect odometry data from two of the car wheels and an external sensor plate houses a downlooking RGB and event camera. With this setup a dataset of more than 10km in the underground parking garage and the outside of our campus was collected and is published with this paper.
Ontological Relations from Word Embeddings
d'Aquin, Mathieu, Nauer, Emmanuel
It has been reliably shown that the similarity of word embeddings obtained from popular neural models such as BERT approximates effectively a form of semantic similarity of the meaning of those words. It is therefore natural to wonder if those embeddings contain enough information to be able to connect those meanings through ontological relationships such as the one of subsumption. If so, large knowledge models could be built that are capable of semantically relating terms based on the information encapsulated in word embeddings produced by pre-trained models, with implications not only for ontologies (ontology matching, ontology evolution, etc.) but also on the ability to integrate ontological knowledge in neural models. In this paper, we test how embeddings produced by several pre-trained models can be used to predict relations existing between classes and properties of popular upper-level and general ontologies. We show that even a simple feed-forward architecture on top of those embeddings can achieve promising accuracies, with varying generalisation abilities depending on the input data. To achieve that, we produce a dataset that can be used to further enhance those models, opening new possibilities for applications integrating knowledge from web ontologies.
A Systematic Review on Long-Tailed Learning
Zhang, Chongsheng, Almpanidis, George, Fan, Gaojuan, Deng, Binquan, Zhang, Yanbo, Liu, Ji, Kamel, Aouaidjia, Soda, Paolo, Gama, João
Long-tailed data is a special type of multi-class imbalanced data with a very large amount of minority/tail classes that have a very significant combined influence. Long-tailed learning aims to build high-performance models on datasets with long-tailed distributions, which can identify all the classes with high accuracy, in particular the minority/tail classes. It is a cutting-edge research direction that has attracted a remarkable amount of research effort in the past few years. In this paper, we present a comprehensive survey of latest advances in long-tailed visual learning. We first propose a new taxonomy for long-tailed learning, which consists of eight different dimensions, including data balancing, neural architecture, feature enrichment, logits adjustment, loss function, bells and whistles, network optimization, and post hoc processing techniques. Based on our proposed taxonomy, we present a systematic review of long-tailed learning methods, discussing their commonalities and alignable differences. We also analyze the differences between imbalance learning and long-tailed learning approaches. Finally, we discuss prospects and future directions in this field.