Materials
Embodied Neuromorphic Artificial Intelligence for Robotics: Perspectives, Challenges, and Research Development Stack
Putra, Rachmad Vidya Wicaksana, Marchisio, Alberto, Zayer, Fakhreddine, Dias, Jorge, Shafique, Muhammad
Robotic technologies have been an indispensable part for improving human productivity since they have been helping humans in completing diverse, complex, and intensive tasks in a fast yet accurate and efficient way. Therefore, robotic technologies have been deployed in a wide range of applications, ranging from personal to industrial use-cases. However, current robotic technologies and their computing paradigm still lack embodied intelligence to efficiently interact with operational environments, respond with correct/expected actions, and adapt to changes in the environments. Toward this, recent advances in neuromorphic computing with Spiking Neural Networks (SNN) have demonstrated the potential to enable the embodied intelligence for robotics through bio-plausible computing paradigm that mimics how the biological brain works, known as "neuromorphic artificial intelligence (AI)". However, the field of neuromorphic AI-based robotics is still at an early stage, therefore its development and deployment for solving real-world problems expose new challenges in different design aspects, such as accuracy, adaptability, efficiency, reliability, and security. To address these challenges, this paper will discuss how we can enable embodied neuromorphic AI for robotic systems through our perspectives: (P1) Embodied intelligence based on effective learning rule, training mechanism, and adaptability; (P2) Cross-layer optimizations for energy-efficient neuromorphic computing; (P3) Representative and fair benchmarks; (P4) Low-cost reliability and safety enhancements; (P5) Security and privacy for neuromorphic computing; and (P6) A synergistic development for energy-efficient and robust neuromorphic-based robotics. Furthermore, this paper identifies research challenges and opportunities, as well as elaborates our vision for future research development toward embodied neuromorphic AI for robotics.
Universal Functional Regression with Neural Operator Flows
Shi, Yaozhong, Gao, Angela F., Ross, Zachary E., Azizzadenesheli, Kamyar
The notion of inference on function spaces is essential to the physical sciences and engineering, where the governing equations are frequently partial differential equations (PDEs) describing the evolution of functions in space and time. In particular, it is often desirable to infer the values of a function everywhere in a physical domain given a sparse number of observation points. There are numerous types of problems in which functional regression plays an important role, such as inverse problems, time series forecasting, data imputation/assimilation. Functional regression problems can be particularly challenging for real world datasets because the underlying stochastic process is often unknown. Much of the work on functional regression and inference has relied on Gaussian processes (GPs) (Rasmussen and Williams, 2006), a specific type of stochastic process in which any finite collection of points has a multivariate Gaussian distribution. Some of the earliest applications focused on analyzing geological data, such as the locations of valuable ore deposits, to identify where new deposits might be found (Chiles and Delfiner, 2012). GP regression (GPR) provides several advantages for functional inference including robustness and mathematical tractability for various problems. This has led to the use of GPR in an assortment of scientific and engineering fields, where precision and reliability in predictions and inferences can significantly impact outcomes (Deringer et al., 2021; Aigrain and Foreman-Mackey, 2023). Despite widespread adoption, the assumption of a GP prior for functional inference problems can be rather limiting, particularly in scenarios where the data exhibit heavy-tailed or multimodal distributions, e.g.
Foundation Models for Structural Health Monitoring
Benfenati, Luca, Pagliari, Daniele Jahier, Zanatta, Luca, Velez, Yhorman Alexander Bedoya, Acquaviva, Andrea, Poncino, Massimo, Macii, Enrico, Benini, Luca, Burrello, Alessio
Structural Health Monitoring (SHM) is a critical task for ensuring the safety and reliability of civil infrastructures, typically realized on bridges and viaducts by means of vibration monitoring. In this paper, we propose for the first time the use of Transformer neural networks, with a Masked Auto-Encoder architecture, as Foundation Models for SHM. We demonstrate the ability of these models to learn generalizable representations from multiple large datasets through self-supervised pre-training, which, coupled with task-specific fine-tuning, allows them to outperform state-of-the-art traditional methods on diverse tasks, including Anomaly Detection (AD) and Traffic Load Estimation (TLE). We then extensively explore model size versus accuracy trade-offs and experiment with Knowledge Distillation (KD) to improve the performance of smaller Transformers, enabling their embedding directly into the SHM edge nodes. We showcase the effectiveness of our foundation models using data from three operational viaducts. For AD, we achieve a near-perfect 99.9% accuracy with a monitoring time span of just 15 windows. In contrast, a state-of-the-art method based on Principal Component Analysis (PCA) obtains its first good result (95.03% accuracy) only considering 120 windows. On two different TLE tasks, our models obtain state-of-the-art performance on multiple evaluation metrics (R$^2$ score, MAE% and MSE%). On the first benchmark, we achieve an R$^2$ score of 0.97 and 0.85 for light and heavy vehicle traffic, respectively, while the best previous approach stops at 0.91 and 0.84. On the second one, we achieve an R$^2$ score of 0.54 versus the 0.10 of the best existing method.
CLAPNQ: Cohesive Long-form Answers from Passages in Natural Questions for RAG systems
Rosenthal, Sara, Sil, Avirup, Florian, Radu, Roukos, Salim
Large (NQ) (Kwiatkowski et al., 2019) and SQuAD (Rajpurkar scale research in this area began with the tasks et al., 2016, 2018) which are just a few of Machine Reading Comprehension (Rajpurkar words. It is grounded on a single gold passage, et al., 2016; Rogers et al., 2023; Fisch et al., in contrast to other long-form question answering 2021), and Information Retrieval (Manning et al., (LFQA) datasets such as ELI5 (Fan et al., 2019) 2008; Voorhees and Harman, 2005; Thakur et al., where gold passages are not available. It is built 2021) and has more recently been come to be from a subset of the highly successful Natural Questions known as Retrieval Augmented Generation (Lewis (Kwiatkowski et al., 2019) dataset for extractive et al., 2021; Guu et al., 2020) which encompasses QA from Wikipedia documents based on users both tasks. The recent popularity of generative real web search queries - specifically, the subset of AI with Large Language models (LLM), such as NQ that has long answers (passages) but no short GPT (Brown et al., 2020), Llama (Touvron et al., extractive answers.
Global Mapping of Exposure and Physical Vulnerability Dynamics in Least Developed Countries using Remote Sensing and Machine Learning
Dimasaka, Joshua, Geiß, Christian, So, Emily
As the world marked the midterm of the Sendai Framework for Disaster Risk Reduction 2015-2030, many countries are still struggling to monitor their climate and disaster risk because of the expensive large-scale survey of the distribution of exposure and physical vulnerability and, hence, are not on track in reducing risks amidst the intensifying effects of climate change. We present an ongoing effort in mapping this vital information using machine learning and time-series remote sensing from publicly available Sentinel-1 SAR GRD and Sentinel-2 Harmonized MSI. We introduce the development of "OpenSendaiBench" consisting of 47 countries wherein most are least developed (LDCs), trained ResNet-50 deep learning models, and demonstrated the region of Dhaka, Bangladesh by mapping the distribution of its informal constructions. As a pioneering effort in auditing global disaster risk over time, this paper aims to advance the area of large-scale risk quantification in informing our collective long-term efforts in reducing climate and disaster risk.
OpenChemIE: An Information Extraction Toolkit For Chemistry Literature
Fan, Vincent, Qian, Yujie, Wang, Alex, Wang, Amber, Coley, Connor W., Barzilay, Regina
Information extraction from chemistry literature is vital for constructing up-to-date reaction databases for data-driven chemistry. Complete extraction requires combining information across text, tables, and figures, whereas prior work has mainly investigated extracting reactions from single modalities. In this paper, we present OpenChemIE to address this complex challenge and enable the extraction of reaction data at the document level. OpenChemIE approaches the problem in two steps: extracting relevant information from individual modalities and then integrating the results to obtain a final list of reactions. For the first step, we employ specialized neural models that each address a specific task for chemistry information extraction, such as parsing molecules or reactions from text or figures. We then integrate the information from these modules using chemistry-informed algorithms, allowing for the extraction of fine-grained reaction data from reaction condition and substrate scope investigations. Our machine learning models attain state-of-the-art performance when evaluated individually, and we meticulously annotate a challenging dataset of reaction schemes with R-groups to evaluate our pipeline as a whole, achieving an F1 score of 69.5%. Additionally, the reaction extraction results of \ours attain an accuracy score of 64.3% when directly compared against the Reaxys chemical database. We provide OpenChemIE freely to the public as an open-source package, as well as through a web interface.
Intelligent Robotic Control System Based on Computer Vision Technology
Che, Chang, Zheng, Haotian, Huang, Zengyi, Jiang, Wei, Liu, Bo
Computer vision is a kind of simulation of biological vision using computers and related equipment. It is an important part of the field of artificial intelligence. Its research goal is to make computers have the ability to recognize three-dimensional environmental information through two-dimensional images. Computer vision is based on image processing technology, signal processing technology, probability statistical analysis, computational geometry, neural network, machine learning theory and computer information processing technology, through computer analysis and processing of visual information.The article explores the intersection of computer vision technology and robotic control, highlighting its importance in various fields such as industrial automation, healthcare, and environmental protection. Computer vision technology, which simulates human visual observation, plays a crucial role in enabling robots to perceive and understand their surroundings, leading to advancements in tasks like autonomous navigation, object recognition, and waste management. By integrating computer vision with robot control, robots gain the ability to interact intelligently with their environment, improving efficiency, quality, and environmental sustainability.
A CRISP-DM-based Methodology for Assessing Agent-based Simulation Models using Process Mining
Bemthuis, Rob H., Govers, Ruben R., Asadi, Amin
Agent-based simulation (ABS) models are potent tools for analyzing complex systems. However, understanding and validating ABS models can be a significant challenge. To address this challenge, cutting-edge data-driven techniques offer sophisticated capabilities for analyzing the outcomes of ABS models. One such technique is process mining, which encompasses a range of methods for discovering, monitoring, and enhancing processes by extracting knowledge from event logs. However, applying process mining to event logs derived from ABSs is not trivial, and deriving meaningful insights from the resulting process models adds an additional layer of complexity. Although process mining is invaluable in extracting insights from ABS models, there is a lack of comprehensive methodological guidance for its application in ABS evaluation in the research landscape. In this paper, we propose a methodology, based on the CRoss-Industry Standard Process for Data Mining (CRISP-DM) methodology, to assess ABS models using process mining techniques. We incorporate process mining techniques into the stages of the CRISP-DM methodology, facilitating the analysis of ABS model behaviors and their underlying processes. We demonstrate our methodology using an established agent-based model, Schelling model of segregation. Our results show that our proposed methodology can effectively assess ABS models through produced event logs, potentially paving the way for enhanced agent-based model validity and more insightful decision-making.
Categorical semiotics: Foundations for Knowledge Integration
The integration of knowledge extracted from diverse models, whether described by domain experts or generated by machine learning algorithms, has historically been challenged by the absence of a suitable framework for specifying and integrating structures, learning processes, data transformations, and data models or rules. In this work, we extend algebraic specification methods to address these challenges within such a framework. In our work, we tackle the challenging task of developing a comprehensive framework for defining and analyzing deep learning architectures. We believe that previous efforts have fallen short by failing to establish a clear connection between the constraints a model must adhere to and its actual implementation. Our methodology employs graphical structures that resemble Ehresmann's sketches, interpreted within a universe of fuzzy sets. This approach offers a unified theory that elegantly encompasses both deterministic and non-deterministic neural network designs. Furthermore, we highlight how this theory naturally incorporates fundamental concepts from computer science and automata theory. Our extended algebraic specification framework, grounded in graphical structures akin to Ehresmann's sketches, offers a promising solution for integrating knowledge across disparate models and domains. By bridging the gap between domain-specific expertise and machine-generated insights, we pave the way for more comprehensive, collaborative, and effective approaches to knowledge integration and modeling.
OpenMines: A Light and Comprehensive Mining Simulation Environment for Truck Dispatching
Meng, Shi, Tian, Bin, Zhang, Xiaotong, Qi, Shuangying, Zhang, Caiji, Zhang, Qiang
Mine fleet management algorithms can significantly reduce operational costs and enhance productivity in mining systems. Most current fleet management algorithms are evaluated based on self-implemented or proprietary simulation environments, posing challenges for replication and comparison. This paper models the simulation environment for mine fleet management from a complex systems perspective. Building upon previous work, we introduce probabilistic, user-defined events for random event simulation and implement various evaluation metrics and baselines, effectively reflecting the robustness of fleet management algorithms against unforeseen incidents. We present ``OpenMines'', an open-source framework encompassing the entire process of mine system modeling, algorithm development, and evaluation, facilitating future algorithm comparison and replication in the field. Code is available in https://github.com/370025263/openmines.