Asaba
Machine Learning Epidemic Predictions Using Agent-based Wireless Sensor Network Models
Nwokoye, Chukwunonso Henry, Oluchi, Blessing, Waldron, Sharna, Ezzeh, Peace
Given Name Surname line 2: dept. Abstract -- The lack of epidemiological data in wireless sensor networks (WSNs) is a fundamental difficulty in constructing robust models to forecast and mitigate threats like viruses and worms. Many studies have looked at different epidemic models for WSNs, focusing on the manner in which malware infections spread given the network's specific properties, including energy limits and node mobili ty. In this study, an agent - based realization of the susceptible - exposed - infected - recovered - vaccinated (SEIRV) mathematical model was employed for machine learning (ML) predictions. Using tools such as Netlogo's BehaviorSpace and Python, two epidemic synth etic datasets were generated and prepared for the application of several ML algorithms. Posed as a regression problem, the infected and recovered nodes were predicted, and the performance of these algorithms is compared using the error metrics of the train and the test sets. The predictions performed quite well, with low error metrics and high R values (0.997, 1.000, 0.999, 1.000), indicating an effective fit to the training set. The validation values were lowered (0.992, 0.998, 0.971, and 0.999), as is ty pical when evaluating model performance on unknown data. Judging from the recorded performances, support vector, linear, Lasso, Ridge, and ElasticNet regression were among the worst performing algorithms, while Random Forest, XGBoost, Decision Trees, and K nearest neighbor had the best model performances. In recent years, the globe as we know it has been changing due to bre akthroughs in numerous linked innovations including smart electrical grids [1], the IoT, long - term evolution, 5G connectivity [2] and cyber physical systems [3] such as wireless sensor networks (WSN).
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- North America > Canada > Ontario > Durham Region > Oshawa (0.04)
- (2 more...)
- Africa > Nigeria > Delta State > Asaba (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > Canada (0.04)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Robots (0.95)
- Information Technology > Artificial Intelligence > Cognitive Science (0.94)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.47)
Lossless Vocabulary Reduction for Auto-Regressive Language Models
Chijiwa, Daiki, Hasegawa, Taku, Nishida, Kyosuke, Yamaguchi, Shin'ya, Ohba, Tomoya, Sakao, Tamao, Takeuchi, Susumu
Tokenization -- the process of decomposing a given text into a sequence of subwords called tokens -- is one of the key components in the development of language models. Particularly, auto-regressive language models generate texts token by token, i.e., by predicting the next-token distribution given the previous ones, and thus tokenization directly affects their efficiency in text generation. Since each language model has their own vocabulary as a set of possible tokens, they struggle to cooperate with each other at the level of next-token distributions such as model ensemble. In this paper, we establish a theoretical framework of lossless vocabulary reduction, which efficiently converts a given auto-regressive language model into the one with an arbitrarily small vocabulary without any loss in accuracy. As an application, we demonstrate that language models with different tokenization can cooperate with each other efficiently through their maximal common vocabulary.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- Europe > Austria > Vienna (0.14)
- Asia > Middle East > Jordan (0.04)
- (10 more...)
- Africa > Nigeria > Delta State > Asaba (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Robots (0.95)
- Information Technology > Artificial Intelligence > Cognitive Science (0.94)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.47)
SpeLLM: Character-Level Multi-Head Decoding
Ben-Artzy, Amit, Schwartz, Roy
Scaling LLM vocabulary is often used to reduce input sequence length and alleviate attention's quadratic cost. Yet, current LLM architectures impose a critical bottleneck to this procedure: the output projection layer scales linearly with vocabulary size, rendering substantial expansion impractical. We propose SpeLLM, a method that decouples input and output vocabularies by predicting character-level strings through multiple output heads. In SpeLLM, each of the $k$ linear heads predicts a single character simultaneously, enabling the model to represent a much larger output space using smaller, independent linear heads. We present a self-distillation approach for converting a standard LLM to a SpeLLM. Our experiments with four pre-trained LLMs show their SpeLLM variants achieve competitive performance on downstream tasks while reducing runtime by 5.1% on average across models. Our approach provides a potential avenue for reducing LLM costs, while increasing support for underrepresented languages and domains.
- North America > United States (0.14)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- Asia > Singapore (0.04)
- (3 more...)
BETTY Dataset: A Multi-modal Dataset for Full-Stack Autonomy
Nye, Micah, Raji, Ayoub, Saba, Andrew, Erlich, Eidan, Exley, Robert, Goyal, Aragya, Matros, Alexander, Misra, Ritesh, Sivaprakasam, Matthew, Bertogna, Marko, Ramanan, Deva, Scherer, Sebastian
We present the BETTY dataset, a large-scale, multi-modal dataset collected on several autonomous racing vehicles, targeting supervised and self-supervised state estimation, dynamics modeling, motion forecasting, perception, and more. Existing large-scale datasets, especially autonomous vehicle datasets, focus primarily on supervised perception, planning, and motion forecasting tasks. Our work enables multi-modal, data-driven methods by including all sensor inputs and the outputs from the software stack, along with semantic metadata and ground truth information. The dataset encompasses 4 years of data, currently comprising over 13 hours and 32TB, collected on autonomous racing vehicle platforms. This data spans 6 diverse racing environments, including high-speed oval courses, for single and multi-agent algorithm evaluation in feature-sparse scenarios, as well as high-speed road courses with high longitudinal and lateral accelerations and tight, GPS-denied environments. It captures highly dynamic states, such as 63 m/s crashes, loss of tire traction, and operation at the limit of stability. By offering a large breadth of cross-modal and dynamic data, the BETTY dataset enables the training and testing of full autonomy stack pipelines, pushing the performance of all algorithms to the limits. The current dataset is available at https://pitt-mit-iac.github.io/betty-dataset/.
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > Costa Rica > Heredia Province > Heredia (0.04)
- Africa > Nigeria > Delta State > Asaba (0.04)
$\Lambda$CDM and early dark energy in latent space: a data-driven parametrization of the CMB temperature power spectrum
Piras, Davide, Herold, Laura, Lucie-Smith, Luisa, Komatsu, Eiichiro
Finding the best parametrization for cosmological models in the absence of first-principle theories is an open question. We propose a data-driven parametrization of cosmological models given by the disentangled 'latent' representation of a variational autoencoder (VAE) trained to compress cosmic microwave background (CMB) temperature power spectra. We consider a broad range of $\Lambda$CDM and beyond-$\Lambda$CDM cosmologies with an additional early dark energy (EDE) component. We show that these spectra can be compressed into 5 ($\Lambda$CDM) or 8 (EDE) independent latent parameters, as expected when using temperature power spectra alone, and which reconstruct spectra at an accuracy well within the Planck errors. These latent parameters have a physical interpretation in terms of well-known features of the CMB temperature spectrum: these include the position, height and even-odd modulation of the acoustic peaks, as well as the gravitational lensing effect. The VAE also discovers one latent parameter which entirely isolates the EDE effects from those related to $\Lambda$CDM parameters, thus revealing a previously unknown degree of freedom in the CMB temperature power spectrum. We further showcase how to place constraints on the latent parameters using Planck data as typically done for cosmological parameters, obtaining latent values consistent with previous $\Lambda$CDM and EDE cosmological constraints. Our work demonstrates the potential of a data-driven reformulation of current beyond-$\Lambda$CDM phenomenological models into the independent degrees of freedom to which the data observables are sensitive.
- Europe > Switzerland > Geneva > Geneva (0.14)
- Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- (8 more...)
Using Generative AI and Multi-Agents to Provide Automatic Feedback
Guo, Shuchen, Latif, Ehsan, Zhou, Yifan, Huang, Xuan, Zhai, Xiaoming
This study investigates the use of generative AI and multi-agent systems to provide automatic feedback in educational contexts, particularly for student constructed responses in science assessments. The research addresses a key gap in the field by exploring how multi-agent systems, called AutoFeedback, can improve the quality of GenAI-generated feedback, overcoming known issues such as over-praise and over-inference that are common in single-agent large language models (LLMs). The study developed a multi-agent system consisting of two AI agents: one for generating feedback and another for validating and refining it. The system was tested on a dataset of 240 student responses, and its performance was compared to that of a single-agent LLM. Results showed that AutoFeedback significantly reduced the occurrence of over-praise and over-inference errors, providing more accurate and pedagogically sound feedback. The findings suggest that multi-agent systems can offer a more reliable solution for generating automated feedback in educational settings, highlighting their potential for scalable and personalized learning support. These results have important implications for educators and researchers seeking to leverage AI in formative assessments, offering a pathway to more effective feedback mechanisms that enhance student learning outcomes.
- North America > United States > Georgia > Clarke County > Athens (0.04)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- Asia > China > Beijing > Beijing (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Instructional Material (1.00)
- Education > Educational Technology > Educational Software > Computer Based Training (1.00)
- Education > Educational Setting > Online (1.00)
- Education > Assessment & Standards (0.88)
RACECAR -- The Dataset for High-Speed Autonomous Racing
Kulkarni, Amar, Chrosniak, John, Ducote, Emory, Sauerbeck, Florian, Saba, Andrew, Chirimar, Utkarsh, Link, John, Cellina, Marcello, Behl, Madhur
This paper describes the first open dataset for full-scale and high-speed autonomous racing. Multi-modal sensor data has been collected from fully autonomous Indy race cars operating at speeds of up to 170 mph (273 kph). Six teams who raced in the Indy Autonomous Challenge have contributed to this dataset. The dataset spans 11 interesting racing scenarios across two race tracks which include solo laps, multi-agent laps, overtaking situations, high-accelerations, banked tracks, obstacle avoidance, pit entry and exit at different speeds. The dataset contains data from 27 racing sessions across the 11 scenarios with over 6.5 hours of sensor data recorded from the track. The data is organized and released in both ROS2 and nuScenes format. We have also developed the ROS2-to-nuScenes conversion library to achieve this. The RACECAR data is unique because of the high-speed environment of autonomous racing. We present several benchmark problems on localization, object detection and tracking (LiDAR, Radar, and Camera), and mapping using the RACECAR data to explore issues that arise at the limits of operation of the vehicle.
- North America > United States > Nevada > Clark County > Las Vegas (0.05)
- North America > United States > Indiana > Marion County > Indianapolis (0.05)
- North America > United States > Virginia (0.04)
- (3 more...)
An Artificial Intelligence-based model for cell killing prediction: development, validation and explainability analysis of the ANAKIN model
Cordoni, Francesco G., Missiaggia, Marta, Scifoni, Emanuele, La Tessa, Chiara
The present work develops ANAKIN: an Artificial iNtelligence bAsed model for (radiation induced) cell KIlliNg prediction. ANAKIN is trained and tested over 513 cell survival experiments with different types of radiation contained in the publicly available PIDE database. We show how ANAKIN accurately predicts several relevant biological endpoints over a wide broad range on ions beams and for a high number of cell--lines. We compare the prediction of ANAKIN to the only two radiobiological model for RBE prediction used in clinics, that is the Microdosimetric Kinetic Model (MKM) and the Local Effect Model (LEM version III), showing how ANAKIN has higher accuracy over the all considered biological endpoints. At last, via modern techniques of Explainable Artificial Intelligence (XAI), we show how ANAKIN predictions can be understood and explained, highlighting how ANAKIN is in fact able to reproduce relevant well-known biological patterns, such as the overkilling effect.
- North America > United States > California > Alameda County > Berkeley (0.14)
- Europe > Switzerland > Ticino > Bellinzona (0.04)
- Europe > Italy > Trentino-Alto Adige/Südtirol > Trentino Province > Trento (0.04)
- (6 more...)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Nuclear Medicine (1.00)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.46)