available data
Augmented data and neural networks for robust epidemic forecasting: application to COVID-19 in Italy
Dimarco, Giacomo, Ferrarese, Federica, Pareschi, Lorenzo
In this work, we propose a data augmentation strategy aimed at improving the training phase of neural networks and, consequently, the accuracy of their predictions. Our approach relies on generating synthetic data through a suitable compartmental model combined with the incorporation of uncertainty. The available data are then used to calibrate the model, which is further integrated with deep learning techniques to produce additional synthetic data for training. The results show that neural networks trained on these augmented datasets exhibit significantly improved predictive performance. We focus in particular on two different neural network architectures: Physics-Informed Neural Networks (PINNs) and Nonlinear Autoregressive (NAR) models. The NAR approach proves especially effective for short-term forecasting, providing accurate quantitative estimates by directly learning the dynamics from data and avoiding the additional computational cost of embedding physical constraints into the training. In contrast, PINNs yield less accurate quantitative predictions but capture the qualitative long-term behavior of the system, making them more suitable for exploring broader dynamical trends. Numerical simulations of the second phase of the COVID-19 pandemic in the Lombardy region (Italy) validate the effectiveness of the proposed approach.
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
- Health & Medicine > Therapeutic Area > Immunology (1.00)
- Health & Medicine > Epidemiology (1.00)
Peering Partner Recommendation for ISPs using Machine Learning
Alam, Md Ibrahim Ibne, Senapati, Ankur, Mahmood, Anindo, Yuksel, Murat, Kar, Koushik
Internet service providers (ISPs) need to connect with other ISPs to provide global connectivity services to their users. To ensure global connectivity, ISPs can either use transit service(s) or establish direct peering relationships between themselves via Internet exchange points (IXPs). Peering offers more room for ISP-specific optimizations and is preferred, but it often involves a lengthy and complex process. Automating peering partner selection can enhance efficiency in the global Internet ecosystem. We explore the use of publicly available data on ISPs to develop a machine learning (ML) model that can predict whether an ISP pair should peer or not. At first, we explore public databases, e.g., PeeringDB, CAIDA, etc., to gather data on ISPs. Then, we evaluate the performance of three broad types of ML models for predicting peering relationships: tree-based, neural network-based, and transformer-based. Among these, we observe that tree-based models achieve the highest accuracy and efficiency in our experiments. The XGBoost model trained with publicly available data showed promising performance, with a 98% accuracy rate in predicting peering partners. In addition, the model demonstrated great resilience to variations in time, space, and missing data. We envision that ISPs can adopt our method to fully automate the peering partner selection process, thus transitioning to a more efficient and optimized Internet ecosystem.
- North America > United States > Florida > Orange County > Orlando (0.14)
- North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
- North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
- Europe (0.04)
Foundation models for time series forecasting: Application in conformal prediction
Achour, Sami, Bouher, Yassine, Nguyen, Duong, Chesneau, Nicolas
The zero-shot capabilities of foundation models (FMs) for time series forecasting offer promising potentials in conformal prediction, as most of the available data can be allocated to calibration. This study compares the performance of Time Series Foundation Models (TSFMs) with traditional methods, including statistical models and gradient boosting, within a conformal prediction setting. Our findings highlight two key advantages of TSFMs. First, when the volume of data is limited, TSFMs provide more reliable conformalized prediction intervals than classic models, thanks to their superior predictive accuracy. Second, the calibration process is more stable because more data are used for calibration. Morever, the fewer data available, the more pronounced these benefits become, as classic models require a substantial amount of data for effective training. These results underscore the potential of foundation models in improving conformal prediction reliability in time series applications, particularly in data-constrained cases. All the code to reproduce the experiments is available on GitHub.
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
A data augmentation strategy for deep neural networks with application to epidemic modelling
Awais, Muhammad, Ali, Abu Sayfan, Dimarco, Giacomo, Ferrarese, Federica, Pareschi, Lorenzo
In this work, we integrate the predictive capabilities of compartmental disease dynamics models with machine learning ability to analyze complex, high-dimensional data and uncover patterns that conventional models may overlook. Specifically, we present a proof of concept demonstrating the application of data-driven methods and deep neural networks to a recently introduced SIR-type model with social features, including a saturated incidence rate, to improve epidemic prediction and forecasting. Our results show that a robust data augmentation strategy trough suitable data-driven models can improve the reliability of Feed-Forward Neural Networks (FNNs) and Nonlinear Autoregressive Networks (NARs), making them viable alternatives to Physics-Informed Neural Networks (PINNs). This approach enhances the ability to handle nonlinear dynamics and offers scalable, data-driven solutions for epidemic forecasting, prioritizing predictive accuracy over the constraints of physics-based models. Numerical simulations of the post-lockdown phase of the COVID-19 epidemic in Italy and Spain validate our methodology.
- Europe > Italy (0.28)
- Europe > Spain (0.27)
- North America > United States > New York (0.04)
- (3 more...)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
- Health & Medicine > Therapeutic Area > Immunology (1.00)
- Health & Medicine > Epidemiology (1.00)
CAS: Confidence Assessments of classification algorithms for Semantic segmentation of EO data
Dionelis, Nikolaos, Longepe, Nicolas
Confidence assessments of semantic segmentation algorithms in remote sensing are important. It is a desirable property of models to a priori know if they produce an incorrect output. Evaluations of the confidence assigned to the estimates of models for the task of classification in Earth Observation (EO) are crucial as they can be used to achieve improved semantic segmentation performance and prevent high error rates during inference and deployment. The model we develop, the Confidence Assessments of classification algorithms for Semantic segmentation (CAS) model, performs confidence evaluations at both the segment and pixel levels, and outputs both labels and confidence. The outcome of this work has important applications. The main application is the evaluation of EO Foundation Models on semantic segmentation downstream tasks, in particular land cover classification using satellite Copernicus Sentinel-2 data. The evaluation shows that the proposed model is effective and outperforms other alternative baseline models.
A neural network-based approach to hybrid systems identification for control
Fabiani, Filippo, Stellato, Bartolomeo, Masti, Daniele, Goulart, Paul J.
We consider the problem of designing a machine learning-based model of an unknown dynamical system from a finite number of (state-input)-successor state data points, such that the model obtained is also suitable for optimal control design. We propose a specific neural network (NN) architecture that yields a hybrid system with piecewise-affine dynamics that is differentiable with respect to the network's parameters, thereby enabling the use of derivative-based training procedures. We show that a careful choice of our NN's weights produces a hybrid system model with structural properties that are highly favourable when used as part of a finite horizon optimal control problem (OCP). Specifically, we show that optimal solutions with strong local optimality guarantees can be computed via nonlinear programming, in contrast to classical OCPs for general hybrid systems which typically require mixed-integer optimization. In addition to being well-suited for optimal control design, numerical simulations illustrate that our NN-based technique enjoys very similar performance to state-of-the-art system identification methodologies for hybrid systems and it is competitive on nonlinear benchmarks.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- Europe > Italy (0.04)
Generative AI Degrades Online Communities
ChatGPT generates believable text about nearly any subject, but there is a big difference between "believable" and "correct." ChatGPT, similarly to other LLMs, is trained on large swaths of publicly available data, in large part scraped from online forums such as Stack Overflow and Reddit. Given differences in the volume of available data, ChatGPT's performance naturally varies by topic and may in turn affect communities to different degrees. We observed ChatGPT's impact on Stack Overflow participation varies significantly across topics, aligning with its expected performance based on available training data. Those topics related to open-source tools and general-purpose programming languages (for example, Python, R) appeared to experience larger declines in participation and contribution than proprietary and closed technologies, such as those employed for enterprise server-side development (for example, Spring Framework, AWS, Azure).
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.44)
Cross-Validation Conformal Risk Control
Cohen, Kfir M., Park, Sangwoo, Simeone, Osvaldo, Shamai, Shlomo
Conformal risk control (CRC) is a recently proposed technique that applies post-hoc to a conventional point predictor to provide calibration guarantees. Generalizing conformal prediction (CP), with CRC, calibration is ensured for a set predictor that is extracted from the point predictor to control a risk function such as the probability of miscoverage or the false negative rate. The original CRC requires the available data set to be split between training and validation data sets. This can be problematic when data availability is limited, resulting in inefficient set predictors. In this paper, a novel CRC method is introduced that is based on cross-validation, rather than on validation as the original CRC. The proposed cross-validation CRC (CV-CRC) extends a version of the jackknife-minmax from CP to CRC, allowing for the control of a broader range of risk functions. CV-CRC is proved to offer theoretical guarantees on the average risk of the set predictor. Furthermore, numerical experiments show that CV-CRC can reduce the average set size with respect to CRC when the available data are limited.
- Asia > Middle East > Jordan (0.04)
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
CLadder: Assessing Causal Reasoning in Language Models
Jin, Zhijing, Chen, Yuen, Leeb, Felix, Gresele, Luigi, Kamal, Ojasv, Lyu, Zhiheng, Blin, Kevin, Adauto, Fernando Gonzalez, Kleiman-Weiner, Max, Sachan, Mrinmaya, Schölkopf, Bernhard
The ability to perform causal reasoning is widely considered a core feature of intelligence. In this work, we investigate whether large language models (LLMs) can coherently reason about causality. Much of the existing work in natural language processing (NLP) focuses on evaluating commonsense causal reasoning in LLMs, thus failing to assess whether a model can perform causal inference in accordance with a set of well-defined formal rules. To address this, we propose a new NLP task, causal inference in natural language, inspired by the "causal inference engine" postulated by Judea Pearl et al. We compose a large dataset, CLadder, with 10K samples: based on a collection of causal graphs and queries (associational, interventional, and counterfactual), we obtain symbolic questions and ground-truth answers, through an oracle causal inference engine. These are then translated into natural language. We evaluate multiple LLMs on our dataset, and we introduce and evaluate a bespoke chain-of-thought prompting strategy, CausalCoT. We show that our task is highly challenging for LLMs, and we conduct an in-depth analysis to gain deeper insights into the causal reasoning abilities of LLMs. Our data is open-sourced at https://huggingface.co/datasets/causalNLP/cladder, and our code can be found at https://github.com/causalNLP/cladder.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > China > Hong Kong (0.04)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- (24 more...)
- Workflow (0.94)
- Research Report (0.81)
- Overview (0.68)
- Health & Medicine > Therapeutic Area > Immunology (0.93)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.92)
- Education (0.67)
Algorithmic Information Forecastability
Amigo, Glauco, Díaz-Pachón, Daniel Andrés, Marks, Robert J., Baylis, Charles
The outcome of all time series cannot be forecast, e.g. the flipping of a fair coin. Others, like the repeated {01} sequence {010101...} can be forecast exactly. Algorithmic information theory can provide a measure of forecastability that lies between these extremes. The degree of forecastability is a function of only the data. For prediction (or classification) of labeled data, we propose three categories for forecastability: oracle forecastability for predictions that are always exact, precise forecastability for errors up to a bound, and probabilistic forecastability for any other predictions. Examples are given in each case.
- North America > United States > California (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.47)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)