South America
Weighted Random Dot Product Graphs
Marenco, Bernardo, Bermolen, Paola, Fiori, Marcelo, Larroca, Federico, Mateos, Gonzalo
Modeling of intricate relational patterns has become a cornerstone of contemporary statistical research and related data science fields. Networks, represented as graphs, offer a natural framework for this analysis. This paper extends the Random Dot Product Graph (RDPG) model to accommodate weighted graphs, markedly broadening the model's scope to scenarios where edges exhibit heterogeneous weight distributions. We propose a nonparametric weighted (W)RDPG model that assigns a sequence of latent positions to each node. Inner products of these nodal vectors specify the moments of their incident edge weights' distribution via moment-generating functions. In this way, and unlike prior art, the WRDPG can discriminate between weight distributions that share the same mean but differ in other higher-order moments. We derive statistical guarantees for an estimator of the nodal's latent positions adapted from the workhorse adjacency spectral embedding, establishing its consistency and asymptotic normality. We also contribute a generative framework that enables sampling of graphs that adhere to a (prescribed or data-fitted) WRDPG, facilitating, e.g., the analysis and testing of observed graph metrics using judicious reference distributions. The paper is organized to formalize the model's definition, the estimation (or nodal embedding) process and its guarantees, as well as the methodologies for generating weighted graphs, all complemented by illustrative and reproducible examples showcasing the WRDPG's effectiveness in various network analytic applications.
Categorical and geometric methods in statistical, manifold, and machine learning
Lê, Hông Vân, Minh, Hà Quang, Protin, Frederic, Tuschmann, Wilderich
We present and discuss applications of the category of probabilistic morphisms, initially developed in \cite{Le2023}, as well as some geometric methods to several classes of problems in statistical, machine and manifold learning which shall be, along with many other topics, considered in depth in the forthcoming book \cite{LMPT2024}.
On the Residual-based Neural Network for Unmodeled Distortions in Coordinate Transformation
Rofatto, Vinicius Francisco, de Almeida, Luiz Felipe Rodrigues, Matsuoka, Marcelo Tomio, Klein, Ivandro, Veronez, Mauricio Roberto, Junior, Luiz Gonzaga Da Silveira
Coordinate transformation models often fail to account for nonlinear and spatially dependent distortions, leading to significant residual errors in geospatial applications. Here we propose a residual-based neural correction strategy, in which a neural network learns to model only the systematic distortions left by an initial geometric transformation. By focusing solely on residual patterns, the proposed method reduces model complexity and improves performance, particularly in scenarios with sparse or structured control point configurations. We evaluate the method using both simulated datasets with varying distortion intensities and sampling strategies, as well as under the real-world image georeferencing tasks. Compared with direct neural network coordinate converter and classical transformation models, the residual-based neural correction delivers more accurate and stable results under challenging conditions, while maintaining comparable performance in ideal cases. These findings demonstrate the effectiveness of residual modelling as a lightweight and robust alternative for improving coordinate transformation accuracy.
Multitask LSTM for Arboviral Outbreak Prediction Using Public Health Data
Farias, Lucas R. C., Silva, Talita P., Araujo, Pedro H. M.
--This paper presents a multitask learning approach based on long-short-term memory (LSTM) networks for the joint prediction of arboviral outbreaks and case counts of dengue, chikungunya, and Zika in Recife, Brazil. Leveraging historical public health data from DataSUS (2017-2023), the proposed model concurrently performs binary classification (outbreak detection) and regression (case forecasting) tasks. A sliding window strategy was adopted to construct temporal features using varying input lengths (60, 90, and 120 days), with hyperparameter optimization carried out using Keras T uner . Model evaluation used time series cross-validation for robustness and a held-out test from 2023 for generalization assessment. The results show that longer windows improve dengue regression accuracy, while classification performance peaked at intermediate windows, suggesting an optimal trade-off between sequence length and generalization. The multitask architecture delivers competitive performance across diseases and tasks, demonstrating the feasibility and advantages of unified modeling strategies for scalable epidemic forecasting in data-limited public health scenarios.
Scientists studying spherical UFO say they've discovered alien technology
Scientists have released the first X-ray images of a mysterious, sphere-shaped object recovered in Colombia, which locals claim is of alien origin. The so-called'UFO' was spotted in March over the town of Buga, zig-zagging through the sky in a way that defies the movement of conventional aircraft. The object was recovered shortly after it landed and has since been analyzed by scientists, who discovered it features three layers of metal-like material and 18 microspheres surrounding a central nucleus they are calling'a chip.' Dr Jose Luis Velazquez, a radiologist who examined the sphere, reported finding'no welds or joints,' which would typically indicate human fabrication. He and his team concluded: 'It is of artificial origin, in that it shows no evidence of welding, and its internal structure is composed of high-density elements. More testing is needed to establish its origin.'
How Russia and Ukraine Are Playing Trump's Blame Game
On May 9th, Vladimir Putin will oversee a parade in Moscow's Red Square, commemorating the Soviet Union's victory in the Second World War, an annual display of military bravado that, since Russia's full-scale invasion of Ukraine, in 2022, has taken on more explicit political undertones. The country's triumph over Nazism is presented as proof of its righteousness in the current war--and of it's role as a global power. Last year, as intercontinental ballistic missiles capable of carrying nuclear warheads rolled across the square, Putin linked the "radiant memory" of those who gave up their lives in the Second World War with "our brothers-in-arms who have fallen in the struggle against neo-Nazism and in the righteous fight for Russia"--that is, Russian soldiers killed in the current war in Ukraine. The Lede Reporting and commentary on what you need to know today. This year, the celebrations in Moscow serve another purpose: a way for Putin to show that he is not geopolitically isolated--China's Xi Jinping and Brazil's Luiz Inácio Lula da Silva are expected to attend.
Adversarial Robustness of Deep Learning Models for Inland Water Body Segmentation from SAR Images
Kothari, Siddharth, Murali, Srinivasan, Kothari, Sankalp, Verma, Ujjwal, Sreevalsan-Nair, Jaya
Inland water body segmentation from Synthetic Aperture Radar (SAR) images is an important task needed for several applications, such as flood mapping. While SAR sensors capture data in all-weather conditions as high-resolution images, differentiating water and water-like surfaces from SAR images is not straightforward. Inland water bodies, such as large river basins, have complex geometry, which adds to the challenge of segmentation. U-Net is a widely used deep learning model for land-water segmentation of SAR images. In practice, manual annotation is often used to generate the corresponding water masks as ground truth. Manual annotation of the images is prone to label noise owing to data poisoning attacks, especially due to complex geometry. In this work, we simulate manual errors in the form of adversarial attacks on the U-Net model and study the robustness of the model to human errors in annotation. Our results indicate that U-Net can tolerate a certain level of corruption before its performance drops significantly. This finding highlights the crucial role that the quality of manual annotations plays in determining the effectiveness of the segmentation model. The code and the new dataset, along with adversarial examples for robust training, are publicly available. (GitHub link - https://github.com/GVCL/IWSeg-SAR-Poison.git)
Developing A Framework to Support Human Evaluation of Bias in Generated Free Response Text
Healey, Jennifer, Byrum, Laurie, Akhtar, Md Nadeem, Bhargava, Surabhi, Sinha, Moumita
LLM evaluation is challenging even the case of base models. In real world deployments, evaluation is further complicated by th e interplay of task specific prompts and experiential context. A t scale, bias evaluation is often based on short context, fixed choicebench-marks that can be rapidly evaluated, however, these can lose validity when the LLMs' deployed context differs. Large scale h u-man evaluation is often seen as too intractable and costly. H ere we present our journey towards developing a semi-automatedbias evaluation framework for free text responses that has human insights at its core. We discuss how we developed an operational definition of bias that helped us automate our pipeline and a methodology for classifying bias beyond multiple choice. We additionally comment on how human evaluation helped us uncover problematic templates in a bias benchmark.
Single-Sample and Robust Online Resource Allocation
Ghuge, Rohan, Singla, Sahil, Wang, Yifan
Online Resource Allocation problem is a central problem in many areas of Computer Science, Operations Research, and Economics. In this problem, we sequentially receive $n$ stochastic requests for $m$ kinds of shared resources, where each request can be satisfied in multiple ways, consuming different amounts of resources and generating different values. The goal is to achieve a $(1-ε)$-approximation to the hindsight optimum, where $ε>0$ is a small constant, assuming each resource has a large budget. In this paper, we investigate the learnability and robustness of online resource allocation. Our primary contribution is a novel Exponential Pricing algorithm with the following properties: 1. It requires only a \emph{single sample} from each of the $n$ request distributions to achieve a $(1-ε)$-approximation for online resource allocation with large budgets. Such an algorithm was previously unknown, even with access to polynomially many samples, as prior work either assumed full distributional knowledge or was limited to i.i.d.\,or random-order arrivals. 2. It is robust to corruptions in the outliers model and the value augmentation model. Specifically, it maintains its $(1 - ε)$-approximation guarantee under both these robustness models, resolving the open question posed in Argue, Gupta, Molinaro, and Singla (SODA'22). 3. It operates as a simple item-pricing algorithm that ensures incentive compatibility. The intuition behind our Exponential Pricing algorithm is that the price of a resource should adjust exponentially as it is overused or underused. It differs from conventional approaches that use an online learning algorithm for item pricing. This departure guarantees that the algorithm will never run out of any resource, but loses the usual no-regret properties of online learning algorithms, necessitating a new analytical approach.
Simulation to Reality: Testbeds and Architectures for Connected and Automated Vehicles
Klüner, David, Schäfer, Simon, Hegerath, Lucas, Xu, Jianye, Kahle, Julius, Ibrahim, Hazem, Kampmann, Alexandru, Alrifaee, Bassam
Ensuring the safe and efficient operation of CAVs relies heavily on the software framework used. A software framework needs to ensure real-time properties, reliable communication, and efficient resource utilization. Furthermore, a software framework needs to enable seamless transition between testing stages, from simulation to small-scale to full-scale experiments. In this paper, we survey prominent software frameworks used for in-vehicle and inter-vehicle communication in CAVs. We analyze these frameworks regarding opportunities and challenges, such as their real-time properties and transitioning capabilities. Additionally, we delve into the tooling requirements necessary for addressing the associated challenges. We illustrate the practical implications of these challenges through case studies focusing on critical areas such as perception, motion planning, and control. Furthermore, we identify research gaps in the field, highlighting areas where further investigation is needed to advance the development and deployment of safe and efficient CAV systems.