Redondo Beach
No more fireworks? Big change coming to 4th of July at Pasadena's Rose Bowl
Marking the end of a longtime tradition, the Fourth of July celebration at the Rose Bowl in Pasadena will not feature a fireworks show this year. Instead, there will be a drone show. The move comes as some venues have switched from fireworks to drone shows -- in which a fleet of drones performs a choreographed light show -- to celebrate the 4th of July. But drone shows have fallen flat for some. Notably Redondo Beach and Laguna Beach switched back to fireworks after trying out drone shows, and some promoters of fireworks shows have voiced criticism over efforts to transition to drone shows.
Simulation to Reality: Testbeds and Architectures for Connected and Automated Vehicles
Klüner, David, Schäfer, Simon, Hegerath, Lucas, Xu, Jianye, Kahle, Julius, Ibrahim, Hazem, Kampmann, Alexandru, Alrifaee, Bassam
Ensuring the safe and efficient operation of CAVs relies heavily on the software framework used. A software framework needs to ensure real-time properties, reliable communication, and efficient resource utilization. Furthermore, a software framework needs to enable seamless transition between testing stages, from simulation to small-scale to full-scale experiments. In this paper, we survey prominent software frameworks used for in-vehicle and inter-vehicle communication in CAVs. We analyze these frameworks regarding opportunities and challenges, such as their real-time properties and transitioning capabilities. Additionally, we delve into the tooling requirements necessary for addressing the associated challenges. We illustrate the practical implications of these challenges through case studies focusing on critical areas such as perception, motion planning, and control. Furthermore, we identify research gaps in the field, highlighting areas where further investigation is needed to advance the development and deployment of safe and efficient CAV systems.
Information maximization for a broad variety of multi-armed bandit games
Barbier-Chebbah, Alex, Vestergaard, Christian L., Masson, Jean-Baptiste
Information and free-energy maximization are physics principles that provide general rules for an agent to optimize actions in line with specific goals and policies. These principles are the building blocks for designing decision-making policies capable of efficient performance with only partial information. Notably, the information maximization principle has shown remarkable success in the classical bandit problem and has recently been shown to yield optimal algorithms for Gaussian and sub-Gaussian reward distributions. This article explores a broad extension of physics-based approaches to more complex and structured bandit problems. To this end, we cover three distinct types of bandit problems, where information maximization is adapted and leads to strong performance. Since the main challenge of information maximization lies in avoiding over-exploration, we highlight how information is tailored at various levels to mitigate this issue, paving the way for more efficient and robust decision-making strategies.
Better Private Distribution Testing by Leveraging Unverified Auxiliary Data
Aliakbarpour, Maryam, Burudgunte, Arnav, Cannone, Clément, Rubinfeld, Ronitt
Accurately analyzing data while preserving individual privacy is a fundamental challenge in statistical inference. Since its formulation nearly two decades ago, Differential Privacy (DP) [DMNS06] has emerged as the leading framework for privacy-preserving data analysis, providing strong mathematical privacy guarantees and gaining adoption by major entities such as the U.S. Census Bureau, Amazon [Ama24], Google [EPK14], Microsoft [DKY17], and Apple [Dif17; TVVKFSD17]. Unfortunately, DP guarantees often come at the cost of increased data requirements or computational resources, which has limited the widespread adoption of differential privacy in spite of its theoretical appeal. To address this issue, a recent line of work has investigated whether access to even small amounts of additional public data could help mitigate this loss of performance. Promising results for various tasks have been shown, both experimentally [KST20; LLHR24; BZHZK24; DORKSF24] and theoretically [BKS22; BBCKS23]. The use of additional auxiliary information is very enticing, as such access is available in many real-world applications: for example, hospitals handling sensitive patient data might leverage public datasets, records from different periods or locations, or synthetic data generated by machine learning models to improve analysis. Similarly, medical or socio-econonomic studies focusing on a minority or protected group can leverage statistical data from the overall population. However, integrating public data introduces its own challenges, as it often lacks guarantees regarding its accuracy or relevance to private datasets.
Streaming, Memory Limited Algorithms for Community Detection
Se-Young Yun, marc lelarge, Alexandre Proutiere
In this paper, we consider sparse networks consisting of a finite number of nonoverlapping communities, i.e. disjoint clusters, so that there is higher density within clusters than across clusters. Both the intra-and inter-cluster edge densities vanish when the size of the graph grows large, making the cluster reconstruction problem nosier and hence difficult to solve. We are interested in scenarios where the network size is very large, so that the adjacency matrix of the graph is hard to manipulate and store. The data stream model in which columns of the adjacency matrix are revealed sequentially constitutes a natural framework in this setting. For this model, we develop two novel clustering algorithms that extract the clusters asymptotically accurately. The first algorithm is offline, as it needs to store and keep the assignments of nodes to clusters, and requires a memory that scales linearly with the network size. The second algorithm is online, as it may classify a node when the corresponding column is revealed and then discard this information. This algorithm requires a memory growing sub-linearly with the network size. To construct these efficient streaming memory-limited clustering algorithms, we first address the problem of clustering with partial information, where only a small proportion of the columns of the adjacency matrix is observed and develop, for this setting, a new spectral algorithm which is of independent interest.
A Novel End-To-End Event Geolocation Method Leveraging Hyperbolic Space and Toponym Hierarchies
Abstract: Timely detection and geolocation of events based on social data can provide critical information for applications such as crisis response and resource allocation. However, most existing methods are greatly affected by event detection errors, leading to insufficient geolocation accuracy. To this end, this paper proposes a novel end-to-end event geolocation method (GTOP) leveraging Hyperbolic space and toponym hierarchies. Specifically, the proposed method contains one event detection module and one geolocation module. The event detection module constructs a heterogeneous information networks based on social data, and then constructs a homogeneous message graph and combines it with the text and time feature of the message to learning initial features of nodes. Node features are updated in Hyperbolic space and then fed into a classifier for event detection. To reduce the geolocation error, this paper proposes a noise toponym filtering algorithm (HIST) based on the hierarchical structure of toponyms. HIST analyzes the hierarchical structure of toponyms mentioned in the event cluster, taking the highly frequent city-level locations as the coarsegrained locations for events. To further improve the geolocation accuracy, we propose a fine-grained pseudo toponyms generation algorithm (FIT) based on the output of HIST, and combine generated pseudo toponyms with filtered toponyms to locate events based on the geographic center points of the combined toponyms. Extensive experiments are conducted on the Chinese dataset constructed in this paper and another public English dataset. The experimental results show that the proposed method is superior to the state-of-the-art baselines.
Optimal Algorithms for Augmented Testing of Discrete Distributions
Aliakbarpour, Maryam, Indyk, Piotr, Rubinfeld, Ronitt, Silwal, Sandeep
We consider the problem of hypothesis testing for discrete distributions. In the standard model, where we have sample access to an underlying distribution $p$, extensive research has established optimal bounds for uniformity testing, identity testing (goodness of fit), and closeness testing (equivalence or two-sample testing). We explore these problems in a setting where a predicted data distribution, possibly derived from historical data or predictive machine learning models, is available. We demonstrate that such a predictor can indeed reduce the number of samples required for all three property testing tasks. The reduction in sample complexity depends directly on the predictor's quality, measured by its total variation distance from $p$. A key advantage of our algorithms is their adaptability to the precision of the prediction. Specifically, our algorithms can self-adjust their sample complexity based on the accuracy of the available prediction, operating without any prior knowledge of the estimation's accuracy (i.e. they are consistent). Additionally, we never use more samples than the standard approaches require, even if the predictions provide no meaningful information (i.e. they are also robust). We provide lower bounds to indicate that the improvements in sample complexity achieved by our algorithms are information-theoretically optimal. Furthermore, experimental results show that the performance of our algorithms on real data significantly exceeds our worst-case guarantees for sample complexity, demonstrating the practicality of our approach.