Law
Finding Regions of Heterogeneity in Decision-Making via Expected Conditional Covariance
Individuals often make di ff erent decisions when faced with the same context, due to personal preferences and background. For instance, judges may vary in their leniency towards certain drug-related o ff enses, and doctors may vary in their preference for how to start treatment for certain types of patients.
Supplementary materials: Video compression dataset and benchmark of learning-based video-quality metrics Anastasia Antsiferova
Below we describe the steps for calculating metrics. To avoid overfitting on our dataset, we used already fitted image-and video-quality-assessment models with public source code. Below are the steps for calculating different versions of such metrics. We used mean temporal pooling as a way to aggregate scores from multiple frames. We intend to include more data on this research in future publications.
Legal Zero-Days: A Novel Risk Vector for Advanced AI Systems
Sadler, Greg, Sherburn, Nathan
We introduce the concept of "Legal Zero-Days" as a novel risk vector for advanced AI systems. Legal Zero-Days are previously undiscovered vulnerabilities in legal frameworks that, when exploited, can cause immediate and significant societal disruption without requiring litigation or other processes before impact. We present a risk model for identifying and evaluating these vulnerabilities, demonstrating their potential to bypass safeguards or impede government responses to AI incidents. Using the 2017 Australian dual citizenship crisis as a case study, we illustrate how seemingly minor legal oversights can lead to large-scale governance disruption. We develop a methodology for creating "legal puzzles" as evaluation instruments for assessing AI systems' capabilities to discover such vulnerabilities. Our findings suggest that while current AI models may not reliably find impactful Legal Zero-Days, future systems may develop this capability, presenting both risks and opportunities for improving legal robustness. This work contributes to the broader effort to identify and mitigate previously unrecognized risks from frontier AI systems.