Goto

Collaborating Authors

 Law







WCLD: Curated Large Dataset of Criminal Cases from Wisconsin Circuit Courts Elliott Ash

Neural Information Processing Systems

We used reliable public data from 1970 to 2020 to curate attributes like prior criminal counts and recidivism outcomes. The dataset contains large number of samples from five racial groups, in addition to information like sex and age (at judgment and first offense).




Police launch talks on stricter drone rules in Japan

The Japan Times

The National Police Agency held its first expert panel meeting on Tuesday on countering illegal drone flights amid rising concerns over their potential use in terrorism and other threats. The National Police Agency on Tuesday held the first meeting of an expert panel on measures against illegal drone flights, in light of the growing threat of drones being used for terrorism and other purposes. The panel plans to compile a report by the end of the year on expanding the list of no-fly zones and penalties, with a view to revising the drone control law. The law was established in 2016 in the wake of an incident in which a drone fell on the roof of the Prime Minister's Office in Tokyo. It currently bans drone flights within about 300 meters of important facilities such as the National Diet Building, the Imperial Palace and nuclear power plants.


Efficient Prediction of Pass@k Scaling in Large Language Models

arXiv.org Machine Learning

Assessing the capabilities and risks of frontier AI systems is a critical area of research, and recent work has shown that repeated sampling from models can dramatically increase both. For instance, repeated sampling has been shown to increase their capabilities, such as solving difficult math and coding problems, but it has also been shown to increase their potential for harm, such as being jailbroken. Such results raise a crucial question for both capability and safety forecasting: how can one accurately predict a model's behavior when scaled to a massive number of attempts, given a vastly smaller sampling budget? This question is directly relevant to model providers, who serve hundreds of millions of users daily, and to governmental regulators, who seek to prevent harms. To answer this questions, we make three contributions. First, we find that standard methods for fitting these laws suffer from statistical shortcomings that hinder predictive accuracy, especially in data-limited scenarios. Second, we remedy these shortcomings by introducing a robust estimation framework, which uses a beta-binomial distribution to generate more accurate predictions from limited data. Third, we propose a dynamic sampling strategy that allocates a greater budget to harder problems. Combined, these innovations enable more reliable prediction of rare risks and capabilities at a fraction of the computational cost.