Goto

Collaborating Authors

 policymaker


How one controversial startup hopes to cool the planet

MIT Technology Review

And why many scientists are freaked out about the first serious for-profit company moving into the solar geoengineering field. Stardust Solutions believes that it can solve climate change--for a price. The Israel-based geoengineering startup has said it expects nations will soon pay it more than a billion dollars a year to launch specially equipped aircraft into the stratosphere. Once they've reached the necessary altitude, those planes will disperse particles engineered to reflect away enough sunlight to cool down the planet, purportedly without causing environmental side effects. The proprietary (and still secret) particles could counteract all the greenhouse gases the world has emitted over the last 150 years, the company stated in a 2023 pitch deck it presented to venture capital firms. In fact, it's the "only technologically feasible solution" to climate change, the company said. The company disclosed it raised $60 million in funding in October, marking by far the largest known funding round to date for a startup working on solar geoengineering.


Left Leaning Models: How AI Evaluates Economic Policy?

Chupilkin, Maxim

arXiv.org Artificial Intelligence

Would artificial intelligence (AI) cut interest rates or adopt conservative monetary policy? Would it deregulate or opt for a more controlled economy? As AI use by economic policymakers, academics, and market participants grows exponentially, it is becoming critical to understand AI preferences over economic policy. However, these preferences are not yet systematically evaluated and remain a black box. This paper makes a conjoint experiment on leading large language models (LLMs) from OpenAI, Anthropic, and Google, asking them to evaluate economic policy under multi-factor constraints. The results are remarkably consistent across models: most LLMs exhibit a strong preference for high growth, low unemployment, and low inequality over traditional macroeconomic concerns such as low inflation and low public debt. Scenario-specific experiments show that LLMs are sensitive to context but still display strong preferences for low unemployment and low inequality even in monetary-policy settings. Numerical sensitivity tests reveal intuitive responses to quantitative changes but also uncover non-linear patterns such as loss aversion.


From Prediction to Foresight: The Role of AI in Designing Responsible Futures

Perez-Ortiz, Maria

arXiv.org Artificial Intelligence

In an era marked by rapid technological advancements and complex global challenges, responsible foresight has emerged as an essential framework for policymakers aiming to navigate future uncertainties and shape the future. Responsible foresight entails the ethical anticipation of emerging opportunities and risks, with a focus on fostering proactive, sustainable, and accountable future design. This paper coins the term "responsible computational foresight", examining the role of human-centric artificial intelligence and computational modeling in advancing responsible foresight, establishing a set of foundational principles for this new field and presenting a suite of AI-driven foresight tools currently shaping it. AI, particularly in conjunction with simulations and scenario analysis, enhances policymakers' ability to address uncertainty, evaluate risks, and devise strategies geared toward sustainable, resilient futures. However, responsible foresight extends beyond mere technical forecasting; it demands a nuanced understanding of the interdependencies within social, environmental, economic and political systems, alongside a commitment to ethical, long-term decision-making that supports human intelligence. We argue that AI will play a role as a supportive tool in responsible, human-centered foresight, complementing rather than substituting policymaker judgment to enable the proactive shaping of resilient and ethically sound futures. This paper advocates for the thoughtful integration of AI into foresight practices to empower policymakers and communities as they confront the grand challenges of the 21st century.


A Framework for Human-Reason-Aligned Trajectory Evaluation in Automated Vehicles

Suryana, Lucas Elbert, Rahmani, Saeed, Calvert, Simeon Craig, Zgonnikov, Arkady, van Arem, Bart

arXiv.org Artificial Intelligence

One major challenge for the adoption and acceptance of automated vehicles (AVs) is ensuring that they can make sound decisions in everyday situations that involve ethical tension. Much attention has focused on rare, high-stakes dilemmas such as trolley problems. Yet similar conflicts arise in routine driving when human considerations, such as legality, efficiency, and comfort, come into conflict. Current AV planning systems typically rely on rigid rules, which struggle to balance these competing considerations and often lead to behaviour that misaligns with human expectations. This paper introduces a reasons-based trajectory evaluation framework that operationalises the tracking condition of Meaningful Human Control (MHC). The framework represents human agents reasons (e.g., regulatory compliance) as quantifiable functions and evaluates how well candidate trajectories align with them. It assigns adjustable weights to agent priorities and includes a balance function to discourage excluding any agent. To demonstrate the approach, we use a real-world-inspired overtaking scenario, which highlights tensions between compliance, efficiency, and comfort. Our results show that different trajectories emerge as preferable depending on how agents reasons are weighted, and small shifts in priorities can lead to discrete changes in the selected action. This demonstrates that everyday ethical decisions in AV driving are highly sensitive to the weights assigned to the reasons of different human agents.


Recommendations and Reporting Checklist for Rigorous & Transparent Human Baselines in Model Evaluations

Wei, Kevin L., Paskov, Patricia, Dev, Sunishchal, Byun, Michael J., Reuel, Anka, Roberts-Gaal, Xavier, Calcott, Rachel, Coxon, Evie, Deshpande, Chinmay

arXiv.org Artificial Intelligence

In this position paper, we argue that human baselines in foundation model evaluations must be more rigorous and more transparent to enable meaningful comparisons of human vs. AI performance, and we provide recommendations and a reporting checklist towards this end. Human performance baselines are vital for the machine learning community, downstream users, and policymakers to interpret AI evaluations. Models are often claimed to achieve "super-human" performance, but existing baselining methods are neither sufficiently rigorous nor sufficiently well-documented to robustly measure and assess performance differences. Based on a meta-review of the measurement theory and AI evaluation literatures, we derive a framework with recommendations for designing, executing, and reporting human baselines. We synthesize our recommendations into a checklist that we use to systematically review 115 human baselines (studies) in foundation model evaluations and thus identify shortcomings in existing baselining methods; our checklist can also assist researchers in conducting human baselines and reporting results. We hope our work can advance more rigorous AI evaluation practices that can better serve both the research community and policymakers. Data is available at: https://github.com/kevinlwei/human-baselines


Lost in Translation: Policymakers are not really listening to Citizen Concerns about AI

Aaronson, Susan Ariel, Moreno, Michael

arXiv.org Artificial Intelligence

The worlds people have strong opinions about artificial intelligence (AI), and they want policymakers to listen. Governments are inviting public comment on AI, but as they translate input into policy, much of what citizens say is lost. Policymakers are missing a critical opportunity to build trust in AI and its governance. This paper compares three countries, Australia, Colombia, and the United States, that invited citizens to comment on AI risks and policies. Using a landscape analysis, the authors examined how each government solicited feedback and whether that input shaped governance. Yet in none of the three cases did citizens and policymakers establish a meaningful dialogue. Governments did little to attract diverse voices or publicize calls for comment, leaving most citizens unaware or unprepared to respond. In each nation, fewer than one percent of the population participated. Moreover, officials showed limited responsiveness to the feedback they received, failing to create an effective feedback loop. The study finds a persistent gap between the promise and practice of participatory AI governance. The authors conclude that current approaches are unlikely to build trust or legitimacy in AI because policymakers are not adequately listening or responding to public concerns. They offer eight recommendations: promote AI literacy; monitor public feedback; broaden outreach; hold regular online forums; use innovative engagement methods; include underrepresented groups; respond publicly to input; and make participation easier.


Reproducibility: The New Frontier in AI Governance

Mason-Williams, Israel, Mason-Williams, Gabryel

arXiv.org Artificial Intelligence

AI policymakers are responsible for delivering effective governance mechanisms that can provide safe, aligned and trustworthy AI development. However, the information environment offered to policymakers is characterised by an unnecessarily low Signal-To-Noise Ratio, favouring regulatory capture and creating deep uncertainty and divides on which risks should be prioritised from a governance perspective. We posit that the current publication speeds in AI combined with the lack of strong scientific standards, via weak reproducibility protocols, effectively erodes the power of policymakers to enact meaningful policy and governance protocols. Our paper outlines how AI research could adopt stricter reproducibility guidelines to assist governance endeavours and improve consensus on the AI risk landscape. We evaluate the forthcoming reproducibility crisis within AI research through the lens of crises in other scientific domains; providing a commentary on how adopting preregistration, increased statistical power and negative result publication reproducibility protocols can enable effective AI governance. While we maintain that AI governance must be reactive due to AI's significant societal implications we argue that policymakers and governments must consider reproducibility protocols as a core tool in the governance arsenal and demand higher standards for AI research. Code to replicate data and figures: https://github.com/IFMW01/reproducibility-the-new-frontier-in-ai-governance


Lack of data on government shutdown blurs US economy insights

Al Jazeera

The government has shut down - what's next? Will a government shutdown hurt the economy? From Wall Street trading floors to the United States Federal Reserve to economists sipping coffee in their home offices, the first Friday morning of the month typically brings a quiet hush around 8:30am Eastern time in the US [12:30 GMT] as everyone awaits the Labor Department's crucial monthly jobs report. But with the government shut down, no information was released on Friday about hiring in September. Hiring has ground nearly to a halt, threatening to drag down the broader economy. Yet at the same time, consumers -- particularly higher-income earners -- are still spending, and some businesses are ramping up investments in data centres developing artificial intelligence models.


Towards Effective E-Participation of Citizens in the European Union: The Development of AskThePublic

Messerschmidt, Nils, Sprenkamp, Kilian, Sartipi, Amir, Wu, Xiaohui, Tchappi, Igor, Zavolokina, Liudmila, Fridgen, Gilbert

arXiv.org Artificial Intelligence

E-participation platforms are an important asset for governments in increasing trust and fostering democratic societies. By engaging public and private institutions and individuals, policymakers can make informed and inclusive decisions. However, current approaches of primarily static nature struggle to integrate citizen feedback effectively. Drawing on the Media Richness Theory and applying the Design Science Research method, we explore how a chatbot can address these shortcomings to improve the decision-making abilities for primary stakeholders of e-participation platforms. Leveraging the "Have Y our Say" platform, which solicits feedback on initiatives and regulations by the European Commission, a Large Language Model-based chatbot, called AskThePublic is created, providing policymakers, journalists, researchers, and interested citizens with a convenient channel to explore and engage with citizen input. Evaluating AskThePublic in 11 semi-structured interviews with public sector-affiliated experts, we find that the interviewees value the interactive and structured responses as well as enhanced language capabilities.


NeurIPS should lead scientific consensus on AI policy

Bommasani, Rishi

arXiv.org Artificial Intelligence

Designing wise AI policy is a grand challenge for society. To design such policy, policymakers should place a premium on rigorous evidence and scientific consensus. While several mechanisms exist for evidence generation, and nascent mechanisms tackle evidence synthesis, we identify a complete void on consensus formation. In this position paper, we argue NeurIPS should actively catalyze scientific consensus on AI policy. Beyond identifying the current deficit in consensus formation mechanisms, we argue that NeurIPS is the best option due its strengths and the paucity of compelling alternatives. To make progress, we recommend initial pilots for NeurIPS by distilling lessons from the IPCC's leadership to build scientific consensus on climate policy. We dispel predictable counters that AI researchers disagree too much to achieve consensus and that policy engagement is not the business of NeurIPS. NeurIPS leads AI on many fronts, and it should champion scientific consensus to create higher quality AI policy.