transparency
How the Pope's Magnifica Humanitas offers a template for individuals to meet the AI moment
How the Pope's Magnifica Humanitas offers a template for individuals to meet the AI moment Despite a lack of regulation, we still have the ability to steer artificial intelligence in ways that can benefit our common humanity. Pope Leo XIV's new encyclical on artificial intelligence includes a statement that warrants serious attention from technologists and policymakers: "Technology is never neutral." As the pope says, the choice before us--the choice AI presents--is one between the Tower of Babel and the rebuilding of our common humanity. In the biblical story of the Tower of Babel, humans sought to build a massive structure that reached all the way to Heaven, only to have their project thwarted when God made those involved unable to understand one another. It was a pursuit fixated on relentless growth, divorced from any concern about God's commandments or the human cost. It resulted in failure and atomization.
Ethical Considerations for Responsible Data Curation
HCCV datasets constructed through nonconsensual web scraping lack crucial metadata for comprehensive fairness and robustness evaluations. Current remedies are post hoc, lack persuasive justification for adoption, or fail to provide proper contextualization for appropriate application. Our research focuses on proactive, domain-specific recommendations, covering purpose, privacy and consent, and diversity, for curating HCCV evaluation datasets, addressing privacy and bias concerns. We adopt an ante hoc reflective perspective, drawing from current practices, guidelines, dataset withdrawals, and audits, to inform our considerations and recommendations.
Differentiable Blocks World: Qualitative 3DDecomposition by Rendering Primitives
Given a set of calibrated images of a scene, we present an approach that produces a simple, compact, and actionable 3D world representation by means of 3D primitives. While many approaches focus on recovering high-fidelity 3D scenes, we focus on parsing a scene into mid-level 3D representations made of a small set of textured primitives. Such representations are interpretable, easy to manipulate and suited for physics-based simulations. Moreover, unlike existing primitive decomposition methods that rely on 3D input data, our approach operates directly on images through differentiable rendering.
Counterfactual Explanations Can Be Manipulated
Counterfactual explanations are emerging as an attractive option for providing recourse to individuals adversely impacted by algorithmic decisions. As they are deployed in critical applications (e.g. law enforcement, financial lending), it becomes important to ensure that we clearly understand the vulnerabilties of these methods and find ways to address them. However, there is little understanding of the vulnerabilities and shortcomings of counterfactual explanations. In this work, we introduce the first framework that describes the vulnerabilities of counterfactual explanations and shows how they can be manipulated. More specifically, we show counterfactual explanations may converge to drastically different counterfactuals under a small perturbation indicating they are not robust. Leveraging this insight, we introduce a novel objective to train seemingly fair models where counterfactual explanations find much lower cost recourse under a slight perturbation. We describe how these models can unfairly provide low-cost recourse for specific subgroups in the data while appearing fair to auditors. We perform experiments on loan and violent crime prediction data sets where certain subgroups achieve up to 20x lower cost recourse under the perturbation. These results raise concerns regarding the dependability of current counterfactual explanation techniques, which we hope will inspire investigations in robust counterfactual explanations.1
Robot Talk Episode 148 – Ethical robot behaviour, with Alan Winfield
Alan Winfield is Professor of Robot Ethics at the University of the West of England (UWE), Visiting Professor at the University of York, and Associate Fellow of the Cambridge Centre for the Future of Intelligence. Alan co-founded the Bristol Robotics Laboratory, where his research is focussed on the science, engineering and ethics of cognitive robotics. Alan is an advocate for robot ethics; he chairs the advisory board of the Responsible Technology Institute at the University of Oxford and has co-drafted new standards on ethical risk assessment and transparency. Robot Talk is a weekly podcast that explores the exciting world of robotics, artificial intelligence and autonomous machines. Robot Talk is a weekly podcast that explores the exciting world of robotics, artificial intelligence and autonomous machines.
RedPajama: an Open Dataset for Training Large Language Models
Large language models are increasingly becoming a cornerstone technology in artificial intelligence, the sciences, and society as a whole, yet the optimal strategies for dataset composition and filtering remain largely elusive. Many of the top-performing models lack transparency in their dataset curation and model development processes, posing an obstacle to the development of fully open language models. In this paper, we identify three core data-related challenges that must be addressed to advance open-source language models. These include (1) transparency in model development, including the data curation process, (2) access to large quantities of high-quality data, and (3) availability of artifacts and metadata for dataset curation and analysis. To address these challenges, we release RedPajama-V1, an open reproduction of the LLaMA training dataset. In addition, we release RedPajama-V2, a massive web-only dataset consisting of raw, unfiltered text data together with quality signals and metadata.Together, the RedPajama datasets comprise over 100 trillion tokens spanning multiple domains and with their quality signals facilitate the filtering of data, aiming to inspire the development of numerous new datasets. To date, these datasets have already been used in the training of strong language models used in production, such as Snowflake Arctic, Salesforce's XGen and AI2's OLMo. To provide insight into the quality of RedPajama, we present a series of analyses and ablation studies with decoder-only language models with up to 1.6B parameters. Our findings demonstrate how quality signals for web data can be effectively leveraged to curate high-quality subsets of the dataset, underscoring the potential of RedPajama to advance the development of transparent and high-performing language models at scale.
The Intelligible and Effective Graph Neural Additive Network
Graph Neural Networks (GNNs) have emerged as the predominant approach for learning over graph-structured data. However, most GNNs operate as black-box models and require post-hoc explanations, which may not suffice in high-stakes scenarios where transparency is crucial.In this paper, we present a GNN that is interpretable by design. Our model, Graph Neural Additive Network (GNAN), is a novel extension of the interpretable class of Generalized Additive Models, and can be visualized and fully understood by humans. GNAN is designed to be fully interpretable, offering both global and local explanations at the feature and graph levels through direct visualization of the model. These visualizations describe exactly how the model uses the relationships between the target variable, the features, and the graph. We demonstrate the intelligibility of GNANs in a series of examples on different tasks and datasets. In addition, we show that the accuracy of GNAN is on par with black-box GNNs, making it suitable for critical applications where transparency is essential, alongside high accuracy.
The greatest risk of AI in higher education isn't cheating – it's the erosion of learning itself
Public debate about artificial intelligence in higher education has largely orbited a familiar worry: cheating . Will students use chatbots to write essays? Should universities ban the tech? But focusing so much on cheating misses the larger transformation already underway, one that extends far beyond student misconduct and even the classroom. Universities are adopting AI across many areas of institutional life .
Humans' love of crystals goes back at least 6 million years
Environment Animals Wildlife Humans' love of crystals goes back at least 6 million years Experiments with chimpanzees show a shared love of shiny things. Crystals have been found along human remains in several archeological dig sites. Breakthroughs, discoveries, and DIY tips sent six days a week. Primates of all stripes really love their crystals. Archeologists have found the shiny rocks at dig sites dating back as long as 780,000 years ago.