Collaborating Authors


Survey XII: What Is the Future of Ethical AI Design? – Imagining the Internet


Results released June 16, 2021 – Pew Research Center and Elon University's Imagining the Internet Center asked experts where they thought efforts aimed at ethical artificial intelligence design would stand in the year 2030. Some 602 technology innovators, developers, business and policy leaders, researchers and activists responded to this specific question. The Question – Regarding the application of AI Ethics by 2030: In recent years, there have been scores of convenings and even more papers generated proposing ethical frameworks for the application of artificial intelligence (AI). They cover a host of issues including transparency, justice and fairness, privacy, freedom and human autonomy, beneficence and non-maleficence, freedom, trust, sustainability and dignity. Our questions here seek your predictions about the possibilities for such efforts. By 2030, will most of the AI systems being used by organizations of all sorts employ ethical principles focused primarily on the public ...

Large-sample evidence on the impact of unconventional oil and gas development on surface waters


Hydraulic fracturing uses a water-based mixture to open up tight oil and gas formations. The process is mostly contained, but concerns remain about the potential for surface water contamination. Bonetti et al. found a small increase in certain ions associated with hydraulic fracturing across several locations in the United States (see the Perspective by Hill and Ma). These small increases appeared 90 to 180 days after new wells were put in and suggest some surface water contamination. The magnitude appears small but may require that more attention be paid to monitoring near-well surface waters. Science , aaz2185, this issue p. [896][1]; see also abk3433, p. [853][2] The impact of unconventional oil and gas development on water quality is a major environmental concern. We built a large geocoded database that combines surface water measurements with horizontally drilled wells stimulated by hydraulic fracturing (HF) for several shales to examine whether temporal and spatial well variation is associated with anomalous salt concentrations in United States watersheds. We analyzed four ions that could indicate water impact from unconventional development. We found very small concentration increases associated with new HF wells for barium, chloride, and strontium but not bromide. All ions showed larger, but still small-in-magnitude, increases 91 to 180 days after well spudding. Our estimates were most pronounced for wells with larger amounts of produced water, wells located over high-salinity formations, and wells closer and likely upstream from water monitors. [1]: /lookup/doi/10.1126/science.aaz2185 [2]: /lookup/doi/10.1126/science.abk3433

Drone-based water sampling goes deep


Water sampling and analysis methods today are logistically complex, labor-intensive, time-consuming, and costly. Could drones, which are relatively cheap provide part of the solution? After two years of research and development, a company called Reign Maker believes the answer is yes as it roles out the world's first drone-based water sampling and data collection system, designed to increase sampling rates and accuracy while reducing reliance on field personnel and equipment, such as boats and boots. The solution is called Nixie, and the company claims it can increase sample rates by 75% while reducing costs by 90%. "The New York City Department of Environmental Protection alone collects 14,000 water quality samples a year, collecting 30 samples a day using boats, captains, and a crew of three at an average cost of $100 per sample," says founder and CEO Jessica Chosid.

Addressing the Long-term Impact of ML Decisions via Policy Regret Machine Learning

Machine Learning (ML) increasingly informs the allocation of opportunities to individuals and communities in areas such as lending, education, employment, and beyond. Such decisions often impact their subjects' future characteristics and capabilities in an a priori unknown fashion. The decision-maker, therefore, faces exploration-exploitation dilemmas akin to those in multi-armed bandits. Following prior work, we model communities as arms. To capture the long-term effects of ML-based allocation decisions, we study a setting in which the reward from each arm evolves every time the decision-maker pulls that arm. We focus on reward functions that are initially increasing in the number of pulls but may become (and remain) decreasing after a certain point. We argue that an acceptable sequential allocation of opportunities must take an arm's potential for growth into account. We capture these considerations through the notion of policy regret, a much stronger notion than the often-studied external regret, and present an algorithm with provably sub-linear policy regret for sufficiently long time horizons. We empirically compare our algorithm with several baselines and find that it consistently outperforms them, in particular for long time horizons.

Data-driven discovery of interpretable causal relations for deep learning material laws with uncertainty propagation Machine Learning

This paper presents a computational framework that generates ensemble predictive mechanics models with uncertainty quantification (UQ). We first develop a causal discovery algorithm to infer causal relations among time-history data measured during each representative volume element (RVE) simulation through a directed acyclic graph (DAG). With multiple plausible sets of causal relationships estimated from multiple RVE simulations, the predictions are propagated in the derived causal graph while using a deep neural network equipped with dropout layers as a Bayesian approximation for uncertainty quantification. We select two representative numerical examples (traction-separation laws for frictional interfaces, elastoplasticity models for granular assembles) to examine the accuracy and robustness of the proposed causal discovery method for the common material law predictions in civil engineering applications.

The State of AI Ethics Report (January 2021) Artificial Intelligence

The 3rd edition of the Montreal AI Ethics Institute's The State of AI Ethics captures the most relevant developments in AI Ethics since October 2020. It aims to help anyone, from machine learning experts to human rights activists and policymakers, quickly digest and understand the field's ever-changing developments. Through research and article summaries, as well as expert commentary, this report distills the research and reporting surrounding various domains related to the ethics of AI, including: algorithmic injustice, discrimination, ethical AI, labor impacts, misinformation, privacy, risk and security, social media, and more. In addition, The State of AI Ethics includes exclusive content written by world-class AI Ethics experts from universities, research institutes, consulting firms, and governments. Unique to this report is "The Abuse and Misogynoir Playbook," written by Dr. Katlyn Tuner (Research Scientist, Space Enabled Research Group, MIT), Dr. Danielle Wood (Assistant Professor, Program in Media Arts and Sciences; Assistant Professor, Aeronautics and Astronautics; Lead, Space Enabled Research Group, MIT) and Dr. Catherine D'Ignazio (Assistant Professor, Urban Science and Planning; Director, Data + Feminism Lab, MIT). The piece (and accompanying infographic), is a deep-dive into the historical and systematic silencing, erasure, and revision of Black women's contributions to knowledge and scholarship in the United Stations, and globally. Exposing and countering this Playbook has become increasingly important following the firing of AI Ethics expert Dr. Timnit Gebru (and several of her supporters) at Google. This report should be used not only as a point of reference and insight on the latest thinking in the field of AI Ethics, but should also be used as a tool for introspection as we aim to foster a more nuanced conversation regarding the impacts of AI on the world.

A Unifying and Canonical Description of Measure-Preserving Diffusions Machine Learning

A complete recipe of measure-preserving diffusions in Euclidean space was recently derived unifying several MCMC algorithms into a single framework. In this paper, we develop a geometric theory that improves and generalises this construction to any manifold. We thereby demonstrate that the completeness result is a direct consequence of the topology of the underlying manifold and the geometry induced by the target measure $P$; there is no need to introduce other structures such as a Riemannian metric, local coordinates, or a reference measure. Instead, our framework relies on the intrinsic geometry of $P$ and in particular its canonical derivative, the deRham rotationnel, which allows us to parametrise the Fokker--Planck currents of measure-preserving diffusions using potentials. The geometric formalism can easily incorporate constraints and symmetries, and deliver new important insights, for example, a new complete recipe of Langevin-like diffusions that are suited to the construction of samplers. We also analyse the reversibility and dissipative properties of the diffusions, the associated deterministic flow on the space of measures, and the geometry of Langevin processes. Our article connects ideas from various literature and frames the theory of measure-preserving diffusions in its appropriate mathematical context.

Linear Convergence of the Subspace Constrained Mean Shift Algorithm: From Euclidean to Directional Data Machine Learning

This paper studies linear convergence of the subspace constrained mean shift (SCMS) algorithm, a well-known algorithm for identifying a density ridge defined by a kernel density estimator. By arguing that the SCMS algorithm is a special variant of a subspace constrained gradient ascent (SCGA) algorithm with an adaptive step size, we derive linear convergence of such SCGA algorithm. While the existing research focuses mainly on density ridges in the Euclidean space, we generalize density ridges and the SCMS algorithm to directional data. In particular, we establish the stability theorem of density ridges with directional data and prove the linear convergence of our proposed directional SCMS algorithm.

Bridging observation, theory and numerical simulation of the ocean using Machine Learning Machine Learning

Progress within physical oceanography has been concurrent with the increasing sophistication of tools available for its study. The incorporation of machine learning (ML) techniques offers exciting possibilities for advancing the capacity and speed of established methods and also for making substantial and serendipitous discoveries. Beyond vast amounts of complex data ubiquitous in many modern scientific fields, the study of the ocean poses a combination of unique challenges that ML can help address. The observational data available is largely spatially sparse, limited to the surface, and with few time series spanning more than a handful of decades. Important timescales span seconds to millennia, with strong scale interactions and numerical modelling efforts complicated by details such as coastlines. This review covers the current scientific insight offered by applying ML and points to where there is imminent potential. We cover the main three branches of the field: observations, theory, and numerical modelling. Highlighting both challenges and opportunities, we discuss both the historical context and salient ML tools. We focus on the use of ML in situ sampling and satellite observations, and the extent to which ML applications can advance theoretical oceanographic exploration, as well as aid numerical simulations. Applications that are also covered include model error and bias correction and current and potential use within data assimilation. While not without risk, there is great interest in the potential benefits of oceanographic ML applications; this review caters to this interest within the research community.

Randomized Algorithms for Scientific Computing (RASC) Artificial Intelligence

Randomized algorithms have propelled advances in artificial intelligence and represent a foundational research area in advancing AI for Science. Future advancements in DOE Office of Science priority areas such as climate science, astrophysics, fusion, advanced materials, combustion, and quantum computing all require randomized algorithms for surmounting challenges of complexity, robustness, and scalability. This report summarizes the outcomes of that workshop, "Randomized Algorithms for Scientific Computing (RASC)," held virtually across four days in December 2020 and January 2021.