Goto

Collaborating Authors

 Scientific Discovery


Toward Human-AI Co-creation to Accelerate Material Discovery

arXiv.org Artificial Intelligence

There is an increasing need in our society to achieve faster advances in Science to tackle urgent problems, such as climate changes, environmental hazards, sustainable energy systems, pandemics, among others. In certain domains like chemistry, scientific discovery carries the extra burden of assessing risks of the proposed novel solutions before moving to the experimental stage. Despite several recent advances in Machine Learning and AI to address some of these challenges, there is still a gap in technologies to support end-to-end discovery applications, integrating the myriad of available technologies into a coherent, orchestrated, yet flexible discovery process. Such applications need to handle complex knowledge management at scale, enabling knowledge consumption and production in a timely and efficient way for subject matter experts (SMEs). Furthermore, the discovery of novel functional materials strongly relies on the development of exploration strategies in the chemical space. For instance, generative models have gained attention within the scientific community due to their ability to generate enormous volumes of novel molecules across material domains. These models exhibit extreme creativity that often translates in low viability of the generated candidates. In this work, we propose a workbench framework that aims at enabling the human-AI co-creation to reduce the time until the first discovery and the opportunity costs involved. This framework relies on a knowledge base with domain and process knowledge, and user-interaction components to acquire knowledge and advise the SMEs. Currently,the framework supports four main activities: generative modeling, dataset triage, molecule adjudication, and risk assessment.


Causal Structural Hypothesis Testing and Data Generation Models

arXiv.org Artificial Intelligence

A vast amount of expert and domain knowledge is captured by causal structural priors, yet there has been little research on testing such priors for generalization and data synthesis purposes. We propose a novel model architecture, Causal Structural Hypothesis Testing, that can use nonparametric, structural causal knowledge and approximate a causal model's functional relationships using deep neural networks. We use these architectures for comparing structural priors, akin to hypothesis testing, using a deliberate (non-random) split of training and testing data. Extensive simulations demonstrate the effectiveness of out-of-distribution generalization error as a proxy for causal structural prior hypothesis testing and offers a statistical baseline for interpreting results. We show that the variational version of the architecture, Causal Structural Variational Hypothesis Testing can improve performance in low SNR regimes. Due to the simplicity and low parameter count of the models, practitioners can test and compare structural prior hypotheses on small dataset and use the priors with the best generalization capacity to synthesize much larger, causally-informed datasets. Finally, we validate our methods on a synthetic pendulum dataset, and show a use-case on a real-world trauma surgery ground-level falls dataset. Our code is available on GitHub.


Data-Driven Computational Imaging for Scientific Discovery

#artificialintelligence

In computational imaging, hardware for signal sampling and software for object reconstruction are designed in tandem for improved capability. Examples of such systems include computed tomography (CT), magnetic resonance imaging (MRI), and superresolution microscopy. In contrast to more traditional cameras, in these devices, indirect measurements are taken and computational algorithms are used for reconstruction. This allows for advanced capabilities such as super-resolution or 3-dimensional imaging, pushing forward the frontier of scientific discovery. However, these techniques generally require a large number of measurements, causing low throughput, motion artifacts, and/or radiation damage, limiting applications. Data-driven approaches to reducing the number of measurements needed have been proposed, but they predominately require a ground truth or reference dataset, which may be impossible to collect. This work outlines a self-supervised approach and explores the future work that is necessary to make such a technique usable for real applications. Light-emitting diode (LED) array microscopy, a modality that allows visualization of transparent objects in two and three dimensions with high resolution and field-of-view, is used as an illustrative example. We release our code at https://github.com/vganapati/LED_PVAE and our experimental data at https://doi.org/10.6084/m9.figshare.21232088 .


Schmidt Futures Will Invest Additional $148 Million In Artificial Intelligence Research

#artificialintelligence

Schmidt Futures, a philanthropic initiative co-founded by former Google CEO and Chairman Eric ... [ ] Schmidt and his wife Wendy, is expanding its investment in artificial intelligence research. Schmidt Futures announced today that it was investing $148 million to fund the Eric and Wendy Schmidt AI in Science Postdoctoral Fellowship, a program of Schmidt Futures. With this newest funding, Schmidt Futures, a philanthropic initiative co-founded by former Google CEO and Chairman Eric Schmidt and his wife Wendy, has now committed a total of $400 million to support the development of artificial intelligence (AI) for scientific discovery for other advances in technology and engineering fields. According to the announcement, the new funding will initially support about 160 postdoctoral fellows at nine universities around the world to learn and apply AI methods to their research. The fellowship is expected to expand to more institutions and countries in the future.


How you can contribute to scientific discoveries from your couch

PBS NewsHour

When you picture a scientist, do you see a white coat-clad PhD-holder pipetting away at a lab bench? Or maybe a skygazer with a different day job who goes out on clear nights for a good view of the stars? Historically speaking, both of those examples fit the bill. German-British astronomer William Herschel was originally an amateur who observed the night sky using homemade telescopes. He discovered Uranus in 1781, working alongside his sister, Caroline Herschel, who made multiple discoveries herself.


Applications of Hypothesis Testing part3(Advanced Statistics)

#artificialintelligence

Abstract: In many scenarios such as genome-wide association studies where dependences between variables commonly exist, it is often of interest to infer the interaction effects in the model. However, testing pairwise interactions among millions of variables in complex and high-dimensional data suffers from low statistical power and huge computational cost. To address these challenges, we propose a two-stage testing procedure with false discovery rate (FDR) control, which is known as a less conservative multiple-testing correction. Theoretically, the difficulty in the FDR control dues to the data dependence among test statistics in two stages, and the fact that the number of hypothesis tests conducted in the second stage depends on the screening result in the first stage. By using the Cramér type moderate deviation technique, we show that our procedure controls FDR at the desired level asymptotically in the generalized linear model (GLM), where the model is allowed to be misspecified.


Computer scientist, Data scientist or similar with a focus on knowledge management (f/m/x) - Data Discovery for Anonymised Health Data

#artificialintelligence

The focus of the DLR Institute for Data Science in Jena is to find solutions for the major challenges of the digitalisation age. The research focuses on the areas of data extraction and mobilisation, data management and preparation, and data analysis and intelligence. The position is part of the BMBF project Avatar (anonymisation of personal health data by creating virtual avatars). Topics include, in particular, the semantic modelling of relevant metadata and data discovery. The overall goal of the project is providing anonymised health data for both academic and commercial research.


Mediamorphosis: How AI is enabling a new paradigm for work and play

#artificialintelligence

Did you miss a session from MetaBeat 2022? Head over to the on-demand library for all of our featured sessions here. Text-to-image AI systems such as DALL-E 2, Imagen and Midjourney are growing in popularity and capability right now, offering creators a revolutionary new way to produce content. Generating images from text prompts is a radical new approach to art-making and creative expression. But it also gives us the first glimpse of a fundamental shift in how we can better communicate and collaborate with our machines.


ZeroC: A Neuro-Symbolic Model for Zero-shot Concept Recognition and Acquisition at Inference Time

arXiv.org Artificial Intelligence

Humans have the remarkable ability to recognize and acquire novel visual concepts in a zero-shot manner. Given a high-level, symbolic description of a novel concept in terms of previously learned visual concepts and their relations, humans can recognize novel concepts without seeing any examples. Moreover, they can acquire new concepts by parsing and communicating symbolic structures using learned visual concepts and relations. Endowing these capabilities in machines is pivotal in improving their generalization capability at inference time. In this work, we introduce Zero-shot Concept Recognition and Acquisition (ZeroC), a neuro-symbolic architecture that can recognize and acquire novel concepts in a zero-shot way. ZeroC represents concepts as graphs of constituent concept models (as nodes) and their relations (as edges). To allow inference time composition, we employ energy-based models (EBMs) to model concepts and relations. We design ZeroC architecture so that it allows a one-to-one mapping between a symbolic graph structure of a concept and its corresponding EBM, which for the first time, allows acquiring new concepts, communicating its graph structure, and applying it to classification and detection tasks (even across domains) at inference time. We introduce algorithms for learning and inference with ZeroC. We evaluate ZeroC on a challenging grid-world dataset which is designed to probe zero-shot concept recognition and acquisition, and demonstrate its capability.


Riemannian geometry as a unifying theory for robot motion learning and control

arXiv.org Artificial Intelligence

Riemannian geometry is a mathematical field which has been the cornerstone of revolutionary scientific discoveries such as the theory of general relativity. Despite early uses in robot design and recent applications for exploiting data with specific geometries, it mostly remains overlooked in robotics. With this blue sky paper, we argue that Riemannian geometry provides the most suitable tools to analyze and generate well-coordinated, energy-efficient motions of robots with many degrees of freedom. Via preliminary solutions and novel research directions, we discuss how Riemannian geometry may be leveraged to design and combine physically-meaningful synergies for robotics, and how this theory also opens the door to coupling motion synergies with perceptual inputs.