Abduction, of inference to the best explanation, is a form of inference that goes from data describing something to a hypothesis that best explains or accounts for the data.
D is a collection of data (facts, observations, givens).
H explains D (would, if true, explain D).
No other hypothesis can explain D as well as H does.
... Therefore, H is probably true.
– Josephson & Josephson, Abductive Inference
A U.S. Department of Energy initiative could refurbish existing supercomputers, turning them into high-performance artificial intelligence machines. WASHINGTON, D.C.--The U.S. Department of Energy (DOE) is planning a major initiative to use artificial intelligence (AI) to speed up scientific discoveries. At a meeting here last week, DOE officials said they will likely ask Congress for between $3 billion and $4 billion over 10 years, roughly the amount the agency is spending to build next-generation "exascale" supercomputers. "That's a good starting point," says Earl Joseph, CEO of Hyperion Research, a high-performance computing analysis firm in St. Paul that tracks AI research funding. He notes, though, that DOE's planned spending is modest compared with the feverish investment in AI by China and industry.
Mathematical proofs are both paradigms of certainty and some of the most explicitly-justified arguments that we have in the cultural record. Their very explicitness, however, leads to a paradox, because their probability of error grows exponentially as the argument expands. Here we show that under a cognitively-plausible belief formation mechanism that combines deductive and abductive reasoning, mathematical arguments can undergo what we call an epistemic phase transition: a dramatic and rapidly-propagating jump from uncertainty to near-complete confidence at reasonable levels of claim-to-claim error rates. To show this, we analyze an unusual dataset of forty-eight machine-aided proofs from the formalized reasoning system Coq, including major theorems ranging from ancient to 21st Century mathematics, along with four hand-constructed cases from Euclid, Apollonius, Spinoza, and Andrew Wiles. Our results bear both on recent work in the history and philosophy of mathematics, and on a question, basic to cognitive science, of how we form beliefs, and justify them to others.
The recipes for those proteins--called genes--are encoded in our DNA. An error in the genetic recipe may result in a malformed protein, which could result in disease or death for an organism. Many diseases, therefore, are fundamentally linked to proteins. But just because you know the genetic recipe for a protein doesn't mean you automatically know its shape. Proteins are comprised of chains of amino acids (also referred to as amino acid residues).
To understand the natural world, scientists must measure it--but how do we define our units? Over the decades, scientists have gradually redefined classic units in terms of universal constants, such as using the speed of light to help define the length of a meter. But the scientific unit of mass, the kilogram, remained pegged to "Le Grand K," a metallic cylinder stored at a facility in France. If that ingot's mass varied for whatever reason, scientists would have to recalibrate their instruments. No more: In 2019, scientists agreed to adopt a new kilogram definition based on a fundamental factor in physics called Planck's constant and the improved definitions for the units of electrical current, temperature, and the number of particles in a given substance.
More detailed analysis would follow from initial discoveries of interesting and significant parameter correlations within complex high-dimensional data. An article was recently published in Nature on "Statistical Errors – p Values, the Gold Standard of Statistical Validity, Are Not as Reliable as Many Scientists Assume" (by Regina Nuzzo, Nature, 506, 150-152, 2014). In this article, Columbia University statistician Andrew Gelman states that instead of doing multiple separate small studies, "researchers would first do small exploratory studies and gather potentially interesting findings without worrying too much about false alarms. Then, on the basis of these results, the authors would decide exactly how they planned to confirm the findings." In other words, a disciplined scientific methodology that includes both exploratory and confirmatory analyses can be documented within an open science framework (e.g., https://osf.io)
Recent improvements in whole slide scanning systems, GPU computing, and deep learning make automated slide analysis well-equipped to solve new and challenging analysis tasks. These learning methods are trained on labeled data. This could be anything from annotating many examples of mitosis, labeling tissue types, or categorizing a full slide or set of slides from a particular patient sample. The goal is then to learning a mapping from the input images to the desired output on training data. Then the same model can be applied to unseen data.
Natural language understanding (NLU) of text is a fundamental challenge in AI, and it has received significant attention throughout the history of NLP research. This primary goal has been studied under different tasks, such as Question Answering (QA) and Textual Entailment (TE). In this thesis, we investigate the NLU problem through the QA task and focus on the aspects that make it a challenge for the current state-of-the-art technology. This thesis is organized into three main parts: In the first part, we explore multiple formalisms to improve existing machine comprehension systems. We propose a formulation for abductive reasoning in natural language and show its effectiveness, especially in domains with limited training data. Additionally, to help reasoning systems cope with irrelevant or redundant information, we create a supervised approach to learn and detect the essential terms in questions. In the second part, we propose two new challenge datasets. In particular, we create two datasets of natural language questions where (i) the first one requires reasoning over multiple sentences; (ii) the second one requires temporal common sense reasoning. We hope that the two proposed datasets will motivate the field to address more complex problems. In the final part, we present the first formal framework for multi-step reasoning algorithms, in the presence of a few important properties of language use, such as incompleteness, ambiguity, etc. We apply this framework to prove fundamental limitations for reasoning algorithms. These theoretical results provide extra intuition into the existing empirical evidence in the field.
Allen, associate professor of statistics, computer science and electrical and computer engineering at Rice and of pediatrics-neurology at Baylor College of Medicine, will address the topic in both a press briefing and a general session today at the 2019 Annual Meeting of the American Association for the Advancement of Science (AAAS). "The question is, 'Can we really trust the discoveries that are currently being made using machine-learning techniques applied to large data sets?'" "The answer in many situations is probably, 'Not without checking,' but work is underway on next-generation machine-learning systems that will assess the uncertainty and reproducibility of their predictions." Machine learning (ML) is a branch of statistics and computer science concerned with building computational systems that learn from data rather than following explicit instructions. Allen said much attention in the ML field has focused on developing predictive models that allow ML to make predictions about future data based on its understanding of data it has studied. "A lot of these techniques are designed to always make a prediction," she said.
Rice University statistician Genevera Allen says scientists must keep questioning the accuracy and reproducibility of scientific discoveries made by machine-learning techniques until researchers develop new computational systems that can critique themselves. Allen, associate professor of statistics, computer science and electrical and computer engineering at Rice and of pediatrics-neurology at Baylor College of Medicine, will address the topic in both a press briefing and a general session today at the 2019 Annual Meeting of the American Association for the Advancement of Science (AAAS). "The question is, 'Can we really trust the discoveries that are currently being made using machine-learning techniques applied to large data sets?'" "The answer in many situations is probably, 'Not without checking,' but work is underway on next-generation machine-learning systems that will assess the uncertainty and reproducibility of their predictions." Machine learning (ML) is a branch of statistics and computer science concerned with building computational systems that learn from data rather than following explicit instructions. Allen said much attention in the ML field has focused on developing predictive models that allow ML to make predictions about future data based on its understanding of data it has studied.