Collaborating Authors

Scientific Discovery

Data Scientist - IoT BigData Jobs


DuPont has a rich history of scientific discovery that has enabled countless innovations and today, we're looking for more people, in more places, to collaborate with us to make life the best that it can be. DuPont Pioneer is aggressively building Big Data and Predictive Analytics capabilities in order to deliver improved services to our customers. We seek a strong data scientist with a background in math, statistics, machine learning and scientific computing to join our team. This is a critical position with the potential to make immediate, significant impact on our business. The successful candidate will have an extensive background in statistical computing and machine learning through courses or thesis/dissertation, and proven experience validating models against experimental data.

Steve Jobs said Silicon Valley needs serendipity, but is it even possible in a Zoom world?


Late Apple co-founder and CEO Steve Jobs stressed the importance of serendipity in Silicon Valley, by which he meant chance, unplanned encounters in person between tech employees. His successor, Tim Cook, Monday held a virtual conference for developers in front of the empty seats of the Steve Jobs Theater on Apple's Cupertino campus. The late Steve Jobs told his biographer, Walter Isaacson, that when he commissioned the headquarters for the animated film studio Pixar, in East Bay, Jobs made sure it was an open structure, where everything converged on an atrium. Jobs believed, as Isaacson described it, that creativity is a result of serendipity. Serendipity, meaning, discoveries that happen as a result of chance encounters, is the exact term he used, and to Jobs, it meant in-person meetings.

Machine Learning Tool Could Provide Unexpected Scientific Insights into COVID-19


Berkeley Lab researchers (clockwise from top left) Kristin Persson, John Dagdelen, Gerbrand Ceder, and Amalie Trewartha led development of COVIDScholar, a text-mining tool for COVID-19-related scientific literature. A team of materials scientists at Lawrence Berkeley National Laboratory (Berkeley Lab) – scientists who normally spend their time researching things like high-performance materials for thermoelectrics or battery cathodes – have built a text-mining tool in record time to help the global scientific community synthesize the mountain of scientific literature on COVID-19 being generated every day. The tool, live at, The hope is that the tool could eventually enable "automated science." "On Google and other search engines people search for what they think is relevant," said Berkeley Lab scientist Gerbrand Ceder, one of the project leads.

Department of Energy plans major AI push to speed scientific discoveries


A U.S. Department of Energy initiative could refurbish existing supercomputers, turning them into high-performance artificial intelligence machines. WASHINGTON, D.C.--The U.S. Department of Energy (DOE) is planning a major initiative to use artificial intelligence (AI) to speed up scientific discoveries. At a meeting here last week, DOE officials said they will likely ask Congress for between $3 billion and $4 billion over 10 years, roughly the amount the agency is spending to build next-generation "exascale" supercomputers. "That's a good starting point," says Earl Joseph, CEO of Hyperion Research, a high-performance computing analysis firm in St. Paul that tracks AI research funding. He notes, though, that DOE's planned spending is modest compared with the feverish investment in AI by China and industry.

Autonomous discovery in the chemical sciences part I: Progress Artificial Intelligence

This two-part review examines how automation has contributed to different aspects of discovery in the chemical sciences. In this first part, we describe a classification for discoveries of physical matter (molecules, materials, devices), processes, and models and how they are unified as search problems. We then introduce a set of questions and considerations relevant to assessing the extent of autonomy. Finally, we describe many case studies of discoveries accelerated by or resulting from computer assistance and automation from the domains of synthetic chemistry, drug discovery, inorganic chemistry, and materials science. These illustrate how rapid advancements in hardware automation and machine learning continue to transform the nature of experimentation and modelling. Part two reflects on these case studies and identifies a set of open challenges for the field.

How We Improved Data Discovery for Data Scientists at Spotify


Not only does this provide useful information to users in the moment, but it has also helped raise awareness and increase the adoption of Lexikon. Since launching the Lexikon Slack Bot, we've seen a sustained 25% increase in the number of Lexikon links shared on Slack per week. You just listened to a track by a new artist on your Discover Weekly and you're hooked. You want to hear more and learn about the artist. So, you go to the artist page on Spotify where you can check out the most popular tracks across different albums, read an artist bio, check out playlists where people tend to discover the artist, and explore similar artists.

Nonzero-sum Adversarial Hypothesis Testing Games

Neural Information Processing Systems

We study nonzero-sum hypothesis testing games that arise in the context of adversarial classification, in both the Bayesian as well as the Neyman-Pearson frameworks. We first show that these games admit mixed strategy Nash equilibria, and then we examine some interesting concentration phenomena of these equilibria. Our main results are on the exponential rates of convergence of classification errors at equilibrium, which are analogous to the well-known Chernoff-Stein lemma and Chernoff information that describe the error exponents in the classical binary hypothesis testing problem, but with parameters derived from the adversarial model. The results are validated through numerical experiments. Papers published at the Neural Information Processing Systems Conference.

Chance discovery brings quantum computing using standard microchips a step closer


A study to prod an antimony nucleus (buried in the middle of this device) with magnetic fields became one with electric fields when a key wire melted a gap in it. An accidental innovation has given a dark-horse approach to quantum computing a boost. For decades, scientists have dreamed of using atomic nuclei embedded in silicon--the familiar stuff of microchips--as quantum bits, or qubits, in a superpowerful quantum computer, manipulating them with magnetic fields. Now, researchers in Australia have stumbled across a way to control such a nucleus with more-manageable electric fields, raising the prospect of controlling the qubits in much the same way as transistors in an ordinary microchip. "That's incredibly important," says Thaddeus Ladd, a research physicist at HRL Laboratories LLC., a private research company.

Why Philip Pullman Is Obsessed with Panpsychism - Facts So Romantic


Philip Pullman is once again having a moment, thanks to the new blockbuster adaptation of His Dark Materials by the BBC and HBO. His fantasy classic--filled with witches, talking bears and "daemons" (people's alter-egos that take animal form)--is rendered in glorious steampunk detail. Pullman has also returned to the fictional world of his heroine, Lyra Belacqua, with a new trilogy, The Book of Dust, which probes more deeply into the central question of his earlier books: What is the nature of consciousness? Pullman loves to write about big ideas, and recent scientific discoveries about dark matter and the Higgs boson have inspired certain plot elements in his novels. The biggest mystery in these books--an enigmatic substance called Dust--comes right out of current debates among scientists and philosophers about the origins of consciousness and the provocative theory of panpsychism.

PAPRIKA: Private Online False Discovery Rate Control Machine Learning

In the modern era of big data, data analyses play an important role in decision-making in healthcare, information technology, and government agencies. The growing availability of large-scale datasets and ease of data analysis, while beneficial to society, has created a severe crisis of reproducibility in science. In 2011, Bayer HealthCare reviewed 67 in-house projects and found that they could replicate fewer than 25 percent, and found that over two-thirds of the projects had major inconsistencies [oSEM19]. One major reason is that random noise in the data can often be mistaken for interesting signals, which does not lead to valid and reproducible results. This problem is particularly relevant when testing multiple hypotheses, when there is an increased chance of false discoveries based on noise in the data. For example, an analyst may conduct 250 hypothesis tests and find that 11 are significant at the 5% level. This may be exciting to the researcher who publishes a paper based on these findings, but elementary statistics suggests that (in expectation) 12.5 of those tests should be significant at that level purely by chance, even if the null hypotheses were all true. To avoid such problems, statisticians have developed tools for controlling overall error rates when performing multiple hypothesis tests. In hypothesis testing problems, the null hypothesis of no interesting scientific discovery (e.g., a drug has no effect), is tested against the alternative hypothesis of a particular scientific theory being true (e.g., a drug