AITopics | Scientific Discovery

Collaborating Authors

Scientific Discovery

"The problem of giving rules for producing true scientific statements has been replaced by the problem of finding efficient heuristic rules for culling the reasonable candidates for an explanation from an appropriate set of possible candidates [and finding methods for constructing the candidates]."
– B. Buchanan, quoted in Lindley Darden. Recent Work in Computational Scientific Discovery.

News Overviews Instructional Materials AI-Alerts Classics

Listen To Whistler Waves NASA Recorded From Space

International Business TimesNov-15-2017, 22:25:02 GMT

Researches have made a breakthrough discovery about the impulsive electron loss that happens in the Earth's upper atmosphere. A paper on the research was published in the Geophysical Review Letters on Wednesday and details the scientific discoveries two spacecraft made about the loss and its cause, according to NASA. The Cubesat FIREBIRD II was one of those craft that recorded the electron microburst when it happened. The craft observed the microbursts from its place orbiting 310 miles above Earth while one of the Van Allen Probes that orbits a bit higher up was able to capture a rising-tone lower band chorus. That chorus of waves had the duration and cadence highly similar to those of the microburst that the FIREBIRD had captured.

artificial intelligence, chorus, scientific discovery, (6 more...)

International Business Times

Country: North America > United States (0.97)

Industry:

Government > Space Agency (0.97)
Government > Regional Government > North America Government > United States Government (0.97)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.58)

Add feedback

Kernel Two-Sample Hypothesis Testing Using Kernel Set Classification

Masnadi-Shirazi, Hamed

arXiv.org Machine LearningNov-13-2017

The two-sample hypothesis testing problem is studied for the challenging scenario of high dimensional data sets with small sample sizes. We show that the two-sample hypothesis testing problem can be posed as a one-class set classification problem. In the set classification problem the goal is to classify a set of data points that are assumed to have a common class. We prove that the average probability of error given a set is less than or equal to the Bayes error and decreases as a power of $n$ number of sample data points in the set. We use the positive definite Set Kernel for directly mapping sets of data to an associated Reproducing Kernel Hilbert Space, without the need to learn a probability distribution. We specifically solve the two-sample hypothesis testing problem using a one-class SVM in conjunction with the proposed Set Kernel. We compare the proposed method with the Maximum Mean Discrepancy, F-Test and T-Test methods on a number of challenging simulated high dimensional and small sample size data. We also perform two-sample hypothesis testing experiments on six cancer gene expression data sets and achieve zero type-I and type-II error results on all data sets.

artificial intelligence, dimension, machine learning, (17 more...)

arXiv.org Machine Learning

1706.05612

Country:

North America > United States > New York (0.04)
North America > United States > Iowa (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.76)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

Theory-guided Data Science: A New Paradigm for Scientific Discovery from Data

Karpatne, Anuj, Atluri, Gowtham, Faghmous, James, Steinbach, Michael, Banerjee, Arindam, Ganguly, Auroop, Shekhar, Shashi, Samatova, Nagiza, Kumar, Vipin

arXiv.org Artificial IntelligenceNov-13-2017

Data science models, although successful in a number of commercial domains, have had limited applicability in scientific problems involving complex physical phenomena. Theory-guided data science (TGDS) is an emerging paradigm that aims to leverage the wealth of scientific knowledge for improving the effectiveness of data science models in enabling scientific discovery. The overarching vision of TGDS is to introduce scientific consistency as an essential component for learning generalizable models. Further, by producing scientifically interpretable models, TGDS aims to advance our scientific understanding by discovering novel domain insights. Indeed, the paradigm of TGDS has started to gain prominence in a number of scientific disciplines such as turbulence modeling, material discovery, quantum chemistry, bio-medical science, bio-marker discovery, climate science, and hydrology. In this paper, we formally conceptualize the paradigm of TGDS and present a taxonomy of research themes in TGDS. We describe several approaches for integrating domain knowledge in different research themes using illustrative examples from different disciplines. We also highlight some of the promising avenues of novel research for realizing the full potential of theory-guided data science.

artificial intelligence, knowledge, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TKDE.2017.2720168

1612.08544

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > Minnesota (0.04)
North America > United States > North Carolina (0.04)
(13 more...)

Genre: Research Report (0.81)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)
(3 more...)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(4 more...)

Add feedback

Trend Analysis of Fragmented Time Series: Hypothesis Testing Based Adaptive Spline Filtering Method

#artificialintelligenceNov-2-2017, 22:20:24 GMT

Missing data present significant challenges to trend analysis of time series. Straightforward approaches consisting of supplementing missing data with constant or zero values or with linear trends can severely degrade the quality of the trend analysis, which significantly reduces the reliability of the trend analysis. We present a robust adaptive approach to discover the trends from fragmented time series. The approach proposed in this paper is based on the HASF (Hypothesis-testing-based Adaptive Spline Filtering) trend analysis algorithm, which can accommodate non-uniform sampling and is therefore inherently robust to missing data. HASF adapts the nodes of the spline based on hypothesis testing and variance minimization, which adds to its robustness.

artificial intelligence, data mining, node, (12 more...)

#artificialintelligence

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.85)

Add feedback

Data Science- Hypothesis Testing Using Minitab and R

@machinelearnbotOct-29-2017, 04:05:22 GMT

Formulating the Null and the alternate hypothesis for normality test; Choice of null hypothesis based on absence of action and the vice versa for alternate hypothesis; checking for normality in Minitab; interpreting the Q–Q plot; Comparing the computed'p' value with α (alpha) for taking the decision on whether or not to take the action; Step to performing the 1 sample Z test, selection of appropriate hypothesis in minitab.

artificial intelligence, data science-hypothesis testing, scientific discovery

@machinelearnbot

Genre:

Instructional Material > Online (0.40)
Instructional Material > Course Syllabus & Notes (0.40)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (0.40)
Education > Educational Setting > Online (0.40)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (0.40)
Information Technology > Data Science (0.40)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.40)

Add feedback

Google's AI is a "new paradigm" that unites humans and machines

#artificialintelligenceOct-29-2017, 00:01:04 GMT

Google is fully aware of artificial intelligence's (AI) potential -- DeepMind's AlphaGo AI is one of today's most well-known examples of its capabilities -- and in an earnings call this week, the company made it clear they believe the future of technology lies with AI. During the call, Sundar Pichai, CEO of Alphabet (Google's parent company), praised the company's decision to invest in AI early, highlighting the concept's trajectory from "a research project to something that can solve new problems for a billion people a day," according to an Inverse report. Pichai went on to note how Google's AI research is already producing products that utilize machine learning, such as the Google Clips camera that debuted earlier this month. "Even though we are in the early days of AI, we are already rethinking how to build products around machine learning," said Pichai. "It's a new paradigm compared to mobile-first software, and I'm thrilled how Google is leading the way."

artificial intelligence, creativity & intelligence, machine learning, (8 more...)

#artificialintelligence

Industry: Information Technology > Services (0.37)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.62)
Information Technology > Artificial Intelligence > Cognitive Science > Creativity & Intelligence (0.62)

Add feedback

From Distance Correlation to Multiscale Generalized Correlation

Shen, Cencheng, Priebe, Carey E., Vogelstein, Joshua T.

arXiv.org Machine LearningOct-26-2017

Understanding and developing a correlation measure that can detect general dependencies is not only imperative to statistics and machine learning, but also crucial to general scientific discovery in the big data age. We proposed the Multiscale Generalized Correlation (MGC) in Shen et al. 2017 as a novel correlation measure, which worked well empirically and helped a number of real data discoveries. But there is a wide gap with respect to the theoretical side, e.g., the population statistic, the convergence from sample to population, how well does the algorithmic Sample MGC perform, etc. To better understand its underlying mechanism, in this paper we formalize the population version of local distance correlations, MGC, and the optimal local scale between the underlying random variables, by utilizing the characteristic functions and incorporating the nearest-neighbor machinery. The population version enables a seamless connection with, and significant improvement to, the algorithmic Sample MGC, both theoretically and in practice, which further allows a number of desirable asymptotic and finite-sample properties to be proved and explored for MGC. The advantages of MGC are further illustrated via a comprehensive set of simulations with linear, nonlinear, univariate, multivariate, and noisy dependencies, where it loses almost no power against monotone dependencies while achieving superior performance against general dependencies.

artificial intelligence, correlation, machine learning, (17 more...)

arXiv.org Machine Learning

1710.09768

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.54)

Add feedback

Uncommon Hypothesis Tests to Debunk Common Misconceptions

@machinelearnbotOct-13-2017, 22:45:38 GMT

I gave a talk about p-values and hypothesis testing at BIDS. Please check out my slides! P-values get a large share of the blame for the replication crisis in science. People take for granted that the tests they use work without justifying the leap from data to model. Often, reported p-values are erroneous because the underlying model doesn't accurately describe the way the data arose.

artificial intelligence, scientific discovery, uncommon hypothesis test, (1 more...)

@machinelearnbot

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.69)

Add feedback

Open data from the Large Hadron Collider sparks new discovery

EngadgetOct-2-2017, 14:25:03 GMT

Back in 2014, CERN released the data from its Large Hadron Collider (LHC) experiments onto an online portal called the Open Data portal. It was an unprecedented move, making data from the LHC's experiments available to those who don't have access to a particle accelerator. It's not completely up-to-date; there's a three-year embargo on results, so, generally speaking, the most recent data being uploaded is from the year 2014. This was the first time results of any particle collider experiment have been released to the public, and now it's produced results. Last week, a team from MIT released an article in Physical Review Letters that used data from the Compact Muon Solenoid (CMS), one of the LHC's main detectors, to explain a feature within high-energy particle collisions.

artificial intelligence, discovery, scientific discovery, (9 more...)

Engadget

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.40)

Add feedback

The Fourth Paradigm: Data-Intensive Scientific Discovery - Microsoft Research

@machinelearnbotSep-10-2017, 02:44:43 GMT

Increasingly, scientific breakthroughs will be powered by advanced computing capabilities that help researchers manipulate and explore massive datasets. The speed at which any given scientific discipline advances will depend on how well its researchers collaborate with one another, and with technologists, in areas of eScience such as databases, workflow management, visualization, and cloud computing technologies. In The Fourth Paradigm: Data-Intensive Scientific Discovery, the collection of essays expands on the vision of pioneering computer scientist Jim Gray for a new, fourth paradigm of discovery based on data-intensive science and offers insights into how it can be fully realized. "The individual essays--and The Fourth Paradigm as a whole--give readers a glimpse of the horizon for 21st-century research and, at their best, a peek at what lies beyond. "The impact of Jim Gray's thinking is continuing to get people to think in a new way about how data and software are redefining what it means to do science." "I often tell people working in eScience that they aren't in this field because they are visionaries or super-intelligent--it's because they care about science and they are alive now.

artificial intelligence, data-intensive scientific discovery, software engineering, (5 more...)

@machinelearnbot

Country: North America > United States > Arizona (0.07)

Technology:

Information Technology > Software Engineering (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.78)

Add feedback