Goto

Collaborating Authors

 Scientific Discovery


Discovering Relationships and their Structures Across Disparate Data Modalities

arXiv.org Machine Learning

Determining how certain properties are related to other properties is fundamental to scientific discovery. As data collection rates accelerate, it is becoming increasingly difficult yet ever more important to determine whether one property of data (e.g., cloud density) is related to another (e.g., grass wetness). Only if two properties are related are further investigations into the geometry of the relationship warranted. While existing approaches can test whether two properties are related, they may require unfeasibly large sample sizes in real data scenarios, and do not address how they are related. Our key insight is that one can adaptively restrict the analysis to the "jointly local" observations---that is, one can estimate the scales with the most informative neighbors for determining the existence and geometry of a relationship. "Multiscale Graph Correlation" (MGC) is a framework that extends global procedures to be multiscale; consequently, MGC tests typically require far fewer samples than existing methods for a wide variety of dependence structures and dimensionalities, while maintaining computational efficiency. Moreover, MGC provides a simple and elegant multiscale characterization of the potentially complex latent geometry underlying the relationship. In several real data applications, MGC uniquely detects the presence and reveals the geometry of the relationships.


Io-Tahoe Announces Machine Learning Smart Data Discovery Platform

#artificialintelligence

Io-Tahoe, a machine learning-driven smart data discovery company recently announced the launch of its smart data discovery platform at the Gartner Data & Analytics 2018 Summit, where it will showcase the product. The new version includes the addition of Data Catalog, a new feature designed to allow data owners and stewards to use a machine learning-based smart catalog to create, maintain and search business rules. Also, it would help define policies and provide governance workflow functionality. It reportedly enables a business user to govern the rules and define policies for critical data elements. It allows data-driven enterprises to enhance information about data automatically, regardless of the underlying technology and build a data catalog.


Io-Tahoe announces smart data discovery solution - SD Times

#artificialintelligence

Io-Tahoe today announced the general availability of its smart data discovery solution, with the addition of a new machine-learning Data Catalog feature that enables the creation of business rules and policy definition.


Scientific reasoning on paper

Science

Helping students develop skills in both critical thinking and scientific reasoning is fundamental to science education. However, the relationship between these two constructs remains largely unknown. Dowd et al. examined this issue by investigating how students' critical thinking skills related to scientific reasoning in the context of undergraduate thesis writing. The authors used the BioTAP rubric to assess scientific reasoning and the California Critical Thinking Skills Test to assess critical thinking. Results support the role of inference in scientific reasoning in writing, while also revealing other aspects of scientific reasoning (epistemological considerations and writing conventions) not related to critical thinking. In considering future implications for instruction, the authors suggest that further research into the impact of interventions focused on specific critical thinking skills (i.e., inference) for improved science reasoning in writing is needed.


Diving into GDPR Personal Data Discovery

#artificialintelligence

Want to watch this again later? Sign in to add this video to a playlist. Report Need to report the video? Sign in to report inappropriate content. Report Need to report the video?


Benevolent AI: Shaping the Future of Scientific Discovery

#artificialintelligence

Hi there, could you tell us a little about Benevolent AI and your motivations as a company? Despite the huge growth of knowledge, scientific discovery has not changed for 50 years. It's impossible for humans alone to process all the information potentially available to advance scientific research. A new scientific paper is published every 30 seconds, there are 10,000 updates to PubMed every day. Consequently, only a small fraction of globally generated scientific information can form'useable' knowledge.


Universal Hypothesis Testing with Kernels: Asymptotically Optimal Tests for Goodness of Fit

arXiv.org Machine Learning

We characterize the asymptotic performance of nonparametric goodness of fit testing, otherwise known as the universal hypothesis testing that dates back to Hoeffding (1965). The exponential decay rate of the type-II error probability is used as the asymptotic performance metric, hence an optimal test achieves the maximum decay rate subject to a constant level constraint on the type-I error probability. We show that two classes of Maximum Mean Discrepancy (MMD) based tests attain this optimality on $\mathbb R^d$, while a Kernel Stein Discrepancy (KSD) based test achieves a weaker one under this criterion. In the finite sample regime, these tests have similar statistical performance in our experiments, while the KSD based test is more computationally efficient. Key to our approach are Sanov's theorem from large deviation theory and recent results on the weak convergence properties of the MMD and KSD.


Cyclica CEO Naheed Kurji Says AI Could Create a New Paradigm for Drug Development - Top Chinese CRO, Biopharma News, Drug Development News WXPRESS

#artificialintelligence

Toronto-based Cyclica President and CEO Naheed Kurji acknowledges that artificial intelligence (AI) is a transformative technology, but he contends it is not the "silver bullet" for drug discovery and development. Instead, he says that AI together with cloud-based computing could serve as a catalyst for a new approach to drug development. Kurji emphasizes that it is important to create "a virtual drug discovery ecosystem where a number of companies who are expert in their space come together and present a more holistic solution than any individual one could do itself because there is no one silver bullet to this problem. The market is so big and there are so many issues, one company can't do it alone." Kurji leads a five-year-old company that has developed and validated a cloud-based platform, called Ligand Express, which uses biophysics, bioinformatics and AI to help pharmaceutical companies navigate the drug discovery pipeline by assessing the safety and efficacy of drugs. The integrated platform enables companies to screen potential small-molecule drugs against repositories of structurally-characterized proteins or'proteomes' to identify significant protein targets. The platform then leverages AI to determine the biological relevance of these targets, and systems biology data to link this information to particular biological pathways or diseases. Kurji says Cyclica's platform, broadly launched in November 2017 already is being used by some of the top 50 pharma companies globally.


Machine learning & data discovery: You can't analyze what you can't find

#artificialintelligence

Corporate data teams are under intense pressure to leverage machine learning and other technologies to transform everything from customer engagement to supply chain management. But data science doesn't do you much good without good data. So if you want to successfully compete and innovate in today's data-driven marketplace, you better be able to put the right information resources into the right hands -- quickly, efficiently, and comprehensively.


The Vestibular Domain

AI Magazine

Although not all researchers agree on the exact bounds of scientific discovery, theory formation is clearly at the core of the domain. Relevant AI research done in scientific discovery includes Kocabas (1992); Karp (1989); Prager, Belanger, and De Mori (1989); Kulkarni and Simon (1988); and Langley et al. (1987). I consider model-based discovery to be a diagnosis and design problem. More precisely, modelbased-theory refinement can be seen as a four-step process: (1) gather data, (2) compare the data to model-based predictions, (3) identify the sources of discrepancies between the predictions and the field data, and (4) fix these discrepancies by modifying the model. The first three steps are traditionally addressed by diagnosis systems, but the fourth step requires design techniques.