Scientific Discovery


Building for the Blockchain

#artificialintelligence

If you're here, we assume that you're a developer/hacker who's intrigued by the blockchain. You're convinced that you understand how it works and now you're itching to figure out what the blockchain means for you and your developer skill set. If you need more of a primer we'd recommend starting with the bitcoin white paper and Ethereum white paper. Our goal in this post is to: 1. Explain how blockchain development differs from existing development paradigms 2. Provide context for the opportunities and challenges in this space 3. Point you to resources that will give you the foundation to start developing in this new paradigm Internet applications benefit from network effects because they maintain centralized silos of information. Built upon shared, open protocols (e.g.



Request-and-Reverify: Hierarchical Hypothesis Testing for Concept Drift Detection with Expensive Labels

arXiv.org Artificial Intelligence

One important assumption underlying common classification models is the stationarity of the data. However, in real-world streaming applications, the data concept indicated by the joint distribution of feature and label is not stationary but drifting over time. Concept drift detection aims to detect such drifts and adapt the model so as to mitigate any deterioration in the model's predictive performance. Unfortunately, most existing concept drift detection methods rely on a strong and over-optimistic condition that the true labels are available immediately for all already classified instances. In this paper, a novel Hierarchical Hypothesis Testing framework with Request-and-Reverify strategy is developed to detect concept drifts by requesting labels only when necessary. Two methods, namely Hierarchical Hypothesis Testing with Classification Uncertainty (HHT-CU) and Hierarchical Hypothesis Testing with Attribute-wise "Goodness-of-fit" (HHT-AG), are proposed respectively under the novel framework. In experiments with benchmark datasets, our methods demonstrate overwhelming advantages over state-of-the-art unsupervised drift detectors. More importantly, our methods even outperform DDM (the widely used supervised drift detector) when we use significantly fewer labels.


Faster data discovery and access - Forrester Names IBM a Leader in Machine Learning Data Catalogs - Watson

#artificialintelligence

The promise of AI is that it will deliver digital transformation and improve productivity and efficiency across businesses. For many of our customers, IBM Watson has already helped deliver on this promise – by enriching customer interactions, accelerating research and discovery, empowering employees, and mitigating risk. The next step for businesses is to make AI ubiquitous by operationalizing their workflows across the full AI lifecycle. IBM is committed to delivering these fundamental, end-to-end AI capabilities and giving enterprises everything they need. For example, consider the critical step of understanding and preparing data for productive and speedy use in analytical tools, machine learning and deep learning.


Astroinformatics - Wikipedia

#artificialintelligence

Astroinformatics is primarily focused on developing the tools, methods, and applications of computational science, data science, and statistics for research and education in data-oriented astronomy.[1] Early efforts in this direction included data discovery, metadata standards development, data modeling, astronomical data dictionary development, data access, information retrieval,[3] data integration, and data mining[4] in the astronomical Virtual Observatory initiatives.[5][6][7] Further development of the field, along with astronomy community endorsement, was presented to the National Research Council (United States) in 2009 in the Astroinformatics "State of the Profession" Position Paper for the 2010 Astronomy and Astrophysics Decadal Survey.[8] That position paper provided the basis for the subsequent more detailed exposition of the field in the Informatics Journal paper Astroinformatics: Data-Oriented Astronomy Research and Education.[1] Astroinformatics as a distinct field of research was inspired by work in the fields of Bioinformatics and Geoinformatics, and through the eScience work[9] of Jim Gray (computer scientist) at Microsoft Research, whose legacy was remembered and continued through the Jim Gray eScience Awards.[10]


Errol Morris Refutes It Thus

Slate

The 18th-century Irish philosopher Bishop George Berkeley concluded that, since all we know of the universe is what our senses convey to us, things in the world exist only to the extent that we perceive them. They have no material reality, but are phenomena in and of our minds, or the mind of God. Samuel Johnson famously countered this philosophy by kicking a large stone and saying, "I refute it thus!" Two hundred years later, while American campuses roiled with protests against the Vietnam War, the philosopher, historian, and physicist Thomas Kuhn met with a grad student at Princeton's legendary Institute for Advanced Study to discuss the student's paper. The professor and student disagreed on some fundamental ideas, and the conversation grew heated.


Analytics and the AML Paradigm Shift

#artificialintelligence

Financial organizations are using artificial intelligence and machine learning to strengthen their fight against financial crimes. David Stewart, Director of Pre-Sales for the Global Security Intelligence Practice at SAS, offers tips to help separate fact from market hype when reviewing new data analytics tools.


Questioning Truth, Reality, and the Role of Scientific Progress

WIRED

It's an interesting time to be making a case for philosophy in science. On the one hand, some scientists working on ideas such as string theory or the multiverse--ideas that reach far beyond our current means to test them--are forced to make a philosophical defense of research that can't rely on traditional hypothesis testing. On the other hand, some physicists, such as Richard Feynman and Stephen Hawking, were notoriously dismissive of the value of the philosophy of science. Original story reprinted with permission from Quanta Magazine, an editorially independent publication of the Simons Foundation whose mission is to enhance public understanding of science by covering research developments and trends in mathematics and the physical and life sciences. That value is asserted with gentle but firm assurance by Michela Massimi, the recent recipient of the Wilkins-Bernal-Medawar Medal, an award given annually by the UK's Royal Society.


How AI and NLP can broaden data discovery, accessibility and maintain governance. - ODBMS.org

#artificialintelligence

The challenge of controlling and protecting data is a big one, but the bigger question is how to make people more productive with corporate information while maintaining standards of compliance and governance for broad access and use in the age of digital business. Applying AI and Natural Language Processing within the various stages of data analytics is a key way to democratize data and build in safeguards for broad use. Below is a Q&A with Ayush Parashar, a Co-Founder and Vice President of Engineering with Unifi Software. Often the quest for security can eclipse data usability. How can applying AI be used to both discover data and ensure information is not being seen or used by those that shouldn't have access to certain kinds of data?


Robust Hypothesis Testing Using Wasserstein Uncertainty Sets

arXiv.org Machine Learning

We develop a novel computationally efficient and general framework for robust hypothesis testing. The new framework features a new way to construct uncertainty sets under the null and the alternative distributions, which are sets centered around the empirical distribution defined via Wasserstein metric, thus our approach is data-driven and free of distributional assumptions. We develop a convex safe approximation of the minimax formulation and show that such approximation renders a nearly-optimal detector among the family of all possible tests. By exploiting the structure of the least favorable distribution, we also develop a tractable reformulation of such approximation, with complexity independent of the dimension of observation space and can be nearly sample-size-independent in general. Real-data example using human activity data demonstrated the excellent performance of the new robust detector.