Scientific Discovery


Why Vegetarians Miss Fewer Flights – Five Bizarre Insights from Data

#artificialintelligence

A "Robot Scientist" – Machine learning automates a kind of scientific research It's weird: Vegetarians miss fewer flights. It's wacky: People who like "curly fries" on Facebook are more intelligent. You're traveling through a dimension whose boundaries are that of imagination. Well, it's not really The Twilight Zone, but let's call it "The'Freakonomics' of big data," or "The Ripley's Believe It or Not of data science." We live in a weird and wacky world, full of bizarre and surprising connections, and these connections are reflected within all those tons of data constantly being collected.


AI-powered Sales, a Paradigm Shift - PCQuest

#artificialintelligence

A result-oriented sales team is the driving force and a determining factor for the success of any business. Traditional sales methodologies have paved the way for more advanced techniques thanks to rapid technological adoptions. Digital transformation has further enhanced the sales pipeline with the integration of artificial intelligence and machine learning. 'One size fits all' is an oxymoron for something as complex as a Customer Relationship Management (CRM) system. The fit of the CRM needs to be aligned with an organization rather than the other way around.


Sling adds Discovery, Science to its lineup

Engadget

Sling TV's line up of available channels is getting bigger. The streaming TV service is adding nine new channels from Discovery Networks that offer live and on-demand content, including the flagship Discovery Channel and MotorTrend. The best news for Sling subscribers: some of the channels will be added to your package for free. Access to the channels will be split across Sling's two separate service packages, both of which cost $25 per month. Sling Blue will get Discovery Channel, Investigation Discovery and TLC.


The Structure of Optimal Private Tests for Simple Hypotheses

arXiv.org Machine Learning

Hypothesis testing plays a central role in statistical inference, and is used in many settings where privacy concerns are paramount. This work answers a basic question about privately testing simple hypotheses: given two distributions $P$ and $Q$, and a privacy level $\varepsilon$, how many i.i.d. samples are needed to distinguish $P$ from $Q$ subject to $\varepsilon$-differential privacy, and what sort of tests have optimal sample complexity? Specifically, we characterize this sample complexity up to constant factors in terms of the structure of $P$ and $Q$ and the privacy level $\varepsilon$, and show that this sample complexity is achieved by a certain randomized and clamped variant of the log-likelihood ratio test. Our result is an analogue of the classical Neyman-Pearson lemma in the setting of private hypothesis testing. We also give an application of our result to the private change-point detection. Our characterization applies more generally to hypothesis tests satisfying essentially any notion of algorithmic stability, which is known to imply strong generalization bounds in adaptive data analysis, and thus our results have applications even when privacy is not a primary concern.


Amazon hit by unexplained data leak just days before Black Friday

The Independent

Amazon has suffered a customer data leak less than two days before Black Friday. Amazon customer service contacted people to warn them that their names and email addresses had been compromised, though it is not yet clear how many customers were affected or how it happened. An Amazon spokesperson told The Independent: "We have fixed the issue and informed customers who may have been impacted." The customer message stated: "We're contacting you to let you know that our website inadvertently disclosed your name and email address due to a technical error. "The issue has been fixed.


A tutorial on MDL hypothesis testing for graph analysis

arXiv.org Machine Learning

When analysing graph structure, it can be difficult to determine whether patterns found are due to chance, or due to structural aspects of the process that generated the data. Hypothesis tests are often used to support such analyses. These allow us to make statistical inferences about which null models are responsible for the data, and they can be used as a heuristic in searching for meaningful patterns. The minimum description length (MDL) principle [6, 4] allows us to build such hypothesis tests, based on efficient descriptions of the data. Broadly: we translate the regularity we are interested in into a code for the data, and if this code describes the data more efficiently than a code corresponding to the null model, by a sufficient margin, we may reject the null model. This is a frequentist approach to MDL, based on hypothesis testing. Bayesian approaches to MDL for model selection rather than model rejection are more common, but for the purposes of pattern analysis, a hypothesis testing approach provides a more natural fit with existing literature. 1 We provide a brief illustration of this principle based on the running example of analysing the size of the largest clique in a graph. We illustrate how a code can be constructed to efficiently represent graphs with large cliques, and how the description length of the data under this code can be compared to the description length under a code corresponding to a null model to show that the null model is highly unlikely to have generated the data.


Using Any Surface to Realize a New Paradigm for Wireless Communications

Communications of the ACM

Wireless communications have undeniably shaped our everyday lives. We expect ubiquitous connectivity to the Internet, with increasing demands for higher data rates and low lag everywhere: at work, at home, on the road, even with massive crowds of Internet users around us. Despite impressive breakthroughs in almost every part of our wireless devices--from antennas and hardware to operating software--this demand is getting increasingly challenging to address. The large scale of research efforts and investment in the fifth generation (5G) of wireless communications reflects the enormity of the challenge.1 A valuable and seemingly unnoticed resource could be exploited to meet this goal.


Automation could accelerate China's scientific rise, close gap with U.S.

#artificialintelligence

"We're in the middle of a paradigm shift, a time when the choice of experiments and the execution of experiments are not really things that people do," says Bob Murphy, the head of the computational biology department at Carnegie Mellon University. Details: Experimental science is expensive. In biology, for example, pricey equipment and labor mean that scientists can't do all the experiments they would like. Instead, they have to prioritize the ones they think will give them the most information about the questions they are after, and then extrapolate to estimate the outcomes of the experiments they didn't do. Automating science makes it easier to do big experiments, allowing more people to participate -- and potentially boosting the scientific output of countries that have traditionally trailed the U.S.


The Rise of Dataism: A Threat to Freedom or a Scientific Revolution?

#artificialintelligence

What would happen if we made all of our data public--everything from wearables monitoring our biometrics, all the way to smartphones monitoring our location, our social media activity, and even our internet search history? Would such insights into our lives simply provide companies and politicians with greater power to invade our privacy and manipulate us by using our psychological profiles against us? A burgeoning new philosophy called dataism doesn't think so. In fact, this trending ideology believes that liberating the flow of data is the supreme value of the universe, and that it could be the key to unleashing the greatest scientific revolution in the history of humanity. First mentioned by David Brooks in his 2013 New York Times article "The Philosophy of Data," dataism is an ethical system that has been most heavily explored and popularized by renowned historian, Yuval Noah Harari.


Poker-Faced Trading: Will This Theory Change Your Strategy?

#artificialintelligence

Sure, you've heard of game theory. And sure, trading is like a game: you devise a strategy, learn the rules, and try to beat everyone else to the punch, finding trends before anyone else does. You've even heard that letting your emotions get the best of you is a terrible trading strategy. To start to undermine our own self-destructive habits, it helps to understand how the game is geared toward those who figure out the game behind the game: how to play to win by controlling the emotions that lead us to bad decisions. Game theory can be applied to human trading because the object of a trade is to "win" a profit.