Goto

Collaborating Authors

 saffron



Feedback-Enhanced Online Multiple Testing with Applications to Conformal Selection

Lu, Lin, Huo, Yuyang, Ren, Haojie, Wang, Zhaojun, Zou, Changliang

arXiv.org Machine Learning

We study online multiple testing with feedback, where decisions are made sequentially and the true state of the hypothesis is revealed after the decision has been made, either instantly or with a delay. We propose GAIF, a feedback-enhanced generalized alpha-investing framework that dynamically adjusts thresholds using revealed outcomes, ensuring finite-sample false discovery rate (FDR)/marginal FDR control. Extending GAIF to online conformal testing, we construct independent conformal $p$-values and introduce a feedback-driven model selection criterion to identify the best model/score, thereby improving statistical power. We demonstrate the effectiveness of our methods through numerical simulations and real-data applications.


SAFFRON and LORD Ensure Online Control of the False Discovery Rate Under Positive Dependence

Fisher, Aaron

arXiv.org Machine Learning

Online testing procedures assume that hypotheses are observed in sequence, and allow the significance thresholds for upcoming tests to depend on the test statistics observed so far. Some of the most popular online methods include alpha investing, LORD++ (hereafter, LORD), and SAFFRON. These three methods have been shown to provide online control of the "modified" false discovery rate (mFDR). However, to our knowledge, they have only been shown to control the traditional false discovery rate (FDR) under an independence condition on the test statistics. Our work bolsters these results by showing that SAFFRON and LORD additionally ensure online control of the FDR under nonnegative dependence. Because alpha investing can be recovered as a special case of the SAFFRON framework, the same result applies to this method as well. Our result also allows for certain forms of adaptive stopping times, for example, stopping after a certain number of rejections have been observed. For convenience, we also provide simplified versions of the LORD and SAFFRON algorithms based on geometric alpha allocations.


PAPRIKA: Private Online False Discovery Rate Control

Zhang, Wanrong, Kamath, Gautam, Cummings, Rachel

arXiv.org Machine Learning

In hypothesis testing, a false discovery occurs when a hypothesis is incorrectly rejected due to noise in the sample. When adaptively testing multiple hypotheses, the probability of a false discovery increases as more tests are performed. Thus the problem of False Discovery Rate (FDR) control is to find a procedure for testing multiple hypotheses that accounts for this effect in determining the set of hypotheses to reject. The goal is to minimize the number (or fraction) of false discoveries, while maintaining a high true positive rate (i.e., correct discoveries). In this work, we study False Discovery Rate (FDR) control in multiple hypothesis testing under the constraint of differential privacy for the sample. Unlike previous work in this direction, we focus on the online setting, meaning that a decision about each hypothesis must be made immediately after the test is performed, rather than waiting for the output of all tests as in the offline setting. We provide new private algorithms based on state-of-the-art results in non-private online FDR control. Our algorithms have strong provable guarantees for privacy and statistical performance as measured by FDR and power. We also provide experimental results to demonstrate the efficacy of our algorithms in a variety of data environments.


The Power of Batching in Multiple Hypothesis Testing

Zrnic, Tijana, Jiang, Daniel L., Jordan, Michael I.

arXiv.org Machine Learning

One important partition of algorithms for controlling the false discovery rate (FDR) in multiple testing is into offline and online algorithms. The first generally achieve significantly higher power of discovery, while the latter allow making decisions sequentially as well as adaptively formulating hypotheses based on past observations. Using existing methodology, it is unclear how one could trade off the benefits of these two broad families of algorithms, all the while preserving their formal FDR guarantees. To this end, we introduce $\text{Batch}_{\text{BH}}$ and $\text{Batch}_{\text{St-BH}}$, algorithms for controlling the FDR when a possibly infinite sequence of batches of hypotheses is tested by repeated application of one of the most widely used offline algorithms, the Benjamini-Hochberg (BH) method or Storey's improvement of the BH method. We show that our algorithms interpolate between existing online and offline methodology, thus trading off the best of both worlds.


ADDIS: an adaptive discarding algorithm for online FDR control with conservative nulls

Tian, Jinjin, Ramdas, Aaditya

arXiv.org Machine Learning

Major internet companies routinely perform tens of thousands of A/B tests each year. Such large-scale sequential experimentation has resulted in a recent spurt of new algorithms that can provably control the false discovery rate (FDR) in a fully online fashion. However, current state-of-the-art adaptive algorithms can suffer from a significant loss in power if null p-values are conservative (stochastically larger than the uniform distribution), a situation that occurs frequently in practice. In this work, we introduce a new adaptive discarding method called ADDIS that provably controls the FDR and achieves the best of both worlds: it enjoys appreciable power increase over all existing methods if nulls are conservative (the practical case), and rarely loses power if nulls are exactly uniformly distributed (the ideal case). We provide several practical insights on robust choices of tuning parameters, and extend the idea to asynchronous and offline settings as well.


Nothing Artificial About How AI Is Transforming MRO

#artificialintelligence

The loudest industry buzz has been about using big data and artificial intelligence (AI) for predictive maintenance, or turning unscheduled events into scheduled ones by forecasting likely failures. But surprise events still occur, and AI can also help troubleshoot them faster and more effectively. Any tool that enables predictive maintenance also helps troubleshooting, as it often points to causes of likely failures. But to provide maximum diagnostic benefits, some AI techniques can also be used in different ways. For example, natural language processing can translate mechanics' plain-spoken inquiries into text that helps find answers.


Intel/Saffron AI Plan Sidesteps Deep Learning EE Times

#artificialintelligence

Intel's $1 billion investment in the AI ecosystem is one of the well-publicized talking points at the processor company. The Intel empire boasts a breadth of AI technologies it has amassed by acquisition and Intel Capital investments in AI startups. The acquired companies seemingly useful to Intel's AI ambitions thus far include Altera (2015), Saffron (2015), Nervana (2016), Movidius (2016) and Mobileye (2017). Intel Capital has also fattened its AI portfolio with startups Mighty AI, Data Robot, Lumiata, CognitiveScale, Aeye Inc., Element AI and others. Unclear is how Intel is going to stitch all this together.


Machine Learning can Help Doctors Diagnose Disease

#artificialintelligence

Dr. Partho Sengupta had a hunch. A leading cardiologist now practicing at the West Virginia University Heart and Vascular Institute, Sengupta wanted to know whether the emerging field of machine learning could help solve a problem that had long vexed heart doctors. Driven by his conviction and curiosity, Sengupta cold-called data scientists at Saffron, a pioneering artificial intelligence company in North Carolina's Research Triangle acquired by Intel in 20151, with an idea for a novel experiment. Several phone calls and one proof of concept later, Sengupta and Saffron were able to show that a particular type of machine learning can be a powerful--even lifesaving--aid to cardiologists. The groundbreaking work also holds promise for delivering on the triple aims of healthcare reform: lowering costs, elevating quality of care, and improving access. The idea for the experiment had its genesis in Sengupta's office, where, like every other cardiologist, Sengupta struggled to diagnose between two very different diseases with dangerously similar symptoms.


Intel Launches Products for Artificial Intelligence Growth

#artificialintelligence

Intel Corporation INTC is gearing up to become a major provider of solutions for the Artificial Intelligence (AI) market, which is anticipated to be worth $5.05 billion by 2020 as per MarketsAndMarkets.com. The company recently announced a plethora of products, technologies and investments, which will augment its presence in the AI market. Additionally, Intel expects the next generation Xeon Phi processors to deliver up to four times better performance than the previous generation for deep learning. Code-named Knights Mill, the co-processor will be available in 2017. Acquisitions to Expand Footprint Apart from Nervana, the Movidius acquisition also expands Intel's AI product portfolio.