AITopics | Zhang, Martin J.

Collaborating Authors

Zhang, Martin J.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

NeuralFDR: Learning Discovery Thresholds from Hypothesis Features

Xia, Fei, Zhang, Martin J., Zou, James Y., Tse, David

Neural Information Processing SystemsDec-31-2017

As datasets grow richer, an important challenge is to leverage the full features in the data to maximize the number of useful discoveries while controlling for false positives. We address this problem in the context of multiple hypotheses testing, where for each hypothesis, we observe a p-value along with a set of features specific to that hypothesis. For example, in genetic association studies, each hypothesis tests the correlation between a variant and the trait. We have a rich set of features for each variant (e.g. its location, conservation, epigenetics etc.) which could inform how likely the variant is to have a true association. However popular testing approaches, such as Benjamini-Hochberg's procedure (BH) and independent hypothesis weighting (IHW), either ignore these features or assume that the features are categorical. We propose a new algorithm, NeuralFDR, which automatically learns a discovery threshold as a function of all the hypothesis features. We parametrize the discovery threshold as a neural network, which enables flexible handling of multi-dimensional discrete and continuous features as well as efficient end-to-end optimization. We prove that NeuralFDR has strong false discovery rate (FDR) guarantees, and show that it makes substantially more discoveries in synthetic and real datasets. Moreover, we demonstrate that the learned discovery threshold is directly interpretable.

health & medicine, hypothesis, neural network, (19 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Genre: Research Report > Experimental Study (0.71)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Contrastive Principal Component Analysis

Abid, Abubakar, Zhang, Martin J., Bagaria, Vivek K., Zou, James

arXiv.org Machine LearningNov-21-2017

We present a new technique called contrastive principal component analysis (cPCA) that is designed to discover low-dimensional structure that is unique to a dataset, or enriched in one dataset relative to other data. The technique is a generalization of standard PCA, for the setting where multiple datasets are available -- e.g. a treatment and a control group, or a mixed versus a homogeneous population -- and the goal is to explore patterns that are specific to one of the datasets. We conduct a wide variety of experiments in which cPCA identifies important dataset-specific patterns that are missed by PCA, demonstrating that it is useful for many applications: subgroup discovery, visualizing trends, feature selection, denoising, and data-dependent standardization. We provide geometrical interpretations of cPCA and show that it satisfies desirable theoretical guarantees. We also extend cPCA to nonlinear settings in the form of kernel cPCA. We have released our code as a python package and documentation is on Github.

artificial intelligence, dataset, health & medicine, (14 more...)

arXiv.org Machine Learning

1709.06716

Country: North America > United States (0.28)

Genre: Research Report > Experimental Study (0.86)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area (0.95)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.62)

Add feedback

NeuralFDR: Learning Discovery Thresholds from Hypothesis Features

Xia, Fei, Zhang, Martin J., Zou, James, Tse, David

arXiv.org Machine LearningNov-18-2017

As datasets grow richer, an important challenge is to leverage the full features in the data to maximize the number of useful discoveries while controlling for false positives. We address this problem in the context of multiple hypotheses testing, where for each hypothesis, we observe a p-value along with a set of features specific to that hypothesis. For example, in genetic association studies, each hypothesis tests the correlation between a variant and the trait. We have a rich set of features for each variant (e.g. its location, conservation, epigenetics etc.) which could inform how likely the variant is to have a true association. However popular testing approaches, such as Benjamini-Hochberg's procedure (BH) and independent hypothesis weighting (IHW), either ignore these features or assume that the features are categorical or uni-variate. We propose a new algorithm, NeuralFDR, which automatically learns a discovery threshold as a function of all the hypothesis features. We parametrize the discovery threshold as a neural network, which enables flexible handling of multi-dimensional discrete and continuous features as well as efficient end-to-end optimization. We prove that NeuralFDR has strong false discovery rate (FDR) guarantees, and show that it makes substantially more discoveries in synthetic and real datasets. Moreover, we demonstrate that the learned discovery threshold is directly interpretable.

health & medicine, hypothesis, neural network, (20 more...)

arXiv.org Machine Learning

1711.01312

Genre: Research Report > Experimental Study (0.88)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback