AITopics | Kundaje, Anshul

Collaborating Authors

Kundaje, Anshul

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DART-Eval: A Comprehensive DNA Language Model Evaluation Benchmark on Regulatory DNA

Patel, Aman, Singhal, Arpita, Wang, Austin, Pampari, Anusri, Kasowski, Maya, Kundaje, Anshul

arXiv.org Artificial IntelligenceDec-6-2024

Recent advances in self-supervised models for natural language, vision, and protein sequences have inspired the development of large genomic DNA language models (DNALMs). These models aim to learn generalizable representations of diverse DNA elements, potentially enabling various genomic prediction, interpretation and design tasks. Despite their potential, existing benchmarks do not adequately assess the capabilities of DNALMs on key downstream applications involving an important class of non-coding DNA elements critical for regulating gene activity. In this study, we introduce DART-Eval, a suite of representative benchmarks specifically focused on regulatory DNA to evaluate model performance across zero-shot, probed, and fine-tuned scenarios against contemporary ab initio models as baselines. Our benchmarks target biologically meaningful downstream tasks such as functional sequence feature discovery, predicting cell-type specific regulatory activity, and counterfactual prediction of the impacts of genetic variants. We find that current DNALMs exhibit inconsistent performance and do not offer compelling gains over alternative baseline models for most tasks, while requiring significantly more computational resources. We discuss potentially promising modeling, data curation, and evaluation strategies for the next generation of DNALMs. Our code is available at https://github.com/kundajelab/DART-Eval.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.0543

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.88)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Calibration with Bias-Corrected Temperature Scaling Improves Domain Adaptation Under Label Shift in Modern Neural Networks

Shrikumar, Avanti, Kundaje, Anshul

arXiv.org Machine LearningJan-21-2019

Label shift refers to the phenomenon where the marginal probability p(y) of observing a particular class changes between the training and test distributions while the conditional probability p(x|y) stays fixed. This is relevant in settings such as medical diagnosis, where a classifier trained to predict disease based on observed symptoms may need to be adapted to a different distribution where the baseline frequency of the disease is higher. Given calibrated estimates of p(y|x), one can apply an EM algorithm to correct for the shift in class imbalance between the training and test distributions without ever needing to calculate p(x|y). Unfortunately, modern neural networks typically fail to produce well-calibrated probabilities, compromising the effectiveness of this approach. Although Temperature Scaling can greatly reduce miscalibration in these networks, it can leave behind a systematic bias in the probabilities that still poses a problem. To address this, we extend Temperature Scaling with class-specific bias parameters, which largely eliminates systematic bias in the calibrated probabilities and allows for effective domain adaptation under label shift. We term our calibration approach "Bias-Corrected Temperature Scaling". On experiments with CIFAR10, we find that EM with Bias-Corrected Temperature Scaling significantly outperforms both EM with Temperature Scaling and the recently-proposed Black-Box Shift Estimation.

bayesian inference, domain adaptation, neural network, (18 more...)

arXiv.org Machine Learning

1901.06852

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Add feedback

TF-MoDISco v0.4.4.2-alpha: Technical Note

Shrikumar, Avanti, Tian, Katherine, Shcherbina, Anna, Avsec, Žiga, Banerjee, Abhimanyu, Sharmin, Mahfuza, Nair, Surag, Kundaje, Anshul

arXiv.org Machine LearningOct-31-2018

TF-MoDISco (Transcription Factor Motif Discovery from Importance Scores) is an algorithm for identifying motifs in basepair-level importance scores computed on genomic sequences. This paper describes the methods behind TF-MoDISco v0.4.4.2-alpha (https://github.

health & medicine, neural network, seqlet, (19 more...)

arXiv.org Machine Learning

1811.00416

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Computationally Efficient Measures of Internal Neuron Importance

Shrikumar, Avanti, Su, Jocelin, Kundaje, Anshul

arXiv.org Machine LearningJul-25-2018

The challenge of assigning importance to individual neurons in a network is of interest when interpreting deep learning models. In recent work, Dhamdhere et al. proposed Total Conductance, a "natural refinement of Integrated Gradients" for attributing importance to internal neurons. Unfortunately, the authors found that calculating conductance in tensorflow required the addition of several custom gradient operators and did not scale well. In this work, we show that the formula for Total Conductance is mathematically equivalent to Path Integrated Gradients computed on a hidden layer in the network. We provide a scalable implementation of Total Conductance using standard tensorflow gradient operators that we call Neuron Integrated Gradients. We compare Neuron Integrated Gradients to DeepLIFT, a pre-existing computationally efficient approach that is applicable to calculating internal neuron importance. We find that DeepLIFT produces strong empirical results and is faster to compute, but because it lacks the theoretical properties of Neuron Integrated Gradients, it may not always be preferred in practice. Colab notebook reproducing results: http://bit.ly/neuronintegratedgradients

deep learning, gradient, neural network, (19 more...)

arXiv.org Machine Learning

1807.09946

Country: Europe (0.29)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Learning to Abstain via Curve Optimization

Alexandari, Amr, Shrikumar, Avanti, Kundaje, Anshul

arXiv.org Machine LearningFeb-20-2018

In practical applications of machine learning, it is often desirable to identify and abstain on examples where the a model's predictions are likely to be incorrect. We consider the problem of selecting a budget-constrained subset of test examples to abstain on, with the goal of maximizing performance on the remaining examples. We develop a novel approach to this problem by analytically optimizing the expected marginal improvement in a desired performance metric, such as the area under the ROC curve or Precision-Recall curve. We compare our approach to other abstention techniques for deep learning models based on posterior probability and uncertainty estimates obtained using test-time dropout. On various tasks in computer vision, natural language processing, and bioinformatics, we demonstrate the consistent effectiveness of our approach over other techniques. We also introduce novel diagnostics based on influence functions to understand the behavior of abstention methods in the presence of noisy training data, and leverage the insights to propose a new influence-based abstention method.

abstention, deep learning, neural network, (21 more...)

arXiv.org Machine Learning

1802.07024

Country: North America > United States (0.46)

Genre: Research Report > Promising Solution (0.48)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Unsupervised Learning from Noisy Networks with Applications to Hi-C Data

Wang, Bo, Zhu, Junjie, Pourshafeie, Armin, Ursu, Oana, Batzoglou, Serafim, Kundaje, Anshul

Neural Information Processing SystemsDec-31-2016

Complex networks play an important role in a plethora of disciplines in natural sciences. Cleaning up noisy observed networks, poses an important challenge in network analysis Existing methods utilize labeled data to alleviate the noise effect in the network. However, labeled data is usually expensive to collect while unlabeled data can be gathered cheaply. In this paper, we propose an optimization framework to mine useful structures from noisy networks in an unsupervised manner. The key feature of our optimization framework is its ability to utilize local structures as well as global patterns in the network. We extend our method to incorporate multi-resolution networks in order to add further resistance to high-levels of noise. We also generalize our framework to utilize partial labels to enhance the performance. We specifically focus our method on multi-resolution Hi-C data by recovering clusters of genomic regions that co-localize in 3D space. Additionally, we use Capture-C-generated partial labels to further denoise the Hi-C network. We empirically demonstrate the effectiveness of our framework in denoising the network and improving community detection results.

health & medicine, optimization problem, resolution, (20 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Europe > Spain (0.14)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.84)

Add feedback