AITopics | Tsang, Michael

Disaggregated Multi-Tower: Topology-aware Modeling Technique for Efficient Large-Scale Recommendation

Luo, Liang, Zhang, Buyun, Tsang, Michael, Ma, Yinbin, Chu, Ching-Hsiang, Chen, Yuxin, Li, Shen, Hao, Yuchen, Zhao, Yanli, Lakshminarayanan, Guna, Wen, Ellie Dingqiao, Park, Jongsoo, Mudigere, Dheevatsa, Naumov, Maxim

arXiv.org Artificial IntelligenceMay-2-2024

We study a mismatch between the deep learning recommendation models' flat architecture, common distributed training paradigm and hierarchical data center topology. To address the associated inefficiencies, we propose Disaggregated Multi-Tower (DMT), a modeling technique that consists of (1) Semantic-preserving Tower Transform (SPTT), a novel training paradigm that decomposes the monolithic global embedding lookup process into disjoint towers to exploit data center locality; (2) Tower Module (TM), a synergistic dense component attached to each tower to reduce model complexity and communication volume through hierarchical feature interaction; and (3) Tower Partitioner (TP), a feature partitioner to systematically create towers with meaningful feature interactions and load balanced assignments to preserve model quality and training throughput via learned embeddings. We show that DMT can achieve up to 1.9 speedup compared to the state-of-the-art baselines without losing accuracy across multiple generations of hardware at large data center scales. Since the embedding tables can be huge, the state-of-the-art practices train these models in a hybrid fashion: the sparse are synchronized through AllReduce operations. Nvidia, work done while at Meta.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2403.00877

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)
Information Technology > Cloud Computing (0.89)

Add feedback

How does this interaction affect me? Interpretable attribution for feature interactions

Tsang, Michael, Rambhatla, Sirisha, Liu, Yan

arXiv.org Machine LearningJun-19-2020

Machine learning transparency calls for interpretable explanations of how inputs relate to predictions. Feature attribution is a way to analyze the impact of features on predictions. Feature interactions are the contextual dependence between features that jointly impact predictions. There are a number of methods that extract feature interactions in prediction models; however, the methods that assign attributions to interactions are either uninterpretable, model-specific, or non-axiomatic. We propose an interaction attribution and detection framework called Archipelago which addresses these problems and is also scalable in real-world settings. Our experiments on standard annotation labels indicate our approach provides significantly more interpretable explanations than comparable methods, which is important for analyzing the impact of interactions on predictions. We also provide accompanying visualizations of our approach that give new insights into deep neural networks.

deep learning, interaction, neural network, (20 more...)

arXiv.org Machine Learning

2006.10965

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.68)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Feature Interaction Interpretability: A Case for Explaining Ad-Recommendation Systems via Neural Interaction Detection

Tsang, Michael, Cheng, Dehua, Liu, Hanpeng, Feng, Xue, Zhou, Eric, Liu, Yan

arXiv.org Machine LearningJun-19-2020

Recommendation is a prevalent application of machine learning that affects many users; therefore, it is important for recommender models to be accurate and interpretable. In this work, we propose a method to both interpret and augment the predictions of black-box recommender systems. In particular, we propose to interpret feature interactions from a source recommender model and explicitly encode these interactions in a target recommender model, where both source and target models are black-boxes. By not assuming the structure of the recommender system, our approach can be used in general settings. In our experiments, we focus on a prominent use of machine learning recommendation: ad-click prediction. We found that our interaction interpretations are both informative and predictive, e.g., significantly outperforming existing recommender models. What's more, the same approach to interpret interactions can provide new insights into domains even beyond recommendation, such as text and image classification.

deep learning, interaction, neural network, (21 more...)

arXiv.org Machine Learning

2006.10966

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Leisure & Entertainment (0.93)
Transportation (0.65)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.52)

Add feedback

Neural Interaction Transparency (NIT): Disentangling Learned Interactions for Improved Interpretability

Tsang, Michael, Liu, Hanpeng, Purushotham, Sanjay, Murali, Pavankumar, Liu, Yan

Neural Information Processing SystemsFeb-14-2020, 17:27:40 GMT

Neural networks are known to model statistical interactions, but they entangle the interactions at intermediate hidden layers for shared representation learning. We propose a framework, Neural Interaction Transparency (NIT), that disentangles the shared learning across different interactions to obtain their intrinsic lower-order and interpretable structure. This is done through a novel regularizer that directly penalizes interaction order. We show that disentangling interactions reduces a feedforward neural network to a generalized additive model with interactions, which can lead to transparent models that perform comparably to the state-of-the-art models. NIT is also flexible and efficient; it can learn generalized additive models with maximum $K$-order interactions by training only $O(1)$ models.

artificial intelligence, interaction, neural network, (5 more...)

Neural Information Processing Systems

Genre: Research Report (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Extracting Interpretable Concept-Based Decision Trees from CNNs

Chyung, Conner, Tsang, Michael, Liu, Yan

arXiv.org Machine LearningJun-16-2019

In an attempt to gather a deeper understanding of how convolutional neural networks (CNNs) reason about human-understandable concepts, we present a method to infer labeled concept data from hidden layer activations and interpret the concepts through a shallow decision tree. The decision tree can provide information about which concepts a model deems important, as well as provide an understanding of how the concepts interact with each other. Experiments demonstrate that the extracted decision tree is capable of accurately representing the original CNN's classifications at low tree depths, thus encouraging human-in-the-loop understanding of discriminative concepts.

decision tree, decision tree learning, deep learning, (17 more...)

arXiv.org Machine Learning

1906.04664

Country: North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report (0.51)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Neural Interaction Transparency (NIT): Disentangling Learned Interactions for Improved Interpretability

Tsang, Michael, Liu, Hanpeng, Purushotham, Sanjay, Murali, Pavankumar, Liu, Yan

Neural Information Processing SystemsDec-31-2018

Neural networks are known to model statistical interactions, but they entangle the interactions at intermediate hidden layers for shared representation learning. We propose a framework, Neural Interaction Transparency (NIT), that disentangles the shared learning across different interactions to obtain their intrinsic lower-order and interpretable structure. This is done through a novel regularizer that directly penalizes interaction order. We show that disentangling interactions reduces a feedforward neural network to a generalized additive model with interactions, which can lead to transparent models that perform comparably to the state-of-the-art models. NIT is also flexible and efficient; it can learn generalized additive models with maximum $K$-order interactions by training only $O(1)$ models.

artificial intelligence, interaction, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Neural Interaction Transparency (NIT): Disentangling Learned Interactions for Improved Interpretability

Tsang, Michael, Liu, Hanpeng, Purushotham, Sanjay, Murali, Pavankumar, Liu, Yan

Neural Information Processing SystemsDec-31-2018

Neural networks are known to model statistical interactions, but they entangle the interactions at intermediate hidden layers for shared representation learning. We propose a framework, Neural Interaction Transparency (NIT), that disentangles the shared learning across different interactions to obtain their intrinsic lower-order and interpretable structure. This is done through a novel regularizer that directly penalizes interaction order. We show that disentangling interactions reduces a feedforward neural network to a generalized additive model with interactions, which can lead to transparent models that perform comparably to the state-of-the-art models. NIT is also flexible and efficient; it can learn generalized additive models with maximum $K$-order interactions by training only $O(1)$ models.

health & medicine, interaction, neural network, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Can I trust you more? Model-Agnostic Hierarchical Explanations

Tsang, Michael, Sun, Youbang, Ren, Dongxu, Liu, Yan

arXiv.org Machine LearningDec-11-2018

Interactions such as double negation in sentences and scene interactions in images are common forms of complex dependencies captured by state-of-the-art machine learning models. We propose Mah\'e, a novel approach to provide Model-agnostic hierarchical \'explanations of how powerful machine learning models, such as deep neural networks, capture these interactions as either dependent on or free of the context of data instances. Specifically, Mah\'e provides context-dependent explanations by a novel local interpretation algorithm that effectively captures any-order interactions, and obtains context-free explanations through generalizing context-dependent interactions to explain global behaviors. Experimental results show that Mah\'e obtains improved local interaction interpretations over state-of-the-art methods and successfully explains interactions that are context-free.

deep learning, interaction, neural network, (20 more...)

arXiv.org Machine Learning

1812.04801

Country: North America > United States > California (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Promising Solution (0.68)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Media (0.67)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Detecting Statistical Interactions from Neural Network Weights

Tsang, Michael, Cheng, Dehua, Liu, Yan

arXiv.org Machine LearningFeb-27-2018

Interpreting neural networks is a crucial and challenging task in machine learning. In this paper, we develop a novel framework for detecting statistical interactions captured by a feedforward multilayer neural network by directly interpreting its learned weights. Depending on the desired interactions, our method can achieve significantly better or similar interaction detection performance compared to the state-of-the-art without searching an exponential solution space of possible interactions. We obtain this accuracy and efficiency by observing that interactions between input features are created by the non-additive effect of nonlinear activation functions, and that interacting paths are encoded in weight matrices. We demonstrate the performance of our method and the importance of discovered interactions via experimental results on both synthetic datasets and real-world application datasets.

deep learning, interaction, neural network, (18 more...)

arXiv.org Machine Learning

1705.04977

Country:

Europe (0.28)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Filters

Collaborating Authors

Tsang, Michael

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Disaggregated Multi-Tower: Topology-aware Modeling Technique for Efficient Large-Scale Recommendation

How does this interaction affect me? Interpretable attribution for feature interactions

Feature Interaction Interpretability: A Case for Explaining Ad-Recommendation Systems via Neural Interaction Detection

Neural Interaction Transparency (NIT): Disentangling Learned Interactions for Improved Interpretability

Extracting Interpretable Concept-Based Decision Trees from CNNs

Neural Interaction Transparency (NIT): Disentangling Learned Interactions for Improved Interpretability

Neural Interaction Transparency (NIT): Disentangling Learned Interactions for Improved Interpretability

Can I trust you more? Model-Agnostic Hierarchical Explanations

Detecting Statistical Interactions from Neural Network Weights