dictionary feature
Towards Interpretable Deep Neural Networks for Tabular Data
Elhadri, Khawla, Schlötterer, Jörg, Seifert, Christin
Tabular data is the foundation of many applications in fields such as finance and healthcare. Although DNNs tailored for tabular data achieve competitive predictive performance, they are blackboxes with little interpretability. We introduce XNNTab, a neural architecture that uses a sparse autoencoder (SAE) to learn a dictionary of monosemantic features within the latent space used for prediction. Using an automated method, we assign human-interpretable semantics to these features. This allows us to represent predictions as linear combinations of semantically meaningful components. Empirical evaluations demonstrate that XNNTab attains performance on par with or exceeding that of state-of-the-art, black-box neural models and classical machine learning approaches while being fully interpretable.
Beyond Label Attention: Transparency in Language Models for Automated Medical Coding via Dictionary Learning
Wu, John, Wu, David, Sun, Jimeng
Medical coding, the translation of unstructured clinical text into standardized medical codes, is a crucial but time-consuming healthcare practice. Though large language models (LLM) could automate the coding process and improve the efficiency of such tasks, interpretability remains paramount for maintaining patient trust. Current efforts in interpretability of medical coding applications rely heavily on label attention mechanisms, which often leads to the highlighting of extraneous tokens irrelevant to the ICD code. To facilitate accurate interpretability in medical language models, this paper leverages dictionary learning that can efficiently extract sparsely activated representations from dense language model embeddings in superposition. Compared with common label attention mechanisms, our model goes beyond token-level representations by building an interpretable dictionary which enhances the mechanistic-based explanations for each ICD code prediction, even when the highlighted tokens are medically irrelevant. We show that dictionary features can steer model behavior, elucidate the hidden meanings of upwards of 90% of medically irrelevant tokens, and are human interpretable.
DILA: Dictionary Label Attention for Mechanistic Interpretability in High-dimensional Multi-label Medical Coding Prediction
Wu, John, Wu, David, Sun, Jimeng
Predicting high-dimensional or extreme multilabels, such as in medical coding, requires both accuracy and interpretability. Existing works often rely on local interpretability methods, failing to provide comprehensive explanations of the overall mechanism behind each label prediction within a multilabel set. We propose a mechanistic interpretability module called DIctionary Label Attention (\method) that disentangles uninterpretable dense embeddings into a sparse embedding space, where each nonzero element (a dictionary feature) represents a globally learned medical concept. Through human evaluations, we show that our sparse embeddings are more human understandable than its dense counterparts by at least 50 percent. Our automated dictionary feature identification pipeline, leveraging large language models (LLMs), uncovers thousands of learned medical concepts by examining and summarizing the highest activating tokens for each dictionary feature. We represent the relationships between dictionary features and medical codes through a sparse interpretable matrix, enhancing the mechanistic and global understanding of the model's predictions while maintaining competitive performance and scalability without extensive human annotation.
Vaporetto: Efficient Japanese Tokenization Based on Improved Pointwise Linear Classification
Akabe, Koichi, Kanda, Shunsuke, Oda, Yusuke, Mori, Shinsuke
This paper proposes an approach to improve the runtime efficiency of Japanese tokenization based on the pointwise linear classification (PLC) framework, which formulates the whole tokenization process as a sequence of linear classification problems. Our approach optimizes tokenization by leveraging the characteristics of the PLC framework and the task definition. Our approach involves (1) composing multiple classifications into array-based operations, (2) efficient feature lookup with memory-optimized automata, and (3) three orthogonal pre-processing methods for reducing actual score calculation. Thus, our approach makes the tokenization speed 5.7 times faster than the current approach based on the same model without decreasing tokenization accuracy. Our implementation is available at https://github.com/daac-tools/vaporetto under the MIT or Apache-2.0 license.
Interactive Semantic Featuring for Text Classification
Jandot, Camille, Simard, Patrice, Chickering, Max, Grangier, David, Suh, Jina
In text classification, dictionaries can be used to define human-comprehensible features. We propose an improvement to dictionary features called smoothed dictionary features. These features recognize document contexts instead of n-grams. We describe a principled methodology to solicit dictionary features from a teacher, and present results showing that models built using these human-comprehensible features are competitive with models trained with Bag of Words features.
Learning Dictionaries for Named Entity Recognition using Minimal Supervision
Neelakantan, Arvind, Collins, Michael
This paper describes an approach for automatic construction of dictionaries for Named Entity Recognition (NER) using large amounts of unlabeled data and a few seed examples. We use Canonical Correlation Analysis (CCA) to obtain lower dimensional embeddings (representations) for candidate phrases and classify these phrases using a small number of labeled examples. Our method achieves 16.5% and 11.3% F-1 score improvement over co-training on disease and virus NER respectively. We also show that by adding candidate phrase embeddings as features in a sequence tagger gives better performance compared to using word embed-dings.
Online Detection of Abnormal Events Using Incremental Coding Length
Dutta, Jayanta Kumar (University of Memphis) | Banerjee, Bonny (University of Memphis)
We present an unsupervised approach for abnormal event detection in videos. We propose, given a dictionary of features learned from local spatiotemporal cuboids using the sparse coding objective, the abnormality of an event depends jointly on two factors: the frequency of each feature in reconstructing all events (or, rarity of a feature) and the strength by which it is used in reconstructing the current event (or, the absolute coefficient). The Incremental Coding Length (ICL) of a feature is a measure of its entropy gain. Given a dictionary, the ICL computation does not involve any parameter, is computationally efficient and has been used for saliency detection in images with impressive results. In this paper, the rarity of a dictionary feature is learned online as its average energy, a function of its ICL. The proposed approach is applicable to real world streaming videos. Experiments on three benchmark datasets and evaluations in comparison with a number of mainstream algorithms show that the approach is comparable to the state-of-the-art.
ICE: Enabling Non-Experts to Build Models Interactively for Large-Scale Lopsided Problems
Simard, Patrice, Chickering, David, Lakshmiratan, Aparna, Charles, Denis, Bottou, Leon, Suarez, Carlos Garcia Jurado, Grangier, David, Amershi, Saleema, Verwey, Johan, Suh, Jina
Quick interaction between a human teacher and a learning machine presents numerous benefits and challenges when working with web-scale data. The human teacher guides the machine towards accomplishing the task of interest. The learning machine leverages big data to find examples that maximize the training value of its interaction with the teacher. When the teacher is restricted to labeling examples selected by the machine, this problem is an instance of active learning. When the teacher can provide additional information to the machine (e.g., suggestions on what examples or predictive features should be used) as the learning task progresses, then the problem becomes one of interactive learning. To accommodate the two-way communication channel needed for efficient interactive learning, the teacher and the machine need an environment that supports an interaction language. The machine can access, process, and summarize more examples than the teacher can see in a lifetime. Based on the machine's output, the teacher can revise the definition of the task or make it more precise. Both the teacher and the machine continuously learn and benefit from the interaction. We have built a platform to (1) produce valuable and deployable models and (2) support research on both the machine learning and user interface challenges of the interactive learning problem. The platform relies on a dedicated, low-latency, distributed, in-memory architecture that allows us to construct web-scale learning machines with quick interaction speed. The purpose of this paper is to describe this architecture and demonstrate how it supports our research efforts. Preliminary results are presented as illustrations of the architecture but are not the primary focus of the paper.