Goto

Collaborating Authors

 Noraset, Thanapon


Some Insights of Construction of Feature Graph to Learn Pairwise Feature Interactions with Graph Neural Networks

arXiv.org Machine Learning

Feature interaction is crucial in predictive machine learning models, as it captures the relationships between features that influence model performance. In this work, we focus on pairwise interactions and investigate their importance in constructing feature graphs for Graph Neural Networks (GNNs). Rather than proposing new methods, we leverage existing GNN models and tools to explore the relationship between feature graph structures and their effectiveness in modeling interactions. Through experiments on synthesized datasets, we uncover that edges between interacting features are important for enabling GNNs to model feature interactions effectively. We also observe that including non-interaction edges can act as noise, degrading model performance. Furthermore, we provide theoretical support for sparse feature graph selection using the Minimum Description Length (MDL) principle. We prove that feature graphs retaining only necessary interaction edges yield a more efficient and interpretable representation than complete graphs, aligning with Occam's Razor. Our findings offer both theoretical insights and practical guidelines for designing feature graphs that improve the performance and interpretability of GNN models.


Controlling Global Statistics in Recurrent Neural Network Text Generation

AAAI Conferences

Recurrent neural network language models (RNNLMs) are an essential component for many language generation tasks such as machine translation, summarization, and automated conversation. Often, we would like to subject the text generated by the RNNLM to constraints, in order to overcome systemic errors (e.g. word repetition) or achieve application-specific goals (e.g. more positive sentiment). In this paper, we present a method for training RNNLMs to simultaneously optimize likelihood and follow a given set of statistical constraints on text generation.  The problem is challenging because the statistical constraints are defined over aggregate model behavior, rather than model parameters, meaning that a straightforward parameter regularization approach is insufficient.  We solve this problem using a dynamic regularizer that updates as training proceeds, based on the generative behavior of the RNNLMs.  Our experiments show that the dynamic regularizer outperforms both generic training and a static regularization baseline.  The approach is successful at improving word-level repetition statistics by a factor of four in RNNLMs on a definition modeling task.  It also improves model perplexity when the statistical constraints are $n$-gram statistics taken from a large corpus.


Definition Modeling: Learning to Define Word Embeddings in Natural Language

AAAI Conferences

Distributed representations of words have been shown to capture lexical semantics, based on their effectiveness in word similarity and analogical relation tasks. But, these tasks only evaluate lexical semantics indirectly. In this paper, we study whether it is possible to utilize distributed representations to generate dictionary definitions of words, as a more direct and transparent representation of the embeddings' semantics. We introduce definition modeling, the task of generating a definition for a given word and its embedding. We present different definition model architectures based on recurrent neural networks, and experiment with the models over multiple data sets. Our results show that a model that controls dependencies between the word being defined and the definition words performs significantly better, and that a character-level convolution layer that leverages morphology can complement word-level embeddings. Our analysis reveals which components of our models contribute to accuracy. Finally, the errors made by a definition model may provide insight into the shortcomings of word embeddings.