Goto

Collaborating Authors

 graphmax


Graphmax for Text Generation

arXiv.org Artificial Intelligence

In text generation, a large language model (LM) makes a choice of each new word based only on the former selection of its context using the softmax function. Nevertheless, the link statistics information of concurrent words based on a scene-specific corpus is valuable in choosing the next word, which can help to ensure the topic of the generated text to be aligned with the current task. To fully explore the co-occurrence information,we propose a graphmax function for task-specific text generation. Using the graph-based regularization, graphmax enables the final word choice to be determined by both the global knowledge from the LM and the local knowledge from the scene-specific corpus. The traditional softmax function is regularized with a graph total variation (GTV) term, which incorporates the local knowledge into the LM and encourages the model to consider the statistical relationships between words in a scene-specific corpus. The proposed graphmax is versatile and can be readily plugged into any large pre-trained LM for text generation and machine translation. Through extensive experiments, we demonstrate that the new GTV-based regularization can improve performances in various natural language processing tasks in comparison with existing methods. Moreover, through human experiments, we observe that participants can easily distinguish the text generated by graphmax or softmax.


Graphmax for Text Generation

Journal of Artificial Intelligence Research

In text generation, a large language model (LM) makes a choice of each new word based only on the former selection of its context using the softmax function. Nevertheless, the link statistics information of concurrent words based on a scene-specific corpus is valuable in choosing the next word, which can help to ensure the topic of the generated text to be aligned with the current task. To fully explore the co-occurrence information, we propose a graphmax function for task-specific text generation. Using the graph-based regularization, graphmax enables the final word choice to be determined by both the global knowledge from the LM and the local knowledge from the scene-specific corpus. The traditional softmax function is regularized with a graph total variation (GTV) term, which incorporates the local knowledge into the LM and encourages the model to consider the statistical relationships between words in a scene-specific corpus. The proposed graphmax is versatile and can be readily plugged into any large pre-trained LM for text generation and machine translation. Through extensive experiments, we demonstrate that the new GTV-based regularization can improve performances in various natural language processing (NLP) tasks in comparison with existing methods. Moreover, through human experiments, we observe that participants can easily distinguish the text generated by graphmax or softmax.


Unsupervised Graph Representation by Periphery and Hierarchical Information Maximization

arXiv.org Machine Learning

Deep representation learning on non-Euclidean data types, such as graphs, has gained significant attention in recent years. Invent of graph neural networks has improved the state-of-the-art for both node and the entire graph representation in a vector space. However, for the entire graph representation, most of the existing graph neural networks are trained on a graph classification loss in a supervised way. But obtaining labels of a large number of graphs is expensive for real world applications. Thus, we aim to propose an unsupervised graph neural network to generate a vector representation of an entire graph in this paper. For this purpose, we combine the idea of hierarchical graph neural networks and mutual information maximization into a single framework. We also propose and use the concept of periphery representation of a graph and show its usefulness in the proposed algorithm which is referred as GraPHmax. We conduct thorough experiments on several real-world graph datasets and compare the performance of GraPHmax with a diverse set of both supervised and unsupervised baseline algorithms. Experimental results show that we are able to improve the state-of-the-art for multiple graph level tasks on several real-world datasets, while remain competitive on the others.