Ding, Hongyi
RakutenAI-7B: Extending Large Language Models for Japanese
Rakuten Group, null, Levine, Aaron, Huang, Connie, Wang, Chenguang, Batista, Eduardo, Szymanska, Ewa, Ding, Hongyi, Chou, Hou Wei, Pessiot, Jean-François, Effendi, Johanes, Chiu, Justin, Ohlhus, Kai Torben, Chopra, Karan, Shinzato, Keiji, Murakami, Koji, Xiong, Lee, Chen, Lei, Kubota, Maki, Tkachenko, Maksim, Lee, Miroku, Takahashi, Naoki, Jwalapuram, Prathyusha, Tatsushima, Ryutaro, Jain, Saurabh, Yadav, Sunil Kumar, Cai, Ting, Chen, Wei-Te, Xia, Yandi, Nakayama, Yuki, Higashiyama, Yutaka
We introduce RakutenAI-7B, a suite of Japanese-oriented large language models that achieve the best performance on the Japanese LM Harness benchmarks among the open 7B models. Along with the foundation model, we release instruction- and chat-tuned models, RakutenAI-7B-instruct and RakutenAI-7B-chat respectively, under the Apache 2.0 license.
Variational Inference for Gaussian Process with Panel Count Data
Ding, Hongyi, Lee, Young, Sato, Issei, Sugiyama, Masashi
We present the first framework for Gaussian-process-modulated Poisson processes when the temporal data appear in the form of panel counts. Panel count data frequently arise when experimental subjects are observed only at discrete time points and only the numbers of occurrences of the events between subsequent observation times are available. The exact occurrence timestamps of the events are unknown. The method of conducting the efficient variational inference is presented, based on the assumption of a Gaussian-process-modulated intensity function. We derive a tractable lower bound to alleviate the problems of the intractable evidence lower bound inherent in the variational inference framework. Our algorithm outperforms classical methods on both synthetic and three real panel count sets.
Bayesian Nonparametric Poisson-Process Allocation for Time-Sequence Modeling
Ding, Hongyi, Sato, Issei, Sugiyama, Masashi
Analyzing the underlying structure of multiple time-sequences provides insight into the understanding of social networks and human activities. In this work, we present the Bayesian nonparametric Poisson process allocation (BaNPPA), a generative model to automatically infer the number of latent functions in temporal data. We model the intensity of each sequence as an infinite mixture of latent functions, each of which is the square of a function drawn from a Gaussian process. A technical challenge for the inference of such mixture models is the identifiability issue between coefficients and the scale of latent functions. We propose to cope with the identifiability issue by regulating the volume of each latent function and derive a variational inference algorithm that can scale well to large-scale data sets. Our algorithm is computationally efficient and scalable to large-scale datasets. Finally, we demonstrate the usefulness of the proposed Bayesian nonparametric model through experiments on both synthetic and real-world data sets.