AITopics | Kamigaito, Hidetaka

Collaborating Authors

Kamigaito, Hidetaka

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Long-Tail Crisis in Nearest Neighbor Language Models

Nishida, Yuto, Morishita, Makoto, Deguchi, Hiroyuki, Kamigaito, Hidetaka, Watanabe, Taro

arXiv.org Artificial IntelligenceMar-28-2025

The $k$-nearest-neighbor language model ($k$NN-LM), one of the retrieval-augmented language models, improves the perplexity for given text by directly accessing a large datastore built from any text data during inference. A widely held hypothesis for the success of $k$NN-LM is that its explicit memory, i.e., the datastore, enhances predictions for long-tail phenomena. However, prior works have primarily shown its ability to retrieve long-tail contexts, leaving the model's performance remain underexplored in estimating the probabilities of long-tail target tokens during inference. In this paper, we investigate the behavior of $k$NN-LM on low-frequency tokens, examining prediction probability, retrieval accuracy, token distribution in the datastore, and approximation error of the product quantization. Our experimental results reveal that $k$NN-LM does not improve prediction performance for low-frequency tokens but mainly benefits high-frequency tokens regardless of long-tail contexts in the datastore.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.22426

Country:

North America (0.46)
Asia > Middle East (0.28)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Considering Length Diversity in Retrieval-Augmented Summarization

Juseon-Do, null, Hwang, Jaesung, Kwon, Jingun, Kamigaito, Hidetaka, Okumura, Manabu

arXiv.org Artificial IntelligenceMar-12-2025

This study investigates retrieval-augmented summarization by specifically examining the impact of exemplar summary lengths under length constraints, not covered by previous work. We propose a Diverse Length-aware Maximal Marginal Relevance (DL-MMR) algorithm to better control summary lengths. This algorithm combines the query relevance with diverse target lengths in retrieval-augmented summarization. Unlike previous methods that necessitate exhaustive exemplar exemplar relevance comparisons using MMR, DL-MMR considers the exemplar target length as well and avoids comparing exemplars to each other, thereby reducing computational cost and conserving memory during the construction of an exemplar pool. Experimental results showed the effectiveness of DL-MMR, which considers length diversity, compared to the original MMR algorithm. DL-MMR additionally showed the effectiveness in memory saving of 781,513 times and computational cost reduction of 500,092 times, while maintaining the same level of informativeness.

computational linguistic, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2503.09249

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Washington > King County > Seattle (0.14)

Genre: Research Report > New Finding (0.54)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.70)

Add feedback

AdParaphrase: Paraphrase Dataset for Analyzing Linguistic Features toward Generating Attractive Ad Texts

Murakami, Soichiro, Zhang, Peinan, Kamigaito, Hidetaka, Takamura, Hiroya, Okumura, Manabu

arXiv.org Artificial IntelligenceFeb-11-2025

Effective linguistic choices that attract potential customers play crucial roles in advertising success. This study aims to explore the linguistic features of ad texts that influence human preferences. Although the creation of attractive ad texts is an active area of research, progress in understanding the specific linguistic features that affect attractiveness is hindered by several obstacles. First, human preferences are complex and influenced by multiple factors, including their content, such as brand names, and their linguistic styles, making analysis challenging. Second, publicly available ad text datasets that include human preferences are lacking, such as ad performance metrics and human feedback, which reflect people's interests. To address these problems, we present AdParaphrase, a paraphrase dataset that contains human preferences for pairs of ad texts that are semantically equivalent but differ in terms of wording and style. This dataset allows for preference analysis that focuses on the differences in linguistic features. Our analysis revealed that ad texts preferred by human judges have higher fluency, longer length, more nouns, and use of bracket symbols. Furthermore, we demonstrate that an ad text-generation model that considers these findings significantly improves the attractiveness of a given text. The dataset is publicly available at: https://github.com/CyberAgentAILab/AdParaphrase.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2502.04674

Genre: Research Report > New Finding (1.00)

Industry: Marketing (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Tonguescape: Exploring Language Models Understanding of Vowel Articulation

Sakajo, Haruki, Sakai, Yusuke, Kamigaito, Hidetaka, Watanabe, Taro

arXiv.org Artificial IntelligenceJan-29-2025

Vowels are primarily characterized by tongue position. Humans have discovered these features of vowel articulation through their own experience and explicit objective observation such as using MRI. With this knowledge and our experience, we can explain and understand the relationship between tongue positions and vowels, and this knowledge is helpful for language learners to learn pronunciation. Since language models (LMs) are trained on a large amount of data that includes linguistic and medical fields, our preliminary studies indicate that an LM is able to explain the pronunciation mechanisms of vowels. However, it is unclear whether multi-modal LMs, such as vision LMs, align textual information with visual information. One question arises: do LMs associate real tongue positions with vowel articulation? In this study, we created video and image datasets from the existing real-time MRI dataset and investigated whether LMs can understand vowel articulation based on tongue positions using vision-based information. Our findings suggest that LMs exhibit potential for understanding vowels and tongue positions when reference examples are provided while they have difficulties without them. Our code for dataset building is available on GitHub.

large language model, machine learning, vowel, (21 more...)

arXiv.org Artificial Intelligence

2501.17643

Country:

North America > United States > Louisiana (0.14)
North America > Mexico > Mexico City (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

Can Impressions of Music be Extracted from Thumbnail Images?

Harada, Takashi, Motomitsu, Takehiro, Hayashi, Katsuhiko, Sakai, Yusuke, Kamigaito, Hidetaka

arXiv.org Artificial IntelligenceJan-5-2025

In recent years, there has been a notable increase in research on machine learning models for music retrieval and generation systems that are capable of taking natural language sentences as inputs. However, there is a scarcity of large-scale publicly available datasets, consisting of music data and their corresponding natural language descriptions known as music captions. In particular, non-musical information such as suitable situations for listening to a track and the emotions elicited upon listening is crucial for describing music. This type of information is underrepresented in existing music caption datasets due to the challenges associated with extracting it directly from music data. To address this issue, we propose a method for generating music caption data that incorporates non-musical aspects inferred from music thumbnail images, and validated the effectiveness of our approach through human evaluations. Additionally, we created a dataset with approximately 360,000 captions containing non-musical aspects. Leveraging this dataset, we trained a music retrieval model and demonstrated its effectiveness in music retrieval tasks through evaluation.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2501.02511

Country:

Asia > Japan (0.14)
Asia > Thailand (0.14)
Asia > India (0.14)

Genre: Research Report (0.82)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

Understanding the Impact of Confidence in Retrieval Augmented Generation: A Case Study in the Medical Domain

Ozaki, Shintaro, Kato, Yuta, Feng, Siyuan, Tomita, Masayo, Hayashi, Kazuki, Obara, Ryoma, Oyamada, Masafumi, Hayashi, Katsuhiko, Kamigaito, Hidetaka, Watanabe, Taro

arXiv.org Artificial IntelligenceDec-28-2024

Retrieval Augmented Generation (RAG) complements the knowledge of Large Language Models (LLMs) by leveraging external information to enhance response accuracy for queries. This approach is widely applied in several fields by taking its advantage of injecting the most up-to-date information, and researchers are focusing on understanding and improving this aspect to unlock the full potential of RAG in such high-stakes applications. However, despite the potential of RAG to address these needs, the mechanisms behind the confidence levels of its outputs remain underexplored, although the confidence of information is very critical in some domains, such as finance, healthcare, and medicine. Our study focuses the impact of RAG on confidence within the medical domain under various configurations and models. We evaluate confidence by treating the model's predicted probability as its output and calculating Expected Calibration Error (ECE) and Adaptive Calibration Error (ACE) scores based on the probabilities and accuracy. In addition, we analyze whether the order of retrieved documents within prompts calibrates the confidence. Our findings reveal large variation in confidence and accuracy depending on the model, settings, and the format of input prompts. These results underscore the necessity of optimizing configurations based on the specific model and conditions.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2412.20309

Country:

Asia (0.93)
North America > Mexico > Mexico City (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CoAM: Corpus of All-Type Multiword Expressions

Ide, Yusuke, Tanner, Joshua, Nohejl, Adam, Hoffman, Jacob, Vasselli, Justin, Kamigaito, Hidetaka, Watanabe, Taro

arXiv.org Artificial IntelligenceDec-23-2024

Multiword expressions (MWEs) refer to idiomatic sequences of multiple words. MWE identification, i.e., detecting MWEs in text, can play a key role in downstream tasks such as machine translation. Existing datasets for MWE identification are inconsistently annotated, limited to a single type of MWE, or limited in size. To enable reliable and comprehensive evaluation, we created CoAM: Corpus of All-Type Multiword Expressions, a dataset of 1.3K sentences constructed through a multi-step process to enhance data quality consisting of human annotation, human review, and automated consistency checking. MWEs in CoAM are tagged with MWE types, such as Noun and Verb, to enable fine-grained error analysis. Annotations for CoAM were collected using a new interface created with our interface generator, which allows easy and flexible annotation of MWEs in any form, including discontinuous ones. Through experiments using CoAM, we find that a fine-tuned large language model outperforms the current state-of-the-art approach for MWE identification. Furthermore, analysis using our MWE type tagged data reveals that Verb MWEs are easier than Noun MWEs to identify across approaches.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2412.18151

Country:

Europe (1.00)
Asia (0.93)
North America > United States > Minnesota (0.28)

Genre: Research Report (0.84)

Industry: Transportation (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)

Add feedback

IRR: Image Review Ranking Framework for Evaluating Vision-Language Models

Hayashi, Kazuki, Onishi, Kazuma, Suzuki, Toma, Ide, Yusuke, Gobara, Seiji, Saito, Shigeki, Sakai, Yusuke, Kamigaito, Hidetaka, Hayashi, Katsuhiko, Watanabe, Taro

arXiv.org Artificial IntelligenceDec-16-2024

Large-scale Vision-Language Models (LVLMs) process both images and text, excelling in multimodal tasks such as image captioning and description generation. However, while these models excel at generating factual content, their ability to generate and evaluate texts reflecting perspectives on the same image, depending on the context, has not been sufficiently explored. To address this, we propose IRR: Image Review Rank, a novel evaluation framework designed to assess critic review texts from multiple perspectives. IRR evaluates LVLMs by measuring how closely their judgments align with human interpretations. We validate it using a dataset of images from 15 categories, each with five critic review texts and annotated rankings in both English and Japanese, totaling over 2,000 data instances. The datasets are available at https://hf.co/datasets/naist-nlp/Wiki-ImageReview1.0. Our results indicate that, although LVLMs exhibited consistent performance across languages, their correlation with human annotations was insufficient, highlighting the need for further advancements. These findings highlight the limitations of current evaluation methods and the need for approaches that better capture human reasoning in Vision & Language tasks.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2402.12121

Country:

Asia (1.00)
North America > United States (0.46)
North America > Mexico (0.28)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

NERsocial: Efficient Named Entity Recognition Dataset Construction for Human-Robot Interaction Utilizing RapidNER

Atuhurra, Jesse, Kamigaito, Hidetaka, Ouchi, Hiroki, Shindo, Hiroyuki, Watanabe, Taro

arXiv.org Artificial IntelligenceNov-27-2024

Adapting named entity recognition (NER) methods to new domains poses significant challenges. We introduce RapidNER, a framework designed for the rapid deployment of NER systems through efficient dataset construction. RapidNER operates through three key steps: (1) extracting domain-specific sub-graphs and triples from a general knowledge graph, (2) collecting and leveraging texts from various sources to build the NERsocial dataset, which focuses on entities typical in human-robot interaction, and (3) implementing an annotation scheme using Elasticsearch (ES) to enhance efficiency. NERsocial, validated by human annotators, includes six entity types, 153K tokens, and 99.4K sentences, demonstrating RapidNER's capability to expedite dataset creation.

artificial intelligence, natural language, text processing, (19 more...)

arXiv.org Artificial Intelligence

2412.09634

Country:

North America > United States (1.00)
Europe (1.00)
Asia > Indonesia (0.67)

Genre: Research Report > New Finding (0.46)

Industry:

Media > News (1.00)
Media > Music (1.00)
Leisure & Entertainment > Sports > Motorsports (1.00)
(13 more...)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

Theoretical Aspects of Bias and Diversity in Minimum Bayes Risk Decoding

Kamigaito, Hidetaka, Deguchi, Hiroyuki, Sakai, Yusuke, Hayashi, Katsuhiko, Watanabe, Taro

arXiv.org Artificial IntelligenceOct-19-2024

Text generation commonly relies on greedy and beam decoding that limit the search space and degrade output quality. Minimum Bayes Risk (MBR) decoding can mitigate this problem by utilizing automatic evaluation metrics and model-generated pseudo-references. Previous studies have conducted empirical analyses to reveal the improvement by MBR decoding, and reported various observations. However, despite these observations, the theoretical relationship between them remains uncertain. To address this, we present a novel theoretical interpretation of MBR decoding from the perspective of bias-diversity decomposition. We decompose errors in the estimated quality of generated hypotheses in MBR decoding into two key factors: bias, which reflects the closeness between utility functions and human evaluations, and diversity, which represents the variation in the estimated quality of utility functions. Our theoretical analysis reveals the difficulty in simultaneously improving both bias and diversity, and highlights the effectiveness of increasing diversity to enhance MBR decoding performance. This analysis verifies the alignment between our theoretical insights and the empirical results reported in previous work. Furthermore, to support our theoretical findings, we propose a new metric, pseudo-bias, which approximates the bias term using gold references. We also introduce a new MBR approach, Metric-augmented MBR (MAMBR), which increases diversity by adjusting the behavior of utility functions without altering the pseudo-references. Experimental results across multiple NLP tasks show that the decomposed terms in the bias-diversity decomposition correlate well with performance, and that MAMBR improves text generation quality by modifying utility function behavior. Our code will be available at https://github.com/naist-nlp/mbr-bias-diversity.

computational linguistic, large language model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2410.15021

Country:

North America > United States (1.00)
Europe (1.00)
Asia (1.00)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.45)

Add feedback