AITopics | Morishima, Shigeo

Collaborating Authors

Morishima, Shigeo

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Understanding and Supporting Formal Email Exchange by Answering AI-Generated Questions

Miura, Yusuke, Yang, Chi-Lan, Kuribayashi, Masaki, Matsumoto, Keigo, Kuzuoka, Hideaki, Morishima, Shigeo

arXiv.org Artificial IntelligenceFeb-6-2025

Replying to formal emails is time-consuming and cognitively demanding, as it requires crafting polite phrasing and providing an adequate response to the sender's demands. Although systems with Large Language Models (LLMs) were designed to simplify the email replying process, users still need to provide detailed prompts to obtain the expected output. Therefore, we proposed and evaluated an LLM-powered question-and-answer (QA)-based approach for users to reply to emails by answering a set of simple and short questions generated from the incoming email. We developed a prototype system, ResQ, and conducted controlled and field experiments with 12 and 8 participants. Our results demonstrated that the QA-based approach improves the efficiency of replying to emails and reduces workload while maintaining email quality, compared to a conventional prompt-based approach that requires users to craft appropriate prompts to obtain email drafts. We discuss how the QA-based approach influences the email reply process and interpersonal relationship dynamics, as well as the opportunities and challenges associated with using a QA-based approach in AI-mediated communication.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2502.03804

Country:

Europe (1.00)
Asia > Japan > Honshū > Kantō (0.15)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Memory-Maze: Scenario Driven Benchmark and Visual Language Navigation Model for Guiding Blind People

Kuribayashi, Masaki, Uehara, Kohei, Wang, Allan, Sato, Daisuke, Chu, Simon, Morishima, Shigeo

arXiv.org Artificial IntelligenceMay-11-2024

Visual Language Navigation (VLN) powered navigation robots have the potential to guide blind people by understanding and executing route instructions provided by sighted passersby. This capability allows robots to operate in environments that are often unknown a priori. Existing VLN models are insufficient for the scenario of navigation guidance for blind people, as they need to understand routes described from human memory, which frequently contain stutters, errors, and omission of details as opposed to those obtained by thinking out loud, such as in the Room-to-Room dataset. However, currently, there is no benchmark that simulates instructions that were obtained from human memory in environments where blind people navigate. To this end, we present our benchmark, Memory-Maze, which simulates the scenario of seeking route instructions for guiding blind people. Our benchmark contains a maze-like structured virtual environment and novel route instruction data from human memory. To collect natural language instructions, we conducted two studies from sighted passersby onsite and annotators online. Our analysis demonstrates that instructions data collected onsite were more lengthy and contained more varied wording. Alongside our benchmark, we propose a VLN model better equipped to handle the scenario. Our proposed VLN model uses Large Language Models (LLM) to parse instructions and generate Python codes for robot control. We further show that the existing state-of-the-art model performed suboptimally on our benchmark. In contrast, our proposed method outperformed the state-of-the-art model by a fair margin. We found that future research should exercise caution when considering VLN technology for practical applications, as real-world scenarios have different characteristics than ones collected in traditional settings.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2405.0706

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry: Education > Educational Setting > Online (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)

Add feedback

Improving the Gap in Visual Speech Recognition Between Normal and Silent Speech Based on Metric Learning

Kashiwagi, Sara, Tanaka, Keitaro, Feng, Qi, Morishima, Shigeo

arXiv.org Artificial IntelligenceOct-16-2023

This paper presents a novel metric learning approach to address the performance gap between normal and silent speech in visual speech recognition (VSR). The difference in lip movements between the two poses a challenge for existing VSR models, which exhibit degraded accuracy when applied to silent speech. To solve this issue and tackle the scarcity of training data for silent speech, we propose to leverage the shared literal content between normal and silent speech and present a metric learning approach based on visemes. Specifically, we aim to map the input of two speech types close to each other in a latent space if they have similar viseme representations. By minimizing the Kullback-Leibler divergence of the predicted viseme probability distributions between and within the two speech types, our model effectively learns and predicts viseme identities. Our evaluation demonstrates that our method improves the accuracy of silent VSR, even when limited training data is available.

artificial intelligence, machine learning, speech, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.21437/Interspeech.2023-370

2305.14203

Country: Europe > Switzerland (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.87)

Add feedback

Gaze-Driven Sentence Simplification for Language Learners: Enhancing Comprehension and Readability

Higasa, Taichi, Tanaka, Keitaro, Feng, Qi, Morishima, Shigeo

arXiv.org Artificial IntelligenceSep-30-2023

Language learners should regularly engage in reading challenging materials as part of their study routine. Nevertheless, constantly referring to dictionaries is time-consuming and distracting. This paper presents a novel gaze-driven sentence simplification system designed to enhance reading comprehension while maintaining their focus on the content. Our system incorporates machine learning models tailored to individual learners, combining eye gaze features and linguistic features to assess sentence comprehension. When the system identifies comprehension difficulties, it provides simplified versions by replacing complex vocabulary and grammar with simpler alternatives via GPT-3.5. We conducted an experiment with 19 English learners, collecting data on their eye movements while reading English text. The results demonstrated that our system is capable of accurately estimating sentence-level comprehension. Additionally, we found that GPT-3.5 simplification improved readability in terms of traditional readability metrics and individual word difficulty, paraphrasing across different linguistic levels.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3610661.3616177

2310.00355

Country: North America > United States > New York > New York County > New York City (0.16)

Genre: Research Report (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.57)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.57)

Add feedback

Memory Efficient Diffusion Probabilistic Models via Patch-based Generation

Arakawa, Shinei, Tsunashima, Hideki, Horita, Daichi, Tanaka, Keitaro, Morishima, Shigeo

arXiv.org Artificial IntelligenceApr-14-2023

Diffusion probabilistic models have been successful in generating high-quality and diverse images. However, traditional models, whose input and output are high-resolution images, suffer from excessive memory requirements, making them less practical for edge devices. Previous approaches for generative adversarial networks proposed a patch-based method that uses positional encoding and global content information. Nevertheless, designing a patch-based approach for diffusion probabilistic models is non-trivial. In this paper, we resent a diffusion probabilistic model that generates images on a patch-by-patch basis. We propose two conditioning methods for a patch-based generation. First, we propose position-wise conditioning using one-hot representation to ensure patches are in proper positions. Second, we propose Global Content Conditioning (GCC) to ensure patches have coherent content when concatenated together. We evaluate our model qualitatively and quantitatively on CelebA and LSUN bedroom datasets and demonstrate a moderate trade-off between maximum memory consumption and generated image quality. Specifically, when an entire image is divided into 2 x 2 patches, our proposed approach can reduce the maximum memory consumption by half while maintaining comparable image quality.

artificial intelligence, dataset, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2304.07087

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

RLTutor: Reinforcement Learning Based Adaptive Tutoring System by Modeling Virtual Student with Fewer Interactions

Kubotani, Yoshiki, Fukuhara, Yoshihiro, Morishima, Shigeo

arXiv.org Artificial IntelligenceJul-31-2021

A major challenge in the field of education is providing review schedules that present learned items at appropriate intervals to each student so that memory is retained over time. In recent years, attempts have been made to formulate item reviews as sequential decision-making problems to realize adaptive instruction based on the knowledge state of students. It has been reported previously that reinforcement learning can help realize mathematical models of students learning strategies to maintain a high memory rate. However, optimization using reinforcement learning requires a large number of interactions, and thus it cannot be applied directly to actual students. In this study, we propose a framework for optimizing teaching strategies by constructing a virtual model of the student while minimizing the interaction with the actual teaching target. In addition, we conducted an experiment considering actual instructions using the mathematical model and confirmed that the model performance is comparable to that of conventional teaching methods. Our framework can directly substitute mathematical models used in experiments with human students, and our results can serve as a buffer between theoretical instructional optimization and practical applications in e-learning systems.

computer based training, educational technology, student, (20 more...)

arXiv.org Artificial Intelligence

2108.00268

Country:

North America > United States > Tennessee (0.14)
North America > Canada > Quebec (0.14)
North America > United States > California (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.87)

Industry: Education > Educational Technology > Educational Software > Computer Based Training (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Automatic Paper Summary Generation from Visual and Textual Information

Yamamoto, Shintaro, Fukuhara, Yoshihiro, Suzuki, Ryota, Morishima, Shigeo, Kataoka, Hirokatsu

arXiv.org Artificial IntelligenceNov-16-2018

Due to the recent boom in artificial intelligence (AI) research, including computer vision (CV), it has become impossible for researchers in these fields to keep up with the exponentially increasing number of manuscripts. In response to this situation, this paper proposes the paper summary generation (PSG) task using a simple but effective method to automatically generate an academic paper summary from raw PDF data. We realized PSG by combination of vision-based supervised components detector and language-based unsupervised important sentence extractor, which is applicable for a trained format of manuscripts. We show the quantitative evaluation of ability of simple vision-based components extraction, and the qualitative evaluation that our system can extract both visual item and sentence that are helpful for understanding. After processing via our PSG, the 979 manuscripts accepted by the Conference on Computer Vision and Pattern Recognition (CVPR) 2018 are available. It is believed that the proposed method will provide a better way for researchers to stay caught with important academic papers.

artificial intelligence, information, natural language, (17 more...)

arXiv.org Artificial Intelligence

1811.06943

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Understanding Fake Faces

Natsume, Ryota, Inoue, Kazuki, Fukuhara, Yoshihiro, Yamamoto, Shintaro, Morishima, Shigeo, Kataoka, Hirokatsu

arXiv.org Artificial IntelligenceSep-22-2018

Face recognition research is one of the most active topics in computer vision (CV), and deep neural networks (DNN) are now filling the gap between human-level and computer-driven performance levels in face verification algorithms. However, although the performance gap appears to be narrowing in terms of accuracy-based expectations, a curious question has arisen; specifically, "Face understanding of AI is really close to that of human?" In the present study, in an effort to confirm the brain-driven concept, we conduct image-based detection, classification, and generation using an in-house created fake face database. This database has two configurations: (i) false positive face detections produced using both the Viola Jones (VJ) method and convolutional neural networks (CNN), and (ii) simulacra that have fundamental characteristics that resemble faces but are completely artificial. The results show a level of suggestive knowledge that indicates the continuing existence of a gap between the capabilities of recent vision-based face recognition algorithms and human-level performance. On a positive note, however, we have obtained knowledge that will advance the progress of face-understanding models.

deep learning, fake face, neural network, (17 more...)

arXiv.org Artificial Intelligence

1809.08391

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.49)

Industry: Information Technology (0.37)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback