AITopics | Kimura, Akisato

Collaborating Authors

Kimura, Akisato

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

BGM2Pose: Active 3D Human Pose Estimation with Non-Stationary Sounds

Shibata, Yuto, Oumi, Yusuke, Irie, Go, Kimura, Akisato, Aoki, Yoshimitsu, Isogawa, Mariko

arXiv.org Artificial IntelligenceMar-1-2025

We propose BGM2Pose, a non-invasive 3D human pose estimation method using arbitrary music (e.g., background music) as active sensing signals. Unlike existing approaches that significantly limit practicality by employing intrusive chirp signals within the audible range, our method utilizes natural music that causes minimal discomfort to humans. Estimating human poses from standard music presents significant challenges. In contrast to sound sources specifically designed for measurement, regular music varies in both volume and pitch. These dynamic changes in signals caused by music are inevitably mixed with alterations in the sound field resulting from human motion, making it hard to extract reliable cues for pose estimation. To address these challenges, BGM2Pose introduces a Contrastive Pose Extraction Module that employs contrastive learning and hard negative sampling to eliminate musical components from the recorded data, isolating the pose information. Additionally, we propose a Frequency-wise Attention Module that enables the model to focus on subtle acoustic variations attributable to human movement by dynamically computing attention across frequency bands. Experiments suggest that our method outperforms the existing methods, demonstrating substantial potential for real-world applications. Our datasets and code will be made publicly available.

artificial intelligence, conference, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2503.00389

Genre: Research Report > Experimental Study (0.48)

Industry:

Information Technology (0.68)
Leisure & Entertainment (0.48)
Media > Music (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Video Understanding (0.94)
(2 more...)

Add feedback

Acoustic-based 3D Human Pose Estimation Robust to Human Position

Oumi, Yusuke, Shibata, Yuto, Irie, Go, Kimura, Akisato, Aoki, Yoshimitsu, Isogawa, Mariko

arXiv.org Artificial IntelligenceNov-8-2024

This paper explores the problem of 3D human pose estimation from only low-level acoustic signals. The existing active acoustic sensing-based approach for 3D human pose estimation implicitly assumes that the target user is positioned along a line between loudspeakers and a microphone. Because reflection and diffraction of sound by the human body cause subtle acoustic signal changes compared to sound obstruction, the existing model degrades its accuracy significantly when subjects deviate from this line, limiting its practicality in real-world scenarios. To overcome this limitation, we propose a novel method composed of a position discriminator and reverberation-resistant model. The former predicts the standing positions of subjects and applies adversarial learning to extract subject position-invariant features. The latter utilizes acoustic signals before the estimation target time as references to enhance robustness against the variations in sound arrival times due to diffraction and reflection. We construct an acoustic pose estimation dataset that covers diverse human locations and demonstrate through experiments that our proposed method outperforms existing approaches.

artificial intelligence, estimation, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2411.07165

Country: Asia > Japan (0.14)

Genre: Research Report > Promising Solution (0.48)

Industry: Energy > Oil & Gas > Upstream (0.56)

Technology:

Information Technology > Artificial Intelligence > Vision > Video Understanding (1.00)
Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Selective Scene Text Removal

Mitani, Hayato, Kimura, Akisato, Uchida, Seiichi

arXiv.org Artificial IntelligenceOct-3-2023

Scene text removal (STR) is the image transformation task to remove text regions in scene images. The conventional STR methods remove all scene text. This means that the existing methods cannot select text to be removed. In this paper, we propose a novel task setting named selective scene text removal (SSTR) that removes only target words specified by the user. Although SSTR is a more complex task than STR, the proposed multi-module structure enables efficient training for SSTR. Experimental results show that the proposed method can remove target words as expected.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2309.0041

Country: Asia > Japan > Kyūshū & Okinawa > Kyūshū (0.14)

Genre: Research Report (0.84)

Industry:

Health & Medicine (0.46)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.51)

Add feedback

Toward Defensive Letter Design

Kataoka, Rentaro, Kimura, Akisato, Uchida, Seiichi

arXiv.org Artificial IntelligenceSep-4-2023

A major approach for defending against adversarial attacks aims at controlling only image classifiers to be more resilient, and it does not care about visual objects, such as pandas and cars, in images. This means that visual objects themselves cannot take any defensive actions, and they are still vulnerable to adversarial attacks. In contrast, letters are artificial symbols, and we can freely control their appearance unless losing their readability. In other words, we can make the letters more defensive to the attacks. This paper poses three research questions related to the adversarial vulnerability of letter images: (1) How defensive are the letters against adversarial attacks? (2) Can we estimate how defensive a given letter image is before attacks? (3) Can we control the letter images to be more defensive against adversarial attacks? For answering the first and second questions, we measure the defensibility of letters by employing Iterative Fast Gradient Sign Method (I-FGSM) and then build a deep regression model for estimating the defensibility of each letter image. We also propose a two-step method based on a generative adversarial network (GAN) for generating character images with higher defensibility, which solves the third research question.

artificial intelligence, defensibility, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2309.01452

Country: Asia > Japan > Kyūshū & Okinawa > Kyūshū (0.14)

Genre: Research Report > New Finding (0.87)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Imitation networks: Few-shot learning of neural networks from scratch

Kimura, Akisato, Ghahramani, Zoubin, Takeuchi, Koh, Iwata, Tomoharu, Ueda, Naonori

arXiv.org Machine LearningFeb-12-2018

In this paper, we propose imitation networks, a simple but effective method for training neural networks with a limited amount of training data. Our approach inherits the idea of knowledge distillation that transfers knowledge from a deep or wide reference model to a shallow or narrow target model. The proposed method employs this idea to mimic predictions of reference estimators that are much more robust against overfitting than the network we want to train. Different from almost all the previous work for knowledge distillation that requires a large amount of labeled training data, the proposed method requires only a small amount of training data. Instead, we introduce pseudo training examples that are optimized as a part of model parameters. Experimental results for several benchmark datasets demonstrate that the proposed method outperformed all the other baselines, such as naive training of the target model and standard knowledge distillation.

inductive learning, neural network, pseudo example, (14 more...)

arXiv.org Machine Learning

1802.03039

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Weakly Supervised Collective Feature Learning From Curated Media

Mukuta, Yusuke (The University of Tokyo) | Kimura, Akisato (NTT Communication Science Laboratories) | Adrian, David B. (Technical University of Munich) | Ghahramani, Zoubin (University of Cambridge)

AAAI ConferencesFeb-8-2018

The current state-of-the-art in feature learning relies on the supervised learning of large-scale datasets consisting of target content items and their respective category labels. However, constructing such large-scale fully-labeled datasets generally requires painstaking manual effort. One possible solution to this problem is to employ community contributed text tags as weak labels, however, the concepts underlying a single text tag strongly depends on the users. We instead present a new paradigm for learning discriminative features by making full use of the human curation process on social networking services (SNSs). During the process of content curation, SNS users collect content items manually from various sources and group them by context, all for their own benefit. Due to the nature of this process, we can assume that (1) content items in the same group share the same semantic concept and (2) groups sharing the same images might have related semantic concepts. Through these insights, we can define human curated groups as weak labels from which our proposed framework can learn discriminative features as a representation in the space of semantic concepts the users intended when creating the groups. We show that this feature learning can be formulated as a problem of link prediction for a bipartite graph whose nodes corresponds to content items and human curated groups, and propose a novel method for feature learning based on sparse coding or network fine-tuning.

curated group, deep learning, neural network, (23 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (0.48)

Industry: Information Technology > Services (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Denoising random forests

Hibino, Masaya, Kimura, Akisato, Yamashita, Takayoshi, Yamauchi, Yuji, Fujiyoshi, Hironobu

arXiv.org Machine LearningOct-30-2017

This paper proposes a novel type of random forests called a denoising random forests that are robust against noises contained in test samples. Such noise-corrupted samples cause serious damage to the estimation performances of random forests, since unexpected child nodes are often selected and the leaf nodes that the input sample reaches are sometimes far from those for a clean sample. Our main idea for tackling this problem originates from a binary indicator vector that encodes a traversal path of a sample in the forest. Our proposed method effectively employs this vector by introducing denoising autoencoders into random forests. A denoising autoencoder can be trained with indicator vectors produced from clean and noisy input samples, and non-leaf nodes where incorrect decisions are made can be identified by comparing the input and output of the trained denoising autoencoder. Multiple traversal paths with respect to the nodes with incorrect decisions caused by the noises can then be considered for the estimation.

decision tree learning, neural network, traversal path, (18 more...)

arXiv.org Machine Learning

1710.11004

Country: Asia > Japan > Honshū (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Infinite Plaid Models for Infinite Bi-Clustering

Ishiguro, Katsuhiko (NTT Corporation) | Sato, Issei (The University of Tokyo) | Nakano, Masahiro (NTT Corporation) | Kimura, Akisato (NTT Corporation) | Ueda, Naonori (NTT Corporation)

AAAI ConferencesApr-19-2016

We propose a probabilistic model for non-exhaustive and overlapping (NEO) bi-clustering. Our goal is to extract a few sub-matrices from the given data matrix, where entries of a sub-matrix are characterized by a specific distribution or parameters. Existing NEO biclustering methods typically require the number of sub-matrices to be extracted, which is essentially difficult to fix a priori. In this paper, we extend the plaid model, known as one of the best NEO bi-clustering algorithms, to allow infinite bi-clustering; NEO bi-clustering without specifying the number of sub-matrices. Our model can represent infinite sub-matrices formally. We develop a MCMC inference without the finite truncation, which potentially addresses all possible numbers of sub-matrices. Experiments quantitatively and qualitatively verify the usefulness of the proposed model. The results reveal that our model can offer more precise and in-depth analysis of sub-matrices.

artificial intelligence, health & medicine, plaid model, (17 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country: Asia > Japan > Honshū (0.14)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.88)

Add feedback

Non-Negative Multiple Matrix Factorization

Takeuchi, Koh (Researcher, NTT Communication Science Labs) | Ishiguro, Katsuhiko (NTT Corporation) | Kimura, Akisato (Researcher, NTT Communication Science Labs) | Sawada, Hiroshi (Researcher, NTT Communication Science Labs)

AAAI ConferencesAug-3-2013

Non-negative Matrix Factorization (NMF) is a traditional unsupervised machine learning technique for decomposing a matrix into a set of bases and coefficients under the non-negative constraint. NMF with sparse constraints is also known for extracting reasonable components from noisy data. However, NMF tends to give undesired results in the case of highly sparse data, because the information included in the data is insufficient to decompose. Our key idea is that we can ease this problem if complementary data are available that we could integrate into the estimation of the bases and coefficients. In this paper, we propose a novel matrix factorization method called Non-negative Multiple Matrix Factorization (NMMF), which utilizes complementary data as auxiliary matrices that share the row or column indices of the target matrix. The data sparse- ness is improved by decomposing the target and auxiliary matrices simultaneously, since auxiliary matrices provide information about the bases and coefficients. We formulate NMMF as a generalization of NMF, and then present a parameter estimation procedure derived from the multiplicative up- date rule. We examined NMMF in both synthetic and real data experiments. The effect of the auxiliary matrices appeared in the improved NMMF performance. We also confirmed that the bases that NMMF obtained from the real data were intuitive and reasonable thanks to the non-negative constraint.

non-negative multiple matrix factorization

AAAI Conferences

Twenty-Third International Joint Conference on Artificial Intelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Designing various component analysis at will

Kimura, Akisato, Sugiyama, Masashi, Hitoshi, Sakano, Kameoka, Hirokazu

arXiv.org Machine LearningOct-5-2012

This paper provides a generic framework of component analysis (CA) methods introducing a new expression for scatter matrices and Gram matrices, called Generalized Pairwise Expression (GPE). This expression is quite compact but highly powerful: The framework includes not only (1) the standard CA methods but also (2) several regularization techniques, (3) weighted extensions, (4) some clustering methods, and (5) their semi-supervised extensions. This paper also presents quite a simple methodology for designing a desired CA method from the proposed framework: Adopting the known GPEs as templates, and generating a new method by combining these templates appropriately.

artificial intelligence, health & medicine, matrix, (20 more...)

arXiv.org Machine Learning

1207.3554

Country:

Asia > Japan > Honshū (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.52)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.95)

Add feedback