AITopics | Wang, Ruili

Collaborating Authors

Wang, Ruili

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

KUNPENG: An Embodied Large Model for Intelligent Maritime

Wang, Naiyao, Jiang, Tongbang, Wang, Ye, Qiu, Shaoyang, Zhang, Bo, Xie, Xinqiang, Li, Munan, Wang, Chunliu, Wang, Yiyang, Ren, Hongxiang, Wang, Ruili, Shan, Hongjun, Liu, Hongbo

arXiv.org Artificial IntelligenceJul-12-2024

Intelligent maritime, as an essential component of smart ocean construction, deeply integrates advanced artificial intelligence technology and data analysis methods, which covers multiple aspects such as smart vessels, route optimization, safe navigation, aiming to enhance the efficiency of ocean resource utilization and the intelligence of transportation networks. However, the complex and dynamic maritime environment, along with diverse and heterogeneous large-scale data sources, present challenges for real-time decision-making in intelligent maritime. In this paper, We propose KUNPENG, the first-ever embodied large model for intelligent maritime in the smart ocean construction, which consists of six systems. The model perceives multi-source heterogeneous data for the cognition of environmental interaction and make autonomous decision strategies, which are used for intelligent vessels to perform navigation behaviors under safety and emergency guarantees and continuously optimize power to achieve embodied intelligence in maritime. In comprehensive maritime task evaluations, KUNPENG has demonstrated excellent performance.

large language model, machine learning, real time system, (18 more...)

arXiv.org Artificial Intelligence

2407.09048

Country: Asia > China (0.14)

Genre: Research Report (0.50)

Industry:

Energy (1.00)
Information Technology > Security & Privacy (0.46)
Transportation > Infrastructure & Services (0.35)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
(4 more...)

Add feedback

PhasePerturbation: Speech Data Augmentation via Phase Perturbation for Automatic Speech Recognition

Lei, Chengxi, Singh, Satwinder, Hou, Feng, Jia, Xiaoyun, Wang, Ruili

arXiv.org Artificial IntelligenceDec-13-2023

Most of the current speech data augmentation methods operate on either the raw waveform or the amplitude spectrum of speech. In this paper, we propose a novel speech data augmentation method called PhasePerturbation that operates dynamically on the phase spectrum of speech. Instead of statically rotating a phase by a constant degree, PhasePerturbation utilizes three dynamic phase spectrum operations, i.e., a randomization operation, a frequency masking operation, and a temporal masking operation, to enhance the diversity of speech data. We conduct experiments on wav2vec2.0 pre-trained ASR models by fine-tuning them with the PhasePerturbation augmented TIMIT corpus. The experimental results demonstrate 10.9\% relative reduction in the word error rate (WER) compared with the baseline model fine-tuned without any augmentation operation. Furthermore, the proposed method achieves additional improvements (12.9\% and 15.9\%) in WER by complementing the Vocal Tract Length Perturbation (VTLP) and the SpecAug, which are both amplitude spectrum-based augmentation methods. The results highlight the capability of PhasePerturbation to improve the current amplitude spectrum-based augmentation methods.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2312.08571

Country: Asia > China (0.15)

Genre: Research Report > New Finding (0.49)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

A Novel Self-training Approach for Low-resource Speech Recognition

Singh, Satwinder, Hou, Feng, Wang, Ruili

arXiv.org Artificial IntelligenceAug-9-2023

In this paper, we propose a self-training approach for automatic speech recognition (ASR) for low-resource settings. While self-training approaches have been extensively developed and evaluated for high-resource languages such as English, their applications to low-resource languages like Punjabi have been limited, despite the language being spoken by millions globally. The scarcity of annotated data has hindered the development of accurate ASR systems, especially for low-resource languages (e.g., Punjabi and M\=aori languages). To address this issue, we propose an effective self-training approach that generates highly accurate pseudo-labels for unlabeled low-resource speech. Our experimental analysis demonstrates that our approach significantly improves word error rate, achieving a relative improvement of 14.94% compared to a baseline model across four real speech datasets. Further, our proposed approach reports the best results on the Common Voice Punjabi dataset.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2308.05269

Country:

Asia (0.28)
Oceania > New Zealand (0.14)

Genre: Research Report (0.64)

Industry: Media (0.47)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.35)

Add feedback

How to Design Translation Prompts for ChatGPT: An Empirical Study

Gao, Yuan, Wang, Ruili, Hou, Feng

arXiv.org Artificial IntelligenceApr-21-2023

The recently released ChatGPT has demonstrated surprising abilities in natural language understanding and natural language generation. Machine translation relies heavily on the abilities of language understanding and generation. Thus, in this paper, we explore how to assist machine translation with ChatGPT. We adopt several translation prompts on a wide range of translations. Our experimental results show that ChatGPT with designed translation prompts can achieve comparable or better performance over commercial translation systems for high-resource language translations. We further evaluate the translation quality using multiple references, and ChatGPT achieves superior performance compared to commercial systems. We also conduct experiments on domain-specific translations, the final results show that ChatGPT is able to comprehend the provided domain keyword and adjust accordingly to output proper translations. At last, we perform few-shot prompts that show consistent improvement across different base prompts. Our work provides empirical evidence that ChatGPT still has great potential in translations.

machine learning, natural language, translation, (19 more...)

arXiv.org Artificial Intelligence

2304.02182

Country: Asia (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DEEPF0: End-To-End Fundamental Frequency Estimation for Music and Speech Signals

Singh, Satwinder, Wang, Ruili, Qiu, Yuanhang

arXiv.org Artificial IntelligenceFeb-11-2021

We propose a novel pitch estimation technique called DeepF0, which leverages the available annotated data to directly learns from the raw audio in a data-driven manner. F0 estimation is important in various speech processing and music information retrieval applications. Existing deep learning models for pitch estimations have relatively limited learning capabilities due to their shallow receptive field. The proposed model addresses this issue by extending the receptive field of a network by introducing the dilated convolutional blocks into the network. The dilation factor increases the network receptive field exponentially without increasing the parameters of the model exponentially. To make the training process more efficient and faster, DeepF0 is augmented with residual blocks with residual connections. Our empirical evaluation demonstrates that the proposed model outperforms the baselines in terms of raw pitch accuracy and raw chroma accuracy even using 77.4% fewer network parameters. We also show that our model can capture reasonably well pitch estimation even under the various levels of accompaniment noise.

dataset, deep learning, neural network, (19 more...)

arXiv.org Artificial Intelligence

2102.06306

Country: Oceania > New Zealand (0.14)

Genre: Research Report (0.50)

Industry:

Media > Music (0.67)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multi-scale Hierarchical Residual Network for Dense Captioning

Tian, Yan, Wang, Xun, Wu, Jiachen, Wang, Ruili, Yang, Bailin

Journal of Artificial Intelligence ResearchJan-30-2019

Recent research on dense captioning based on the recurrent neural network and the convolutional neural network has made a great progress. However, mapping from an image feature space to a description space is a nonlinear and multimodel task, which makes it difficult for the current methods to get accurate results. In this paper, we put forward a novel approach for dense captioning based on hourglass-structured residual learning. Discriminant feature maps are obtained by incorporating dense connected networks and residual learning in our model. Finally, the performance of the approach on the Visual Genome V1.0 dataset and the region labelled MS-COCO (Microsoft Common Objects in Context) dataset are demonstrated. The experimental results have shown that our approach outperforms most current methods.

artificial intelligence, machine learning, neural network, (17 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.11338

AI Access Foundation

11338

Journal of Artificial Intelligence Research

Country: Asia > China > Zhejiang Province (0.14)

Genre:

Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)
Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

KDSL: a Knowledge-Driven Supervised Learning Framework for Word Sense Disambiguation

Yin, Shi, Zhou, Yi, Li, Chenguang, Wang, Shangfei, Ji, Jianmin, Chen, Xiaoping, Wang, Ruili

arXiv.org Artificial IntelligenceSep-4-2018

We propose KDSL, a new word sense disambiguation (WSD) framework that utilizes knowledge to automatically generate sense-labeled data for supervised learning. First, from WordNet, we automatically construct a semantic knowledge base called DisDict, which provides refined feature words that highlight the differences among word senses, i.e., synsets. Second, we automatically generate new sense-labeled data by DisDict from unlabeled corpora. Third, these generated data, together with manually labeled data and unlabeled data, are fed to a neural framework conducting supervised and unsupervised learning jointly to model the semantic relations among synsets, feature words and their contexts. The experimental results show that KDSL outperforms several representative state-of-the-art methods on various major benchmarks. Interestingly, it performs relatively well even when manually labeled data is unavailable, thus provides a potential solution for similar tasks in a lack of manual annotations.

deep learning, neural network, synset, (22 more...)

arXiv.org Artificial Intelligence

1808.09888

Country: Oceania > Australia (0.14)

Genre: Research Report > Promising Solution (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Add feedback