AITopics | Calgary

Collaborating Authors

Calgary

3D-SeqMOS: A Novel Sequential 3D Moving Object Segmentation in Autonomous Driving

Li, Qipeng, Zhuang, Yuan, Chen, Yiwen, Huai, Jianzhu, Li, Miao, Ma, Tianbing, Tang, Yufei, Liang, Xinlian

arXiv.org Artificial IntelligenceJul-18-2023

For the SLAM system in robotics and autonomous driving, the accuracy of front-end odometry and back-end loop-closure detection determine the whole intelligent system performance. But the LiDAR-SLAM could be disturbed by current scene moving objects, resulting in drift errors and even loop-closure failure. Thus, the ability to detect and segment moving objects is essential for high-precision positioning and building a consistent map. In this paper, we address the problem of moving object segmentation from 3D LiDAR scans to improve the odometry and loop-closure accuracy of SLAM. We propose a novel 3D Sequential Moving-Object-Segmentation (3D-SeqMOS) method that can accurately segment the scene into moving and static objects, such as moving and static cars. Different from the existing projected-image method, we process the raw 3D point cloud and build a 3D convolution neural network for MOS task. In addition, to make full use of the spatio-temporal information of point cloud, we propose a point cloud residual mechanism using the spatial features of current scan and the temporal features of previous residual scans. Besides, we build a complete SLAM framework to verify the effectiveness and accuracy of 3D-SeqMOS. Experiments on SemanticKITTI dataset show that our proposed 3D-SeqMOS method can effectively detect moving objects and improve the accuracy of LiDAR odometry and loop-closure detection. The test results show our 3D-SeqMOS outperforms the state-of-the-art method by 12.4%. We extend the proposed method to the SemanticKITTI: Moving Object Segmentation competition and achieve the 2nd in the leaderboard, showing its effectiveness.

artificial intelligence, machine learning, point cloud, (19 more...)

arXiv.org Artificial Intelligence

2307.09044

Country:

Asia > China > Hubei Province > Wuhan (0.05)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(9 more...)

Genre: Research Report > New Finding (0.34)

Industry: Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Improving End-to-End Speech Translation by Imitation-Based Knowledge Distillation with Synthetic Transcripts

Hubert, Rebekka, Sokolov, Artem, Riezler, Stefan

arXiv.org Artificial IntelligenceJul-17-2023

End-to-end automatic speech translation (AST) relies on data that combines audio inputs with text translation outputs. Previous work used existing large parallel corpora of transcriptions and translations in a knowledge distillation (KD) setup to distill a neural machine translation (NMT) into an AST student model. While KD allows using larger pretrained models, the reliance of previous KD approaches on manual audio transcripts in the data pipeline restricts the applicability of this framework to AST. We present an imitation learning approach where a teacher NMT system corrects the errors of an AST student without relying on manual transcripts. We show that the NMT teacher can recover from errors in automatic transcriptions and is able to correct erroneous translations of the AST student, leading to improvements of about 4 BLEU points over the standard AST end-to-end baseline on the English-German CoVoST-2 and MuST-C datasets, respectively. Code and data are publicly available.\footnote{\url{https://github.com/HubReb/imitkd_ast/releases/tag/v1.1}}

machine learning, natural language, transcript, (18 more...)

arXiv.org Artificial Intelligence

2307.08426

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Germany > Berlin (0.04)
North America > United States > Texas (0.04)
(13 more...)

Genre: Research Report > Experimental Study (0.46)

Industry: Education (0.88)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Efficiently Factorizing Boolean Matrices using Proximal Gradient Descent

Dalleiger, Sebastian, Vreeken, Jilles

arXiv.org Artificial IntelligenceJul-14-2023

Addressing the interpretability problem of NMF on Boolean data, Boolean Matrix Factorization (BMF) uses Boolean algebra to decompose the input into low-rank Boolean factor matrices. These matrices are highly interpretable and very useful in practice, but they come at the high computational cost of solving an NP-hard combinatorial optimization problem. To reduce the computational burden, we propose to relax BMF continuously using a novel elastic-binary regularizer, from which we derive a proximal gradient algorithm. Through an extensive set of experiments, we demonstrate that our method works well in practice: On synthetic data, we show that it converges quickly, recovers the ground truth precisely, and estimates the simulated rank exactly. On real-world data, we improve upon the state of the art in recall, loss, and runtime, and a case study from the medical domain confirms that our results are easily interpretable and semantically meaningful.

artificial intelligence, machine learning, matrix, (18 more...)

arXiv.org Artificial Intelligence

2307.07615

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
(19 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Information Technology (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.50)

Add feedback

Speech Diarization and ASR with GMM

Sharma, Aayush Kumar, Bhavikatti, Vineet, Nidawani, Amogh, Siddappaji, Dr., P, Sanath, Mishra, Dr Geetishree

arXiv.org Artificial IntelligenceJul-11-2023

In this research paper, we delve into the topics of Speech Diarization and Automatic Speech Recognition (ASR). Speech diarization involves the separation of individual speakers within an audio stream. By employing the ASR transcript, the diarization process aims to segregate each speaker's utterances, grouping them based on their unique audio characteristics. On the other hand, Automatic Speech Recognition refers to the capability of a machine or program to identify and convert spoken words and phrases into a machine-readable format. In our speech diarization approach, we utilize the Gaussian Mixer Model (GMM) to represent speech segments. The inter-cluster distance is computed based on the GMM parameters, and the distance threshold serves as the stopping criterion. ASR entails the conversion of an unknown speech waveform into a corresponding written transcription. The speech signal is analyzed using synchronized algorithms, taking into account the pitch frequency. Our primary objective typically revolves around developing a model that minimizes the Word Error Rate (WER) metric during speech transcription.

artificial intelligence, machine learning, speech diarization, (14 more...)

arXiv.org Artificial Intelligence

2307.05637

Country:

Asia > India > Karnataka > Bengaluru (0.06)
North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.04)

Genre: Research Report (0.40)

Industry:

Media (0.35)
Leisure & Entertainment (0.35)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.70)

Add feedback

Designing Mixed-Initiative Video Games

Yang, Daijin

arXiv.org Artificial IntelligenceJul-7-2023

The development of Artificial Intelligence (AI) enables humans to co-create content with machines. The unexpectedness of AI-generated content can bring inspiration and entertainment to users. However, the co-creation interactions are always designed for content creators and have poor accessibility. To explore gamification of mixed-initiative co-creation and make human-AI interactions accessible and fun for players, I prototyped Snake Story, a mixed-initiative game where players can select AI-generated texts to write a story of a snake by playing a "Snake" like game. A controlled experiment was conducted to investigate the dynamics of player-AI interactions with and without the game component in the designed interface. As a result of a study with 11 players (n=11), I found that players utilized different strategies when playing with the two versions, game mechanics significantly affected the output stories, players' creative process, as well as role perceptions, and players with different backgrounds showed different preferences for the two versions. Based on these results, I further discussed considerations for mixed-initiative game design. This work aims to inspire the design of engaging co-creation experiences.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2307.03877

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.67)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(3 more...)

Add feedback

WACO: Word-Aligned Contrastive Learning for Speech Translation

Ouyang, Siqi, Ye, Rong, Li, Lei

arXiv.org Artificial IntelligenceJul-7-2023

End-to-end Speech Translation (E2E ST) aims to directly translate source speech into target text. Existing ST methods perform poorly when only extremely small speech-text data are available for training. We observe that an ST model's performance closely correlates with its embedding similarity between speech and source transcript. In this paper, we propose Word-Aligned COntrastive learning (WACO), a simple and effective method for extremely low-resource speech-to-text translation. Our key idea is bridging word-level representations for both speech and text modalities via contrastive learning. We evaluate WACO and other methods on the MuST-C dataset, a widely used ST benchmark, and on a low-resource direction Maltese-English from IWSLT 2023. Our experiments demonstrate that WACO outperforms the best baseline by 9+ BLEU points with only 1-hour parallel ST data. Code is available at https://github.com/owaski/WACO.

machine learning, natural language, translation, (17 more...)

arXiv.org Artificial Intelligence

2212.09359

Country:

North America > United States > California > Santa Barbara County > Santa Barbara (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
North America > Canada > Quebec > Montreal (0.04)
(9 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

An AI tool for automated analysis of large-scale unstructured clinical cine CMR databases

Mariscal-Harana, Jorge, Asher, Clint, Vergani, Vittoria, Rizvi, Maleeha, Keehn, Louise, Kim, Raymond J., Judd, Robert M., Petersen, Steffen E., Razavi, Reza, King, Andrew, Ruijsink, Bram, Puyol-Antón, Esther

arXiv.org Artificial IntelligenceJul-5-2023

Artificial intelligence (AI) techniques have been proposed for automating analysis of short axis (SAX) cine cardiac magnetic resonance (CMR), but no CMR analysis tool exists to automatically analyse large (unstructured) clinical CMR datasets. We develop and validate a robust AI tool for start-to-end automatic quantification of cardiac function from SAX cine CMR in large clinical databases. Our pipeline for processing and analysing CMR databases includes automated steps to identify the correct data, robust image pre-processing, an AI algorithm for biventricular segmentation of SAX CMR and estimation of functional biomarkers, and automated post-analysis quality control to detect and correct errors. The segmentation algorithm was trained on 2793 CMR scans from two NHS hospitals and validated on additional cases from this dataset (n=414) and five external datasets (n=6888), including scans of patients with a range of diseases acquired at 12 different centres using CMR scanners from all major vendors. Median absolute errors in cardiac biomarkers were within the range of inter-observer variability: <8.4mL (left ventricle volume), <9.2mL (right ventricle volume), <13.3g (left ventricular mass), and <5.9% (ejection fraction) across all datasets. Stratification of cases according to phenotypes of cardiac disease and scanner vendors showed good performance across all groups. We show that our proposed tool, which combines image pre-processing steps, a domain-generalisable AI algorithm trained on a large-scale multi-domain CMR dataset and quality control steps, allows robust analysis of (clinical or research) databases from multiple centres, vendors, and cardiac diseases. This enables translation of our tool for use in fully-automated processing of large multi-centre databases.

artificial intelligence, machine learning, segmentation, (16 more...)

arXiv.org Artificial Intelligence

2206.08137

Country:

Europe > United Kingdom > England > Greater London > London (0.28)
North America > United States > North Carolina > Durham County > Durham (0.14)
North America > Canada > Quebec > Montreal (0.14)
(10 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Using BOLD-fMRI to Compute the Respiration Volume per Time (RTV) and Respiration Variation (RV) with Convolutional Neural Networks (CNN) in the Human Connectome Development Cohort

Addeh, Abdoljalil, Vega, Fernando, Williams, Rebecca J, Golestani, Ali, Pike, G. Bruce, MacDonald, M. Ethan

arXiv.org Artificial IntelligenceJul-3-2023

In many fMRI studies, respiratory signals are unavailable or do not have acceptable quality. Consequently, the direct removal of low-frequency respiratory variations from BOLD signals is not possible. This study proposes a one-dimensional CNN model for reconstruction of two respiratory measures, RV and RVT. Results show that a CNN can capture informative features from resting BOLD signals and reconstruct realistic RV and RVT timeseries. It is expected that application of the proposed method will lower the cost of fMRI studies, reduce complexity, and decrease the burden on participants as they will not be required to wear a respiratory bellows.

artificial intelligence, calgary, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2307.05426

Country:

North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.09)
Oceania > Australia (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > United Kingdom (0.04)

Genre: Research Report > New Finding (0.49)

Industry:

Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.71)
Health & Medicine > Diagnostic Medicine > Imaging (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

CASEIN: Cascading Explicit and Implicit Control for Fine-grained Emotion Intensity Regulation

Cui, Yuhao, Wang, Xiongwei, Zhao, Zhongzhou, Zhou, Wei, Chen, Haiqing

arXiv.org Artificial IntelligenceJun-27-2023

Existing fine-grained intensity regulation methods rely on explicit control through predicted emotion probabilities. However, these high-level semantic probabilities are often inaccurate and unsmooth at the phoneme level, leading to bias in learning. Especially when we attempt to mix multiple emotion intensities for specific phonemes, resulting in markedly reduced controllability and naturalness of the synthesis. To address this issue, we propose the CAScaded Explicit and Implicit coNtrol framework (CASEIN), which leverages accurate disentanglement of emotion manifolds from the reference speech to learn the implicit representation at a lower semantic level. This representation bridges the semantical gap between explicit probabilities and the synthesis model, reducing bias in learning. In experiments, our CASEIN surpasses existing methods in both controllability and naturalness. Notably, we are the first to achieve fine-grained control over the mixed intensity of multiple emotions.

artificial intelligence, emotion, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2307.0002

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(8 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Multisensor Multiobject Tracking With High-Dimensional Object States

Zhang, Wenyu, Meyer, Florian

arXiv.org Artificial IntelligenceJun-27-2023

Passive monitoring of acoustic or radio sources has important applications in modern convenience, public safety, and surveillance. A key task in passive monitoring is multiobject tracking (MOT). This paper presents a Bayesian method for multisensor MOT for challenging tracking problems where the object states are high-dimensional, and the measurements follow a nonlinear model. Our method is developed in the framework of factor graphs and the sum-product algorithm (SPA) and implemented using random samples or "particles". The multimodal probability density functions (pdfs) provided by the SPA are effectively represented by a Gaussian mixture model (GMM). To perform the operations of the SPA in high-dimensional spaces, we make use of Particle flow (PFL). Here, particles are migrated towards regions of high likelihood based on the solution of a partial differential equation. This makes it possible to obtain good object detection and tracking performance even in challenging multisensor MOT scenarios with single sensor measurements that have a lower dimension than the object positions. We perform a numerical evaluation in a passive acoustic monitoring scenario where multiple sources are tracked in 3-D from 1-D time-difference-of-arrival (TDOA) measurements provided by pairs of hydrophones. Our numerical results demonstrate favorable detection and estimation accuracy compared to state-of-the-art reference techniques.

artificial intelligence, machine learning, particle, (19 more...)

arXiv.org Artificial Intelligence

2212.14556

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(8 more...)

Genre: Research Report (1.00)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback