AITopics | Yang, Yuting

Collaborating Authors

Yang, Yuting

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

BLR-MoE: Boosted Language-Routing Mixture of Experts for Domain-Robust Multilingual E2E ASR

Ma, Guodong, Wang, Wenxuan, Zhou, Lifeng, Yang, Yuting, Li, Yuke, Du, Binbin

arXiv.org Artificial IntelligenceJan-21-2025

Recently, the Mixture of Expert (MoE) architecture, such as LR-MoE, is often used to alleviate the impact of language confusion on the multilingual ASR (MASR) task. However, it still faces language confusion issues, especially in mismatched domain scenarios. In this paper, we decouple language confusion in LR-MoE into confusion in self-attention and router. To alleviate the language confusion in self-attention, based on LR-MoE, we propose to apply attention-MoE architecture for MASR. In our new architecture, MoE is utilized not only on feed-forward network (FFN) but also on self-attention. In addition, to improve the robustness of the LID-based router on language confusion, we propose expert pruning and router augmentation methods. Combining the above, we get the boosted language-routing MoE (BLR-MoE) architecture. We verify the effectiveness of the proposed BLR-MoE in a 10,000-hour MASR dataset.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2501.12602

Country:

Asia > China (0.16)
Europe > France (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)

Add feedback

AI-Based Teat Shape and Skin Condition Prediction for Dairy Management

Hao, Yuexing, Yuan, Tiancheng, Yang, Yuting, Gupta, Aarushi, Wieland, Matthias, Birman, Ken, Basran, Parminder S.

arXiv.org Artificial IntelligenceDec-22-2024

Dairy owners spend significant effort to keep their animals healthy. There is good reason to hope that technologies such as computer vision and artificial intelligence (AI) could reduce these costs, yet obstacles arise when adapting advanced tools to farming environments. In this work, we adapt AI tools to dairy cow teat localization, teat shape, and teat skin condition classifications. We also curate a data collection and analysis methodology for a Machine Learning (ML) pipeline. The resulting teat shape prediction model achieves a mean Average Precision (mAP) of 0.783, and the teat skin condition model achieves a mean average precision of 0.828. Our work leverages existing ML vision models to facilitate the individualized identification of teat health and skin conditions, applying AI to the dairy management industry.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2412.17142

Country:

North America > United States (0.46)
Europe > Switzerland (0.28)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Dermatology (1.00)
Food & Agriculture > Agriculture (0.92)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

3rd Place Solution for MOSE Track in CVPR 2024 PVUW workshop: Complex Video Object Segmentation

Liu, Xinyu, Zhang, Jing, Zhang, Kexin, Yang, Yuting, Jiao, Licheng, Yang, Shuyuan

arXiv.org Artificial IntelligenceJun-5-2024

Motion Expression guided Video Segmentation (MeViS) Track is designed to advance the study of natural languageguided Video Object Segmentation (VOS) is a vital task in computer video understanding in complex environments, with vision, focusing on distinguishing foreground objects the goal of fostering the development of a more comprehensive from the background across video frames. Our work draws and robust pixel-level understanding of video scenes in inspiration from the Cutie model, and we investigate the effects such settings and realistic scenarios through the inclusion of of object memory, the total number of memory frames, new videos, sentences, and annotations [7].

artificial intelligence, object-oriented architecture, segmentation, (16 more...)

arXiv.org Artificial Intelligence

2406.03668

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.36)

Add feedback

Compass: A Decentralized Scheduler for Latency-Sensitive ML Workflows

Yang, Yuting, Merlina, Andrea, Song, Weijia, Yuan, Tiancheng, Birman, Ken, Vitenberg, Roman

arXiv.org Artificial IntelligenceFeb-28-2024

Yet We consider ML query processing in distributed systems intelligent edge applications differ from cloud microservices where GPU-enabled workers coordinate to execute complex in important ways, so we cannot just use the same techniques queries: a computing style often seen in applications that interact employed in web frameworks. Whereas the outer tiers of with users in support of image processing and natural today's cloud are dominated by lightweight, stateless, containerized language processing. In such systems, coscheduling of GPU applications that can be upscaled or downscaled memory management and task placement represents a promising at low cost, ML depends on large objects (hyperparameters, opportunity. We propose Compass, a novel framework model parameters, and supporting databases) and often entails that unifies these functions to reduce job latency while using hardware-accelerated computation using devices preconfigured resources efficiently, placing tasks where data dependencies with the proper firmware. When shifting a task to a will be satisfied, collocating tasks from the same job (when device that has not previously run it, computation cannot begin this will not overload the host or its GPU), and efficiently managing until all the prerequisites are in place. We can and do GPU memory. Comparison with other state of the art launch new ML instances when additional capacity is needed, schedulers shows a significant reduction in completion times but scheduling strategies must evolve to avoid thrashing.

large language model, machine learning, natural language, (23 more...)

arXiv.org Artificial Intelligence

2402.17652

Country:

North America > United States (0.14)
Europe > Norway (0.14)

Genre:

Workflow (0.84)
Research Report (0.50)

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(5 more...)

Add feedback

Towards Efficient Verification of Quantized Neural Networks

Huang, Pei, Wu, Haoze, Yang, Yuting, Daukantas, Ieva, Wu, Min, Zhang, Yedi, Barrett, Clark

arXiv.org Artificial IntelligenceDec-27-2023

Quantization replaces floating point arithmetic with integer arithmetic in deep neural network models, providing more efficient on-device inference with less power and memory. In this work, we propose a framework for formally verifying properties of quantized neural networks. Our baseline technique is based on integer linear programming which guarantees both soundness and completeness. We then show how efficiency can be improved by utilizing gradient-based heuristic search methods and also bound-propagation techniques. We evaluate our approach on perception networks quantized with PyTorch. Our results show that we can verify quantized networks with better scalability and efficiency than the previous state of the art.

artificial intelligence, machine learning, neural network, (16 more...)

arXiv.org Artificial Intelligence

2312.12679

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California (0.46)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.86)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Cascade: A Platform for Delay-Sensitive Edge Intelligence

Song, Weijia, Garrett, Thiago, Yang, Yuting, Liu, Mingzhao, Tremel, Edward, Rosa, Lorenzo, Merlina, Andrea, Vitenberg, Roman, Birman, Ken

arXiv.org Artificial IntelligenceNov-28-2023

Interactive intelligent computing applications are increasingly prevalent, creating a need for AI/ML platforms optimized to reduce per-event latency while maintaining high throughput and efficient resource management. Yet many intelligent applications run on AI/ML platforms that optimize for high throughput even at the cost of high tail-latency. Cascade is a new AI/ML hosting platform intended to untangle this puzzle. Innovations include a legacy-friendly storage layer that moves data with minimal copying and a "fast path" that collocates data and computation to maximize responsiveness. Our evaluation shows that Cascade reduces latency by orders of magnitude with no loss of throughput.

artificial intelligence, latency, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2311.17329

Country:

Europe (0.68)
North America > United States > California (0.28)

Genre: Research Report (1.00)

Industry:

Information Technology > Services (0.68)
Energy (0.67)
Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

The Robust Semantic Segmentation UNCV2023 Challenge Results

Yu, Xuanlong, Zuo, Yi, Wang, Zitao, Zhang, Xiaowen, Zhao, Jiaxuan, Yang, Yuting, Jiao, Licheng, Peng, Rui, Wang, Xinyi, Zhang, Junpei, Zhang, Kexin, Liu, Fang, Alcover-Couso, Roberto, SanMiguel, Juan C., Escudero-Viñolo, Marcos, Tian, Hanlin, Matsui, Kenta, Wang, Tianhao, Adan, Fahmy, Gao, Zhitong, He, Xuming, Bouniot, Quentin, Moghaddam, Hossein, Rai, Shyam Nandan, Cermelli, Fabio, Masone, Carlo, Pilzer, Andrea, Ricci, Elisa, Bursuc, Andrei, Solin, Arno, Trapp, Martin, Li, Rui, Yao, Angela, Chen, Wenlong, Simpson, Ivor, Campbell, Neill D. F., Franchi, Gianni

arXiv.org Artificial IntelligenceSep-27-2023

This paper outlines the winning solutions employed in addressing the MUAD uncertainty quantification challenge held at ICCV 2023. The challenge was centered around semantic segmentation in urban environments, with a particular focus on natural adversarial scenarios. The report presents the results of 19 submitted entries, with numerous techniques drawing inspiration from cutting-edge uncertainty quantification methodologies presented at prominent conferences in the fields of computer vision and machine learning and journals over the past few years. Within this document, the challenge is introduced, shedding light on its purpose and objectives, which primarily revolved around enhancing the robustness of semantic segmentation in urban scenes under varying natural adversarial conditions. The report then delves into the top-performing solutions. Moreover, the document aims to provide a comprehensive overview of the diverse solutions deployed by all participants. By doing so, it seeks to offer readers a deeper insight into the array of strategies that can be leveraged to effectively handle the inherent uncertainties associated with autonomous driving and semantic segmentation, especially within urban environments.

artificial intelligence, semantic segmentation uncv2023 challenge result, survey article

arXiv.org Artificial Intelligence

2309.15478

Genre: Overview (0.73)

Technology: Information Technology > Artificial Intelligence (0.87)

Add feedback

Enhancing the Unified Streaming and Non-streaming Model with Contrastive Learning

Yang, Yuting, Li, Yuke, Du, Binbin

arXiv.org Artificial IntelligenceJun-1-2023

The unified streaming and non-streaming speech recognition model has achieved great success due to its comprehensive capabilities. In this paper, we propose to improve the accuracy of the unified model by bridging the inherent representation gap between the streaming and non-streaming modes with a contrastive objective. Specifically, the top-layer hidden representation at the same frame of the streaming and non-streaming modes are regarded as a positive pair, encouraging the representation of the streaming mode close to its non-streaming counterpart. The multiple negative samples are randomly selected from the rest frames of the same sample under the non-streaming mode. Experimental results demonstrate that the proposed method achieves consistent improvements toward the unified model in both streaming and non-streaming modes. Our method achieves CER of 4.66% in the streaming mode and CER of 4.31% in the non-streaming mode, which sets a new state-of-the-art on the AISHELL-1 benchmark.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2306.00755

Country:

North America > United States (0.14)
Asia > China (0.14)
Africa > Ethiopia (0.14)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Improving CTC-based ASR Models with Gated Interlayer Collaboration

Yang, Yuting, Li, Yuke, Du, Binbin

arXiv.org Artificial IntelligenceMar-14-2023

The CTC-based automatic speech recognition (ASR) models without the external language model usually lack the capacity to model conditional dependencies and textual interactions. In this paper, we present a Gated Interlayer Collaboration (GIC) mechanism to improve the performance of CTC-based models, which introduces textual information into the model and thus relaxes the conditional independence assumption of CTC-based models. Specifically, we consider the weighted sum of token embeddings as the textual representation for each position, where the position-specific weights are the softmax probability distribution constructed via inter-layer auxiliary CTC losses. The textual representations are then fused with acoustic features by developing a gate unit. Experiments on AISHELL-1, TEDLIUM2, and AIDATATANG corpora show that the proposed method outperforms several strong baselines.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2205.12462

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

A Dual Prompt Learning Framework for Few-Shot Dialogue State Tracking

Yang, Yuting, Lei, Wenqiang, Huang, Pei, Cao, Juan, Li, Jintao, Chua, Tat-Seng

arXiv.org Artificial IntelligenceJan-25-2023

Dialogue state tracking (DST) module is an important component for task-oriented dialog systems to understand users' goals and needs. Collecting dialogue state labels including slots and values can be costly, especially with the wide application of dialogue systems in more and more new-rising domains. In this paper, we focus on how to utilize the language understanding and generation ability of pre-trained language models for DST. We design a dual prompt learning framework for few-shot DST. Specifically, we consider the learning of slot generation and value generation as dual tasks, and two prompts are designed based on such a dual structure to incorporate task-related knowledge of these two tasks respectively. In this way, the DST task can be formulated as a language modeling task efficiently under few-shot settings. Experimental results on two task-oriented dialogue datasets show that the proposed method not only outperforms existing state-of-the-art few-shot methods, but also can generate unseen slots. It indicates that DST-related knowledge can be probed from PLM and utilized to address low-resource DST efficiently with the help of prompt learning.

artificial intelligence, computational linguistic, natural language, (12 more...)

arXiv.org Artificial Intelligence

2201.0578

Country:

North America > United States (1.00)
Europe (1.00)
Asia (0.94)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Add feedback