AITopics | Wang, Guoqing

Collaborating Authors

Wang, Guoqing

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Open-Vocabulary Calibration for Vision-Language Models

Wang, Shuoyuan, Wang, Jindong, Wang, Guoqing, Zhang, Bob, Zhou, Kaiyang, Wei, Hongxin

arXiv.org Artificial IntelligenceFeb-7-2024

Vision-language models (VLMs) have emerged as formidable tools, showing their strong capability in handling various open-vocabulary tasks in image recognition, text-driven visual content generation, and visual chatbots, to name a few. In recent years, considerable efforts and resources have been devoted to adaptation methods for improving downstream performance of VLMs, particularly on parameter-efficient fine-tuning methods like prompt learning. However, a crucial aspect that has been largely overlooked is the confidence calibration problem in fine-tuned VLMs, which could greatly reduce reliability when deploying such models in the real world. This paper bridges the gap by systematically investigating the confidence calibration problem in the context of prompt learning and reveals that existing calibration methods are insufficient to address the problem, especially in the open-vocabulary setting. To solve the problem, we present a simple and effective approach called Distance-Aware Calibration (DAC), which is based on scaling the temperature using as guidance the distance between predicted text labels and base classes. The experiments with 7 distinct prompt learning methods applied across 11 diverse downstream datasets demonstrate the effectiveness of DAC, which achieves high efficacy without sacrificing the inference speed.

calibration, machine learning, pattern recognition, (19 more...)

arXiv.org Artificial Intelligence

2402.04655

Country: Asia > China (0.46)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.34)

Add feedback

Recent Advances in Multi-modal 3D Scene Understanding: A Comprehensive Survey and Evaluation

Lei, Yinjie, Wang, Zixuan, Chen, Feng, Wang, Guoqing, Wang, Peng, Yang, Yang

arXiv.org Artificial IntelligenceOct-24-2023

Multi-modal 3D scene understanding has gained considerable attention due to its wide applications in many areas, such as autonomous driving and human-computer interaction. Compared to conventional single-modal 3D understanding, introducing an additional modality not only elevates the richness and precision of scene interpretation but also ensures a more robust and resilient understanding. This becomes especially crucial in varied and challenging environments where solely relying on 3D data might be inadequate. While there has been a surge in the development of multi-modal 3D methods over past three years, especially those integrating multi-camera images (3D+2D) and textual descriptions (3D+language), a comprehensive and in-depth review is notably absent. In this article, we present a systematic survey of recent progress to bridge this gap. We begin by briefly introducing a background that formally defines various 3D multi-modal tasks and summarizes their inherent challenges. After that, we present a novel taxonomy that delivers a thorough categorization of existing methods according to modalities and tasks, exploring their respective strengths and limitations. Furthermore, comparative results of recent approaches on several benchmark datasets, together with insightful analysis, are offered. Finally, we discuss the unresolved issues and provide several potential avenues for future research.

large language model, machine learning, natural language, (23 more...)

arXiv.org Artificial Intelligence

2310.15676

Country:

Asia > China (0.14)
Oceania > Australia (0.14)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.92)
(3 more...)

Add feedback

Blind quantum machine learning with quantum bipartite correlator

Li, Changhao, Li, Boning, Amer, Omar, Shaydulin, Ruslan, Chakrabarti, Shouvanik, Wang, Guoqing, Xu, Haowei, Tang, Hao, Schoch, Isidor, Kumar, Niraj, Lim, Charles, Li, Ju, Cappellaro, Paola, Pistoia, Marco

arXiv.org Artificial IntelligenceOct-19-2023

Distributed quantum computing is a promising computational paradigm for performing computations that are beyond the reach of individual quantum devices. Privacy in distributed quantum computing is critical for maintaining confidentiality and protecting the data in the presence of untrusted computing nodes. In this work, we introduce novel blind quantum machine learning protocols based on the quantum bipartite correlator algorithm. Our protocols have reduced communication overhead while preserving the privacy of data from untrusted parties. We introduce robust algorithm-specific privacy-preserving mechanisms with low computational overhead that do not require complex cryptographic techniques. We then validate the effectiveness of the proposed protocols through complexity and privacy analysis. Our findings pave the way for advancements in distributed quantum computing, opening up new possibilities for privacy-aware machine learning applications in the era of quantum technologies.

artificial intelligence, machine learning, server, (19 more...)

arXiv.org Artificial Intelligence

2310.12893

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

PPGN: Phrase-Guided Proposal Generation Network For Referring Expression Comprehension

Yang, Chao, Wang, Guoqing, Li, Dongsheng, Shen, Huawei, Feng, Su, Jiang, Bin

arXiv.org Artificial IntelligenceDec-20-2020

Reference expression comprehension (REC) aims to find the location that the phrase refer to in a given image. Proposal generation and proposal representation are two effective techniques in many two-stage REC methods. However, most of the existing works only focus on proposal representation and neglect the importance of proposal generation. As a result, the low-quality proposals generated by these methods become the performance bottleneck in REC tasks. In this paper, we reconsider the problem of proposal generation, and propose a novel phrase-guided proposal generation network (PPGN). The main implementation principle of PPGN is refining visual features with text and generate proposals through regression. Experiments show that our method is effective and achieve SOTA performance in benchmark datasets.

artificial intelligence, image understanding, proposal, (17 more...)

arXiv.org Artificial Intelligence

2012.1089

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.37)

Add feedback