AITopics | Yang, Xun

Collaborating Authors

Yang, Xun

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

InstanT: Semi-supervised Learning with Instance-dependent Thresholds

Li, Muyang, Wu, Runze, Liu, Haoyu, Yu, Jun, Yang, Xun, Han, Bo, Liu, Tongliang

arXiv.org Machine LearningOct-29-2023

Semi-supervised learning (SSL) has been a fundamental challenge in machine learning for decades. The primary family of SSL algorithms, known as pseudo-labeling, involves assigning pseudo-labels to confident unlabeled instances and incorporating them into the training set. Therefore, the selection criteria of confident instances are crucial to the success of SSL. Recently, there has been growing interest in the development of SSL methods that use dynamic or adaptive thresholds. Yet, these methods typically apply the same threshold to all samples, or use class-dependent thresholds for instances belonging to a certain class, while neglecting instance-level information. In this paper, we propose the study of instance-dependent thresholds, which has the highest degree of freedom compared with existing methods. Specifically, we devise a novel instance-dependent threshold function for all unlabeled instances by utilizing their instance-level ambiguity and the instance-dependent error rates of pseudo-labels, so instances that are more likely to have incorrect pseudo-labels will have higher thresholds. Furthermore, we demonstrate that our instance-dependent threshold function provides a bounded probabilistic guarantee for the correctness of the pseudo-labels it assigns.

artificial intelligence, machine learning, threshold, (13 more...)

arXiv.org Machine Learning

2310.1891

Country:

Oceania > Australia (0.14)
Asia > Middle East > Israel (0.14)
Asia > China (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

NTU4DRadLM: 4D Radar-centric Multi-Modal Dataset for Localization and Mapping

Zhang, Jun, Zhuge, Huayang, Liu, Yiyao, Peng, Guohao, Wu, Zhenyu, Zhang, Haoyuan, Lyu, Qiyang, Li, Heshan, Zhao, Chunyang, Kircali, Dogan, Mharolkar, Sanat, Yang, Xun, Yi, Su, Wang, Yuanzhe, Wang, Danwei

arXiv.org Artificial IntelligenceSep-2-2023

Simultaneous Localization and Mapping (SLAM) is moving towards a robust perception age. However, LiDAR- and visual- SLAM may easily fail in adverse conditions (rain, snow, smoke and fog, etc.). In comparison, SLAM based on 4D Radar, thermal camera and IMU can work robustly. But only a few literature can be found. A major reason is the lack of related datasets, which seriously hinders the research. Even though some datasets are proposed based on 4D radar in past four years, they are mainly designed for object detection, rather than SLAM. Furthermore, they normally do not include thermal camera. Therefore, in this paper, NTU4DRadLM is presented to meet this requirement. The main characteristics are: 1) It is the only dataset that simultaneously includes all 6 sensors: 4D radar, thermal camera, IMU, 3D LiDAR, visual camera and RTK GPS. 2) Specifically designed for SLAM tasks, which provides fine-tuned ground truth odometry and intentionally formulated loop closures. 3) Considered both low-speed robot platform and fast-speed unmanned vehicle platform. 4) Covered structured, unstructured and semi-structured environments. 5) Considered both middle- and large- scale outdoor environments, i.e., the 6 trajectories range from 246m to 6.95km. 6) Comprehensively evaluated three types of SLAM algorithms. Totally, the dataset is around 17.6km, 85mins, 50GB and it will be accessible from this link: https://github.com/junzhang2016/NTU4DRadLM

artificial intelligence, dataset, loop closure, (13 more...)

arXiv.org Artificial Intelligence

2309.00962

Genre: Research Report (0.40)

Industry: Information Technology (0.68)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.68)

Add feedback

Redundancy-aware Transformer for Video Question Answering

Li, Yicong, Yang, Xun, Zhang, An, Feng, Chun, Wang, Xiang, Chua, Tat-Seng

arXiv.org Artificial IntelligenceAug-6-2023

This paper identifies two kinds of redundancy in the current VideoQA paradigm. Specifically, the current video encoders tend to holistically embed all video clues at different granularities in a hierarchical manner, which inevitably introduces \textit{neighboring-frame redundancy} that can overwhelm detailed visual clues at the object level. Subsequently, prevailing vision-language fusion designs introduce the \textit{cross-modal redundancy} by exhaustively fusing all visual elements with question tokens without explicitly differentiating their pairwise vision-language interactions, thus making a pernicious impact on the answering. To this end, we propose a novel transformer-based architecture, that aims to model VideoQA in a redundancy-aware manner. To address the neighboring-frame redundancy, we introduce a video encoder structure that emphasizes the object-level change in neighboring frames, while adopting an out-of-neighboring message-passing scheme that imposes attention only on distant frames. As for the cross-modal redundancy, we equip our fusion module with a novel adaptive sampling, which explicitly differentiates the vision-language interactions by identifying a small subset of visual elements that exclusively support the answer. Upon these advancements, we find this \underline{R}edundancy-\underline{a}ware trans\underline{former} (RaFormer) can achieve state-of-the-art results on multiple VideoQA benchmarks.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3581783.3612577

2308.03267

Country:

North America (0.48)
Asia > China > Anhui Province (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Deep Conversational Recommender in Travel

Liao, Lizi, Takanobu, Ryuichi, Ma, Yunshan, Yang, Xun, Huang, Minlie, Chua, Tat-Seng

arXiv.org Artificial IntelligenceJun-25-2019

When traveling to a foreign country, we are often in dire need of an intelligent conversational agent to provide instant and informative responses to our various queries. However, to build such a travel agent is non-trivial. First of all, travel naturally involves several sub-tasks such as hotel reservation, restaurant recommendation and taxi booking etc, which invokes the need for global topic control. Secondly, the agent should consider various constraints like price or distance given by the user to recommend an appropriate venue. In this paper, we present a Deep Conversational Recommender (DCR) and apply to travel. It augments the sequence-to-sequence (seq2seq) models with a neural latent topic component to better guide response generation and make the training easier. To consider the various constraints for venue recommendation, we leverage a graph convolutional network (GCN) based approach to capture the relationships between different venues and the match between venue and dialog context. For response generation, we combine the topic-based component with the idea of pointer networks, which allows us to effectively incorporate recommendation results. We perform extensive evaluation on a multi-turn task-oriented dialog dataset in travel domain and the results show that our method achieves superior performance as compared to a wide range of baselines.

deep learning, neural network, recommender, (21 more...)

arXiv.org Artificial Intelligence

1907.0071

Country: Asia (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Consumer Products & Services > Travel (0.68)
Consumer Products & Services > Restaurants (0.48)
Information Technology > Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

A Multi-Agent Reinforcement Learning Method for Impression Allocation in Online Display Advertising

Wu, Di, Chen, Cheng, Yang, Xun, Chen, Xiujun, Tan, Qing, Xu, Jian, Gai, Kun

arXiv.org Artificial IntelligenceSep-10-2018

In online display advertising, guaranteed contracts and real-time bidding (RTB) are two major ways to sell impressions for a publisher. Despite the increasing popularity of RTB, there is still half of online display advertising revenue generated from guaranteed contracts. Therefore, simultaneously selling impressions through both guaranteed contracts and RTB is a straightforward choice for a publisher to maximize its yield. However, deriving the optimal strategy to allocate impressions is not a trivial task, especially when the environment is unstable in real-world applications. In this paper, we formulate the impression allocation problem as an auction problem where each contract can submit virtual bids for individual impressions. With this formulation, we derive the optimal impression allocation strategy by solving the optimal bidding functions for contracts. Since the bids from contracts are decided by the publisher, we propose a multi-agent reinforcement learning (MARL) approach to derive cooperative policies for the publisher to maximize its yield in an unstable environment. The proposed approach also resolves the common challenges in MARL such as input dimension explosion, reward credit assignment, and non-stationary environment. Experimental evaluations on large-scale real datasets demonstrate the effectiveness of our approach.

artificial intelligence, impression, information technology services, (16 more...)

arXiv.org Artificial Intelligence

1809.03152

Country: Asia > China (0.14)

Genre: Research Report (0.50)

Industry:

Marketing (1.00)
Information Technology > Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Budget Constrained Bidding by Model-free Reinforcement Learning in Display Advertising

Wu, Di, Chen, Xiujun, Yang, Xun, Wang, Hao, Tan, Qing, Zhang, Xiaoxun, Xu, Jian, Gai, Kun

arXiv.org Artificial IntelligenceSep-6-2018

Real-time bidding (RTB) is an important mechanism in online display advertising, where a proper bid for each page view plays an essential role for good marketing results. Budget constrained bidding is a typical scenario in RTB where the advertisers hope to maximize the total value of the winning impressions under a pre-set budget constraint. However, the optimal bidding strategy is hard to be derived due to the complexity and volatility of the auction environment. To address these challenges, in this paper, we formulate budget constrained bidding as a Markov Decision Process and propose a model-free reinforcement learning framework to resolve the optimization problem. Our analysis shows that the immediate reward from environment is misleading under a critical resource constraint. Therefore, we innovate a reward function design methodology for the reinforcement learning problems with constraints. Based on the new reward design, we employ a deep neural network to learn the appropriate reward so that the optimal policy can be learned effectively. Different from the prior model-based work, which suffers from the scalability problem, our framework is easy to be deployed in large-scale industrial applications. The experimental evaluations demonstrate the effectiveness of our framework on large-scale real datasets.

artificial intelligence, budget, information technology services, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3269206.3271748

1802.08365

Country:

Europe > Italy (0.14)
Asia > China (0.14)
North America > United States (0.14)

Genre: Research Report (1.00)

Industry:

Marketing (1.00)
Information Technology > Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Add feedback