AITopics | He, Bo

Collaborating Authors

He, Bo

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

OIPR: Evaluation for Time-series Anomaly Detection Inspired by Operator Interest

Jing, Yuhan, Wang, Jingyu, Zhang, Lei, Sun, Haifeng, He, Bo, Zhuang, Zirui, Wang, Chengsen, Qi, Qi, Liao, Jianxin

arXiv.org Artificial IntelligenceMar-3-2025

With the growing adoption of time-series anomaly detection (TAD) technology, numerous studies have employed deep learning-based detectors for analyzing time-series data in the fields of Internet services, industrial systems, and sensors. The selection and optimization of anomaly detectors strongly rely on the availability of an effective performance evaluation method for TAD. Since anomalies in time-series data often manifest as a sequence of points, conventional metrics that solely consider the detection of individual point are inadequate. Existing evaluation methods for TAD typically employ point-based or event-based metrics to capture the temporal context. However, point-based metrics tend to overestimate detectors that excel only in detecting long anomalies, while event-based metrics are susceptible to being misled by fragmented detection results. To address these limitations, we propose OIPR, a novel set of TAD evaluation metrics. It models the process of operators receiving detector alarms and handling faults, utilizing area under the operator interest curve to evaluate the performance of TAD algorithms. Furthermore, we build a special scenario dataset to compare the characteristics of different evaluation methods. Through experiments conducted on the special scenario dataset and five real-world datasets, we demonstrate the remarkable performance of OIPR in extreme and complex scenarios. It achieves a balance between point and event perspectives, overcoming their primary limitations and offering applicability to broader situations.

anomaly event, data mining, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2503.0126

Country:

North America > United States (0.46)
Europe > France (0.28)
North America > Canada > Ontario (0.14)
(2 more...)

Genre: Research Report (0.63)

Industry:

Information Technology > Security & Privacy (0.92)
Education > Educational Setting > Higher Education (0.46)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Pix2Cap-COCO: Advancing Visual Comprehension via Pixel-Level Captioning

You, Zuyao, Wang, Junke, Kong, Lingyu, He, Bo, Wu, Zuxuan

arXiv.org Artificial IntelligenceJan-23-2025

We present Pix2Cap-COCO, the first panoptic pixel-level caption dataset designed to advance fine-grained visual understanding. To achieve this, we carefully design an automated annotation pipeline that prompts GPT-4V to generate pixel-aligned, instance-specific captions for individual objects within images, enabling models to learn more granular relationships between objects and their contexts. This approach results in 167,254 detailed captions, with an average of 22.94 words per caption. Building on Pix2Cap-COCO, we introduce a novel task, panoptic segmentation-captioning, which challenges models to recognize instances in an image and provide detailed descriptions for each simultaneously. To benchmark this task, we design a robust baseline based on X-Decoder. The experimental results demonstrate that Pix2Cap-COCO is a particularly challenging dataset, as it requires models to excel in both fine-grained visual understanding and detailed language generation. Furthermore, we leverage Pix2Cap-COCO for Supervised Fine-Tuning (SFT) on large multimodal models (LMMs) to enhance their performance. For example, training with Pix2Cap-COCO significantly improves the performance of GPT4RoI, yielding gains in CIDEr +1.4%, ROUGE +0.4%, and SPICE +0.5% on Visual Genome dataset, and strengthens its region understanding ability on the ViP-BENCH, with an overall improvement of +5.1%, including notable increases in recognition accuracy +11.2% and language generation quality +22.2%.

caption, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2501.13893

Country: North America > United States > Maryland (0.14)

Genre: Research Report (0.70)

Industry: Leisure & Entertainment > Sports > Tennis (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
(2 more...)

Add feedback

Evaluating the Design Features of an Intelligent Tutoring System for Advanced Mathematics Learning

Fang, Ying, He, Bo, Liu, Zhi, Liu, Sannyuya, Yan, Zhonghua, Sun, Jianwen

arXiv.org Artificial IntelligenceDec-22-2024

Xiaomai is an intelligent tutoring system (ITS) designed to help Chinese college students in learning advanced mathematics and preparing for the graduate school math entrance exam. This study investigates two distinctive features within Xiaomai: the incorporation of free-response questions with automatic feedback and the metacognitive element of reflecting on self-made errors. An experiment was conducted to evaluate the impact of these features on mathematics learning. One hundred and twenty college students were recruited and randomly assigned to four conditions: (1) multiple-choice questions without reflection, (2) multiple-choice questions with reflection, (3) free-response questions without reflection, and (4) free-response questions with reflection. Students in the multiple-choice conditions demonstrated better practice performance and learning outcomes compared to their counterparts in the freeresponse conditions. Additionally, the incorporation of error reflection did not yield a significant impact on students' practice performance or learning outcomes. These findings indicate that current design of free-response questions and the metacognitive feature of error reflection do not enhance the efficacy of the math ITS. This study highlights the need for redesign or enhancement of Xiaomai to optimize its effectiveness in facilitating advanced mathematics learning.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-64302-6_24

2412.17265

Country: Asia > China (0.15)

Genre:

Research Report > Strength High (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (1.00)
Education > Educational Setting > Higher Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.88)
Information Technology > Artificial Intelligence > Cognitive Science (0.68)
Information Technology > Artificial Intelligence > Natural Language > Understanding (0.61)

Add feedback

Feature Combination Meets Attention: Baidu Soccer Embeddings and Transformer based Temporal Detection

Zhou, Xin, Kang, Le, Cheng, Zhiyu, He, Bo, Xin, Jingyu

arXiv.org Artificial IntelligenceJun-28-2021

With rapidly evolving internet technologies and emerging tools, sports related videos generated online are increasing at an unprecedentedly fast pace. To automate sports video editing/highlight generation process, a key task is to precisely recognize and locate the events in the long untrimmed videos. In this tech report, we present a two-stage paradigm to detect what and when events happen in soccer broadcast videos. Specifically, we fine-tune multiple action recognition models on soccer data to extract high-level semantic features, and design a transformer based temporal detection module to locate the target events. This approach achieved the state-of-the-art performance in both two tasks, i.e., action spotting and replay grounding, in the SoccerNet-v2 Challenge, under CVPR 2021 ActivityNet workshop. Our soccer embedding features are released at https://github.com/baidu-research/vidpress-sports. By sharing these features with the broader community, we hope to accelerate the research into soccer video understanding.

baidu soccer embedding and transformer, feature combination meet attention, temporal detection

arXiv.org Artificial Intelligence

2106.14447

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)
Information Technology > Artificial Intelligence > Vision (0.53)

Add feedback

GAN-Based Interactive Reinforcement Learning from Demonstration and Human Evaluative Feedback

Huang, Jie, Juan, Rongshun, Gomez, Randy, Nakamura, Keisuke, Sha, Qixin, He, Bo, Li, Guangliang

arXiv.org Artificial IntelligenceApr-13-2021

Deep reinforcement learning (DRL) has achieved great successes in many simulated tasks. The sample inefficiency problem makes applying traditional DRL methods to real-world robots a great challenge. Generative Adversarial Imitation Learning (GAIL) -- a general model-free imitation learning method, allows robots to directly learn policies from expert trajectories in large environments. However, GAIL shares the limitation of other imitation learning methods that they can seldom surpass the performance of demonstrations. In this paper, to address the limit of GAIL, we propose GAN-Based Interactive Reinforcement Learning (GAIRL) from demonstration and human evaluative feedback by combining the advantages of GAIL and interactive reinforcement learning. We tested our proposed method in six physics-based control tasks, ranging from simple low-dimensional control tasks -- Cart Pole and Mountain Car, to difficult high-dimensional tasks -- Inverted Double Pendulum, Lunar Lander, Hopper and HalfCheetah. Our results suggest that with both optimal and suboptimal demonstrations, a GAIRL agent can always learn a more stable policy with optimal or close to optimal performance, while the performance of the GAIL agent is upper bounded by the performance of demonstrations or even worse than it. In addition, our results indicate the reason that GAIRL is superior over GAIL is the complementary effect of demonstrations and human evaluative feedback.

agent, artificial intelligence, educational setting, (17 more...)

arXiv.org Artificial Intelligence

2104.066

Country:

Asia > China (0.29)
Asia > Japan > Honshū (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Education > Educational Setting (0.93)
Leisure & Entertainment > Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Improving Interactive Reinforcement Agent Planning with Human Demonstration

Li, Guangliang, Gomez, Randy, Nakamura, Keisuke, Lin, Jinying, Zhang, Qilei, He, Bo

arXiv.org Artificial IntelligenceApr-18-2019

TAMER has proven to be a powerful interactive reinforcement learning method for allowing ordinary people to teach and personalize autonomous agents' behavior by providing evaluative feedback. However, a TAMER agent planning with UCT---a Monte Carlo Tree Search strategy, can only update states along its path and might induce high learning cost especially for a physical robot. In this paper, we propose to drive the agent's exploration along the optimal path and reduce the learning cost by initializing the agent's reward function via inverse reinforcement learning from demonstration. We test our proposed method in the RL benchmark domain---Grid World---with different discounts on human reward. Our results show that learning from demonstration can allow a TAMER agent to learn a roughly optimal policy up to the deepest search and encourage the agent to explore along the optimal path. In addition, we find that learning from demonstration can improve the learning efficiency by reducing total feedback, the number of incorrect actions and increasing the ratio of correct actions to obtain an optimal policy, allowing a TAMER agent to converge faster.

agent, artificial intelligence, reinforcement learning, (20 more...)

arXiv.org Artificial Intelligence

1904.08621

Country: Asia (0.14)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

LARSEN-ELM: Selective Ensemble of Extreme Learning Machines using LARS for Blended Data

Han, Bo, He, Bo, Nian, Rui, Ma, Mengmeng, Zhang, Shujing, Li, Minghui, Lendasse, Amaury

arXiv.org Machine LearningAug-26-2014

Extreme learning machine (ELM) as a neural network algorithm has shown its good performance, such as fast speed, simple structure etc, but also, weak robustness is an unavoidable defect in original ELM for blended data. We present a new machine learning framework called LARSEN-ELM for overcoming this problem. In our paper, we would like to show two key steps in LARSEN-ELM. In the first step, preprocessing, we select the input variables highly related to the output using least angle regression (LARS). In the second step, training, we employ Genetic Algorithm (GA) based selective ensemble and original ELM. In the experiments, we apply a sum of two sines and four datasets from UCI repository to verify the robustness of our approach. The experimental results show that compared with original ELM and other methods such as OP-ELM, GASEN-ELM and LSBoost, LARSEN-ELM significantly improve robustness performance while keeping a relatively high speed.

artificial intelligence, elm, neural network, (20 more...)

arXiv.org Machine Learning

doi: 10.1016/j.neucom.2014.01.069

1408.2003

Country:

Europe (0.46)
Asia > China (0.29)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback