AITopics | Wang, Hongbo

Collaborating Authors

Wang, Hongbo

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards Comprehensive Detection of Chinese Harmful Memes

Lu, Junyu, Xu, Bo, Zhang, Xiaokun, Wang, Hongbo, Zhu, Haohao, Zhang, Dongyu, Yang, Liang, Lin, Hongfei

arXiv.org Artificial IntelligenceOct-3-2024

This paper has been accepted in the NeurIPS 2024 D & B Track. Harmful memes have proliferated on the Chinese Internet, while research on detecting Chinese harmful memes significantly lags behind due to the absence of reliable datasets and effective detectors. To this end, we focus on the comprehensive detection of Chinese harmful memes. We construct ToxiCN MM, the first Chinese harmful meme dataset, which consists of 12,000 samples with fine-grained annotations for various meme types. Additionally, we propose a baseline detector, Multimodal Knowledge Enhancement (MKE), incorporating contextual information of meme content generated by the LLM to enhance the understanding of Chinese memes. During the evaluation phase, we conduct extensive quantitative experiments and qualitative analyses on multiple baselines, including LLMs and our MKE. The experimental results indicate that detecting Chinese harmful memes is challenging for existing models while demonstrating the effectiveness of MKE. The resources for this paper are available at https://github.com/DUT-lujunyu/ToxiCN_MM.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.02378

Country:

Asia > Middle East > UAE (0.14)
North America > United States > Washington > King County > Seattle (0.14)
North America > United States > California (0.14)
(2 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Law (0.68)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

PclGPT: A Large Language Model for Patronizing and Condescending Language Detection

Wang, Hongbo, Li, Mingda, Lu, Junyu, Xia, Hebin, Yang, Liang, Xu, Bo, Liu, Ruizhu, Lin, Hongfei

arXiv.org Artificial IntelligenceSep-30-2024

Disclaimer: Samples in this paper may be harmful and cause discomfort! Patronizing and condescending language (PCL) is a form of speech directed at vulnerable groups. As an essential branch of toxic language, this type of language exacerbates conflicts and confrontations among Internet communities and detrimentally impacts disadvantaged groups. Traditional pre-trained language models (PLMs) perform poorly in detecting PCL due to its implicit toxicity traits like hypocrisy and false sympathy. With the rise of large language models (LLMs), we can harness their rich emotional semantics to establish a paradigm for exploring implicit toxicity. In this paper, we introduce PclGPT, a comprehensive LLM benchmark designed specifically for PCL. We collect, annotate, and integrate the Pcl-PT/SFT dataset, and then develop a bilingual PclGPT-EN/CN model group through a comprehensive pre-training and supervised fine-tuning staircase process to facilitate implicit toxic detection. Group detection results and fine-grained detection from PclGPT and other models reveal significant variations in the degree of bias in PCL towards different vulnerable groups, necessitating increased societal attention to protect them.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.00361

Country:

Asia > China (0.14)
North America > Canada (0.14)
Europe (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Enhanced User Interaction in Operating Systems through Machine Learning Language Models

Zhang, Chenwei, Lu, Wenran, Ni, Chunhe, Wang, Hongbo, Wu, Jiang

arXiv.org Artificial IntelligenceFeb-24-2024

With the large language model showing human-like logical reasoning and understanding ability, whether agents based on the large language model can simulate the interaction behavior of real users, so as to build a reliable virtual recommendation A/B test scene to help the application of recommendation research is an urgent, important and economic value problem. The combination of interaction design and machine learning can provide a more efficient and personalized user experience for products and services. This personalized service can meet the specific needs of users and improve user satisfaction and loyalty. Second, the interactive system can understand the user's views and needs for the product by providing a good user interface and interactive experience, and then use machine learning algorithms to improve and optimize the product. This iterative optimization process can continuously improve the quality and performance of the product to meet the changing needs of users. At the same time, designers need to consider how these algorithms and tools can be combined with interactive systems to provide a good user experience. This paper explores the potential applications of large language models, machine learning and interaction design for user interaction in recommendation systems and operating systems. By integrating these technologies, more intelligent and personalized services can be provided to meet user needs and promote continuous improvement and optimization of products. This is of great value for both recommendation research and user experience applications.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2403.00806

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Texas > Travis County > Austin (0.14)

Genre: Research Report (0.50)

Industry:

Health & Medicine (1.00)
Information Technology > Security & Privacy (0.96)
Energy (0.94)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Enhancing Cloud-Based Large Language Model Processing with Elasticsearch and Transformer Models

Ni, Chunhe, Wu, Jiang, Wang, Hongbo, Lu, Wenran, Zhang, Chenwei

arXiv.org Artificial IntelligenceFeb-24-2024

Large Language Models (LLMs) are a class of generative AI models built using the Transformer network, capable of leveraging vast datasets to identify, summarize, translate, predict, and generate language. LLMs promise to revolutionize society, yet training these foundational models poses immense challenges. Semantic/vector search within large language models is a potent technique that can significantly enhance search result accuracy and relevance. Unlike traditional keyword-based search methods, semantic search utilizes the meaning and context of words to grasp the intent behind queries and deliver more precise outcomes. Elasticsearch emerges as one of the most popular tools for implementing semantic search -- an exceptionally scalable and robust search engine designed for indexing and searching extensive datasets. In this article, we delve into the fundamentals of semantic search and explore how to harness Elasticsearch and Transformer models to bolster large language model processing paradigms. We gain a comprehensive understanding of semantic search principles and acquire practical skills for implementing semantic search in real-world model application scenarios.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2403.00807

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.15)
North America > United States > Texas > Travis County > Austin (0.14)

Genre: Research Report (0.50)

Industry: Information Technology > Services (0.41)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Data Pipeline Training: Integrating AutoML to Optimize the Data Flow of Machine Learning Models

Wu, Jiang, Wang, Hongbo, Ni, Chunhe, Zhang, Chenwei, Lu, Wenran

arXiv.org Artificial IntelligenceFeb-20-2024

Data Pipeline plays an indispensable role in tasks such as modeling machine learning and developing data products. With the increasing diversification and complexity of Data sources, as well as the rapid growth of data volumes, building an efficient Data Pipeline has become crucial for improving work efficiency and solving complex problems. This paper focuses on exploring how to optimize data flow through automated machine learning methods by integrating AutoML with Data Pipeline. We will discuss how to leverage AutoML technology to enhance the intelligence of Data Pipeline, thereby achieving better results in machine learning tasks. By delving into the automation and optimization of Data flows, we uncover key strategies for constructing efficient data pipelines that can adapt to the ever-changing data landscape. This not only accelerates the modeling process but also provides innovative solutions to complex problems, enabling more significant outcomes in increasingly intricate data domains. Keywords- Data Pipeline Training;AutoML; Data environment; Machine learning

artificial intelligence, data pipeline, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2402.12916

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > Illinois > Champaign County > Urbana (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

AutoFed: Heterogeneity-Aware Federated Multimodal Learning for Robust Autonomous Driving

Zheng, Tianyue, Li, Ang, Chen, Zhe, Wang, Hongbo, Luo, Jun

arXiv.org Artificial IntelligenceMar-30-2023

Object detection with on-board sensors (e.g., lidar, radar, and camera) play a crucial role in autonomous driving (AD), and these sensors complement each other in modalities. While crowdsensing may potentially exploit these sensors (of huge quantity) to derive more comprehensive knowledge, \textit{federated learning} (FL) appears to be the necessary tool to reach this potential: it enables autonomous vehicles (AVs) to train machine learning models without explicitly sharing raw sensory data. However, the multimodal sensors introduce various data heterogeneity across distributed AVs (e.g., label quantity skews and varied modalities), posing critical challenges to effective FL. To this end, we present AutoFed as a heterogeneity-aware FL framework to fully exploit multimodal sensory data on AVs and thus enable robust AD. Specifically, we first propose a novel model leveraging pseudo-labeling to avoid mistakenly treating unlabeled objects as the background. We also propose an autoencoder-based data imputation method to fill missing data modality (of certain AVs) with the available ones. To further reconcile the heterogeneity, we finally present a client selection mechanism exploiting the similarities among client models to improve both training stability and convergence rate. Our experiments on benchmark dataset confirm that AutoFed substantially improves over status quo approaches in both precision and recall, while demonstrating strong robustness to adverse weather conditions.

artificial intelligence, autofed, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2302.08646

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Ground > Road (1.00)
Information Technology (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

DocRED-FE: A Document-Level Fine-Grained Entity And Relation Extraction Dataset

Wang, Hongbo, Xiong, Weimin, Song, Yifan, Zhu, Dawei, Xia, Yu, Li, Sujian

arXiv.org Artificial IntelligenceMar-21-2023

Joint entity and relation extraction (JERE) is one of the most important tasks in information extraction. However, most existing works focus on sentence-level coarse-grained JERE, which have limitations in real-world scenarios. In this paper, we construct a large-scale document-level fine-grained JERE dataset DocRED-FE, which improves DocRED with Fine-Grained Entity Type. Specifically, we redesign a hierarchical entity type schema including 11 coarse-grained types and 119 fine-grained types, and then re-annotate DocRED manually according to this schema. Through comprehensive experiments we find that: (1) DocRED-FE is challenging to existing JERE models; (2) Our fine-grained entity types promote relation classification. We make DocRED-FE with instruction and the code for our baselines publicly available at https://github.com/PKU-TANGENT/DOCRED-FE.

artificial intelligence, docred-fe, natural language, (16 more...)

arXiv.org Artificial Intelligence

2303.11141

Country: North America > United States > Minnesota (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.48)

Add feedback

Dynamic Interactional And Cooperative Network For Shield Machine

Gao, Dazhi, Li, Rongyang, Wang, Hongbo, Mao, Lingfeng, Ning, Huansheng

arXiv.org Artificial IntelligenceNov-17-2022

The shield machine (SM) is a complex mechanical device used for tunneling. However, the monitoring and deciding were mainly done by artificial experience during traditional construction, which brought some limitations, such as hidden mechanical failures, human operator error, and sensor anomalies. To deal with these challenges, many scholars have studied SM intelligent methods. Most of these methods only take SM into account but do not consider the SM operating environment. So, this paper discussed the relationship among SM, geological information, and control terminals. Then, according to the relationship, models were established for the control terminal, including SM rate prediction and SM anomaly detection. The experimental results show that compared with baseline models, the proposed models in this paper perform better. In the proposed model, the R2 and MSE of rate prediction can reach 92.2\%, and 0.0064 respectively. The abnormal detection rate of anomaly detection is up to 98.2\%.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2211.10473

Country:

North America > United States (0.28)
Asia > China (0.17)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback