AITopics | Haiphong

Collaborating Authors

Haiphong

Is Human-Like Text Liked by Humans? Multilingual Human Detection and Preference Against AI

Wang, Yuxia, Xing, Rui, Mansurov, Jonibek, Puccetti, Giovanni, Xie, Zhuohan, Ta, Minh Ngoc, Geng, Jiahui, Su, Jinyan, Abassy, Mervat, Ahmed, Saad El Dine, Elozeiri, Kareem, Laiyk, Nurkhan, Goloburda, Maiya, Mahmoud, Tarek, Tomar, Raj Vardhan, Aziz, Alexander, Koike, Ryuto, Kaneko, Masahiro, Shelmanov, Artem, Artemova, Ekaterina, Mikhailov, Vladislav, Tsvigun, Akim, Aji, Alham Fikri, Habash, Nizar, Gurevych, Iryna, Nakov, Preslav

arXiv.org Artificial IntelligenceFeb-17-2025

Prior studies have shown that distinguishing text generated by large language models (LLMs) from human-written one is highly challenging, and often no better than random guessing. To verify the generalizability of this finding across languages and domains, we perform an extensive case study to identify the upper bound of human detection accuracy. Across 16 datasets covering 9 languages and 9 domains, 19 annotators achieved an average detection accuracy of 87.6%, thus challenging previous conclusions. We find that major gaps between human and machine text lie in concreteness, cultural nuances, and diversity. Prompting by explicitly explaining the distinctions in the prompts can partially bridge the gaps in over 50% of the cases. However, we also find that humans do not always prefer human-written text, particularly when they cannot clearly identify its source.

annotator, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2502.11614

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Vietnam > Haiphong > Haiphong (0.04)
(14 more...)

Genre:

Overview (0.92)
Research Report > New Finding (0.67)

Industry:

Media > News (1.00)
Education (1.00)
Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Empathetic Conversational Agents: Utilizing Neural and Physiological Signals for Enhanced Empathetic Interactions

Saffaryazdi, Nastaran, Gunasekaran, Tamil Selvan, Laveys, Kate, Broadbent, Elizabeth, Billinghurst, Mark

arXiv.org Artificial IntelligenceJan-14-2025

Conversational agents (CAs) are revolutionizing human-computer interaction by evolving from text-based chatbots to empathetic digital humans (DHs) capable of rich emotional expressions. This paper explores the integration of neural and physiological signals into the perception module of CAs to enhance empathetic interactions. By leveraging these cues, the study aims to detect emotions in real-time and generate empathetic responses and expressions. We conducted a user study where participants engaged in conversations with a DH about emotional topics. The DH responded and displayed expressions by mirroring detected emotions in real-time using neural and physiological cues. The results indicate that participants experienced stronger emotions and greater engagement during interactions with the Empathetic DH, demonstrating the effectiveness of incorporating neural and physiological signals for real-time emotion recognition. However, several challenges were identified, including recognition accuracy, emotional transition speeds, individual personality effects, and limitations in voice tone modulation. Addressing these challenges is crucial for further refining Empathetic DHs and fostering meaningful connections between humans and artificial entities. Overall, this research advances human-agent interaction and highlights the potential of real-time neural and physiological emotion recognition in creating empathetic DHs.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2501.08393

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.06)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(5 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Diagnostic Medicine (0.93)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction

Zhang, Qintong, Huang, Victor Shea-Jay, Wang, Bin, Zhang, Junyuan, Wang, Zhengren, Liang, Hao, Wang, Shawn, Lin, Matthieu, He, Conghui, Zhang, Wentao

arXiv.org Artificial IntelligenceNov-5-2024

Document parsing is essential for converting unstructured and semi-structured documents--such as contracts, academic papers, and invoices--into structured, machine-readable data. Document parsing extract reliable structured data from unstructured inputs, providing huge convenience for numerous applications. Especially with recent achievements in Large Language Models, document parsing plays an indispensable role in both knowledge base construction and training data generation. This survey presents a comprehensive review of the current state of document parsing, covering key methodologies, from modular pipeline systems to end-to-end models driven by large vision-language models. Core components such as layout detection, content extraction (including text, tables, and mathematical expressions), and multi-modal data integration are examined in detail. Additionally, this paper discusses the challenges faced by modular document parsing systems and vision-language models in handling complex layouts, integrating multiple modules, and recognizing high-density text. It emphasizes the importance of developing larger and more diverse datasets and outlines future research directions.

detection, proceedings, recognition, (11 more...)

arXiv.org Artificial Intelligence

2410.21169

Country:

Europe > Switzerland > Vaud > Lausanne (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > Vietnam > Haiphong > Haiphong (0.04)
(5 more...)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.45)

Industry:

Media (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(2 more...)

Add feedback

Multi-Dialect Vietnamese: Task, Dataset, Baseline Models and Challenges

Van Dinh, Nguyen, Dang, Thanh Chi, Nguyen, Luan Thanh, Van Nguyen, Kiet

arXiv.org Artificial IntelligenceOct-4-2024

Vietnamese, a low-resource language, is typically categorized into three primary dialect groups that belong to Northern, Central, and Southern Vietnam. However, each province within these regions exhibits its own distinct pronunciation variations. Despite the existence of various speech recognition datasets, none of them has provided a fine-grained classification of the 63 dialects specific to individual provinces of Vietnam. To address this gap, we introduce Vietnamese Multi-Dialect (ViMD) dataset, a novel comprehensive dataset capturing the rich diversity of 63 provincial dialects spoken across Vietnam. Our dataset comprises 102.56 hours of audio, consisting of approximately 19,000 utterances, and the associated transcripts contain over 1.2 million words. To provide benchmarks and simultaneously demonstrate the challenges of our dataset, we fine-tune state-of-the-art pre-trained models for two downstream tasks: (1) Dialect identification and (2) Speech recognition. The empirical results suggest two implications including the influence of geographical factors on dialects, and the constraints of current approaches in speech recognition tasks involving multi-dialect speech data. Our dataset is available for research purposes.

dataset, dialect, experiment, (17 more...)

arXiv.org Artificial Intelligence

2410.03458

Country:

Asia > Vietnam > Hanoi > Hanoi (0.14)
Asia > Vietnam > Thanh Hóa Province > Thanh Hóa (0.04)
Asia > Vietnam > Hưng Yên Province > Hưng Yên (0.04)
(65 more...)

Genre: Research Report > New Finding (0.66)

Industry: Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Recent Trends of Multimodal Affective Computing: A Survey from NLP Perspective

Hu, Guimin, Xin, Yi, Lyu, Weimin, Huang, Haojian, Sun, Chang, Zhu, Zhihong, Gui, Lin, Cai, Ruichu

arXiv.org Artificial IntelligenceSep-11-2024

Multimodal affective computing (MAC) has garnered increasing attention due to its broad applications in analyzing human behaviors and intentions, especially in text-dominated multimodal affective computing field. This survey presents the recent trends of multimodal affective computing from NLP perspective through four hot tasks: multimodal sentiment analysis, multimodal emotion recognition in conversation, multimodal aspect-based sentiment analysis and multimodal multi-label emotion recognition. The goal of this survey is to explore the current landscape of multimodal affective research, identify development trends, and highlight the similarities and differences across various tasks, offering a comprehensive report on the recent progress in multimodal affective computing from an NLP perspective. This survey covers the formalization of tasks, provides an overview of relevant works, describes benchmark datasets, and details the evaluation metrics for each task. Additionally, it briefly discusses research in multimodal affective computing involving facial expressions, acoustic signals, physiological signals, and emotion causes. Additionally, we discuss the technical approaches, challenges, and future directions in multimodal affective computing. To support further research, we released a repository that compiles related works in multimodal affective computing, providing detailed resources and references for the community.

emotion recognition, recognition, sentiment analysis, (14 more...)

arXiv.org Artificial Intelligence

2409.07388

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.14)
North America > Dominican Republic (0.04)
(57 more...)

Genre: Overview (1.00)

Industry:

Leisure & Entertainment (0.92)
Health & Medicine > Therapeutic Area (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (1.00)

Add feedback

SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages

Lovenia, Holy, Mahendra, Rahmad, Akbar, Salsabil Maulana, Miranda, Lester James V., Santoso, Jennifer, Aco, Elyanah, Fadhilah, Akhdan, Mansurov, Jonibek, Imperial, Joseph Marvin, Kampman, Onno P., Moniz, Joel Ruben Antony, Habibi, Muhammad Ravi Shulthan, Hudi, Frederikus, Montalan, Railey, Ignatius, Ryan, Lopo, Joanito Agili, Nixon, William, Karlsson, Börje F., Jaya, James, Diandaru, Ryandito, Gao, Yuze, Amadeus, Patrick, Wang, Bin, Cruz, Jan Christian Blaise, Whitehouse, Chenxi, Parmonangan, Ivan Halim, Khelli, Maria, Zhang, Wenyu, Susanto, Lucky, Ryanda, Reynard Adha, Hermawan, Sonny Lazuardi, Velasco, Dan John, Kautsar, Muhammad Dehan Al, Hendria, Willy Fitra, Moslem, Yasmin, Flynn, Noah, Adilazuarda, Muhammad Farid, Li, Haochen, Lee, Johanes, Damanhuri, R., Sun, Shuo, Qorib, Muhammad Reza, Djanibekov, Amirbek, Leong, Wei Qi, Do, Quyet V., Muennighoff, Niklas, Pansuwan, Tanrada, Putra, Ilham Firdausi, Xu, Yan, Tai, Ngee Chia, Purwarianti, Ayu, Ruder, Sebastian, Tjhi, William, Limkonchotiwat, Peerat, Aji, Alham Fikri, Keh, Sedrick, Winata, Genta Indra, Zhang, Ruochen, Koto, Fajri, Yong, Zheng-Xin, Cahyawijaya, Samuel

arXiv.org Artificial IntelligenceJul-8-2024

Southeast Asia (SEA) is a region rich in linguistic diversity and cultural variety, with over 1,300 indigenous languages and a population of 671 million people. However, prevailing AI models suffer from a significant lack of representation of texts, images, and audio datasets from SEA, compromising the quality of AI models for SEA languages. Evaluating models for SEA languages is challenging due to the scarcity of high-quality datasets, compounded by the dominance of English training data, raising concerns about potential cultural misrepresentation. To address these challenges, we introduce SEACrowd, a collaborative initiative that consolidates a comprehensive resource hub that fills the resource gap by providing standardized corpora in nearly 1,000 SEA languages across three modalities. Through our SEACrowd benchmarks, we assess the quality of AI models on 36 indigenous languages across 13 tasks, offering valuable insights into the current AI landscape in SEA. Furthermore, we propose strategies to facilitate greater AI advancements, maximizing potential utility and resource equity for the future of AI in SEA.

computational linguistic, dataset, sea language, (14 more...)

arXiv.org Artificial Intelligence

2406.10118

Country:

Asia > Southeast Asia (0.24)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Laos (0.06)
(59 more...)

Genre: Research Report (0.81)

Industry:

Education (0.68)
Information Technology (0.67)
Energy (0.45)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(6 more...)

Add feedback

VietMed: A Dataset and Benchmark for Automatic Speech Recognition of Vietnamese in the Medical Domain

Le-Duc, Khai

arXiv.org Artificial IntelligenceMay-28-2024

In this work, we present VietMed - a Vietnamese speech recognition dataset in the medical domain comprising 16h of labeled medical speech, 1000h of unlabeled medical speech and 1200h of unlabeled general-domain speech. To our best knowledge, VietMed is by far the world's largest public medical speech recognition dataset in 7 aspects: total duration, number of speakers, diseases, recording conditions, speaker roles, unique medical terms and accents. VietMed is also by far the largest public Vietnamese speech dataset in terms of total duration. Additionally, we are the first to present a medical ASR dataset covering all ICD-10 disease groups and all accents within a country. Moreover, we release the first public large-scale pre-trained models for Vietnamese ASR, w2v2-Viet and XLSR-53-Viet, along with the first public large-scale fine-tuned models for medical ASR. Even without any medical data in unsupervised pre-training, our best pre-trained model XLSR-53-Viet generalizes very well to the medical domain by outperforming state-of-the-art XLSR-53, from 51.8% to 29.6% WER on test set (a relative reduction of more than 40%). All code, data and models are made publicly available here.

dataset, recognition, vietmed, (15 more...)

arXiv.org Artificial Intelligence

2404.05659

Country:

North America > United States (0.14)
Europe > Germany (0.14)
North America > Canada > Ontario > Toronto (0.14)
(15 more...)

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (1.00)
Government (1.00)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)

Add feedback

ViTextVQA: A Large-Scale Visual Question Answering Dataset for Evaluating Vietnamese Text Comprehension in Images

Van Nguyen, Quan, Tran, Dan Quang, Pham, Huy Quang, Nguyen, Thang Kien-Bao, Nguyen, Nghia Hieu, Van Nguyen, Kiet, Nguyen, Ngan Luu-Thuy

arXiv.org Artificial IntelligenceApr-16-2024

Visual Question Answering (VQA) is a complicated task that requires the capability of simultaneously processing natural language and images. Initially, this task was researched, focusing on methods to help machines understand objects and scene contexts in images. However, some text appearing in the image that carries explicit information about the full content of the image is not mentioned. Along with the continuous development of the AI era, there have been many studies on the reading comprehension ability of VQA models in the world. As a developing country, conditions are still limited, and this task is still open in Vietnam. Therefore, we introduce the first large-scale dataset in Vietnamese specializing in the ability to understand text appearing in images, we call it ViTextVQA (\textbf{Vi}etnamese \textbf{Text}-based \textbf{V}isual \textbf{Q}uestion \textbf{A}nswering dataset) which contains \textbf{over 16,000} images and \textbf{over 50,000} questions with answers. Through meticulous experiments with various state-of-the-art models, we uncover the significance of the order in which tokens in OCR text are processed and selected to formulate answers. This finding helped us significantly improve the performance of the baseline models on the ViTextVQA dataset. Our dataset is available at this \href{https://github.com/minhquan6203/ViTextVQA-Dataset}{link} for research purposes.

blip-2, dataset, prestu, (15 more...)

arXiv.org Artificial Intelligence

2404.10652

Country:

Asia > Vietnam > Hanoi > Hanoi (0.14)
Asia > Vietnam > Vĩnh Long Province > Vĩnh Long (0.04)
Asia > Vietnam > Hồ Chí Minh City > Hồ Chí Minh City (0.04)
(7 more...)

Genre:

Research Report > Promising Solution (1.00)
Overview (1.00)

Industry:

Education (0.66)
Health & Medicine (0.47)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Machine learning and Empirical Bayesian Approach for Predictive Buying in B2B E-commerce

De, Tuhin Subhra, Singh, Pranjal, Patel, Alok

arXiv.org Artificial IntelligenceMar-12-2024

In the context of developing nations like India, traditional business to business (B2B) commerce heavily relies on the establishment of robust relationships, trust, and credit arrangements between buyers and sellers. Consequently, ecommerce enterprises frequently. Established in 2016 with a vision to revolutionize trade in India through technology, Udaan is the countrys largest business to business ecommerce platform. Udaan operates across diverse product categories, including lifestyle, electronics, home and employ telecallers to cultivate buyer relationships, streamline order placement procedures, and promote special promotions. The accurate anticipation of buyer order placement behavior emerges as a pivotal factor for attaining sustainable growth, heightening competitiveness, and optimizing the efficiency of these telecallers. To address this challenge, we have employed an ensemble approach comprising XGBoost and a modified version of Poisson Gamma model to predict customer order patterns with precision. This paper provides an in-depth exploration of the strategic fusion of machine learning and an empirical Bayesian approach, bolstered by the judicious selection of pertinent features. This innovative approach has yielded a remarkable 3 times increase in customer order rates, show casing its potential for transformative impact in the ecommerce industry.

customer, prediction, probability, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3647750.3647754

2403.07843

Country:

Asia > Singapore > Central Region > Singapore (0.05)
North America > United States > New York > New York County > New York City (0.05)
Asia > Vietnam > Haiphong > Haiphong (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Information Technology > Services > e-Commerce Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.94)

Add feedback

Traffic Prediction using Artificial Intelligence: Review of Recent Advances and Emerging Opportunities

Shaygan, Maryam, Meese, Collin, Li, Wanxin, Zhao, Xiaolong, Nejad, Mark

arXiv.org Artificial IntelligenceJun-4-2023

Traffic prediction plays a crucial role in alleviating traffic congestion which represents a critical problem globally, resulting in negative consequences such as lost hours of additional travel time and increased fuel consumption. Integrating emerging technologies into transportation systems provides opportunities for improving traffic prediction significantly and brings about new research problems. In order to lay the foundation for understanding the open research challenges in traffic prediction, this survey aims to provide a comprehensive overview of traffic prediction methodologies. Specifically, we focus on the recent advances and emerging research opportunities in Artificial Intelligence (AI)-based traffic prediction methods, due to their recent success and potential in traffic prediction, with an emphasis on multivariate traffic time series modeling. We first provide a list and explanation of the various data types and resources used in the literature. Next, the essential data preprocessing methods within the traffic prediction context are categorized, and the prediction methods and applications are subsequently summarized. Lastly, we present primary research challenges in traffic prediction and discuss some directions for future research.

data mining, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.trc.2022.103921

2305.19591

Country:

North America > United States > Delaware > New Castle County > Newark (0.13)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
Asia > China > Sichuan Province > Chengdu (0.04)
(27 more...)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Transportation > Passenger (1.00)
Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)
(8 more...)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
(7 more...)

Add feedback