AITopics | Chen, Yulong

Collaborating Authors

Chen, Yulong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

Chen, Liang, Wang, Zekun, Ren, Shuhuai, Li, Lei, Zhao, Haozhe, Li, Yunshui, Cai, Zefan, Guo, Hongcheng, Zhang, Lei, Xiong, Yizhe, Zhang, Yichi, Wu, Ruoyu, Dong, Qingxiu, Zhang, Ge, Yang, Jian, Meng, Lingwei, Hu, Shujie, Chen, Yulong, Lin, Junyang, Bai, Shuai, Vlachos, Andreas, Tan, Xu, Zhang, Minjia, Xiao, Wen, Yee, Aaron, Liu, Tianyu, Chang, Baobao

arXiv.org Artificial IntelligenceDec-29-2024

Building on the foundations of language modeling in natural language processing, Next Token Prediction (NTP) has evolved into a versatile training objective for machine learning tasks across various modalities, achieving considerable success. As Large Language Models (LLMs) have advanced to unify understanding and generation tasks within the textual modality, recent research has shown that tasks from different modalities can also be effectively encapsulated within the NTP framework, transforming the multimodal information into tokens and predict the next one given the context. This survey introduces a comprehensive taxonomy that unifies both understanding and generation within multimodal learning through the lens of NTP. The proposed taxonomy covers five key aspects: Multimodal tokenization, MMNTP model architectures, unified task representation, datasets \& evaluation, and open challenges. This new taxonomy aims to aid researchers in their exploration of multimodal intelligence. An associated GitHub repository collecting the latest papers and repos is available at https://github.com/LMM101/Awesome-Multimodal-Next-Token-Prediction

large language model, machine learning, natural language, (25 more...)

arXiv.org Artificial Intelligence

2412.18619

Country:

North America > United States (0.67)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre:

Research Report (1.00)
Overview (1.00)
Instructional Material (1.00)

Industry:

Leisure & Entertainment (0.92)
Information Technology (0.67)
Health & Medicine > Pharmaceuticals & Biotechnology (0.45)
Media > Music (0.45)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

Add feedback

Interacted Object Grounding in Spatio-Temporal Human-Object Interactions

Liu, Xiaoyang, Wen, Boran, Liu, Xinpeng, Zhou, Zizheng, Fan, Hongwei, Lu, Cewu, Ma, Lizhuang, Chen, Yulong, Li, Yong-Lu

arXiv.org Artificial IntelligenceDec-27-2024

Spatio-temporal Human-Object Interaction (ST-HOI) understanding aims at detecting HOIs from videos, which is crucial for activity understanding. However, existing whole-body-object interaction video benchmarks overlook the truth that open-world objects are diverse, that is, they usually provide limited and predefined object classes. Therefore, we introduce a new open-world benchmark: Grounding Interacted Objects (GIO) including 1,098 interacted objects class and 290K interacted object boxes annotation. Accordingly, an object grounding task is proposed expecting vision systems to discover interacted objects. Even though today's detectors and grounding methods have succeeded greatly, they perform unsatisfactorily in localizing diverse and rare objects in GIO. This profoundly reveals the limitations of current vision systems and poses a great challenge. Thus, we explore leveraging spatio-temporal cues to address object grounding and propose a 4D question-answering framework (4D-QA) to discover interacted objects from diverse videos. Our method demonstrates significant superiority in extensive experiments compared to current baselines. Data and code will be publicly available at https://github.com/DirtyHarryLYL/HAKE-AVA.

detection, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.19542

Country: Asia > China (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.57)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)

Add feedback

Benchmarking GPT-4 against Human Translators: A Comprehensive Evaluation Across Languages, Domains, and Expertise Levels

Yan, Jianhao, Yan, Pingchuan, Chen, Yulong, Li, Jing, Zhu, Xianchao, Zhang, Yue

arXiv.org Artificial IntelligenceNov-20-2024

This study presents a comprehensive evaluation of GPT-4's translation capabilities compared to human translators of varying expertise levels. Through systematic human evaluation using the MQM schema, we assess translations across three language pairs (Chinese$\longleftrightarrow$English, Russian$\longleftrightarrow$English, and Chinese$\longleftrightarrow$Hindi) and three domains (News, Technology, and Biomedical). Our findings reveal that GPT-4 achieves performance comparable to junior-level translators in terms of total errors, while still lagging behind senior translators. Unlike traditional Neural Machine Translation systems, which show significant performance degradation in resource-poor language directions, GPT-4 maintains consistent translation quality across all evaluated language pairs. Through qualitative analysis, we identify distinctive patterns in translation approaches: GPT-4 tends toward overly literal translations and exhibits lexical inconsistency, while human translators sometimes over-interpret context and introduce hallucinations. This study represents the first systematic comparison between LLM and human translators across different proficiency levels, providing valuable insights into the current capabilities and limitations of LLM-based translation systems.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2411.13775

Country:

Europe (1.00)
Asia (0.68)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

The Automated Verification of Textual Claims (AVeriTeC) Shared Task

Schlichtkrull, Michael, Chen, Yulong, Whitehouse, Chenxi, Deng, Zhenyun, Akhtar, Mubashara, Aly, Rami, Guo, Zhijiang, Christodoulopoulos, Christos, Cocarascu, Oana, Mittal, Arpit, Thorne, James, Vlachos, Andreas

arXiv.org Artificial IntelligenceOct-31-2024

The Automated Verification of Textual Claims (AVeriTeC) shared task asks participants to retrieve evidence and predict veracity for real-world claims checked by fact-checkers. Evidence can be found either via a search engine, or via a knowledge store provided by the organisers. Submissions are evaluated using AVeriTeC score, which considers a claim to be accurately verified if and only if both the verdict is correct and retrieved evidence is considered to meet a certain quality threshold. The shared task received 21 submissions, 18 of which surpassed our baseline. The winning team was TUDA_MAI with an AVeriTeC score of 63%. In this paper we describe the shared task, present the full results, and highlight key takeaways from the shared task.

computational linguistic, large language model, machine learning, (23 more...)

arXiv.org Artificial Intelligence

2410.2385

Country:

Asia > China (0.93)
Asia > Middle East > UAE (0.14)
North America > United States > Louisiana (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.64)

Industry:

Media (0.68)
Government > Regional Government > Asia Government > China Government (0.46)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)

Add feedback

GPT-4 vs. Human Translators: A Comprehensive Evaluation of Translation Quality Across Languages, Domains, and Expertise Levels

Yan, Jianhao, Yan, Pingchuan, Chen, Yulong, Li, Judy, Zhu, Xianchao, Zhang, Yue

arXiv.org Artificial IntelligenceJul-4-2024

This study comprehensively evaluates the translation quality of Large Language Models (LLMs), specifically GPT-4, against human translators of varying expertise levels across multiple language pairs and domains. Through carefully designed annotation rounds, we find that GPT-4 performs comparably to junior translators in terms of total errors made but lags behind medium and senior translators. We also observe the imbalanced performance across different languages and domains, with GPT-4's translation capability gradually weakening from resource-rich to resource-poor directions. In addition, we qualitatively study the translation given by GPT-4 and human translators, and find that GPT-4 translator suffers from literal translations, but human translators sometimes overthink the background information. To our knowledge, this study is the first to evaluate LLMs against human translators and analyze the systematic differences between their outputs, providing valuable insights into the current state of LLM-based translation and its potential limitations.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2407.03658

Country:

Europe > Portugal (0.14)
Europe > Bulgaria (0.14)
Europe > Belgium (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

When Swarm Learning meets energy series data: A decentralized collaborative learning design based on blockchain

Xu, Lei, Chen, Yulong, Chen, Yuntian, Nie, Longfeng, Wei, Xuetao, Xue, Liang, Zhang, Dongxiao

arXiv.org Artificial IntelligenceJun-7-2024

Machine learning models offer the capability to forecast future energy production or consumption and infer essential unknown variables from existing data. However, legal and policy constraints within specific energy sectors render the data sensitive, presenting technical hurdles in utilizing data from diverse sources. Therefore, we propose adopting a Swarm Learning (SL) scheme, which replaces the centralized server with a blockchain-based distributed network to address the security and privacy issues inherent in Federated Learning (FL)'s centralized architecture. Within this distributed Collaborative Learning framework, each participating organization governs nodes for inter-organizational communication. Devices from various organizations utilize smart contracts for parameter uploading and retrieval. Consensus mechanism ensures distributed consistency throughout the learning process, guarantees the transparent trustworthiness and immutability of parameters on-chain. The efficacy of the proposed framework is substantiated across three real-world energy series modeling scenarios with superior performance compared to Local Learning approaches, simultaneously emphasizing enhanced data security and privacy over Centralized Learning and FL method. Notably, as the number of data volume and the count of local epochs increases within a threshold, there is an improvement in model performance accompanied by a reduction in the variance of performance errors. Consequently, this leads to an increased stability and reliability in the outcomes produced by the model.

artificial intelligence, decentralized collaborative learning design, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2406.04743

Country:

Asia > China > Zhejiang Province (0.14)
Asia > China > Guangdong Province (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Tables as Texts or Images: Evaluating the Table Reasoning Ability of LLMs and MLLMs

Deng, Naihao, Sun, Zhenjie, He, Ruiqi, Sikka, Aman, Chen, Yulong, Ma, Lin, Zhang, Yue, Mihalcea, Rada

arXiv.org Artificial IntelligenceJun-5-2024

Specifically, we investigate Recent years have witnessed an explosion of Large several research questions, including the effectiveness Language Models (LLMs), with impressive performance of image-based representation of tabular on various Natural Language Processing data and how different text-based or imagebased (NLP) tasks (Brown et al., 2020; Touvron et al., prompt methods affect LLMs' performance 2023; Team et al., 2023). Research to date has on table-related tasks. In addition, we provide analysis examined the performance of LLMs for various and hypothesis of LLMs' behaviors. Our findings aspects and abilities (Bang et al., 2023b; Bubeck include: et al., 2023; Akter et al., 2023), but their effectiveness on structured data such as tables is less explored. LLMs maintain decent performance when we Unlike unstructured text, tables are systematically use image-based table representations. Sometimes, organized structures of a large amount of image-based table representations can information. This characteristic makes tabular make LLMs perform better.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2402.12424

Country:

North America > United States (1.00)
Europe (0.67)
Asia > Middle East > UAE (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.66)

Industry: Consumer Products & Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Constituency Parsing using LLMs

Bai, Xuefeng, Wu, Jialong, Chen, Yulong, Wang, Zhongqing, Zhang, Yue

arXiv.org Artificial IntelligenceOct-31-2023

Constituency parsing is a fundamental yet unsolved natural language processing task. In this paper, we explore the potential of recent large language models (LLMs) that have exhibited remarkable performance across various domains and tasks to tackle this task. We employ three linearization strategies to transform output trees into symbol sequences, such that LLMs can solve constituency parsing by generating linearized trees. We conduct experiments using a diverse range of LLMs, including ChatGPT, GPT-4, OPT, LLaMA, and Alpaca, comparing their performance against the state-of-the-art constituency parsers. Our experiments encompass zero-shot, few-shot, and full-training learning settings, and we evaluate the models on one in-domain and five out-of-domain test datasets. Our findings reveal insights into LLMs' performance, generalization abilities, and challenges in constituency parsing.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2310.19462

Country:

Europe (1.00)
Asia > Middle East > UAE (0.14)
North America > United States > Texas (0.14)
(4 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

HI-TOM: A Benchmark for Evaluating Higher-Order Theory of Mind Reasoning in Large Language Models

He, Yinghui, Wu, Yufan, Jia, Yilin, Mihalcea, Rada, Chen, Yulong, Deng, Naihao

arXiv.org Artificial IntelligenceOct-25-2023

Theory of Mind (ToM) is the ability to reason about one's own and others' mental states. ToM plays a critical role in the development of intelligence, language understanding, and cognitive processes. While previous work has primarily focused on first and second-order ToM, we explore higher-order ToM, which involves recursive reasoning on others' beliefs. We introduce HI-TOM, a Higher Order Theory of Mind benchmark. Our experimental evaluation using various Large Language Models (LLMs) indicates a decline in performance on higher-order ToM tasks, demonstrating the limitations of current LLMs. We conduct a thorough analysis of different failure cases of LLMs, and share our thoughts on the implications of our findings on the future of NLP.

higher-order theory, large language model, natural language, (3 more...)

arXiv.org Artificial Intelligence

2310.16755

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models

Zhang, Yue, Li, Yafu, Cui, Leyang, Cai, Deng, Liu, Lemao, Fu, Tingchen, Huang, Xinting, Zhao, Enbo, Zhang, Yu, Chen, Yulong, Wang, Longyue, Luu, Anh Tuan, Bi, Wei, Shi, Freda, Shi, Shuming

arXiv.org Artificial IntelligenceSep-24-2023

While large language models (LLMs) have demonstrated remarkable capabilities across a range of downstream tasks, a significant concern revolves around their propensity to exhibit hallucinations: LLMs occasionally generate content that diverges from the user input, contradicts previously generated context, or misaligns with established world knowledge. This phenomenon poses a substantial challenge to the reliability of LLMs in real-world scenarios. In this paper, we survey recent efforts on the detection, explanation, and mitigation of hallucination, with an emphasis on the unique challenges posed by LLMs. We present taxonomies of the LLM hallucination phenomena and evaluation benchmarks, analyze existing approaches aiming at mitigating LLM hallucination, and discuss potential directions for future research.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2309.01219

Country:

Europe (0.92)
Asia (0.67)
North America > United States > Colorado (0.14)

Genre:

Overview (1.00)
Research Report > New Finding (0.46)
Instructional Material > Course Syllabus & Notes (0.46)

Industry:

Leisure & Entertainment (1.00)
Health & Medicine (1.00)
Media (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback