AITopics | Wu, Di

Plotting

Wu, Di

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

End-to-end Graph Learning Approach for Cognitive Diagnosis of Student Tutorial

Yang, Fulai, Wu, Di, He, Yi, Tao, Li, Luo, Xin

arXiv.org Artificial IntelligenceOct-30-2024

Cognitive diagnosis (CD) utilizes students' existing studying records to estimate their mastery of unknown knowledge concepts, which is vital for evaluating their learning abilities. Accurate CD is extremely challenging because CD is associated with complex relationships and mechanisms among students, knowledge concepts, studying records, etc. However, existing approaches loosely consider these relationships and mechanisms by a non-end-to-end learning framework, resulting in sub-optimal feature extractions and fusions for CD. Different from them, this paper innovatively proposes an End-to-end Graph Neural Networks-based Cognitive Diagnosis (EGNN-CD) model. EGNN-CD consists of three main parts: knowledge concept network (KCN), graph neural networks-based feature extraction (GNNFE), and cognitive ability prediction (CAP). First, KCN constructs CD-related interaction by comprehensively extracting physical information from students, exercises, and knowledge concepts. Second, a four-channel GNNFE is designed to extract high-order and individual features from the constructed KCN. Finally, CAP employs a multi-layer perceptron to fuse the extracted features to predict students' learning abilities in an end-to-end learning way. With such designs, the feature extractions and fusions are guaranteed to be comprehensive and optimal for CD. Extensive experiments on three real datasets demonstrate that our EGNN-CD achieves significantly higher accuracy than state-of-the-art models in CD.

artificial intelligence, ieee transaction, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2411.00845

Country: Asia (0.29)

Genre: Research Report > Promising Solution (0.34)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

A Unified Solution to Diverse Heterogeneities in One-shot Federated Learning

Bai, Jun, Song, Yiliao, Wu, Di, Sajjanhar, Atul, Xiang, Yong, Zhou, Wei, Tao, Xiaohui, Li, Yan

arXiv.org Artificial IntelligenceOct-28-2024

One-shot federated learning (FL) limits the communication between the server and clients to a single round, which largely decreases the privacy leakage risks in traditional FLs requiring multiple communications. However, we find existing one-shot FL frameworks are vulnerable to distributional heterogeneity due to their insufficient focus on data heterogeneity while concentrating predominantly on model heterogeneity. Filling this gap, we propose a unified, data-free, one-shot federated learning framework (FedHydra) that can effectively address both model and data heterogeneity. Rather than applying existing value-only learning mechanisms, a structure-value learning mechanism is proposed in FedHydra. Specifically, a new stratified learning structure is proposed to cover data heterogeneity, and the value of each item during computation reflects model heterogeneity. By this design, the data and model heterogeneity issues are simultaneously monitored from different aspects during learning. Consequently, FedHydra can effectively mitigate both issues by minimizing their inherent conflicts. We compared FedHydra with three SOTA baselines on four benchmark datasets. Experimental results show that our method outperforms the previous one-shot FL methods in both homogeneous and heterogeneous settings.

artificial intelligence, heterogeneity, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2410.21119

Country: Oceania > Australia (0.47)

Genre: Research Report > New Finding (0.66)

Industry:

Education (0.93)
Information Technology > Security & Privacy (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

BRIEF: Bridging Retrieval and Inference for Multi-hop Reasoning via Compression

Li, Yuankai, Gu, Jia-Chen, Wu, Di, Chang, Kai-Wei, Peng, Nanyun

arXiv.org Artificial IntelligenceOct-20-2024

Retrieval-augmented generation (RAG) can supplement large language models (LLMs) by integrating external knowledge. However, as the number of retrieved documents increases, the input length to LLMs grows linearly, causing a dramatic increase in latency and a degradation in long-context understanding. This is particularly serious for multi-hop questions that require a chain of reasoning across documents. To accelerate inference, reduce costs, and minimize distractions, this paper presents BRIEF (Bridging Retrieval and Inference through Evidence Fusion), a lightweight approach that performs query-aware multi-hop reasoning by compressing retrieved documents into highly dense textual summaries to integrate into in-context learning. To enable learning compression for multi-hop reasoning, we curate synthetic data by extracting atomic proposition expressions that encapsulate distinct factoids from the source documents to compose synthetic summaries. Based on our synthetic data built entirely by open-source models, BRIEF generates more concise summaries and enables a range of LLMs to achieve exceptional open-domain question answering (QA) performance. For example, on HotpotQA, BRIEF improves the compression rate by 2 times compared to the state-of-the-art baseline, while outperforming it by 3.00% EM and 4.16% F1 with Flan-UL2 as the reader LM. It also generates more concise summaries than proprietary GPT-3.5, while demonstrating nearly identical QA performance.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.15277

Country:

North America > United States (1.00)
Europe (1.00)
North America > Canada (0.68)

Genre: Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment (1.00)
Consumer Products & Services > Hotels (1.00)
Government (0.93)
Media > Film (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

DMGNN: Detecting and Mitigating Backdoor Attacks in Graph Neural Networks

Sui, Hao, Chen, Bing, Zhang, Jiale, Zhu, Chengcheng, Wu, Di, Lu, Qinghua, Long, Guodong

arXiv.org Artificial IntelligenceOct-17-2024

Recent studies have revealed that GNNs are highly susceptible to multiple adversarial attacks. Among these, graph backdoor attacks pose one of the most prominent threats, where attackers cause models to misclassify by learning the backdoored features with injected triggers and modified target labels during the training phase. Based on the features of the triggers, these attacks can be categorized into out-of-distribution (OOD) and in-distribution (ID) graph backdoor attacks, triggers with notable differences from the clean sample feature distributions constitute OOD backdoor attacks, whereas the triggers in ID backdoor attacks are nearly identical to the clean sample feature distributions. Existing methods can successfully defend against OOD backdoor attacks by comparing the feature distribution of triggers and clean samples but fail to mitigate stealthy ID backdoor attacks. Due to the lack of proper supervision signals, the main task accuracy is negatively affected in defending against ID backdoor attacks. To bridge this gap, we propose DMGNN against OOD and ID graph backdoor attacks that can powerfully eliminate stealthiness to guarantee defense effectiveness and improve the model performance. Specifically, DMGNN can easily identify the hidden ID and OOD triggers via predicting label transitions based on counterfactual explanation. To further filter the diversity of generated explainable graphs and erase the influence of the trigger features, we present a reverse sampling pruning method to screen and discard the triggers directly on the data level. Extensive experimental evaluations on open graph datasets demonstrate that DMGNN far outperforms the state-of-the-art (SOTA) defense methods, reducing the attack success rate to 5% with almost negligible degradation in model performance (within 3.5%).

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.14105

Country:

Asia (0.28)
Oceania > Australia (0.28)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.90)

Add feedback

Trajectory Prediction for Autonomous Driving using Agent-Interaction Graph Embedding

Samiuddin, Jilan, Boulet, Benoit, Wu, Di

arXiv.org Artificial IntelligenceOct-15-2024

Trajectory prediction module in an autonomous driving system is crucial for the decision-making and safety of the autonomous agent car and its surroundings. This work presents a novel scheme called AiGem (Agent-Interaction Graph Embedding) to predict traffic vehicle trajectories around the autonomous car. AiGem tackles this problem in four steps. First, AiGem formulates the historical traffic interaction with the autonomous agent as a graph in two steps: (1) at each time step of the history frames, agent-interactions are captured using spatial edges between the agents (nodes of the graph), and then, (2) connects the spatial graphs in chronological order using temporal edges. Then, AiGem applies a depthwise graph encoder network on the spatial-temporal graph to generate graph embedding, i.e., embedding of all the nodes in the graph. Next, a sequential Gated Recurrent Unit decoder network uses the embedding of the current timestamp to get the decoded states. Finally, an output network comprising a Multilayer Perceptron is used to predict the trajectories utilizing the decoded states as its inputs. Results show that AiGem outperforms the state-of-the-art deep learning algorithms for longer prediction horizons.

actor, artificial intelligence, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.23298

Country:

North America > United States (0.68)
North America > Canada > Quebec (0.28)

Genre: Research Report (0.84)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Robotics & Automation (0.91)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

An Online Self-learning Graph-based Lateral Controller for Self-Driving Cars

Samiuddin, Jilan, Boulet, Benoit, Wu, Di

arXiv.org Artificial IntelligenceOct-15-2024

The hype around self-driving cars has been growing over the past years and has sparked much research. Several modules in self-driving cars are thoroughly investigated to ensure safety, comfort, and efficiency, among which the controller is crucial. The controller module can be categorized into longitudinal and lateral controllers in which the task of the former is to follow the reference velocity, and the latter is to reduce the lateral displacement error from the reference path. Generally, a tuned controller is not sufficient to perform in all environments. Thus, a controller that can adapt to changing conditions is necessary for autonomous driving. Furthermore, these controllers often depend on vehicle models that also need to adapt over time due to varying environments. This paper uses graphs to present novel techniques to learn the vehicle model and the lateral controller online. First, a heterogeneous graph is presented depicting the current states of and inputs to the vehicle. The vehicle model is then learned online using known physical constraints in conjunction with the processing of the graph through a Graph Neural Network structure. Next, another heterogeneous graph - depicting the transition from current to desired states - is processed through another Graph Neural Network structure to generate the steering command on the fly. Finally, the performance of this self-learning model-based lateral controller is evaluated and shown to be satisfactory on an open-source autonomous driving platform called CARLA.

artificial intelligence, controller, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TIV.2024.3478052

2410.11979

Country: North America > Canada > Quebec > Montreal (0.15)

Genre:

Research Report > Promising Solution (0.48)
Instructional Material > Online (0.40)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Information Technology > Robotics & Automation (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory

Wu, Di, Wang, Hongwei, Yu, Wenhao, Zhang, Yuwei, Chang, Kai-Wei, Yu, Dong

arXiv.org Artificial IntelligenceOct-14-2024

Recent large language model (LLM)-driven chat assistant systems have integrated memory components to track user-assistant chat histories, enabling more accurate and personalized responses. However, their long-term memory capabilities in sustained interactions remain underexplored. This paper introduces LongMemEval, a comprehensive benchmark designed to evaluate five core long-term memory abilities of chat assistants: information extraction, multi-session reasoning, temporal reasoning, knowledge updates, and abstention. With 500 meticulously curated questions embedded within freely scalable user-assistant chat histories, LongMemEval presents a significant challenge to existing long-term memory systems, with commercial chat assistants and long-context LLMs showing 30% accuracy drop on memorizing information across sustained interactions. We then present a unified framework that breaks down the long-term memory design into four design choices across the indexing, retrieval, and reading stages. Built upon key experimental insights, we propose several memory designs including session decomposition for optimizing value granularity, fact-augmented key expansion for enhancing the index structure, and time-aware query expansion for refining the search scope. Experiment results show that these optimizations greatly improve both memory recall and downstream question answering on LongMemEval. Overall, our study provides valuable resources and guidance for advancing the long-term memory capabilities of LLM-based chat assistants, paving the way toward more personalized and reliable conversational AI.

information, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.10813

Country:

North America > United States (1.00)
Asia (1.00)
Europe (0.93)

Genre: Research Report > New Finding (0.87)

Industry:

Information Technology (0.68)
Health & Medicine (0.67)
Leisure & Entertainment > Sports (0.67)
Media > Music (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Towards Homogeneous Lexical Tone Decoding from Heterogeneous Intracranial Recordings

Wu, Di, Li, Siyuan, Feng, Chen, Cao, Lu, Zhang, Yue, Yang, Jie, Sawan, Mohamad

arXiv.org Artificial IntelligenceOct-13-2024

Recent advancements in brain-computer interfaces (BCIs) have enabled the decoding of lexical tones from intracranial recordings, offering the potential to restore the communication abilities of speech-impaired tonal language speakers. However, data heterogeneity induced by both physiological and instrumental factors poses a significant challenge for unified invasive brain tone decoding. Traditional subject-specific models, which operate under a heterogeneous decoding paradigm, fail to capture generalized neural representations and cannot effectively leverage data across subjects. To address these limitations, we introduce Homogeneity-Heterogeneity Disentangled Learning for neural Representations (H2DiLR), a novel framework that disentangles and learns both the homogeneity and heterogeneity from intracranial recordings across multiple subjects. To evaluate H2DiLR, we collected stereoelectroencephalography (sEEG) data from multiple participants reading Mandarin materials comprising 407 syllables, representing nearly all Mandarin characters. Extensive experiments demonstrate that H2DiLR, as a unified decoding paradigm, significantly outperforms the conventional heterogeneous decoding approach. Furthermore, we empirically confirm that H2DiLR effectively captures both homogeneity and heterogeneity during neural representation learning. The human language system, with its intricate and expansive syntactic structure, enables rich and complex communication. Decoding spoken language from within human brains has emerged as a significant topic of interest in neuroscience (Anumanchipalli et al., 2019; Willett et al., 2023; Feng et al., 2023; Lu et al., 2023; Liu et al., 2023). The decoding of vocal tone from brain measurements (Lu et al., 2023; Liu et al., 2023) is of particular research interest, due to the prominence of tonal languages, which make up over 60% of the world's languages (Yip, 2002) and are spoken by approximately one-third of the global population (Dryer & Haspelmath, 2013). In these languages, tone plays a critical role in distinguishing lexical meaning at the syllable level. Mandarin, for instance, is a widely spoken tonal language that has an extensive inventory of over 50,000 characters, with each associated with a syllable composed of an initial, a final, and a tone (Duanmu, 2007).

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.12866

Country: Asia (0.46)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Therapeutic Area > Neurology > Epilepsy (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Neuroscience (0.66)

Add feedback

GrabDAE: An Innovative Framework for Unsupervised Domain Adaptation Utilizing Grab-Mask and Denoise Auto-Encoder

Chen, Junzhou, Wen, Xuan, Zhang, Ronghui, Ren, Bingtao, Wu, Di, Xu, Zhigang, Wang, Danwei

arXiv.org Artificial IntelligenceOct-10-2024

Unsupervised Domain Adaptation (UDA) aims to adapt a model trained on a labeled source domain to an unlabeled target domain by addressing the domain shift. Existing Unsupervised Domain Adaptation (UDA) methods often fall short in fully leveraging contextual information from the target domain, leading to suboptimal decision boundary separation during source and target domain alignment. To address this, we introduce GrabDAE, an innovative UDA framework designed to tackle domain shift in visual classification tasks. GrabDAE incorporates two key innovations: the Grab-Mask module, which blurs background information in target domain images, enabling the model to focus on essential, domain-relevant features through contrastive learning; and the Denoising Auto-Encoder (DAE), which enhances feature alignment by reconstructing features and filtering noise, ensuring a more robust adaptation to the target domain. These components empower GrabDAE to effectively handle unlabeled target domain data, significantly improving both classification accuracy and robustness. Extensive experiments on benchmark datasets, including VisDA-2017, Office-Home, and Office31, demonstrate that GrabDAE consistently surpasses state-of-the-art UDA methods, setting new performance benchmarks. By tackling UDA's critical challenges with its novel feature masking and denoising approach, GrabDAE offers both significant theoretical and practical advancements in domain adaptation.

adaptation, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2410.08023

Country: Asia > China > Guangdong Province (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Can LLMs Really Learn to Translate a Low-Resource Language from One Grammar Book?

Aycock, Seth, Stap, David, Wu, Di, Monz, Christof, Sima'an, Khalil

arXiv.org Artificial IntelligenceSep-27-2024

Extremely low-resource (XLR) languages lack substantial corpora for training NLP models, motivating the use of all available resources such as dictionaries and grammar books. Machine Translation from One Book (Tanzer et al., 2024) suggests prompting long-context LLMs with one grammar book enables English-Kalamang translation, an unseen XLR language - a noteworthy case of linguistic knowledge helping an NLP task. We investigate whether the book's grammatical explanations or its parallel examples are most effective for learning XLR translation, finding almost all improvement stems from the parallel examples. Further, we find similar results for Nepali, a seen low-resource language, and achieve performance comparable to an LLM with a grammar book by simply fine-tuning an encoder-decoder translation model. We then investigate where grammar books help by testing two linguistic tasks, grammaticality judgment and gloss prediction, and we explore what kind of grammatical knowledge helps by introducing a typological feature prompt that achieves leading results on these more relevant tasks. We thus emphasise the importance of task-appropriate data for XLR languages: parallel examples for translation, and grammatical data for linguistic tasks. As we find no evidence that long-context LLMs can make effective use of grammatical explanations for XLR translation, we suggest data collection for multilingual XLR tasks such as translation is best focused on parallel data over linguistic description.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2409.19151

Country:

North America > United States (0.46)
Europe > Germany (0.46)
Asia > Indonesia (0.46)
(3 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback