AITopics | Wu, Tingting

Collaborating Authors

Wu, Tingting

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Investigating Large Language Models for Code Vulnerability Detection: An Experimental Study

Jiang, Xuefeng, Wu, Lvhua, Sun, Sheng, Li, Jia, Xue, Jingjing, Wang, Yuwei, Wu, Tingting, Liu, Min

arXiv.org Artificial IntelligenceJan-5-2025

Code vulnerability detection (CVD) is essential for addressing and preventing system security issues, playing a crucial role in ensuring software security. Previous learning-based vulnerability detection methods rely on either fine-tuning medium-size sequence models or training smaller neural networks from scratch. Recent advancements in large pre-trained language models (LLMs) have showcased remarkable capabilities in various code intelligence tasks including code understanding and generation. However, the effectiveness of LLMs in detecting code vulnerabilities is largely under-explored. This work aims to investigate the gap by fine-tuning LLMs for the CVD task, involving four widely-used open-source LLMs. We also implement other five previous graph-based or medium-size sequence models for comparison. Experiments are conducted on five commonly-used CVD datasets, including both the part of short samples and long samples. In addition, we conduct quantitative experiments to investigate the class imbalance issue and the model's performance on samples of different lengths, which are rarely studied in previous works. To better facilitate communities, we open-source all codes and resources of this study in https://github.com/SakiRinn/LLM4CVD and https://huggingface.co/datasets/xuefen/VulResource.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2412.1826

Country:

Asia > China (0.47)
North America > United States > California > Santa Clara County (0.14)

Genre:

Research Report > New Finding (0.50)
Research Report > Experimental Study (0.40)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CC-FedAvg: Computationally Customized Federated Averaging

Zhang, Hao, Wu, Tingting, Cheng, Siyao, Liu, Jie

arXiv.org Artificial IntelligenceJul-1-2023

Federated learning (FL) is an emerging paradigm to train model with distributed data from numerous Internet of Things (IoT) devices. It inherently assumes a uniform capacity among participants. However, due to different conditions such as differing energy budgets or executing parallel unrelated tasks, participants have diverse computational resources in practice. Participants with insufficient computation budgets must plan for the use of restricted computational resources appropriately, otherwise they would be unable to complete the entire training procedure, resulting in model performance decline. To address this issue, we propose a strategy for estimating local models without computationally intensive iterations. Based on it, we propose Computationally Customized Federated Averaging (CC-FedAvg), which allows participants to determine whether to perform traditional local training or model estimation in each round based on their current computational budgets. Both theoretical analysis and exhaustive experiments indicate that CC-FedAvg has the same convergence rate and comparable performance as FedAvg without resource constraints. Furthermore, CC-FedAvg can be viewed as a computation-efficient version of FedAvg that retains model performance while considerably lowering computation overhead.

artificial intelligence, cc-fedavg, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2212.13679

Country: Asia > China > Heilongjiang Province (0.14)

Genre: Research Report (0.64)

Industry:

Information Technology (0.66)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

NoisywikiHow: A Benchmark for Learning with Real-world Noisy Labels in Natural Language Processing

Wu, Tingting, Ding, Xiao, Tang, Minji, Zhang, Hao, Qin, Bing, Liu, Ting

arXiv.org Artificial IntelligenceMay-18-2023

Large-scale datasets in the real world inevitably involve label noise. Deep models can gradually overfit noisy labels and thus degrade model generalization. To mitigate the effects of label noise, learning with noisy labels (LNL) methods are designed to achieve better generalization performance. Due to the lack of suitable datasets, previous studies have frequently employed synthetic label noise to mimic real-world label noise. However, synthetic noise is not instance-dependent, making this approximation not always effective in practice. Recent research has proposed benchmarks for learning with real-world noisy labels. However, the noise sources within may be single or fuzzy, making benchmarks different from data with heterogeneous label noises in the real world. To tackle these issues, we contribute NoisywikiHow, the largest NLP benchmark built with minimal supervision. Specifically, inspired by human cognition, we explicitly construct multiple sources of label noise to imitate human errors throughout the annotation, replicating real-world noise, whose corruption is affected by both ground-truth labels and instances. Moreover, we provide a variety of noise levels to support controlled experiments on noisy data, enabling us to evaluate LNL methods systematically and comprehensively. After that, we conduct extensive multi-dimensional experiments on a broad range of LNL methods, obtaining new and intriguing findings.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2305.10709

Country: Asia (0.46)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.68)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Add feedback

FedCos: A Scene-adaptive Federated Optimization Enhancement for Performance Improvement

Zhang, Hao, Wu, Tingting, Cheng, Siyao, Liu, Jie

arXiv.org Artificial IntelligenceApr-6-2022

As an emerging technology, federated learning (FL) involves training machine learning models over distributed edge devices, which attracts sustained attention and has been extensively studied. However, the heterogeneity of client data severely degrades the performance of FL compared with that in centralized training. It causes the locally trained models of clients to move in different directions. On the one hand, it slows down or even stalls the global updates, leading to inefficient communication. On the other hand, it enlarges the distances between local models, resulting in an aggregated global model with poor performance. Fortunately, these shortcomings can be mitigated by reducing the angle between the directions that local models move in. Based on this fact, we propose FedCos, which reduces the directional inconsistency of local models by introducing a cosine-similarity penalty. It promotes the local model iterations towards an auxiliary global direction. Moreover, our approach is auto-adapt to various non-IID settings without an elaborate selection of hyperparameters. The experimental results show that FedCos outperforms the well-known baselines and can enhance them under a variety of FL scenes, including varying degrees of data heterogeneity, different number of participants, and cross-silo and cross-device settings. Besides, FedCos improves communication efficiency by 2 to 5 times. With the help of FedCos, multiple FL methods require significantly fewer communication rounds than before to obtain a model with comparable performance.

artificial intelligence, fedco, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/JIOT.2022.3218315

2204.03174

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Security & Privacy (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback