AITopics | Peng, Hanyu

Collaborating Authors

Peng, Hanyu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Faster Algorithms for Generalized Mean Densest Subgraph Problem

Fan, Chenglin, Li, Ping, Peng, Hanyu

arXiv.org Machine LearningOct-17-2023

The densest subgraph of a large graph usually refers to some subgraph with the highest average degree, which has been extended to the family of $p$-means dense subgraph objectives by~\citet{veldt2021generalized}. The $p$-mean densest subgraph problem seeks a subgraph with the highest average $p$-th-power degree, whereas the standard densest subgraph problem seeks a subgraph with a simple highest average degree. It was shown that the standard peeling algorithm can perform arbitrarily poorly on generalized objective when $p>1$ but uncertain when $0

artificial intelligence, social media, subgraph, (15 more...)

arXiv.org Machine Learning

2310.11377

Country:

Europe (1.00)
North America > United States > California (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Communications > Social Media (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Copula for Instance-wise Feature Selection and Ranking

Peng, Hanyu, Fang, Guanhua, Li, Ping

arXiv.org Artificial IntelligenceAug-1-2023

The identification of feature correlations can minimize the redundancy of features. Yet, in the literature of instance-wise Instance-wise feature selection and ranking methods feature selection and ranking methods [Chen et al., 2018, can achieve a good selection of task-friendly Yoon et al., 2019, Abid et al., 2019, Masoomi et al., 2020, features for each sample in the context of neural Wu and Liu, 2018] that follow the context of neural networks, networks. However, existing approaches that the dependencies between features has not been considered assume feature subsets to be independent are imperfect manifestly. For instance, L2X [Chen et al., 2018] performs when considering the dependency between a feature selection for maximizing the mutual information features. To address this limitation, we propose between selected feature subsets and corresponding outputs.

artificial intelligence, feature selection, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2308.00549

Country:

Asia (1.00)
Europe (0.67)
North America > United States > California > Santa Clara County (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Dataset Pruning: Reducing Training Data by Examining Generalization Influence

Yang, Shuo, Xie, Zeke, Peng, Hanyu, Xu, Min, Sun, Mingming, Li, Ping

arXiv.org Artificial IntelligenceFeb-26-2023

The great success of deep learning heavily relies on increasingly larger training data, which comes at a price of huge computational and infrastructural costs. This poses crucial questions that, do all training data contribute to model's performance? How much does each individual training sample or a sub-training-set affect the model's generalization, and how to construct the smallest subset from the entire training data as a proxy training set without significantly sacrificing the model's performance? To answer these, we propose dataset pruning, an optimization-based sample selection method that can (1) examine the influence of removing a particular set of training samples on model's generalization ability with theoretical guarantee, and (2) construct the smallest subset of training data that yields strictly constrained generalization gap. The empirically observed generalization gap of dataset pruning is substantially consistent with our theoretical expectations. Furthermore, the proposed method prunes 40% training examples on the CIFAR-10 dataset, halves the convergence time with only 1.3% test accuracy decrease, which is superior to previous score-based sample selection methods.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2205.09329

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

MetaTPTrans: A Meta Learning Approach for Multilingual Code Representation Learning

Pian, Weiguo, Peng, Hanyu, Tang, Xunzhu, Sun, Tiezhu, Tian, Haoye, Habib, Andrew, Klein, Jacques, Bissyandé, Tegawendé F.

arXiv.org Artificial IntelligenceDec-5-2022

Representation learning of source code is essential for applying machine learning to software engineering tasks. Learning code representation from a multilingual source code dataset has been shown to be more effective than learning from single-language datasets separately, since more training data from multilingual dataset improves the model's ability to extract language-agnostic information from source code. However, existing multilingual training overlooks the language-specific information which is crucial for modeling source code across different programming languages, while only focusing on learning a unified model with shared parameters among different languages for language-agnostic information modeling. To address this problem, we propose MetaTPTrans, a meta learning approach for multilingual code representation learning. MetaTPTrans generates different parameters for the feature extractor according to the specific programming language type of the input code snippet, enabling the model to learn both language-agnostic and language-specific information with dynamic parameters in the feature extractor. We conduct experiments on the code summarization and code completion tasks to verify the effectiveness of our approach. The results demonstrate the superiority of our approach with significant improvements on state-of-the-art baselines.

artificial intelligence, information, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2206.0646

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report > New Finding (0.48)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback