AITopics | Jiang, Lin

Collaborating Authors

Jiang, Lin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Deep Learning-Based Identification of Inconsistent Method Names: How Far Are We?

Wang, Taiming, Zhang, Yuxia, Jiang, Lin, Tang, Yi, Li, Guangjie, Liu, Hui

arXiv.org Artificial IntelligenceJan-21-2025

Concise and meaningful method names are crucial for program comprehension and maintenance. However, method names may become inconsistent with their corresponding implementations, causing confusion and errors. Several deep learning (DL)-based approaches have been proposed to identify such inconsistencies, with initial evaluations showing promising results. However, these evaluations typically use a balanced dataset, where the number of inconsistent and consistent names are equal. This setup, along with flawed dataset construction, leads to false positives, making reported performance less reliable in real-world scenarios, where most method names are consistent. In this paper, we present an empirical study that evaluates state-of-the-art DL-based methods for identifying inconsistent method names. We create a new benchmark by combining automatic identification from commit histories and manual developer inspections, reducing false positives. We evaluate five representative DL approaches (one retrieval-based and four generation-based) on this benchmark. Our results show that performance drops substantially when moving from the balanced dataset to the new benchmark. We further conduct quantitative and qualitative analyses to understand the strengths and weaknesses of the approaches. Retrieval-based methods perform well on simple methods and those with popular name sub-tokens but fail due to inefficient representation techniques. Generation-based methods struggle with inaccurate similarity calculations and immature name generation. Based on these findings, we propose improvements using contrastive learning and large language models (LLMs). Our study suggests that significant improvements are needed before these DL approaches can be effectively applied to real-world software systems.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s10664-024-10592-z

2501.12617

Country:

Asia (1.00)
North America > United States (0.92)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Rankitect: Ranking Architecture Search Battling World-class Engineers at Meta Scale

Wen, Wei, Liu, Kuang-Hung, Fedorov, Igor, Zhang, Xin, Yin, Hang, Chu, Weiwei, Hassani, Kaveh, Sun, Mengying, Liu, Jiang, Wang, Xu, Jiang, Lin, Chen, Yuxin, Zhang, Buyun, Liu, Xi, Cheng, Dehua, Chen, Zhengxing, Zhao, Guang, Han, Fangqiu, Yang, Jiyan, Hao, Yuchen, Xiong, Liang, Chen, Wen-Yen

arXiv.org Artificial IntelligenceNov-13-2023

Neural Architecture Search (NAS) has demonstrated its efficacy in computer vision and potential for ranking systems. However, prior work focused on academic problems, which are evaluated at small scale under well-controlled fixed baselines. In industry system, such as ranking system in Meta, it is unclear whether NAS algorithms from the literature can outperform production baselines because of: (1) scale - Meta ranking systems serve billions of users, (2) strong baselines - the baselines are production models optimized by hundreds to thousands of world-class engineers for years since the rise of deep learning, (3) dynamic baselines - engineers may have established new and stronger baselines during NAS search, and (4) efficiency - the search pipeline must yield results quickly in alignment with the productionization life cycle. In this paper, we present Rankitect, a NAS software framework for ranking systems at Meta. Rankitect seeks to build brand new architectures by composing low level building blocks from scratch. Rankitect implements and improves state-of-the-art (SOTA) NAS methods for comprehensive and fair comparison under the same search space, including sampling-based NAS, one-shot NAS, and Differentiable NAS (DNAS). We evaluate Rankitect by comparing to multiple production ranking models at Meta. We find that Rankitect can discover new models from scratch achieving competitive tradeoff between Normalized Entropy loss and FLOPs. When utilizing search space designed by engineers, Rankitect can generate better models than engineers, achieving positive offline evaluation and online A/B test at Meta scale.

artificial intelligence, machine learning, supernet, (14 more...)

arXiv.org Artificial Intelligence

2311.0843

Country: North America > United States > Virginia (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Sudowoodo: a Chinese Lyric Imitation System with Source Lyrics

Chang, Yongzhu, Zhang, Rongsheng, Jiang, Lin, Chen, Qihang, Zhang, Le, Pu, Jiashu

arXiv.org Artificial IntelligenceAug-8-2023

Lyrics generation is a well-known application in natural language generation research, with several previous studies focusing on generating accurate lyrics using precise control such as keywords, rhymes, etc. However, lyrics imitation, which involves writing new lyrics by imitating the style and content of the source lyrics, remains a challenging task due to the lack of a parallel corpus. In this paper, we introduce \textbf{\textit{Sudowoodo}}, a Chinese lyrics imitation system that can generate new lyrics based on the text of source lyrics. To address the issue of lacking a parallel training corpus for lyrics imitation, we propose a novel framework to construct a parallel corpus based on a keyword-based lyrics model from source lyrics. Then the pairs \textit{(new lyrics, source lyrics)} are used to train the lyrics imitation model. During the inference process, we utilize a post-processing module to filter and rank the generated lyrics, selecting the highest-quality ones. We incorporated audio information and aligned the lyrics with the audio to form the songs as a bonus. The human evaluation results show that our framework can perform better lyric imitation. Meanwhile, the \textit{Sudowoodo} system and demo video of the system is available at \href{https://Sudowoodo.apps-hp.danlu.netease.com/}{Sudowoodo} and \href{https://youtu.be/u5BBT_j1L5M}{https://youtu.be/u5BBT\_j1L5M}.

artificial intelligence, chinese lyric imitation system, natural language, (2 more...)

arXiv.org Artificial Intelligence

2308.04665

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Natural Language > Generation (0.53)

Add feedback