AITopics | Lu, Xuan

Collaborating Authors

Lu, Xuan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

DeepSeek-AI, null, Liu, Aixin, Feng, Bei, Wang, Bin, Wang, Bingxuan, Liu, Bo, Zhao, Chenggang, Dengr, Chengqi, Ruan, Chong, Dai, Damai, Guo, Daya, Yang, Dejian, Chen, Deli, Ji, Dongjie, Li, Erhang, Lin, Fangyun, Luo, Fuli, Hao, Guangbo, Chen, Guanting, Li, Guowei, Zhang, H., Xu, Hanwei, Yang, Hao, Zhang, Haowei, Ding, Honghui, Xin, Huajian, Gao, Huazuo, Li, Hui, Qu, Hui, Cai, J. L., Liang, Jian, Guo, Jianzhong, Ni, Jiaqi, Li, Jiashi, Chen, Jin, Yuan, Jingyang, Qiu, Junjie, Song, Junxiao, Dong, Kai, Gao, Kaige, Guan, Kang, Wang, Lean, Zhang, Lecong, Xu, Lei, Xia, Leyi, Zhao, Liang, Zhang, Liyue, Li, Meng, Wang, Miaojun, Zhang, Mingchuan, Zhang, Minghua, Tang, Minghui, Li, Mingming, Tian, Ning, Huang, Panpan, Wang, Peiyi, Zhang, Peng, Zhu, Qihao, Chen, Qinyu, Du, Qiushi, Chen, R. J., Jin, R. L., Ge, Ruiqi, Pan, Ruizhe, Xu, Runxin, Chen, Ruyi, Li, S. S., Lu, Shanghao, Zhou, Shangyan, Chen, Shanhuang, Wu, Shaoqing, Ye, Shengfeng, Ma, Shirong, Wang, Shiyu, Zhou, Shuang, Yu, Shuiping, Zhou, Shunfeng, Zheng, Size, Wang, T., Pei, Tian, Yuan, Tian, Sun, Tianyu, Xiao, W. L., Zeng, Wangding, An, Wei, Liu, Wen, Liang, Wenfeng, Gao, Wenjun, Zhang, Wentao, Li, X. Q., Jin, Xiangyue, Wang, Xianzu, Bi, Xiao, Liu, Xiaodong, Wang, Xiaohan, Shen, Xiaojin, Chen, Xiaokang, Chen, Xiaosha, Nie, Xiaotao, Sun, Xiaowen, Wang, Xiaoxiang, Liu, Xin, Xie, Xin, Yu, Xingkai, Song, Xinnan, Zhou, Xinyi, Yang, Xinyu, Lu, Xuan, Su, Xuecheng, Wu, Y., Li, Y. K., Wei, Y. X., Zhu, Y. X., Xu, Yanhong, Huang, Yanping, Li, Yao, Zhao, Yao, Sun, Yaofeng, Li, Yaohui, Wang, Yaohui, Zheng, Yi, Zhang, Yichao, Xiong, Yiliang, Zhao, Yilong, He, Ying, Tang, Ying, Piao, Yishi, Dong, Yixin, Tan, Yixuan, Liu, Yiyuan, Wang, Yongji, Guo, Yongqiang, Zhu, Yuchen, Wang, Yuduan, Zou, Yuheng, Zha, Yukun, Ma, Yunxian, Yan, Yuting, You, Yuxiang, Liu, Yuxuan, Ren, Z. Z., Ren, Zehui, Sha, Zhangli, Fu, Zhe, Huang, Zhen, Zhang, Zhen, Xie, Zhenda, Hao, Zhewen, Shao, Zhihong, Wen, Zhiniu, Xu, Zhipeng, Zhang, Zhongyu, Li, Zhuoshu, Wang, Zihan, Gu, Zihui, Li, Zilin, Xie, Ziwei

arXiv.org Artificial IntelligenceJun-19-2024

We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference through significantly compressing the Key-Value (KV) cache into a latent vector, while DeepSeekMoE enables training strong models at an economical cost through sparse computation. Compared with DeepSeek 67B, DeepSeek-V2 achieves significantly stronger performance, and meanwhile saves 42.5% of training costs, reduces the KV cache by 93.3%, and boosts the maximum generation throughput to 5.76 times. We pretrain DeepSeek-V2 on a high-quality and multi-source corpus consisting of 8.1T tokens, and further perform Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) to fully unlock its potential. Evaluation results show that, even with only 21B activated parameters, DeepSeek-V2 and its chat versions still achieve top-tier performance among open-source models.

deepseek-v2 chat, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2405.04434

Country:

Europe (1.00)
Asia (0.92)
North America > United States > New York (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.34)

Industry:

Education (0.68)
Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

DeepSeek-AI, null, Zhu, Qihao, Guo, Daya, Shao, Zhihong, Yang, Dejian, Wang, Peiyi, Xu, Runxin, Wu, Y., Li, Yukun, Gao, Huazuo, Ma, Shirong, Zeng, Wangding, Bi, Xiao, Gu, Zihui, Xu, Hanwei, Dai, Damai, Dong, Kai, Zhang, Liyue, Piao, Yishi, Gou, Zhibin, Xie, Zhenda, Hao, Zhewen, Wang, Bingxuan, Song, Junxiao, Chen, Deli, Xie, Xin, Guan, Kang, You, Yuxiang, Liu, Aixin, Du, Qiushi, Gao, Wenjun, Lu, Xuan, Chen, Qinyu, Wang, Yaohui, Deng, Chengqi, Li, Jiashi, Zhao, Chenggang, Ruan, Chong, Luo, Fuli, Liang, Wenfeng

arXiv.org Artificial IntelligenceJun-17-2024

We present DeepSeek-Coder-V2, an open-source Mixture-of-Experts (MoE) code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks. Specifically, DeepSeek-Coder-V2 is further pre-trained from an intermediate checkpoint of DeepSeek-V2 with additional 6 trillion tokens. Through this continued pre-training, DeepSeek-Coder-V2 substantially enhances the coding and mathematical reasoning capabilities of DeepSeek-V2, while maintaining comparable performance in general language tasks. Compared to DeepSeek-Coder-33B, DeepSeek-Coder-V2 demonstrates significant advancements in various aspects of code-related tasks, as well as reasoning and general capabilities. Additionally, DeepSeek-Coder-V2 expands its support for programming languages from 86 to 338, while extending the context length from 16K to 128K. In standard benchmark evaluations, DeepSeek-Coder-V2 achieves superior performance compared to closed-source models such as GPT4-Turbo, Claude 3 Opus, and Gemini 1.5 Pro in coding and math benchmarks.

benchmark, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2406.11931

Country:

North America > Canada (0.14)
Europe > Spain (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

From Adoption to Adaption: Tracing the Diffusion of New Emojis on Twitter

Zhou, Yuhang, Lu, Xuan, Ai, Wei

arXiv.org Artificial IntelligenceFeb-21-2024

In the rapidly evolving landscape of social media, the introduction of new emojis in Unicode release versions presents a structured opportunity to explore digital language evolution. Analyzing a large dataset of sampled English tweets, we examine how newly released emojis gain traction and evolve in meaning. We find that community size of early adopters and emoji semantics are crucial in determining their popularity. Certain emojis experienced notable shifts in the meanings and sentiment associations during the diffusion process. Additionally, we propose a novel framework utilizing language models to extract words and pre-existing emojis with semantically similar contexts, which enhances interpretation of new emojis. The framework demonstrates its effectiveness in improving sentiment classification performance by substituting unknown new emojis with familiar ones. This study offers a new perspective in understanding how new language units are adopted, adapted, and integrated into the fabric of online communication.

large language model, machine learning, natural language, (23 more...)

arXiv.org Artificial Intelligence

2402.14187

Country: North America > United States > Maryland (0.14)

Genre: Research Report (0.82)

Industry:

Information Technology (0.46)
Banking & Finance (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.49)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.49)
(3 more...)

Add feedback

Emojis Decoded: Leveraging ChatGPT for Enhanced Understanding in Social Media Communications

Zhou, Yuhang, Xu, Paiheng, Wang, Xiyao, Lu, Xuan, Gao, Ge, Ai, Wei

arXiv.org Artificial IntelligenceJan-22-2024

Emojis, which encapsulate semantics beyond mere words or phrases, have become prevalent in social network communications. This has spurred increasing scholarly interest in exploring their attributes and functionalities. However, emoji-related research and application face two primary challenges. First, researchers typically rely on crowd-sourcing to annotate emojis in order to understand their sentiments, usage intentions, and semantic meanings. Second, subjective interpretations by users can often lead to misunderstandings of emojis and cause the communication barrier. Large Language Models (LLMs) have achieved significant success in various annotation tasks, with ChatGPT demonstrating expertise across multiple domains. In our study, we assess ChatGPT's effectiveness in handling previously annotated and downstream tasks. Our objective is to validate the hypothesis that ChatGPT can serve as a viable alternative to human annotators in emoji research and that its ability to explain emoji meanings can enhance clarity and transparency in online communications. Our findings indicate that ChatGPT has extensive knowledge of emojis. It is adept at elucidating the meaning of emojis across various application scenarios and demonstrates the potential to replace human annotators in a range of tasks.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2402.01681

Country:

North America > United States > Maryland (0.14)
North America > United States > Arizona (0.14)
North America > United States > California (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Emoji Promotes Developer Participation and Issue Resolution on GitHub

Zhou, Yuhang, Lu, Xuan, Gao, Ge, Mei, Qiaozhu, Ai, Wei

arXiv.org Artificial IntelligenceSep-7-2023

Although remote working is increasingly adopted during the pandemic, many are concerned by the low-efficiency in the remote working. Missing in text-based communication are non-verbal cues such as facial expressions and body language, which hinders the effective communication and negatively impacts the work outcomes. Prevalent on social media platforms, emojis, as alternative non-verbal cues, are gaining popularity in the virtual workspaces well. In this paper, we study how emoji usage influences developer participation and issue resolution in virtual workspaces. To this end, we collect GitHub issues for a one-year period and apply causal inference techniques to measure the causal effect of emojis on the outcome of issues, controlling for confounders such as issue content, repository, and author information. We find that emojis can significantly reduce the resolution time of issues and attract more user participation. We also compare the heterogeneous effect on different types of issues. These findings deepen our understanding of the developer communities, and they provide design implications on how to facilitate interactions and broaden developer participation.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2308.1636

Country:

North America > United States > Michigan (0.14)
North America > United States > Maryland (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Communications > Collaboration (0.86)

Add feedback

Team Resilience under Shock: An Empirical Analysis of GitHub Repositories during Early COVID-19 Pandemic

Lu, Xuan, Ai, Wei, Wang, Yixin, Mei, Qiaozhu

arXiv.org Artificial IntelligenceJan-28-2023

While many organizations have shifted to working remotely during the COVID-19 pandemic, how the remote workforce and the remote teams are influenced by and would respond to this and future shocks remain largely unknown. Software developers have relied on remote collaborations long before the pandemic, working in virtual teams (GitHub repositories). The dynamics of these repositories through the pandemic provide a unique opportunity to understand how remote teams react under shock. This work presents a systematic analysis. We measure the overall effect of the early pandemic on public GitHub repositories by comparing their sizes and productivity with the counterfactual outcomes forecasted as if there were no pandemic. We find that the productivity level and the number of active members of these teams vary significantly during different periods of the pandemic. We then conduct a finer-grained investigation and study the heterogeneous effects of the shock on individual teams. We find that the resilience of a team is highly correlated to certain properties of the team before the pandemic. Through a bootstrapped regression analysis, we reveal which types of teams are robust or fragile to the shock.

artificial intelligence, machine learning, repository, (18 more...)

arXiv.org Artificial Intelligence

2301.12326

Country:

North America > United States > Michigan (0.28)
North America > United States > Maryland (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.88)

Industry:

Health & Medicine > Therapeutic Area > Immunology (0.84)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.70)

Technology:

Information Technology > Communications > Collaboration (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Untangling Emoji Popularity Through Semantic Embeddings

AAAI ConferencesMay-11-2017

Emojis have gone viral on the Internet across platforms and devices. Interwoven into our daily communications, they have become a ubiquitous new language. However, little has been done to analyze the usage of emojis at scale and in depth. Why do some emojis become especially popular while others don't? How are people using them among the words? In this work, we take the initiative to study the collective usage and behavior of emojis, and specifically, how emojis interact with their context. We base our analysis on a very large corpus collected from a popular emoji keyboard, which contains a full month of inputs from millions of users. Our analysis is empowered by a state-of-the-art machine learning tool that computes the embeddings of emojis and words in a semantic space. We find that emojis with clear semantic meanings are more likely to be adopted. While entity-related emojis are more likely to be used as alternatives to words, sentiment-related emojis often play a complementary role in a message. Overall, emojis are significantly more prevalent in a sentimental context.

semantic embedding, untangling emoji popularity

AAAI Conferences

Eleventh International AAAI Conference on Web and Social Media

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback