AITopics | Hu, Wenjie

Collaborating Authors

Hu, Wenjie

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

DeepSeek-AI, null, :, null, Bi, Xiao, Chen, Deli, Chen, Guanting, Chen, Shanhuang, Dai, Damai, Deng, Chengqi, Ding, Honghui, Dong, Kai, Du, Qiushi, Fu, Zhe, Gao, Huazuo, Gao, Kaige, Gao, Wenjun, Ge, Ruiqi, Guan, Kang, Guo, Daya, Guo, Jianzhong, Hao, Guangbo, Hao, Zhewen, He, Ying, Hu, Wenjie, Huang, Panpan, Li, Erhang, Li, Guowei, Li, Jiashi, Li, Yao, Li, Y. K., Liang, Wenfeng, Lin, Fangyun, Liu, A. X., Liu, Bo, Liu, Wen, Liu, Xiaodong, Liu, Xin, Liu, Yiyuan, Lu, Haoyu, Lu, Shanghao, Luo, Fuli, Ma, Shirong, Nie, Xiaotao, Pei, Tian, Piao, Yishi, Qiu, Junjie, Qu, Hui, Ren, Tongzheng, Ren, Zehui, Ruan, Chong, Sha, Zhangli, Shao, Zhihong, Song, Junxiao, Su, Xuecheng, Sun, Jingxiang, Sun, Yaofeng, Tang, Minghui, Wang, Bingxuan, Wang, Peiyi, Wang, Shiyu, Wang, Yaohui, Wang, Yongji, Wu, Tong, Wu, Y., Xie, Xin, Xie, Zhenda, Xie, Ziwei, Xiong, Yiliang, Xu, Hanwei, Xu, R. X., Xu, Yanhong, Yang, Dejian, You, Yuxiang, Yu, Shuiping, Yu, Xingkai, Zhang, B., Zhang, Haowei, Zhang, Lecong, Zhang, Liyue, Zhang, Mingchuan, Zhang, Minghua, Zhang, Wentao, Zhang, Yichao, Zhao, Chenggang, Zhao, Yao, Zhou, Shangyan, Zhou, Shunfeng, Zhu, Qihao, Zou, Yuheng

arXiv.org Artificial IntelligenceJan-5-2024

The rapid development of open-source large language models (LLMs) has been truly remarkable. However, the scaling law described in previous literature presents varying conclusions, which casts a dark cloud over scaling LLMs. We delve into the study of scaling laws and present our distinctive findings that facilitate scaling of large scale models in two commonly used open-source configurations, 7B and 67B. Guided by the scaling laws, we introduce DeepSeek LLM, a project dedicated to advancing open-source language models with a long-term perspective. To support the pre-training phase, we have developed a dataset that currently consists of 2 trillion tokens and is continuously expanding. We further conduct supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) on DeepSeek LLM Base models, resulting in the creation of DeepSeek Chat models. Our evaluation results demonstrate that DeepSeek LLM 67B surpasses LLaMA-2 70B on various benchmarks, particularly in the domains of code, mathematics, and reasoning. Furthermore, open-ended evaluations reveal that DeepSeek LLM 67B Chat exhibits superior performance compared to GPT-3.5.

large language model, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2401.02954

Country:

Europe (1.00)
North America > United States > New York (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Law (1.00)
Education (1.00)
Leisure & Entertainment > Sports (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Breaking the Curse of Quality Saturation with User-Centric Ranking

Zhao, Zhuokai, Yang, Yang, Wang, Wenyu, Liu, Chihuang, Shi, Yu, Hu, Wenjie, Zhang, Haotian, Yang, Shuang

arXiv.org Artificial IntelligenceMay-24-2023

A key puzzle in search, ads, and recommendation is that the ranking model can only utilize a small portion of the vastly available user interaction data. As a result, increasing data volume, model size, or computation FLOPs will quickly suffer from diminishing returns. We examined this problem and found that one of the root causes may lie in the so-called ``item-centric'' formulation, which has an unbounded vocabulary and thus uncontrolled model complexity. To mitigate quality saturation, we introduce an alternative formulation named ``user-centric ranking'', which is based on a transposed view of the dyadic user-item interaction data. We show that this formulation has a promising scaling property, enabling us to train better-converged models on substantially larger data sets.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2305.15333

Country: North America > United States (0.47)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Communications > Social Media (0.69)
(2 more...)

Add feedback

Students' Voices on Generative AI: Perceptions, Benefits, and Challenges in Higher Education

Chan, Cecilia Ka Yuk, Hu, Wenjie

arXiv.org Artificial IntelligenceApr-29-2023

This study explores university students' perceptions of generative AI (GenAI) technologies, such as ChatGPT, in higher education, focusing on familiarity, their willingness to engage, potential benefits and challenges, and effective integration. A survey of 399 undergraduate and postgraduate students from various disciplines in Hong Kong revealed a generally positive attitude towards GenAI in teaching and learning. Students recognized the potential for personalized learning support, writing and brainstorming assistance, and research and analysis capabilities. However, concerns about accuracy, privacy, ethical issues, and the impact on personal development, career prospects, and societal values were also expressed. According to John Biggs' 3P model, student perceptions significantly influence learning approaches and outcomes. By understanding students' perceptions, educators and policymakers can tailor GenAI technologies to address needs and concerns while promoting effective learning outcomes. Insights from this study can inform policy development around the integration of GenAI technologies into higher education. By understanding students' perceptions and addressing their concerns, policymakers can create well-informed guidelines and strategies for the responsible and effective implementation of GenAI tools, ultimately enhancing teaching and learning experiences in higher education.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2305.0029

Country: Asia > China > Hong Kong (0.25)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (0.93)

Industry:

Education > Educational Setting > Higher Education (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Modeling Combinatorial Evolution in Time Series Prediction

Hu, Wenjie, Yang, Yang, You, Zilong, Liu, Zongtao, Ren, Xiang

arXiv.org Machine LearningMay-20-2019

For instance, earthquake wave is the observation Time series modeling aims to capture the intrinsic factors underpinning of crustal movements, while different actions like running and observed data and its evolution. However, most existing studies walking will cause differences in observations of a fitness-tracking ignore the evolutionary relations among these factors, which are device. Moreover, in practice, we often observe the combinatorial what cause the combinatorial evolution of a given time series. For evolution of data; that is, the observed time series being covered example, personal interests are intrinsic factors hidden behind users' by the influence of multiple factors, and especially the relations observable online shopping behaviors; consequently, a precise item among these factors. For example, an earthquake is the result of recommendation depends not only on discovering the item-interest quick transitions from smooth movements in the Earth's crust to relationship, but also on an understanding of how user interests intense ones, which cause a sudden release of energy in the Earth's shift over time. In this paper, we propose to represent complex and crust. Meanwhile, observing one's online shopping logs, precise dynamic relations among intrinsic factors of time series data by item recommendations rely on tracing and understanding the shift means of an evolutionary state graph structure.

deep learning, neural network, time series, (19 more...)

arXiv.org Machine Learning

1905.05006

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Capturing Evolution Genes for Time Series Data

Hu, Wenjie, Yang, Yang, Wu, Liang, Liu, Zongtao, Sun, Zhanlin, Yao, Bingshen

arXiv.org Machine LearningMay-10-2019

The modeling of time series is becoming increasingly critical in a wide variety of applications. Overall, data evolves by following different patterns, which are generally caused by different user behaviors. Given a time series, we define the evolution gene to capture the latent user behaviors and to describe how the behaviors lead to the generation of time series. In particular, we propose a uniform framework that recognizes different evolution genes of segments by learning a classifier, and adopt an adversarial generator to implement the evolution gene by estimating the segments' distribution. Experimental results based on a synthetic dataset and five real-world datasets show that our approach can not only achieve a good prediction results (e.g., averagely +10.56% in terms of F1), but is also able to provide explanations of the results.

dataset, deep learning, neural network, (20 more...)

arXiv.org Machine Learning

1905.05004

Country: North America > United States (0.30)

Genre: Research Report (1.00)

Industry: Energy > Power Industry (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Representation Learning for Scale-Free Networks

Feng, Rui (Zhejiang University) | Yang, Yang (Zhejiang University) | Hu, Wenjie (Zhejiang University) | Wu, Fei (Zhejiang University) | Zhang, Yueting (Zhejiang University)

AAAI ConferencesFeb-8-2018

Network embedding aims to learn the low-dimensional representations of vertexes in a network, while structure and inherent properties of the network is preserved. Existing network embedding works primarily focus on preserving the microscopic structure, such as the first- and second-order proximity of vertexes, while the macroscopic scale-free property is largely ignored. Scale-free property depicts the fact that vertex degrees follow a heavy-tailed distribution (i.e., only a few vertexes have high degrees) and is a critical property of real-world networks, such as social networks. In this paper, we study the problem of learning representations for scale-free networks. We first theoretically analyze the difficulty of embedding and reconstructing a scale-free network in the Euclidean space, by converting our problem to the sphere packing problem. Then, we propose the "degree penalty" principle for designing scale-free property preserving network embedding algorithm: punishing the proximity between high-degree vertexes. We introduce two implementations of our principle by utilizing the spectral techniques and a skip-gram model respectively. Extensive experiments on six datasets show that our algorithms are able to not only reconstruct heavy-tailed distributed degree distribution, but also outperform state-of-the-art embedding models in various network mining tasks, such as vertex classification and link prediction.

artificial intelligence, information technology services, representation, (19 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Services (0.48)

Technology: