AITopics | Dou, Fei

Collaborating Authors

Dou, Fei

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

HELENE: Hessian Layer-wise Clipping and Gradient Annealing for Accelerating Fine-tuning LLM with Zeroth-order Optimization

Zhao, Huaqin, Li, Jiaxi, Pan, Yi, Liang, Shizhe, Yang, Xiaofeng, Liu, Wei, Li, Xiang, Dou, Fei, Liu, Tianming, Lu, Jin

arXiv.org Artificial IntelligenceNov-15-2024

Fine-tuning large language models (LLMs) poses significant memory challenges, as the back-propagation process demands extensive resources, especially with growing model sizes. Recent work, MeZO, addresses this issue using a zerothorder (ZO) optimization method, which reduces memory consumption by matching the usage to the inference phase. To overcome this limitation, we introduce HELENE, a novel scalable and memory-efficient optimizer that integrates annealed A-GNB gradients with a diagonal Hessian estimation and layerwise clipping, serving as a second-order pre-conditioner. This combination allows for faster and more stable convergence. Our theoretical analysis demonstrates that HELENE improves convergence rates, particularly for models with heterogeneous layer dimensions, by reducing the dependency on the total parameter space dimension. Furthermore, HELENE remains compatible with both full parameter tuning and parameter-efficient fine-tuning (PEFT), outperforming several state-of-the-art optimizers. The codes will be released after reviewing. LLMs have demonstrated remarkable capabilities across various downstream tasks. Fine-tuning these models has become the standard approach for improving task-specific performance, in which the firstorder optimizers like Stochastic Gradient Descent (SGD) (Robbins & Monro, 1951), Adam (Diederik, 2014) and AdamW (Hutter & Loshchilov, 2017) are widely used.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2411.10696

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.68)

Add feedback

A Systematic Assessment of OpenAI o1-Preview for Higher Order Thinking in Education

Latif, Ehsan, Zhou, Yifan, Guo, Shuchen, Gao, Yizhu, Shi, Lehong, Nayaaba, Matthew, Lee, Gyeonggeon, Zhang, Liang, Bewersdorff, Arne, Fang, Luyang, Yang, Xiantong, Zhao, Huaqin, Jiang, Hanqi, Lu, Haoran, Li, Jiaxi, Yu, Jichao, You, Weihang, Liu, Zhengliang, Liu, Vincent Shung, Wang, Hui, Wu, Zihao, Lu, Jin, Dou, Fei, Ma, Ping, Liu, Ninghao, Liu, Tianming, Zhai, Xiaoming

arXiv.org Artificial IntelligenceOct-11-2024

As artificial intelligence (AI) continues to advance, it demonstrates capabilities comparable to human intelligence, with significant potential to transform education and workforce development. This study evaluates OpenAI o1-preview's ability to perform higher-order cognitive tasks across 14 dimensions, including critical thinking, systems thinking, computational thinking, design thinking, metacognition, data literacy, creative thinking, abstract reasoning, quantitative reasoning, logical reasoning, analogical reasoning, and scientific reasoning. We used validated instruments like the Ennis-Weir Critical Thinking Essay Test and the Biological Systems Thinking Test to compare the o1-preview's performance with human performance systematically. Our findings reveal that o1-preview outperforms humans in most categories, achieving 150% better results in systems thinking, computational thinking, data literacy, creative thinking, scientific reasoning, and abstract reasoning. However, compared to humans, it underperforms by around 25% in logical reasoning, critical thinking, and quantitative reasoning. In analogical reasoning, both o1-preview and humans achieved perfect scores. Despite these strengths, the o1-preview shows limitations in abstract reasoning, where human psychology students outperform it, highlighting the continued importance of human oversight in tasks requiring high-level abstraction. These results have significant educational implications, suggesting a shift toward developing human skills that complement AI, such as creativity, abstract reasoning, and critical thinking. This study emphasizes the transformative potential of AI in education and calls for a recalibration of educational goals, teaching methods, and curricula to align with an AI-driven world.

large language model, machine learning, natural language, (24 more...)

arXiv.org Artificial Intelligence

2410.21287

Country:

Europe (1.00)
Asia > China (1.00)
North America > Canada > Alberta (0.45)
North America > United States > Arizona (0.27)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
(4 more...)

Industry:

Leisure & Entertainment (1.00)
Law (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
(22 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

Add feedback

Mobilizing Personalized Federated Learning in Infrastructure-Less and Heterogeneous Environments via Random Walk Stochastic ADMM

Parsons, Ziba, Dou, Fei, Du, Houyi, Song, Zheng, Lu, Jin

arXiv.org Artificial IntelligenceSep-26-2023

This paper explores the challenges of implementing Federated Learning (FL) in practical scenarios featuring isolated nodes with data heterogeneity, which can only be connected to the server through wireless links in an infrastructure-less environment. To overcome these challenges, we propose a novel mobilizing personalized FL approach, which aims to facilitate mobility and resilience. Specifically, we develop a novel optimization algorithm called Random Walk Stochastic Alternating Direction Method of Multipliers (RWSADMM). RWSADMM capitalizes on the server's random movement toward clients and formulates local proximity among their adjacent clients based on hard inequality constraints rather than requiring consensus updates or introducing bias via regularization methods. To mitigate the computational burden on the clients, an efficient stochastic solver of the approximated optimization problem is designed in RWSADMM, which provably converges to the stationary point almost surely in expectation. Our theoretical and empirical results demonstrate the provable fast convergence and substantial accuracy improvements achieved by RWSADMM compared to baseline methods, along with its benefits of reduced communication costs and enhanced scalability.

artificial intelligence, infrastructure-less and heterogeneous environment, optimization problem, (3 more...)

arXiv.org Artificial Intelligence

2304.12534

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.60)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.53)

Add feedback

Towards Artificial General Intelligence (AGI) in the Internet of Things (IoT): Opportunities and Challenges

Dou, Fei, Ye, Jin, Yuan, Geng, Lu, Qin, Niu, Wei, Sun, Haijian, Guan, Le, Lu, Guoyu, Mai, Gengchen, Liu, Ninghao, Lu, Jin, Liu, Zhengliang, Wu, Zihao, Tan, Chenjiao, Xu, Shaochen, Wang, Xianqiao, Li, Guoming, Chai, Lilong, Li, Sheng, Sun, Jin, Sun, Hongyue, Shao, Yunli, Li, Changying, Liu, Tianming, Song, Wenzhan

arXiv.org Artificial IntelligenceSep-14-2023

Artificial General Intelligence (AGI), possessing the capacity to comprehend, learn, and execute tasks with human cognitive abilities, engenders significant anticipation and intrigue across scientific, commercial, and societal arenas. This fascination extends particularly to the Internet of Things (IoT), a landscape characterized by the interconnection of countless devices, sensors, and systems, collectively gathering and sharing data to enable intelligent decision-making and automation. This research embarks on an exploration of the opportunities and challenges towards achieving AGI in the context of the IoT. Specifically, it starts by outlining the fundamental principles of IoT and the critical role of Artificial Intelligence (AI) in IoT systems. Subsequently, it delves into AGI fundamentals, culminating in the formulation of a conceptual framework for AGI's seamless integration within IoT. The application spectrum for AGI-infused IoT is broad, encompassing domains ranging from smart grids, residential environments, manufacturing, and transportation to environmental monitoring, agriculture, healthcare, and education. However, adapting AGI to resource-constrained IoT settings necessitates dedicated research efforts. Furthermore, the paper addresses constraints imposed by limited computing resources, intricacies associated with large-scale IoT communication, as well as the critical concerns pertaining to security and privacy.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2309.07438

Country:

Europe (1.00)
North America > United States > Georgia > Clarke County > Athens (0.14)
North America > United States > Florida > Alachua County > Gainesville (0.14)
North America > United States > California > Santa Clara County (0.14)

Genre:

Research Report > Promising Solution (1.00)
Overview (1.00)
Instructional Material (1.00)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Smart Houses & Appliances (1.00)
Information Technology > Security & Privacy (1.00)
(11 more...)

Technology:

Information Technology > Communications > Networks > Sensor Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(7 more...)

Add feedback