AITopics | Huang, Ting

Collaborating Authors

Huang, Ting

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs

Ling Team, null, Zeng, Binwei, Huang, Chao, Zhang, Chao, Tian, Changxin, Chen, Cong, Jin, Dingnan, Yu, Feng, Zhu, Feng, Yuan, Feng, Wang, Fakang, Wang, Gangshan, Zhai, Guangyao, Zhang, Haitao, Li, Huizhong, Zhou, Jun, Liu, Jia, Fang, Junpeng, Ou, Junjie, Hu, Jun, Luo, Ji, Zhang, Ji, Liu, Jian, Sha, Jian, Qian, Jianxue, Wu, Jiewei, Zhao, Junping, Li, Jianguo, Feng, Jubao, Di, Jingchao, Xu, Junming, Yao, Jinghua, Xu, Kuan, Du, Kewei, Li, Longfei, Liang, Lei, Yu, Lu, Tang, Li, Ju, Lin, Xu, Peng, Cui, Qing, Liu, Song, Li, Shicheng, Song, Shun, Yan, Song, Cai, Tengwei, Chen, Tianyi, Guo, Ting, Huang, Ting, Feng, Tao, Wu, Tao, Wu, Wei, Zhang, Xiaolu, Yang, Xueming, Zhao, Xin, Hu, Xiaobo, Lin, Xin, Zhao, Yao, Wang, Yilong, Guo, Yongzhen, Wang, Yuanyuan, Yang, Yue, Cao, Yang, Fu, Yuhao, Xiong, Yi, Li, Yanzhe, Li, Zhe, Zhang, Zhiqiang, Liu, Ziqi, Huan, Zhaoxin, Wen, Zujie, Sun, Zhenhang, Du, Zhuoxuan, He, Zhengyu

arXiv.org Artificial IntelligenceMar-10-2025

In this technical report, we tackle the challenges of training large-scale Mixture of Experts (MoE) models, focusing on overcoming cost inefficiency and resource limitations prevalent in such systems. To address these issues, we present two differently sized MoE large language models (LLMs), namely Ling-Lite and Ling-Plus (referred to as "Bailing" in Chinese, spelled B\v{a}il\'ing in Pinyin). Ling-Lite contains 16.8 billion parameters with 2.75 billion activated parameters, while Ling-Plus boasts 290 billion parameters with 28.8 billion activated parameters. Both models exhibit comparable performance to leading industry benchmarks. This report offers actionable insights to improve the efficiency and accessibility of AI development in resource-constrained settings, promoting more scalable and sustainable technologies. Specifically, to reduce training costs for large-scale MoE models, we propose innovative methods for (1) optimization of model architecture and training processes, (2) refinement of training anomaly handling, and (3) enhancement of model evaluation efficiency. Additionally, leveraging high-quality data generated from knowledge graphs, our models demonstrate superior capabilities in tool use compared to other models. Ultimately, our experimental findings demonstrate that a 300B MoE LLM can be effectively trained on lower-performance devices while achieving comparable performance to models of a similar scale, including dense and MoE models. Compared to high-performance devices, utilizing a lower-specification hardware system during the pre-training phase demonstrates significant cost savings, reducing computing costs by approximately 20%. The models can be accessed at https://huggingface.co/inclusionAI.

arxiv preprint arxiv, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2503.05139

Genre: Research Report > New Finding (1.00)

Industry:

Education (0.46)
Energy (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multi-Agent Simulator Drives Language Models for Legal Intensive Interaction

Yue, Shengbin, Huang, Ting, Jia, Zheng, Wang, Siyuan, Liu, Shujun, Song, Yun, Huang, Xuanjing, Wei, Zhongyu

arXiv.org Artificial IntelligenceFeb-8-2025

Large Language Models (LLMs) have significantly advanced legal intelligence, but the scarcity of scenario data impedes the progress toward interactive legal scenarios. This paper introduces a Multi-agent Legal Simulation Driver (MASER) to scalably generate synthetic data by simulating interactive legal scenarios. Leveraging real-legal case sources, MASER ensures the consistency of legal attributes between participants and introduces a supervisory mechanism to align participants' characters and behaviors as well as addressing distractions. A Multi-stage Interactive Legal Evaluation (MILE) benchmark is further constructed to evaluate LLMs' performance in dynamic legal scenarios. Extensive experiments confirm the effectiveness of our framework.

information, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2502.06882

Country:

Asia > China (0.28)
North America > United States > California (0.14)

Genre: Research Report (0.63)

Industry: Law > Litigation (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

RLEMMO: Evolutionary Multimodal Optimization Assisted By Deep Reinforcement Learning

Lian, Hongqiao, Ma, Zeyuan, Guo, Hongshu, Huang, Ting, Gong, Yue-Jiao

arXiv.org Artificial IntelligenceApr-12-2024

Solving multimodal optimization problems (MMOP) requires finding all optimal solutions, which is challenging in limited function evaluations. Although existing works strike the balance of exploration and exploitation through hand-crafted adaptive strategies, they require certain expert knowledge, hence inflexible to deal with MMOP with different properties. In this paper, we propose RLEMMO, a Meta-Black-Box Optimization framework, which maintains a population of solutions and incorporates a reinforcement learning agent for flexibly adjusting individual-level searching strategies to match the up-to-date optimization status, hence boosting the search performance on MMOP. Concretely, we encode landscape properties and evolution path information into each individual and then leverage attention networks to advance population information sharing. With a novel reward mechanism that encourages both quality and diversity, RLEMMO can be effectively trained using a policy gradient algorithm. The experimental results on the CEC2013 MMOP benchmark underscore the competitive optimization performance of RLEMMO against several strong baselines.

machine learning, optimization, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2404.08242

Country: Asia > China > Guangdong Province (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Add feedback

InternLM2 Technical Report

Cai, Zheng, Cao, Maosong, Chen, Haojiong, Chen, Kai, Chen, Keyu, Chen, Xin, Chen, Xun, Chen, Zehui, Chen, Zhi, Chu, Pei, Dong, Xiaoyi, Duan, Haodong, Fan, Qi, Fei, Zhaoye, Gao, Yang, Ge, Jiaye, Gu, Chenya, Gu, Yuzhe, Gui, Tao, Guo, Aijia, Guo, Qipeng, He, Conghui, Hu, Yingfan, Huang, Ting, Jiang, Tao, Jiao, Penglong, Jin, Zhenjiang, Lei, Zhikai, Li, Jiaxing, Li, Jingwen, Li, Linyang, Li, Shuaibin, Li, Wei, Li, Yining, Liu, Hongwei, Liu, Jiangning, Hong, Jiawei, Liu, Kaiwen, Liu, Kuikun, Liu, Xiaoran, Lv, Chengqi, Lv, Haijun, Lv, Kai, Ma, Li, Ma, Runyuan, Ma, Zerun, Ning, Wenchang, Ouyang, Linke, Qiu, Jiantao, Qu, Yuan, Shang, Fukai, Shao, Yunfan, Song, Demin, Song, Zifan, Sui, Zhihao, Sun, Peng, Sun, Yu, Tang, Huanze, Wang, Bin, Wang, Guoteng, Wang, Jiaqi, Wang, Jiayu, Wang, Rui, Wang, Yudong, Wang, Ziyi, Wei, Xingjian, Weng, Qizhen, Wu, Fan, Xiong, Yingtong, Xu, Chao, Xu, Ruiliang, Yan, Hang, Yan, Yirong, Yang, Xiaogui, Ye, Haochen, Ying, Huaiyuan, Yu, Jia, Yu, Jing, Zang, Yuhang, Zhang, Chuyu, Zhang, Li, Zhang, Pan, Zhang, Peng, Zhang, Ruijie, Zhang, Shuo, Zhang, Songyang, Zhang, Wenjian, Zhang, Wenwei, Zhang, Xingcheng, Zhang, Xinyue, Zhao, Hui, Zhao, Qian, Zhao, Xiaomeng, Zhou, Fengzhe, Zhou, Zaida, Zhuo, Jingming, Zou, Yicheng, Qiu, Xipeng, Qiao, Yu, Lin, Dahua

arXiv.org Artificial IntelligenceMar-25-2024

The evolution of Large Language Models (LLMs) like ChatGPT and GPT-4 has sparked discussions on the advent of Artificial General Intelligence (AGI). However, replicating such advancements in open-source models has been challenging. This paper introduces InternLM2, an open-source LLM that outperforms its predecessors in comprehensive evaluations across 6 dimensions and 30 benchmarks, long-context modeling, and open-ended subjective evaluations through innovative pre-training and optimization techniques. The pre-training process of InternLM2 is meticulously detailed, highlighting the preparation of diverse data types including text, code, and long-context data. InternLM2 efficiently captures long-term dependencies, initially trained on 4k tokens before advancing to 32k tokens in pre-training and fine-tuning stages, exhibiting remarkable performance on the 200k ``Needle-in-a-Haystack" test. InternLM2 is further aligned using Supervised Fine-Tuning (SFT) and a novel Conditional Online Reinforcement Learning from Human Feedback (COOL RLHF) strategy that addresses conflicting human preferences and reward hacking. By releasing InternLM2 models in different training stages and model sizes, we provide the community with insights into the model's evolution.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2403.17297

Country:

Asia (1.00)
North America > United States (0.67)
Europe > Italy (0.46)

Genre:

Research Report > New Finding (0.45)
Research Report > Promising Solution (0.45)

Industry:

Leisure & Entertainment (0.67)
Education > Educational Setting (0.67)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback