AITopics | Lu, Su

Collaborating Authors

Lu, Su

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

InfiR : Crafting Effective Small Language Models and Multimodal Small Language Models in Reasoning

Xie, Congkai, Cai, Shuo, Wang, Wenjun, Li, Pengxiang, Sang, Zhijie, Yang, Kejing, Zhang, Yiming, Li, Zhen, Zhu, Guanghao, Liu, Zeyu, Yu, Yang, Liu, Yuhang, Lu, Su, He, Baoyi, Zhou, Qi, Han, Xiaotian, Yuan, Jianbo, Zhang, Shengyu, Wu, Fei, Yang, Hongxia

arXiv.org Artificial IntelligenceFeb-17-2025

Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) have made significant advancements in reasoning capabilities. However, they still face challenges such as high computational demands and privacy concerns. This paper focuses on developing efficient Small Language Models (SLMs) and Multimodal Small Language Models (MSLMs) that retain competitive reasoning abilities. We introduce a novel training pipeline that enhances reasoning capabilities and facilitates deployment on edge devices, achieving state-of-the-art performance while minimizing development costs. \InfR~ aims to advance AI systems by improving reasoning, reducing adoption barriers, and addressing privacy concerns through smaller model sizes. Resources are available at https://github. com/Reallm-Labs/InfiR.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.11573

Country: Asia > China (0.68)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.54)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Support-Target Protocol for Meta-Learning

Lu, Su, Ye, Han-Jia, Zhan, De-Chuan

arXiv.org Artificial IntelligenceApr-8-2021

The support/query (S/Q) training protocol is widely used in meta-learning. S/Q protocol trains a task-specific model on S and then evaluates it on Q to optimize the meta-model using query loss, which depends on size and quality of Q. In this paper, we study a new S/T protocol for meta-learning. Assuming that we have access to the theoretically optimal model T for a task, we can directly match the task-specific model trained on S to T. S/T protocol offers a more accurate evaluation since it does not rely on possibly biased and noisy query instances. There are two challenges in putting S/T protocol into practice. Firstly, we have to determine how to match the task-specific model to T. To this end, we minimize the discrepancy between them on a fictitious dataset generated by adversarial learning, and distill the prediction ability of T to the task-specific model. Secondly, we usually do not have ready-made optimal models. As an alternative, we construct surrogate target models by fine-tuning on local tasks the globally pre-trained meta-model, maintaining both efficiency and veracity.

artificial intelligence, reinforcement learning, target model, (17 more...)

arXiv.org Artificial Intelligence

2104.03736

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.34)

Add feedback

Few-Shot Action Recognition with Compromised Metric via Optimal Transport

Lu, Su, Ye, Han-Jia, Zhan, De-Chuan

arXiv.org Artificial IntelligenceApr-8-2021

Although vital to computer vision systems, few-shot action recognition is still not mature despite the wide research of few-shot image classification. Popular few-shot learning algorithms extract a transferable embedding from seen classes and reuse it on unseen classes by constructing a metric-based classifier. One main obstacle to applying these algorithms in action recognition is the complex structure of videos. Some existing solutions sample frames from a video and aggregate their embeddings to form a video-level representation, neglecting important temporal relations. Others perform an explicit sequence matching between two videos and define their distance as matching cost, imposing too strong restrictions on sequence ordering. In this paper, we propose Compromised Metric via Optimal Transport (CMOT) to combine the advantages of these two solutions. CMOT simultaneously considers semantic and temporal information in videos under Optimal Transport framework, and is discriminative for both content-sensitive and ordering-sensitive tasks. In detail, given two videos, we sample segments from them and cast the calculation of their distance as an optimal transport problem between two segment sequences. To preserve the inherent temporal ordering information, we additionally amend the ground cost matrix by penalizing it with the positional distance between a pair of segments. Empirical results on benchmark datasets demonstrate the superiority of CMOT.

artificial intelligence, neural network, sth, (17 more...)

arXiv.org Artificial Intelligence

2104.03737

Country: Asia > China (0.14)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback