AITopics | Diao, Muxi

Collaborating Authors

Diao, Muxi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

From Simple to Professional: A Combinatorial Controllable Image Captioning Agent

Wang, Xinran, Diao, Muxi, Li, Baoteng, Zhang, Haiwen, Liang, Kongming, Ma, Zhanyu

arXiv.org Artificial IntelligenceDec-14-2024

The Controllable Image Captioning Agent (CapAgent) is an innovative system designed to bridge the gap between user simplicity and professional-level outputs in image captioning tasks. CapAgent automatically transforms user-provided simple instructions into detailed, professional instructions, enabling precise and context-aware caption generation. By leveraging multimodal large language models (MLLMs) and external tools such as object detection tool and search engines, the system ensures that captions adhere to specified guidelines, including sentiment, keywords, focus, and formatting. CapAgent transparently controls each step of the captioning process, and showcases its reasoning and tool usage at every step, fostering user trust and engagement.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2412.11025

Genre: Workflow (0.88)

Industry:

Transportation > Ground > Road (0.48)
Automobiles & Trucks (0.48)
Media (0.47)
Information Technology (0.30)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery

Song, Xiaoshuai, Diao, Muxi, Dong, Guanting, Wang, Zhengyang, Fu, Yujia, Qiao, Runqi, Wang, Zhexu, Fu, Dayuan, Wu, Huangxuan, Liang, Bin, Zeng, Weihao, Wang, Yejie, GongQue, Zhuoma, Yu, Jianing, Tan, Qiuna, Xu, Weiran

arXiv.org Artificial IntelligenceJun-12-2024

Computer Science (CS) stands as a testament to the intricacies of human intelligence, profoundly advancing the development of artificial intelligence and modern society. However, the current community of large language models (LLMs) overly focuses on benchmarks for analyzing specific foundational skills (e.g. mathematics and code generation), neglecting an all-round evaluation of the computer science field. To bridge this gap, we introduce CS-Bench, the first bilingual (Chinese-English) benchmark dedicated to evaluating the performance of LLMs in computer science. CS-Bench comprises approximately 5K meticulously curated test samples, covering 26 subfields across 4 key areas of computer science, encompassing various task forms and divisions of knowledge and reasoning. Utilizing CS-Bench, we conduct a comprehensive evaluation of over 30 mainstream LLMs, revealing the relationship between CS performance and model scales. We also quantitatively analyze the reasons for failures in existing LLMs and highlight directions for improvements, including knowledge supplementation and CS-specific reasoning. Further cross-capability experiments show a high correlation between LLMs' capabilities in computer science and their abilities in mathematics and coding. Moreover, expert LLMs specialized in mathematics and coding also demonstrate strong performances in several CS subfields. Looking ahead, we envision CS-Bench serving as a cornerstone for LLM applications in the CS field and paving new avenues in assessing LLMs' diverse reasoning capabilities. The CS-Bench data and evaluation code are available at https://github.com/csbench/csbench.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2406.08587

Country: Asia > China (0.28)

Genre: Research Report (0.81)

Industry:

Education (1.00)
Information Technology (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning

Wang, Yejie, He, Keqing, Dong, Guanting, Wang, Pei, Zeng, Weihao, Diao, Muxi, Mou, Yutao, Zhang, Mengdi, Wang, Jingang, Cai, Xunliang, Xu, Weiran

arXiv.org Artificial IntelligenceFeb-14-2024

Code Large Language Models (Code LLMs) have demonstrated outstanding performance in code-related tasks. Several instruction tuning approaches have been proposed to boost the code generation performance of pre-trained Code LLMs. In this paper, we introduce a diverse instruction model (DolphCoder) with self-evaluating for code generation. It learns diverse instruction targets and combines a code evaluation objective to enhance its code generation ability. Our model achieves superior performance on the HumanEval and MBPP benchmarks, demonstrating new insights for future code instruction tuning work. Our key findings are: (1) Augmenting more diverse responses with distinct reasoning paths increases the code capability of LLMs. (2) Improving one's ability to evaluate the correctness of code solutions also enhances their ability to create it.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2402.09136

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.93)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.52)

Add feedback