AITopics

Large Language Models (LLMs) have driven significant progress, yet their growing parameter counts and context windows incur prohibitive compute, energy, and monetary costs. We introduce EfficientLLM, a novel benchmark and the first comprehensive empirical study evaluating efficiency techniques for LLMs at scale. Conducted on a production-class cluster (48xGH200, 8xH200 GPUs), our study systematically explores three key axes: (1) architecture pretraining (efficient attention variants: MQA, GQA, MLA, NSA; sparse Mixture-of-Experts (MoE)), (2) fine-tuning (parameter-efficient methods: LoRA, RSLoRA, DoRA), and (3) inference (quantization methods: int4, float16). We define six fine-grained metrics (Memory Utilization, Compute Utilization, Latency, Throughput, Energy Consumption, Compression Rate) to capture hardware saturation, latency-throughput balance, and carbon cost. Evaluating over 100 model-technique pairs (0.5B-72B parameters), we derive three core insights: (i) Efficiency involves quantifiable trade-offs: no single method is universally optimal; e.g., MoE reduces FLOPs and improves accuracy but increases VRAM by 40%, while int4 quantization cuts memory/energy by up to 3.9x at a 3-5% accuracy drop. (ii) Optima are task- and scale-dependent: MQA offers optimal memory-latency trade-offs for constrained devices, MLA achieves lowest perplexity for quality-critical tasks, and RSLoRA surpasses LoRA efficiency only beyond 14B parameters. (iii) Techniques generalize across modalities: we extend evaluations to Large Vision Models (Stable Diffusion 3.5, Wan 2.1) and Vision-Language Models (Qwen2.5-VL), confirming effective transferability. By open-sourcing datasets, evaluation pipelines, and leaderboards, EfficientLLM provides essential guidance for researchers and engineers navigating the efficiency-performance landscape of next-generation foundation models.

large language model, machine learning, natural language, (22 more...)

2505.1384

Country:

Asia > Middle East > Jordan (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
(8 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Instructional Material > Course Syllabus & Notes (0.92)

Industry:

Information Technology (1.00)
Health & Medicine (1.00)
Energy > Power Industry (0.46)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

MLZero: A Multi-Agent System for End-to-end Machine Learning Automation

Fang, Haoyang, Han, Boran, Erickson, Nick, Zhang, Xiyuan, Zhou, Su, Dagar, Anirudh, Zhang, Jiani, Turkmen, Ali Caner, Hu, Cuixiong, Rangwala, Huzefa, Wu, Ying Nian, Wang, Bernie, Karypis, George

Existing AutoML systems have advanced the automation of machine learning (ML); however, they still require substantial manual configuration and expert input, particularly when handling multimodal data. We introduce MLZero, a novel multi-agent framework powered by Large Language Models (LLMs) that enables end-to-end ML automation across diverse data modalities with minimal human intervention. A cognitive perception module is first employed, transforming raw multimodal inputs into perceptual context that effectively guides the subsequent workflow. To address key limitations of LLMs, such as hallucinated code generation and outdated API knowledge, we enhance the iterative code generation process with semantic and episodic memory. MLZero demonstrates superior performance on MLE-Bench Lite, outperforming all competitors in both success rate and solution quality, securing six gold medals. Additionally, when evaluated on our Multimodal AutoML Agent Benchmark, which includes 25 more challenging tasks spanning diverse data modalities, MLZero outperforms the competing methods by a large margin with a success rate of 0.92 (+263.6\%) and an average rank of 2.28. Our approach maintains its robust effectiveness even with a compact 8B LLM, outperforming full-size systems from existing solutions.

large language model, machine learning, natural language, (13 more...)

2505.13941

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
South America > Paraguay > Asunción > Asunción (0.04)
(8 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Instructional Material > Course Syllabus & Notes (0.92)

Industry:

Transportation > Air (1.00)
Information Technology (1.00)
Education (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Towards Non-Euclidean Foundation Models: Advancing AI Beyond Euclidean Frameworks

Yang, Menglin, Zhang, Yifei, Chen, Jialin, Weber, Melanie, Ying, Rex

In the era of foundation models and Large Language Models (LLMs), Euclidean space is the de facto geometric setting of our machine learning architectures. However, recent literature has demonstrated that this choice comes with fundamental limitations. To that end, non-Euclidean learning is quickly gaining traction, particularly in web-related applications where complex relationships and structures are prevalent. Non-Euclidean spaces, such as hyperbolic, spherical, and mixed-curvature spaces, have been shown to provide more efficient and effective representations for data with intrinsic geometric properties, including web-related data like social network topology, query-document relationships, and user-item interactions. Integrating foundation models with non-Euclidean geometries has great potential to enhance their ability to capture and model the underlying structures, leading to better performance in search, recommendations, and content understanding. This workshop focuses on the intersection of Non-Euclidean Foundation Models and Geometric Learning (NEGEL), exploring its potential benefits, including the potential benefits for advancing web-related technologies, challenges, and future directions. Workshop page: [https://hyperboliclearning.github.io/events/www2025workshop](https://hyperboliclearning.github.io/events/www2025workshop)

data mining, large language model, machine learning, (19 more...)

doi: 10.1145/3701716.3717806

2505.14417

Country:

Oceania > Australia > New South Wales > Sydney (0.05)
North America > United States > Connecticut > New Haven County > New Haven (0.05)
Asia > China > Hong Kong (0.04)
(6 more...)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry: Education (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)

Contrastive Cross-Course Knowledge Tracing via Concept Graph Guided Knowledge Transfer

Han, Wenkang, Lin, Wang, Hu, Liya, Dai, Zhenlong, Zhou, Yiyun, Li, Mengze, Liu, Zemin, Yao, Chang, Chen, Jingyuan

Knowledge tracing (KT) aims to predict learners' future performance based on historical learning interactions. However, existing KT models predominantly focus on data from a single course, limiting their ability to capture a comprehensive understanding of learners' knowledge states. In this paper, we propose TransKT, a contrastive cross-course knowledge tracing method that leverages concept graph guided knowledge transfer to model the relationships between learning behaviors across different courses, thereby enhancing knowledge state estimation. Specifically, TransKT constructs a cross-course concept graph by leveraging zero-shot Large Language Model (LLM) prompts to establish implicit links between related concepts across different courses. This graph serves as the foundation for knowledge transfer, enabling the model to integrate and enhance the semantic features of learners' interactions across courses. Furthermore, TransKT includes an LLM-to-LM pipeline for incorporating summarized semantic features, which significantly improves the performance of Graph Convolutional Networks (GCNs) used for knowledge transfer. Additionally, TransKT employs a contrastive objective that aligns single-course and cross-course knowledge states, thereby refining the model's ability to provide a more robust and accurate representation of learners' overall knowledge states.

knowledge management, large language model, machine learning, (19 more...)

2505.13489

Country:

North America > United States (0.04)
Europe > Belgium > Flanders > East Flanders > Ghent (0.04)
Asia > China (0.04)

Genre:

Research Report > Experimental Study (0.46)
Instructional Material > Course Syllabus & Notes (0.46)

Industry:

Education > Educational Setting > Online (0.68)
Education > Educational Technology > Educational Software > Computer Based Training (0.46)

Technology:

Information Technology > Knowledge Management (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Assessing GPT Performance in a Proof-Based University-Level Course Under Blind Grading

Ding, Ming, Kyng, Rasmus, Solda, Federico, Yuan, Weixuan

While LLMs have demonstrated impressive capabilities, their true level of intelligence and reasoning remains a subject of debate. The classical Turing Test proposes that a machine demonstrating human-like responses in conversation could be considered intelligent. Over the past few years, substantial efforts have been devoted to evaluating LLMs from various angles [Cha+24]. For example, LLMs can generate essays with their quality rated higher than those produced by humans [Her+23]; pass questions involving communication skills, ethics, empathy, and professionalism in a United States Medical Licensing Examination (USMLE) [Bri+23]; achieve passing scores on the reading comprehension test of the Program for International Student Assessment (PISA), a global standardized student assessment [V az+23]; and demonstrate strong performance in solving middle school-level math word problems, with multiple LLMs achieving passing scores and some exceeding 90% accuracy [Vid24]. However, existing evaluation protocols may fall short of comprehensively assessing their reasoning and problem-solving capabilities.

large language model, machine learning, natural language, (20 more...)

2505.13664

Country:

North America > United States (0.34)
Europe > Switzerland > Zürich > Zürich (0.04)
Europe > United Kingdom (0.04)

Genre:

Research Report > New Finding (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Education > Educational Setting (1.00)
Education > Curriculum > Subject-Specific Education (1.00)
Education > Assessment & Standards > Student Performance (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Hashemifar, Soroush, Sahebi, Sherry

Personalized Student Knowledge Modeling for Future Learning Resource Prediction

Despite advances in deep learning for education, student knowledge tracing and behavior modeling face persistent challenges: limited personalization, inadequate modeling of diverse learning activities (especially non-assessed materials), and overlooking the interplay between knowledge acquisition and behavioral patterns. Practical limitations, such as fixed-size sequence segmentation, frequently lead to the loss of contextual information vital for personalized learning. Moreover, reliance on student performance on assessed materials limits the modeling scope, excluding non-assessed interactions like lectures. To overcome these shortcomings, we propose Knowledge Modeling and Material Prediction (KMaP), a stateful multi-task approach designed for personalized and simultaneous modeling of student knowledge and behavior. KMaP employs clustering-based student profiling to create personalized student representations, improving predictions of future learning resource preferences. Extensive experiments on two real-world datasets confirm significant behavioral differences across student clusters and validate the efficacy of the KMaP model.

artificial intelligence, machine learning, student, (17 more...)

2505.14072

Country: North America > United States (0.46)

Genre:

Research Report (1.00)
Instructional Material (0.91)

Industry: Education > Educational Technology > Educational Software > Computer Based Training (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

AIHubMay-20-2025, 08:45:37 GMT

Interview with Filippos Gouidis: Object state classification

Filippos's PhD dissertation focuses on developing a method for recognizing object states without visual training data. By leveraging semantic knowledge from online sources and Large Language Models, structured as Knowledge Graphs, Graph Neural Networks learn representations for accurate state classification. In this interview series, we're meeting some of the AAAI/SIGAI Doctoral Consortium participants to find out more about their research. The Doctoral Consortium provides an opportunity for a group of PhD students to discuss and explore their research interests and career objectives in an interdisciplinary workshop together with a panel of established researchers. In this latest interview, we met with Filippos Gouidis, who has recently completed his PhD, and found out more about his research on object state classification.

classification, knowledge, state classification, (14 more...)

AIHub

Country: Europe > Greece (0.05)

Genre: Instructional Material > Course Syllabus & Notes (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceMay-20-2025

MatPredict: a dataset and benchmark for learning material properties of diverse indoor objects

Chen, Yuzhen, Son, Hojun, Kusari, Arpan

Determining material properties from camera images can expand the ability to identify complex objects in indoor environments, which is valuable for consumer robotics applications. To support this, we introduce MatPredict, a dataset that combines the high-quality synthetic objects from Replica dataset with MatSynth dataset's material properties classes - to create objects with diverse material properties. We select 3D meshes of specific foreground objects and render them with different material properties. In total, we generate \textbf{18} commonly occurring objects with \textbf{14} different materials. We showcase how we provide variability in terms of lighting and camera placement for these objects. Next, we provide a benchmark for inferring material properties from visual images using these perturbed models in the scene, discussing the specific neural network models involved and their performance based on different image comparison metrics. By accurately simulating light interactions with different materials, we can enhance realism, which is crucial for training models effectively through large-scale simulations. This research aims to revolutionize perception in consumer robotics. The dataset is provided \href{https://huggingface.co/datasets/UMTRI/MatPredict}{here} and the code is provided \href{https://github.com/arpan-kusari/MatPredict}{here}.

artificial intelligence, machine learning, material property, (18 more...)

2505.13201

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre:

Research Report (0.82)
Instructional Material > Course Syllabus & Notes (0.50)

Industry: Education (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)

arXiv.org Artificial IntelligenceMay-20-2025

Towards Comprehensive Argument Analysis in Education: Dataset, Tasks, and Method

Ren, Yupei, Zhou, Xinyi, Zhang, Ning, Zhao, Shangqing, Lan, Man, Bai, Xiaopeng

Argument mining has garnered increasing attention over the years, with the recent advancement of Large Language Models (LLMs) further propelling this trend. However, current argument relations remain relatively simplistic and foundational, struggling to capture the full scope of argument information, particularly when it comes to representing complex argument structures in real-world scenarios. To address this limitation, we propose 14 fine-grained relation types from both vertical and horizontal dimensions, thereby capturing the intricate interplay between argument components for a thorough understanding of argument structure. On this basis, we conducted extensive experiments on three tasks: argument component detection, relation prediction, and automated essay grading. Additionally, we explored the impact of writing quality on argument component detection and relation prediction, as well as the connections between discourse relations and argumentative features. The findings highlight the importance of fine-grained argumentative annotations for argumentative writing quality assessment and encourage multi-dimensional argument analysis.

large language model, machine learning, relation, (19 more...)

2505.12028

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > Italy (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre:

Instructional Material (0.68)
Research Report (0.64)

Industry:

Education > Educational Setting (0.68)
Education > Educational Technology > Educational Software > Computer-Aided Assessment (0.56)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Srisuchinnawong, Arthicha, Manoonpong, Poramate

Growable and Interpretable Neural Control with Online Continual Learning for Autonomous Lifelong Locomotion Learning Machines

arXiv.org Artificial IntelligenceMay-20-2025

Continual locomotion learning faces four challenges: incomprehensibility, sample inefficiency, lack of knowledge exploitation, and catastrophic forgetting. Thus, this work introduces Growable Online Locomotion Learning Under Multicondition (GOLLUM), which exploits the interpretability feature to address the aforementioned challenges. GOLLUM has two dimensions of interpretability: layer-wise interpretability for neural control function encoding and column-wise interpretability for robot skill encoding. With this interpretable control structure, GOLLUM utilizes neurogenesis to unsupervisely increment columns (ring-like networks); each column is trained separately to encode and maintain a specific primary robot skill. GOLLUM also transfers the parameters to new skills and supplements the learned combination of acquired skills through another neural mapping layer added (layer-wise) with online supplementary learning. On a physical hexapod robot, GOLLUM successfully acquired multiple locomotion skills (e.g., walking, slope climbing, and bouncing) autonomously and continuously within an hour using a simple reward function. Furthermore, it demonstrated the capability of combining previous learned skills to facilitate the learning process of new skills while preventing catastrophic forgetting. Compared to state-of-the-art locomotion learning approaches, GOLLUM is the only approach that addresses the four challenges above mentioned without human intervention. It also emphasizes the potential exploitation of interpretability to achieve autonomous lifelong learning machines.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

doi: 10.1177/02783649251336385

2505.12029

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Bristol (0.04)
Europe > Spain (0.04)
(3 more...)

Genre:

Research Report (1.00)
Instructional Material > Online (0.80)

Industry:

Media > Television (0.92)
Leisure & Entertainment (0.92)
Health & Medicine > Therapeutic Area > Neurology (0.46)
Education > Educational Setting > Continuing Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots > Locomotion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)