AITopics | model type

Machine learning (ML) is increasingly adopted in scientific research, yet the quality and reliability of results often depend on how experiments are designed and documented. Poor baselines, inconsistent preprocessing, or insufficient validation can lead to misleading conclusions about model performance. This paper presents a practical and structured guide for conducting ML experiments in scientific applications, focussing on reproducibility, fair comparison, and transparent reporting. We outline a step-by-step workflow, from dataset preparation to model selection and evaluation, and propose metrics that account for overfitting and instability across validation folds, including the Logarithmic Overfitting Ratio (LOR) and the Composite Overfitting Score (COS). Through recommended practices and example reporting formats, this work aims to support researchers in establishing robust baselines and drawing valid evidence-based insights from ML models applied to scientific problems.

artificial intelligence, experiment, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2511.21354

Country: Europe > Switzerland > Zürich > Zürich (0.15)

Genre:

Workflow (0.87)
Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Does Model Size Matter? A Comparison of Small and Large Language Models for Requirements Classification

Zadenoori, Mohammad Amin, De Martino, Vincenzo, Dabrowski, Jacek, Franch, Xavier, Ferrari, Alessio

arXiv.org Artificial IntelligenceOct-27-2025

[Context and motivation] Large language models (LLMs) show notable results in natural language processing (NLP) tasks for requirements engineering (RE). However, their use is compromised by high computational cost, data sharing risks, and dependence on external services. In contrast, small language models (SLMs) offer a lightweight, locally deployable alternative. [Question/problem] It remains unclear how well SLMs perform compared to LLMs in RE tasks in terms of accuracy. [Results] Our preliminary study compares eight models, including three LLMs and five SLMs, on requirements classification tasks using the PROMISE, PROMISE Reclass, and SecReq datasets. Our results show that although LLMs achieve an average F1 score of 2% higher than SLMs, this difference is not statistically significant. SLMs almost reach LLMs performance across all datasets and even outperform them in recall on the PROMISE Reclass dataset, despite being up to 300 times smaller. We also found that dataset characteristics play a more significant role in performance than model size. [Contribution] Our study contributes with evidence that SLMs are a valid alternative to LLMs for requirements classification, offering advantages in privacy, cost, and local deployability.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.21443

Country:

Europe > Italy (0.04)
Europe > Switzerland (0.04)
Europe > Spain (0.04)
Europe > Ireland > Munster > County Limerick > Limerick (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.34)

Industry: Information Technology > Security & Privacy (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Model Shapley: Equitable Model Valuation with Black-box Access Xinyi Xu, Thanh Lam

Neural Information Processing SystemsOct-10-2025, 23:28:59 GMT

ML models call for an equitable model valuation method to price them. In particular, we investigate the black-box access setting which allows querying a model (to observe predictions) without disclosing model-specific information (e.g., architecture and parameters). By exploiting a Dirichlet abstraction of a model's predictions, we propose a novel and equitable model valuation method called

data mining, dirichlet abstraction, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Singapore (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.68)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.93)
Transportation > Air (0.61)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(5 more...)

Add feedback

e1de63ec74f40d3234c4e053f3528e18-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 09:53:04 GMT

artificial intelligence, machine learning, optimization problem, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Vision (0.69)
(3 more...)

Add feedback

Supplementary Material Ensembles for Robustness and Uncertainty Quantification A Further details about fixed hyper ensembles and hyper deep ensembles

Neural Information Processing SystemsOct-2-2025, 20:11:17 GMT

In [52], the choice of their parametrization (i.e., shifting and rescaling) is motivated by the example

artificial intelligence, ensemble, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Multi-Class Human/Object Detection on Robot Manipulators using Proprioceptive Sensing

Hehli, Justin, Heiniger, Marco, Rezayati, Maryam, van de Venn, Hans Wernher

arXiv.org Artificial IntelligenceAug-5-2025

In physical human-robot collaboration (pHRC) settings, humans and robots collaborate directly in shared environments. Robots must analyze interactions with objects to ensure safety and facilitate meaningful workflows. One critical aspect is human/object detection, where the contacted object is identified. Past research introduced binary machine learning classifiers to distinguish between soft and hard objects. This study improves upon those results by evaluating three-class human/object detection models, offering more detailed contact analysis. A dataset was collected using the Franka Emika Panda robot manipulator, exploring preprocessing strategies for time-series analysis. Models including LSTM, GRU, and Transformers were trained on these datasets. The best-performing model achieved 91.11\% accuracy during real-time testing, demonstrating the feasibility of multi-class detection models. Additionally, a comparison of preprocessing strategies suggests a sliding window approach is optimal for this task.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2508.02425

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
Africa > Central African Republic > Ombella-M'Poko > Bimbo (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.94)

Add feedback

Large Language Model-Empowered Interactive Load Forecasting

Zuo, Yu, Qin, Dalin, Wang, Yi

arXiv.org Artificial IntelligenceMay-23-2025

--The growing complexity of power systems has made accurate load forecasting more important than ever . An increasing number of advanced load forecasting methods have been developed. However, the static design of current methods offers no mechanism for human-model interaction. As the primary users of forecasting models, system operators often find it difficult to understand and apply these advanced models, which typically requires expertise in artificial intelligence (AI). This also prevents them from incorporating their experience and real-world contextual understanding into the forecasting process. Recent breakthroughs in large language models (LLMs) offer a new opportunity to address this issue. By leveraging their natural language understanding and reasoning capabilities, we propose an LLM-based multi-agent collaboration framework to bridge the gap between human operators and forecasting models. A set of specialized agents is designed to perform different tasks in the forecasting workflow and collaborate via a dedicated communication mechanism. Our experiments demonstrate that the interactive load forecasting accuracy can be significantly improved when users provide proper insight in key stages. Our cost analysis shows that the framework remains affordable, making it practical for real-world deployment. With the boom of artificial intelligence, a wide range of forecasting algorithms have been proposed recently, many of which have demonstrated impressive performance. However, these forecasting methods become static once designed, offering no mechanism for interaction between the model and human users. This lack of interaction creates major barriers to the practical use of the forecasting methods.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.16577

Country: Asia > China > Guangdong Province (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Energy > Power Industry (0.68)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Collaborating Authors

model type

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

e1de63ec74f40d3234c4e053f3528e18-Paper-Conference.pdf

86bcae6da75c72e32f30a5553f094c06-Paper-Conference.pdf

481fbfa59da2581098e841b7afc122f1-Supplemental.pdf

Best Practices for Machine Learning Experimentation in Scientific Applications

Does Model Size Matter? A Comparison of Small and Large Language Models for Requirements Classification

Model Shapley: Equitable Model Valuation with Black-box Access Xinyi Xu, Thanh Lam

e1de63ec74f40d3234c4e053f3528e18-Paper-Conference.pdf

Supplementary Material Ensembles for Robustness and Uncertainty Quantification A Further details about fixed hyper ensembles and hyper deep ensembles

Multi-Class Human/Object Detection on Robot Manipulators using Proprioceptive Sensing

Large Language Model-Empowered Interactive Load Forecasting