AITopics | fitting model

Collaborating Authors

fitting model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

03593ce517feac573fdaafa6dcedef61-AuthorFeedback.pdf

Neural Information Processing SystemsOct-1-2025, 22:14:03 GMT

artificial intelligence, equalized odds, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Bias Fitting to Mitigate Length Bias of Reward Model in RLHF

Zhao, Kangwen, Cai, Jianfeng, Zhu, Jinhua, Sun, Ruopei, Xue, Dongyun, Zhou, Wengang, Li, Li, Li, Houqiang

arXiv.org Artificial IntelligenceMay-20-2025

Reinforcement Learning from Human Feedback relies on reward models to align large language models with human preferences. However, RLHF often suffers from reward hacking, wherein policy learning exploits flaws in the trained reward model to maximize reward scores without genuinely aligning with human preferences. A significant example of such reward hacking is length bias, where reward models usually favor longer responses irrespective of actual response quality. Previous works on length bias have notable limitations, these approaches either mitigate bias without characterizing the bias form, or simply assume a linear length-reward relation. To accurately model the intricate nature of length bias and facilitate more effective bias mitigation, we propose FiMi-RM (Bias Fitting to Mitigate Length Bias of Reward Model in RLHF), a framework that autonomously learns and corrects underlying bias patterns. Our approach consists of three stages: First, we train a standard reward model which inherently contains length bias. Next, we deploy a lightweight fitting model to explicitly capture the non-linear relation between length and reward. Finally, we incorporate this learned relation into the reward model to debias. Experimental results demonstrate that FiMi-RM achieves a more balanced length-reward distribution. Furthermore, when applied to alignment algorithms, our debiased reward model improves length-controlled win rate and reduces verbosity without compromising its performance.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.12843

Country:

North America > United States > Florida > Miami-Dade County > Miami (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Semantic-Loss Function Modeling Framework With Task-Oriented Machine Learning Perspectives

Nguyen, Ti Ti, Le, Thanh-Dung, Ha, Vu Nguyen, Chou, Hong-fu, Eappen, Geoffrey, Tran, Duc-Dung, Nguyen-Kha, Hung, Thiruvasagam, Prabhu, Garces-Socarras, Luis M., Gonzalez-Rios, Jorge L., Merlano-Duncan, Juan C., Chatzinotas, Symeon

arXiv.org Artificial IntelligenceMar-12-2025

The integration of machine learning (ML) has significantly enhanced the capabilities of Earth Observation (EO) systems by enabling the extraction of actionable insights from complex datasets. However, the performance of data-driven EO applications is heavily influenced by the data collection and transmission processes, where limited satellite bandwidth and latency constraints can hinder the full transmission of original data to the receivers. To address this issue, adopting the concepts of Semantic Communication (SC) offers a promising solution by prioritizing the transmission of essential data semantics over raw information. Implementing SC for EO systems requires a thorough understanding of the impact of data processing and communication channel conditions on semantic loss at the processing center. This work proposes a novel data-fitting framework to empirically model the semantic loss using real-world EO datasets and domain-specific insights. The framework quantifies two primary types of semantic loss: (1) source coding loss, assessed via a data quality indicator measuring the impact of processing on raw source data, and (2) transmission loss, evaluated by comparing practical transmission performance against the Shannon limit. Semantic losses are estimated by evaluating the accuracy of EO applications using four task-oriented ML models, EfficientViT, MobileViT, ResNet50-DINO, and ResNet8-KD, on lossy image datasets under varying channel conditions and compression ratios. These results underpin a framework for efficient semantic-loss modeling in bandwidth-constrained EO scenarios, enabling more reliable and effective operations.

application, eo application, fitting model, (14 more...)

arXiv.org Artificial Intelligence

2503.09903

Country: Europe (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Model-based Multi-Agent Personalized Short-Video Recommender System

Zhou, Peilun, Xu, Xiaoxiao, Hu, Lantao, Li, Han, Jiang, Peng

arXiv.org Artificial IntelligenceMay-3-2024

Recommender selects and presents top-K items to the user at each online request, and a recommendation session consists of several sequential requests. Formulating a recommendation session as a Markov decision process and solving it by reinforcement learning (RL) framework has attracted increasing attention from both academic and industry communities. In this paper, we propose a RL-based industrial short-video recommender ranking framework, which models and maximizes user watch-time in an environment of user multi-aspect preferences by a collaborative multi-agent formulization. Moreover, our proposed framework adopts a model-based learning approach to alleviate the sample selection bias which is a crucial but intractable problem in industrial recommender system. Extensive offline evaluations and live experiments confirm the effectiveness of our proposed method over alternatives. Our proposed approach has been deployed in our real large-scale short-video sharing platform, successfully serving over hundreds of millions users.

agent, non-impression sample, watchtime, (15 more...)

arXiv.org Artificial Intelligence

2405.01847

Country:

North America > United States > District of Columbia > Washington (0.05)
North America > United States > New York > New York County > New York City (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Two-Stage Hybrid Day-Ahead Solar Forecasting

Alanazi, Mohana, Mahoor, Mohsen, Khodaei, Amin

arXiv.org Machine LearningJun-27-2017

Abstract--Power supply from renewable resources is on a global rise where it is forecasted that renewable generation will surpass other types of generation in a foreseeable future. Increased generation from renewable resources, mainly solar and wind, exposes the power grid to more vulnerabilities, conceivably due to their variable generation, thus highlighting the importance of accurate forecasting methods. This paper proposes a two-stage day-ahead solar forecasting method that breaks down the forecasting into linear and nonlinear parts, determines subsequent forecasts, and accordingly, improves accuracy of the obtained results. To further reduce the error resulted from nonstationarity of the historical solar radiation data, a data processing approach, including pre-process and post-process levels, is integrated with the proposed method. Numerical simulations on three test days with different weather conditions exhibit the effectiveness of the proposed two-stage model. Figure 1 The new added U.S. electric generation from 2010 to Q1 2016 [2].

artificial intelligence, machine learning, modeling & simulation, (17 more...)

arXiv.org Machine Learning

1706.08699

Country:

Europe > Portugal > Coimbra > Coimbra (0.05)
Asia (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry:

Energy > Renewable > Solar (1.00)
Energy > Power Industry (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)

Add feedback