AITopics | cold start

DropoutNet: Addressing Cold Start in Recommender Systems

Neural Information Processing SystemsMar-17-2026, 17:49:25 GMT

Latent models have become the default choice for recommender systems due to their performance and scalability. However, research in this area has primarily focused on modeling user-item interactions, and few latent models have been developed for cold start. Deep learning has recently achieved remarkable success showing excellent results for diverse input types. Inspired by these results we propose a neural network based latent model called DropoutNet to address the cold start problem in recommender systems. Unlike existing approaches that incorporate additional content-based objective terms, we instead focus on the optimization and show that neural network models can be explicitly trained for cold start through dropout. Our model can be applied on top of any existing latent model effectively providing cold start capabilities, and full power of deep architectures. Empirically we demonstrate state-of-the-art accuracy on publicly available benchmarks.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)

Add feedback

DropoutNet: Addressing Cold Start in Recommender Systems

Neural Information Processing SystemsNov-21-2025, 16:07:56 GMT

Latent models have become the default choice for recommender systems due to their performance and scalability. However, research in this area has primarily focused on modeling user-item interactions, and few latent models have been developed for cold start. Deep learning has recently achieved remarkable success showing excellent results for diverse input types. Inspired by these results we propose a neural network based latent model called DropoutNet to address the cold start problem in recommender systems. Unlike existing approaches that incorporate additional content-based objective terms, we instead focus on the optimization and show that neural network models can be explicitly trained for cold start through dropout. Our model can be applied on top of any existing latent model effectively providing cold start capabilities, and full power of deep architectures. Empirically we demonstrate state-of-the-art accuracy on publicly available benchmarks.

cold start, dropoutnet, name change, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)

Add feedback

DropoutNet: Addressing Cold Start in Recommender Systems

Maksims Volkovs, Guangwei Yu, Tomi Poutanen

Neural Information Processing SystemsNov-21-2025, 13:13:03 GMT

Latent models have become the default choice for recommender systems due to their performance and scalability. However, research in this area has primarily focused on modeling user-item interactions, and few latent models have been developed for cold start. Deep learning has recently achieved remarkable success showing excellent results for diverse input types. Inspired by these results we propose a neural network based latent model called DropoutNet to address the cold start problem in recommender systems. Unlike existing approaches that incorporate additional content-based objective terms, we instead focus on the optimization and show that neural network models can be explicitly trained for cold start through dropout. Our model can be applied on top of any existing latent model effectively providing cold start capabilities, and full power of deep architectures. Empirically we demonstrate state-of-the-art accuracy on publicly available benchmarks.

artificial intelligence, machine learning, representation, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Solving cold start in news recommendations: a RippleNet-based system for large scale media outlet

Radziszewski, Karol, Szpunar, Michał, Ociepka, Piotr, Buczyński, Mateusz

arXiv.org Artificial IntelligenceNov-5-2025

We present a scalable recommender system implementation based on RippleNet, tailored for the media domain with a production deployment in Onet.pl, one of Poland's largest online media platforms. Our solution addresses the cold-start problem for newly published content by integrating content-based item embeddings into the knowledge propagation mechanism of RippleNet, enabling effective scoring of previously unseen items. The system architecture leverages Amazon SageMaker for distributed training and inference, and Apache Airflow for orchestrating data pipelines and model retraining workflows. To ensure high-quality training data, we constructed a comprehensive golden dataset consisting of user and item features and a separate interaction table, all enabling flexible extensions and integration of new signals.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.02052

Country: Europe > Poland (0.52)

Genre: Research Report (0.64)

Industry: Media > News (0.51)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.92)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.69)

Add feedback

Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning

Wei, Yana, Zhao, Liang, Sun, Jianjian, Lin, Kangheng, Yin, Jisheng, Hu, Jingcheng, Zhang, Yinmin, Yu, En, Lv, Haoran, Weng, Zejia, Wang, Jia, Han, Chunrui, Peng, Yuang, Han, Qi, Ge, Zheng, Zhang, Xiangyu, Jiang, Daxin, Patel, Vishal M.

arXiv.org Artificial IntelligenceSep-23-2025

The remarkable reasoning capability of large language models (LLMs) stems from cognitive behaviors that emerge through reinforcement with verifiable rewards. This work investigates how to transfer this principle to Multimodal LLMs (MLLMs) to unlock advanced visual reasoning. We introduce a two-stage paradigm built on Qwen2.5-VL-7B: a massive linguistic cold-start fine-tuning, followed by multimodal reinforcement learning (RL) spanning nearly 1,000 steps, surpassing all previous open-source efforts in scale. This pioneering work reveals three fundamental insights: 1) Behavior transfer emerges surprisingly early in cold start due to linguistic mental imagery. 2) Cold start broadly memorizes visual behaviors, while RL critically discerns and scales up effective patterns. 3) Transfer strategically favors high-utility behaviors such as visual reflection. Our resulting model, Open-Vision-Reasoner (OVR), achieves state-of-the-art performance on a suite of reasoning benchmarks, including 95.3% on MATH500, 51.8% on MathVision and 54.6% on MathVerse. We release our model, data, and training dynamics to catalyze the development of more capable, behavior-aligned multimodal reasoners.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2507.05255

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Revealing Potential Biases in LLM-Based Recommender Systems in the Cold Start Setting

Andre, Alexandre, Roy, Gauthier, Dyer, Eva, Wang, Kai

arXiv.org Artificial IntelligenceSep-9-2025

Large Language Models (LLMs) are increasingly used for recommendation tasks due to their general-purpose capabilities. While LLMs perform well in rich-context settings, their behavior in cold-start scenarios, where only limited signals such as age, gender, or language are available, raises fairness concerns because they may rely on societal biases encoded during pretraining. We introduce a benchmark specifically designed to evaluate fairness in zero-context recommendation. Our modular pipeline supports configurable recommendation domains and sensitive attributes, enabling systematic and flexible audits of any open-source LLM. Through evaluations of state-of-the-art models (Gemma 3 and Llama 3.2), we uncover consistent biases across recommendation domains (music, movies, and colleges) including gendered and cultural stereotypes. We also reveal a non-linear relationship between model size and fairness, highlighting the need for nuanced analysis.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2508.20401

Country: North America > United States > Pennsylvania (0.14)

Genre: Research Report (0.84)

Industry:

Leisure & Entertainment (1.00)
Media > Film (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Privacy Preserving Inference of Personalized Content for Out of Matrix Users

Sun, Michael, Vu, Tai, Wang, Andrew

arXiv.org Artificial IntelligenceAug-22-2025

Recommender systems for niche and dynamic communities face persistent challenges from data sparsity, cold start users and items, and privacy constraints. Traditional collaborative filtering and content-based approaches underperform in these settings, either requiring invasive user data or failing when preference histories are absent. We present DeepNaniNet, a deep neural recommendation framework that addresses these challenges through an inductive graph-based architecture combining user-item interactions, item-item relations, and rich textual review embeddings derived from BERT. Our design enables cold start recommendations without profile mining, using a novel "content basket" user representation and an autoencoder-based generalization strategy for unseen users. We introduce AnimeULike, a new dataset of 10,000 anime titles and 13,000 users, to evaluate performance in realistic scenarios with high proportions of guest or low-activity users. DeepNaniNet achieves state-of-the-art cold start results on the CiteULike benchmark, matches DropoutNet in user recall without performance degradation for out-of-matrix users, and outperforms Weighted Matrix Factorization (WMF) and DropoutNet on AnimeULike warm start by up to 7x and 1.5x in Recall@100, respectively. Our findings demonstrate that DeepNaniNet delivers high-quality, privacy-preserving recommendations in data-sparse, cold start-heavy environments while effectively integrating heterogeneous content sources.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2508.14905

Country:

Europe > Netherlands (0.17)
North America > United States (0.14)

Genre: Research Report > New Finding (0.53)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

CSGO: Generalized Optimization for Cold Start in Wireless Collaborative Edge LLM Systems

Liu, Xuran, Xue, Nan, Bao, Rui, Sun, Yaping, Chen, Zhiyong, Tao, Meixia, Xu, Xiaodong, Cui, Shuguang

arXiv.org Artificial IntelligenceAug-18-2025

While deploying large language models on edge devices promises low-latency and privacy-preserving AI services, it is hindered by limited device resources. Although pipeline parallelism facilitates distributed inference, existing approaches often ignore the cold-start latency caused by on-demand model loading. In this paper, we propose a latency-aware scheduling framework that overlaps model loading with computation and communication to minimize total inference latency. Based on device and model parameters, the framework dynamically adjusts layer partitioning and allocation to effectively hide loading time, thereby eliminating as many idle periods as possible. We formulate the problem as a Mixed-Integer Non-Linear Program and design an efficient dynamic programming algorithm to optimize model partitioning and device assignment. Experimental results show that the proposed method significantly reduces cold-start latency compared to baseline strategies.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2508.11287

Country: Asia > China (1.00)

Genre: Research Report > New Finding (0.34)

Industry:

Information Technology (0.68)
Telecommunications (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.86)

Add feedback

Addressing Cold Start For next-article Recommendation

Elgohary, Omar, Jorgenson, Nathan, Marple, Trenton

arXiv.org Artificial IntelligenceAug-5-2025

This replication study modifies ALMM, the Adaptive Linear Mapping Model constructed for the next song recommendation, to the news recommendation problem on the MIND dataset. The original version of ALMM computes latent representations for users, last-time items, and current items in a tensor factorization structure and learns a linear mapping from content features to latent item vectors. Our replication aims to improve recommendation performance in cold-start scenarios by restructuring this model to sequential news click behavior, viewing consecutively read articles as (last news, next news) tuples. Instead of the original audio features, we apply BERT and a TF-IDF (Term Frequency-Inverse Document Frequency) to news titles and abstracts to extract token contextualized representations and align them with triplet-based user reading patterns. We also propose a reproducibly thorough pre-processing pipeline combining news filtering and feature integrity validation. Our implementation of ALMM with TF-IDF shows relatively improved recommendation accuracy and robustness over Forbes and Oord baseline models in the cold-start scenario. We demonstrate that ALMM in a minimally modified state is not suitable for next news recommendation.

artificial intelligence, machine learning, recommendation, (17 more...)

arXiv.org Artificial Intelligence

2508.01036

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.15)

Genre: Research Report (0.82)

Industry:

Media > Music (0.89)
Leisure & Entertainment (0.89)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Advancing Multimodal Reasoning via Reinforcement Learning with Cold Start

Wei, Lai, Li, Yuting, Zheng, Kaipeng, Wang, Chen, Wang, Yue, Kong, Linghe, Sun, Lichao, Huang, Weiran

arXiv.org Artificial IntelligenceJul-24-2025

Recent advancements in large language models (LLMs) have demonstrated impressive chain-of-thought reasoning capabilities, with reinforcement learning (RL) playing a crucial role in this progress. While "aha moment" patterns--where models exhibit self-correction through reflection--are often attributed to emergent properties from RL, we first demonstrate that these patterns exist in multimodal LLMs (MLLMs) prior to RL training but may not necessarily correlate with improved reasoning performance. Building on these insights, we present a comprehensive study on enhancing multimodal reasoning through a two-stage approach: (1) supervised fine-tuning (SFT) as a cold start with structured chain-of-thought reasoning patterns, followed by (2) reinforcement learning via GRPO to further refine these capabilities. Our extensive experiments show that this combined approach consistently outperforms both SFT-only and RL-only methods across challenging multimodal reasoning benchmarks. The resulting models achieve state-of-the-art performance among open-source MLLMs at both 3B and 7B scales, with our 7B model showing substantial improvements over base models (e.g., 66.3 %$\rightarrow$73.4 % on MathVista, 62.9 %$\rightarrow$70.4 % on We-Math) and our 3B model achieving performance competitive with several 7B models. Overall, this work provides practical guidance for building advanced multimodal reasoning models. Our code is available at https://github.com/waltonfuture/RL-with-Cold-Start.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2505.22334

Country: Asia (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Filters

Collaborating Authors

cold start

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

DropoutNet: Addressing Cold Start in Recommender Systems

DropoutNet: Addressing Cold Start in Recommender Systems

DropoutNet: Addressing Cold Start in Recommender Systems

Solving cold start in news recommendations: a RippleNet-based system for large scale media outlet

Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning

Revealing Potential Biases in LLM-Based Recommender Systems in the Cold Start Setting

Privacy Preserving Inference of Personalized Content for Out of Matrix Users

CSGO: Generalized Optimization for Cold Start in Wireless Collaborative Edge LLM Systems

Addressing Cold Start For next-article Recommendation

Advancing Multimodal Reasoning via Reinforcement Learning with Cold Start