AITopics

Country:

Asia > China > Hong Kong (0.04)
North America > Canada (0.04)
Europe > Netherlands > South Holland > Dordrecht (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)

Neural Information Processing SystemsDec-26-2025, 03:57:55 GMT

MetaQuant: Learning to Quantize by Learning to Penetrate Non-differentiable Quantization

learning, metaquant, penetrate non-differentiable quantization, (7 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.62)

Neural Information Processing SystemsAug-16-2025, 13:49:35 GMT

d072677d210ac4c03ba046120f0802ec-AuthorFeedback.pdf

We respond to the concerns point-by-point as below. Why distilling prioritized paths improves architecture rating? The more sufficient/full training of subnets leads to a more accurate architecture rating [6](Sec.4.3). The set used to train the matching network? We will revise the manuscript to make this point clearer.

prioritized path, reviewer, top-1 acc, (16 more...)

Technology: Information Technology > Artificial Intelligence (1.00)

Neural Information Processing SystemsOct-11-2024, 07:07:59 GMT

MetaQuant: Learning to Quantize by Learning to Penetrate Non-differentiable Quantization

Tremendous amount of parameters make deep neural networks impractical to be deployed for edge-device-based real-world applications due to the limit of computational power and storage space. Existing studies have made progress on learning quantized deep models to reduce model size and energy consumption, i.e. converting full-precision weights ( r's) into discrete values ( q's) in a supervised training manner. However, the training process for quantization is non-differentiable, which leads to either infinite or zero gradients ( g_r) w.r.t. To address this problem, most training-based quantization methods use the gradient w.r.t. However, these methods only heuristically make training-based quantization applicable, without further analysis on how the approximated gradients can assist training of a quantized network.

learning, metaquant, penetrate non-differentiable quantization, (4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.65)

arXiv.org Artificial IntelligenceJun-18-2024

LLM4MSR: An LLM-Enhanced Paradigm for Multi-Scenario Recommendation

Wang, Yuhao, Wang, Yichao, Fu, Zichuan, Li, Xiangyang, Zhao, Xiangyu, Guo, Huifeng, Tang, Ruiming

As the demand for more personalized recommendation grows and a dramatic boom in commercial scenarios arises, the study on multi-scenario recommendation (MSR) has attracted much attention, which uses the data from all scenarios to simultaneously improve their recommendation performance. However, existing methods tend to integrate insufficient scenario knowledge and neglect learning personalized cross-scenario preferences, thus leading to suboptimal performance and inadequate interpretability. Meanwhile, though large language model (LLM) has shown great capability of reasoning and capturing semantic information, the high inference latency and high computation cost of tuning hinder its implementation in industrial recommender systems. To fill these gaps, we propose an effective efficient interpretable LLM-enhanced paradigm LLM4MSR in this work. Specifically, we first leverage LLM to uncover multi-level knowledge including scenario correlations and users' cross-scenario interests from the designed scenario- and user-level prompt without fine-tuning the LLM, then adopt hierarchical meta networks to generate multi-level meta layers to explicitly improves the scenario-aware and personalized recommendation capability. Our experiments on KuaiSAR-small, KuaiSAR, and Amazon datasets validate two significant advantages of LLM4MSR: (i) the effectiveness and compatibility with different multi-scenario backbone models (achieving 1.5%, 1%, and 40% AUC improvement on three datasets), (ii) high efficiency and deployability on industrial recommender systems, and (iii) improved interpretability. The implemented code and data is available to ease reproduction.

artificial intelligence, large language model, natural language, (18 more...)

2406.12529

Country:

North America > United States > District of Columbia > Washington (0.05)
Asia > China > Hong Kong (0.05)
Asia > China > Guangdong Province > Shenzhen (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

arXiv.org Artificial IntelligenceJan-11-2024

Early Warning Prediction with Automatic Labeling in Epilepsy Patients

Zhang, Peng, Gao, Ting, Guo, Jin, Duan, Jinqiao, Nikolenko, Sergey

Early warning for epilepsy patients is crucial for their safety and well-being, in particular to prevent or minimize the severity of seizures. Through the patients' EEG data, we propose a meta learning framework to improve the prediction of early ictal signals. The proposed bi-level optimization framework can help automatically label noisy data at the early ictal stage, as well as optimize the training accuracy of the backbone model. To validate our approach, we conduct a series of experiments to predict seizure onset in various long-term windows, with LSTM and ResNet implemented as the baseline models. Our study demonstrates that not only the ictal prediction accuracy obtained by meta learning is significantly improved, but also the resulting model captures some intrinsic patterns of the noisy data that a single backbone model could not learn. As a result, the predicted probability generated by the meta network serves as a highly effective early warning indicator.

indicator, seizure, transition, (16 more...)

2310.06059

Country:

North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
Europe > Russia (0.04)
Asia > Russia (0.04)
(2 more...)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Epilepsy (1.00)
Health & Medicine > Therapeutic Area > Genetic Disease (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceMar-4-2023

Meta Matrix Factorization for Federated Rating Predictions

Lin, Yujie, Ren, Pengjie, Chen, Zhumin, Ren, Zhaochun, Yu, Dongxiao, Ma, Jun, de Rijke, Maarten, Cheng, Xiuzhen

Federated recommender systems have distinct advantages in terms of privacy protection over traditional recommender systems that are centralized at a data center. However, previous work on federated recommender systems does not fully consider the limitations of storage, RAM, energy and communication bandwidth in a mobile environment. The scales of the models proposed are too large to be easily run on mobile devices. And existing federated recommender systems need to fine-tune recommendation models on each device, making it hard to effectively exploit collaborative filtering information among users/devices. Our goal in this paper is to design a novel federated learning framework for rating prediction (RP) for mobile environments. We introduce a federated matrix factorization (MF) framework, named meta matrix factorization (MetaMF). Given a user, we first obtain a collaborative vector by collecting useful information with a collaborative memory module. Then, we employ a meta recommender module to generate private item embeddings and a RP model based on the collaborative vector in the server. To address the challenge of generating a large number of high-dimensional item embeddings, we devise a rise-dimensional generation strategy that first generates a low-dimensional item embedding matrix and a rise-dimensional matrix, and then multiply them to obtain high-dimensional embeddings. We use the generated model to produce private RPs for the given user on her device. MetaMF shows a high capacity even with a small RP model, which can adapt to the limitations of a mobile environment. We conduct extensive experiments on four benchmark datasets to compare MetaMF with existing MF methods and find that MetaMF can achieve competitive performance. Moreover, we find MetaMF achieves higher RP performance over existing federated methods by better exploiting collaborative filtering among users/devices.

artificial intelligence, machine learning, metamf, (16 more...)

1910.10086

Country:

Asia > China > Shandong Province > Qingdao (0.05)
Europe > Netherlands > North Holland > Amsterdam (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Lebanon (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Artificial IntelligenceMay-11-2021

Transfer-Meta Framework for Cross-domain Recommendation to Cold-Start Users

Zhu, Yongchun, Ge, Kaikai, Zhuang, Fuzhen, Xie, Ruobing, Xi, Dongbo, Zhang, Xu, Lin, Leyu, He, Qing

Cold-start problems are enormous challenges in practical recommender systems. One promising solution for this problem is cross-domain recommendation (CDR) which leverages rich information from an auxiliary (source) domain to improve the performance of recommender system in the target domain. In these CDR approaches, the family of Embedding and Mapping methods for CDR (EMCDR) is very effective, which explicitly learn a mapping function from source embeddings to target embeddings with overlapping users. However, these approaches suffer from one serious problem: the mapping function is only learned on limited overlapping users, and the function would be biased to the limited overlapping users, which leads to unsatisfying generalization ability and degrades the performance on cold-start users in the target domain. With the advantage of meta learning which has good generalization ability to novel tasks, we propose a transfer-meta framework for CDR (TMCDR) which has a transfer stage and a meta stage. In the transfer (pre-training) stage, a source model and a target model are trained on source and target domains, respectively. In the meta stage, a task-oriented meta network is learned to implicitly transform the user embedding in the source domain to the target feature space. In addition, the TMCDR is a general framework that can be applied upon various base models, e.g., MF, BPR, CML. By utilizing data from Amazon and Douban, we conduct extensive experiments on 6 cross-domain tasks to demonstrate the superior performance and compatibility of TMCDR.

cold-start user, meta network, target domain, (12 more...)

doi: 10.1145/3404835.3463010

2105.04785

Country:

Asia > China > Beijing > Beijing (0.05)
North America > Canada (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > China > Fujian Province > Xiamen (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Chen, Shangyu, Wang, Wenya, Pan, Sinno Jialin

MetaQuant: Learning to Quantize by Learning to Penetrate Non-differentiable Quantization

Neural Information Processing SystemsMar-18-2020, 22:02:00 GMT

Tremendous amount of parameters make deep neural networks impractical to be deployed for edge-device-based real-world applications due to the limit of computational power and storage space. Existing studies have made progress on learning quantized deep models to reduce model size and energy consumption, i.e. converting full-precision weights ($r$'s) into discrete values ($q$'s) in a supervised training manner. However, the training process for quantization is non-differentiable, which leads to either infinite or zero gradients ($g_r$) w.r.t. To address this problem, most training-based quantization methods use the gradient w.r.t. However, these methods only heuristically make training-based quantization applicable, without further analysis on how the approximated gradients can assist training of a quantized network. In this paper, we propose to learn $g_r$ by a neural network.

learning, metaquant, penetrate non-differentiable quantization, (4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.84)

arXiv.org Machine LearningJun-12-2018

Meta-Learning Transferable Active Learning Policies by Deep Reinforcement Learning

Pang, Kunkun, Dong, Mingzhi, Wu, Yang, Hospedales, Timothy

Active learning (AL) aims to enable training high performance classifiers with low annotation cost by predicting which subset of unlabelled instances would be most beneficial to label. The importance of AL has motivated extensive research, proposing a wide variety of manually designed AL algorithms with diverse theoretical and intuitive motivations. In contrast to this body of research, we propose to treat active learning algorithm design as a meta-learning problem and learn the best criterion from data. We model an active learning algorithm as a deep neural network that inputs the base learner state and the unlabelled point set and predicts the best point to annotate next. Training this active query policy network with reinforcement learning, produces the best non-myopic policy for a given dataset. The key challenge in achieving a general solution to AL then becomes that of learner generalisation, particularly across heterogeneous datasets. We propose a multi-task dataset-embedding approach that allows dataset-agnostic active learners to be trained. Our evaluation shows that AL algorithms trained in this way can directly generalise across diverse problems.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Machine Learning

1806.04798

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)