missingness
Modeling Dynamic Missingness of Implicit Feedback for Recommendation
Implicit feedback is widely used in collaborative filtering methods for recommendation. It is well known that implicit feedback contains a large number of values that are \emph{missing not at random} (MNAR); and the missing data is a mixture of negative and unknown feedback, making it difficult to learn user's negative preferences. Recent studies modeled \emph{exposure}, a latent missingness variable which indicates whether an item is missing to a user, to give each missing entry a confidence of being negative feedback. However, these studies use static models and ignore the information in temporal dependencies among items, which seems to be a essential underlying factor to subsequent missingness. To model and exploit the dynamics of missingness, we propose a latent variable named ``\emph{user intent}'' to govern the temporal changes of item missingness, and a hidden Markov model to represent such a process. The resulting framework captures the dynamic item missingness and incorporate it into matrix factorization (MF) for recommendation. We also explore two types of constraints to achieve a more compact and interpretable representation of \emph{user intents}. Experiments on real-world datasets demonstrate the superiority of our method against state-of-the-art recommender systems.
- Research Report > New Finding (1.00)
- Instructional Material > Course Syllabus & Notes (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (0.68)
- Education > Educational Setting > Online (0.47)
- Health & Medicine > Health Care Technology > Medical Record (0.46)
ac01e21bb14609416760f790dd8966ae-Supplemental-Datasets_and_Benchmarks.pdf
In the hospital, patients may be in the ICU with ECG/PPG sensors to monitor their already-poor healthcondition. ML methods must rely onlearning toimpute missing signals based onthesignal that is present, rather than learning tocreate ageneral-purpose imputation template thatmimics standard healthybehavior. Likewise, participant movement inboth contexts can result in artifacts(e.g. Inabroadercontext, we want to match the high quality level of other datasets such as PTB-XL, in which 77.01% of thesignal data areofhighest assessed quality [18]. See below for examples of ECG signals with their associated periodogram.
- Asia > China > Shanghai > Shanghai (0.04)
- South America > Peru > Lima Department > Lima Province > Lima (0.04)
- Asia > Middle East > Israel (0.04)
- (2 more...)
- Education (0.46)
- Information Technology (0.46)
- Health & Medicine (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
- North America > United States > California > Santa Clara County > Palo Alto (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Asia > China > Hong Kong (0.04)
Missing At Random as Covariate Shift: Correcting Bias in Iterative Imputation
Shannon, Luke, Liu, Song, Reluga, Katarzyna
Accurate imputation of missing data is critical to downstream machine learning performance. We formulate missing data imputation as a risk minimisation problem, which highlights a covariate shift between the observed and unobserved data distributions. This covariate shift induced bias is not accounted for by popular imputation methods and leads to suboptimal performance. In this paper, we derive theoretically valid importance weights that correct for the induced distributional bias. Furthermore, we propose a novel imputation algorithm that jointly estimates both the importance weights and imputation models, enabling bias correction throughout the imputation process. Empirical results across benchmark datasets show reductions in root mean squared error and Wasserstein distance of up to 7% and 20%, respectively, compared to otherwise identical unweighted methods.
- North America > United States > California (0.04)
- Europe > United Kingdom > England > Bristol (0.04)
- Europe > Germany > Berlin (0.04)
- Information Technology > Data Science (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
16009ce3d8a6872d79f056c75618911d-Paper-Conference.pdf
Many important datasets contain samples that are missing one or more feature values. Maintaining the interpretability of machine learning models in the presence of such missing data is challenging. Singly or multiply imputing missing values complicates the model's mapping from features to labels. On the other hand, reasoning on indicator variables that represent missingness introduces a potentially largenumber ofadditional terms, sacrificing sparsity.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Netherlands > South Holland > Leiden (0.04)
- Health & Medicine > Therapeutic Area (0.70)
- Education > Educational Setting (0.46)