AITopics | rating data

Collaborating Authors

rating data

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Urban Incident Prediction with Graph Neural Networks: Integrating Government Ratings and Crowdsourced Reports

Balachandar, Sidhika, Sadhuka, Shuvom, Berger, Bonnie, Pierson, Emma, Garg, Nikhil

arXiv.org Artificial IntelligenceNov-17-2025

Graph neural networks (GNNs) are widely used in urban spatiotemporal forecasting, such as predicting infrastructure problems. In this setting, government officials wish to know in which neighborhoods incidents like potholes or rodent issues occur. The true state of incidents (e.g., street conditions) for each neighborhood is observed via government inspection ratings. However, these ratings are only conducted for a sparse set of neighborhoods and incident types. We also observe the state of incidents via crowdsourced reports, which are more densely observed but may be biased due to heterogeneous reporting behavior. First, for such settings, we propose a multiview, multioutput GNN-based model that uses both unbiased rating data and biased reporting data to predict the true latent state of incidents. Second, we investigate a case study of New York City urban incidents and collect, standardize, and make publicly available a dataset of 9,615,863 crowdsourced reports and 1,041,415 government inspection ratings over 3 years and across 139 types of incidents. Finally, we show on both real and semi-synthetic data that our model can better predict the latent state compared to models that use only reporting data or models that use only rating data, especially when rating data is sparse and reports are predictive of ratings. We also quantify demographic biases in crowdsourced reporting, e.g., higher-income neighborhoods report problems at higher rates. Our analysis showcases a widely applicable approach for latent state prediction using heterogeneous, sparse, and biased data.

full model, inspection, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2506.0874

Country: North America > United States > New York (0.34)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Health & Medicine > Therapeutic Area (0.67)
Transportation > Ground > Road (0.67)
Transportation > Infrastructure & Services (0.67)
(2 more...)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Discovering Semantic Subdimensions through Disentangled Conceptual Representations

Zhang, Yunhao, Wang, Shaonan, Lin, Nan, Dong, Xinyi, Li, Chong, Zong, Chengqing

arXiv.org Artificial IntelligenceSep-22-2025

Understanding the core dimensions of conceptual semantics is fundamental to uncovering how meaning is organized in language and the brain. Existing approaches often rely on predefined semantic dimensions that offer only broad representations, overlooking finer conceptual distinctions. This paper proposes a novel framework to investigate the subdimensions underlying coarse-grained semantic dimensions. Specifically, we introduce a Disentangled Continuous Semantic Representation Model (DCSRM) that decomposes word embeddings from large language models into multiple sub-embeddings, each encoding specific semantic information. Using these sub-embeddings, we identify a set of interpretable semantic subdimensions. To assess their neural plausibility, we apply voxel-wise encoding models to map these subdimensions to brain activation. Our work offers more fine-grained interpretable semantic subdimensions of conceptual meaning. Further analyses reveal that semantic dimensions are structured according to distinct principles, with polarity emerging as a key factor driving their decomposition into subdimensions. The neural correlates of the identified subdimensions support their cognitive and neuroscientific plausibility.

dimension, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2508.21436

Country: North America > United States > Minnesota (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Crowdsourcing with Difficulty: A Bayesian Rating Model for Heterogeneous Items

Han, Seong Woo, Adıgüzel, Ozan, Carpenter, Bob

arXiv.org Machine LearningMay-29-2024

In applied statistics and machine learning, the "gold standards" used for training are often biased and almost always noisy. Dawid and Skene's justifiably popular crowdsourcing model adjusts for rater (coder, annotator) sensitivity and specificity, but fails to capture distributional properties of rating data gathered for training, which in turn biases training. In this study, we introduce a general purpose measurement-error model with which we can infer consensus categories by adding item-level effects for difficulty, discriminativeness, and guessability. We further show how to constrain the bimodal posterior of these models to avoid (or if necessary, allow) adversarial raters. We validate our model's goodness of fit with posterior predictive checks, the Bayesian analogue of $\chi^2$ tests. Dawid and Skene's model is rejected by goodness of fit tests, whereas our new model, which adjusts for item heterogeneity, is not rejected. We illustrate our new model with two well-studied data sets, binary rating data for caries in dental X-rays and implication in natural language.

logit 1, probability, rater, (15 more...)

arXiv.org Machine Learning

2405.19521

Country: North America > United States > Pennsylvania (0.04)

Genre: Research Report > New Finding (0.49)

Industry:

Health & Medicine (1.00)
Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Communications > Social Media > Crowdsourcing (0.87)
(3 more...)

Add feedback

Discordance Minimization-based Imputation Algorithms for Missing Values in Rating Data

Park, Young Woong, Kim, Jinhak, Zhu, Dan

arXiv.org Machine LearningNov-7-2023

Ratings are frequently used to evaluate and compare subjects in various applications, from education to healthcare, because ratings provide succinct yet credible measures for comparing subjects. However, when multiple rating lists are combined or considered together, subjects often have missing ratings, because most rating lists do not rate every subject in the combined list. In this study, we propose analyses on missing value patterns using six real-world data sets in various applications, as well as the conditions for applicability of imputation algorithms. Based on the special structures and properties derived from the analyses, we propose optimization models and algorithms that minimize the total rating discordance across rating providers to impute missing ratings in the combined rating lists, using only the known rating information. The total rating discordance is defined as the sum of the pairwise discordance metric, which can be written as a quadratic function. Computational experiments based on real-world and synthetic rating data sets show that the proposed methods outperform the state-of-the-art general imputation methods in the literature in terms of imputation accuracy.

algorithm, artificial intelligence, machine learning, (19 more...)

arXiv.org Machine Learning

doi: 10.1007/s10994-023-06452-4

2311.04035

Country:

North America > United States > Texas > Dallas County > Dallas (0.04)
North America > United States > Iowa > Story County > Ames (0.04)
Asia > South Korea > Gyeonggi-do > Suwon (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Health Care Providers & Services (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Media (0.93)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Applications of K Nearest Neighbor algorithm part1(Artificial Intelligence)

#artificialintelligenceOct-25-2022, 06:20:22 GMT

Abstract: Demands for minimum parameter setup in machine learning models are desirable to avoid time-consuming optimization processes. The k-Nearest Neighbors is one of the most effective and straightforward models employed in numerous problems. Despite its well-known performance, it requires the value of k for specific data distribution, thus demanding expensive computational efforts. This paper proposes a k-Nearest Neighbors classifier that bypasses the need to define the value of k. The model computes the k value adaptively considering the data distribution of the training set.

differential privacy, nearest neighbor algorithm part1, neighbor, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)

Add feedback

Make your Own Book and Movie Recommender System using Surprise

#artificialintelligenceAug-10-2021, 00:05:23 GMT

Surprise (stands for Simple Python RecommendatIon System Engine) is a Python library for building and analyzing recommender systems that deal with explicit rating data. It provides various ready-to-use prediction algorithms such as baseline algorithms, neighborhood methods, matrix factorization-based ( SVD, PMF, SVD, NMF), and many others. Also, various similarity measures (cosine, MSD, Pearson…) are built-in. In both, I will use collaborative filtering techniques and content-based techniques to filter items, and no worries I will explain the differences between them. I use here the MovieLens dataset.

book and movie recommender system, dataset, user profile, (9 more...)

#artificialintelligence

Industry:

Media > Film (0.34)
Leisure & Entertainment (0.34)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)

Add feedback

Leveraging Cross Feedback of User and Item Embeddings for Variational Autoencoder based Collaborative Filtering

Jin, Yuan, Zhao, He, Liu, Ming, Du, Lan, Li, Yunfeng, Xu, Ruohua, Gao, Longxiang

arXiv.org Machine LearningFeb-21-2020

Matrix factorization (MF) has been widely applied to collaborative filtering in recommendation systems. Its Bayesian variants can derive posterior distributions of user and item embeddings, and are more robust to sparse ratings. However, the Bayesian methods are restricted by their update rules for the posterior parameters due to the conjugacy of the priors and the likelihood. Neural networks can potentially address this issue by capturing complex mappings between the posterior parameters and the data. In this paper, we propose a variational auto-encoder based Bayesian MF framework. It leverages not only the data but also the information from the embeddings to approximate their joint posterior distribution. The approximation is an iterative procedure with cross feedback of user and item embeddings to the others' encoders. More specifically, user embeddings sampled in the previous iteration, alongside their ratings, are fed back into the item-side encoders to compute the posterior parameters for the item embeddings in the current iteration, and vice versa. The decoder network then reconstructs the data using the MF with the currently re-sampled user and item embeddings. We show the effectiveness of our framework in terms of reconstruction errors across five real-world datasets. We also perform ablation studies to illustrate the importance of the cross feedback component of our framework in lowering the reconstruction errors and accelerating the convergence.

dataset, information, user and item, (14 more...)

arXiv.org Machine Learning

2002.09145

Country: North America > United States > Virginia > Arlington County > Arlington (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Explainable Restricted Boltzmann Machines for Collaborative Filtering

Abdollahi, Behnoush, Nasraoui, Olfa

arXiv.org Machine LearningJun-22-2016

Most accurate recommender systems are black-box models, hiding the reasoning behind their recommendations. Yet explanations have been shown to increase the user's trust in the system in addition to providing other benefits such as scrutability, meaning the ability to verify the validity of recommendations. This gap between accuracy and transparency or explainability has generated an interest in automated explanation generation methods. Restricted Boltzmann Machines (RBM) are accurate models for CF that also lack interpretability. In this paper, we focus on RBM based collaborative filtering recommendations, and further assume the absence of any additional data source, such as item content or user attributes. We thus propose a new Explainable RBM technique that computes the top-n recommendation list from items that are explainable. Experimental results show that our method is effective in generating accurate and explainable recommendations.

artificial intelligence, machine learning, recommendation, (17 more...)

arXiv.org Machine Learning

1606.07129

Country: North America > United States (0.30)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.63)

Add feedback

Hierarchical Compound Poisson Factorization

Basbug, Mehmet E., Engelhardt, Barbara E.

arXiv.org Machine LearningMay-26-2016

Non-negative matrix factorization models based on a hierarchical Gamma-Poisson structure capture user and item behavior effectively in extremely sparse data sets, making them the ideal choice for collaborative filtering applications. Hierarchical Poisson factorization (HPF) in particular has proved successful for scalable recommendation systems with extreme sparsity. HPF, however, suffers from a tight coupling of sparsity model (absence of a rating) and response model (the value of the rating), which limits the expressiveness of the latter. Here, we introduce hierarchical compound Poisson factorization (HCPF) that has the favorable Gamma-Poisson structure and scalability of HPF to high-dimensional extremely sparse matrices. More importantly, HCPF decouples the sparsity model from the response model, allowing us to choose the most suitable distribution for the response. HCPF can capture binary, non-negative discrete, non-negative continuous, and zero-inflated continuous responses. We compare HCPF with HPF on nine discrete and three continuous data sets and conclude that HCPF captures the relationship between sparsity and response better than HPF.

artificial intelligence, machine learning, response model, (18 more...)

arXiv.org Machine Learning

1604.03853

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Industry:

Media (0.71)
Leisure & Entertainment (0.47)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.88)

Add feedback

Collaborative Filtering Tutorials Across Languages

@machinelearnbotMar-28-2016, 00:35:07 GMT

Collaborative filtering is the process of filtering for information using techniques involving collaboration among multiple agents. Applications of collaborative filtering typically involve very large data sets. This article covers some good tutorials regarding collaborative filtering we came across in Python, Java and R. Crab engine aims to provide a rich set of components from which you can construct a customized recommender system from a set of algorithms. The tutorial is from official documentation of Crab. This article presents an implementation of the collaborative filtering algorithm, that filters information for a user based on a collection of user profiles.

algorithm, artificial intelligence, collaborative filtering tutorial, (6 more...)

@machinelearnbot

Genre: Instructional Material > Course Syllabus & Notes (0.38)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)

Add feedback