AITopics | userid

Collaborating Authors

userid

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Do LLMs Memorize Recommendation Datasets? A Preliminary Study on MovieLens-1M

Di Palma, Dario, Merra, Felice Antonio, Sfilio, Maurizio, Anelli, Vito Walter, Narducci, Fedelucio, Di Noia, Tommaso

arXiv.org Artificial IntelligenceMay-16-2025

Large Language Models (LLMs) have become increasingly central to recommendation scenarios due to their remarkable natural language understanding and generation capabilities. Although significant research has explored the use of LLMs for various recommendation tasks, little effort has been dedicated to verifying whether they have memorized public recommendation dataset as part of their training data. This is undesirable because memorization reduces the generalizability of research findings, as benchmarking on memorized datasets does not guarantee generalization to unseen datasets. Furthermore, memorization can amplify biases, for example, some popular items may be recommended more frequently than others. In this work, we investigate whether LLMs have memorized public recommendation datasets. Specifically, we examine two model families (GPT and Llama) across multiple sizes, focusing on one of the most widely used dataset in recommender systems: MovieLens-1M. First, we define dataset memorization as the extent to which item attributes, user profiles, and user-item interactions can be retrieved by prompting the LLMs. Second, we analyze the impact of memorization on recommendation performance. Lastly, we examine whether memorization varies across model families and model sizes. Our results reveal that all models exhibit some degree of memorization of MovieLens-1M, and that recommendation performance is related to the extent of memorization. We have made all the code publicly available at: https://github.com/sisinflab/LLM-MemoryInspector

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3726302.3730178

2505.10212

Country:

Europe (0.49)
Asia (0.46)
North America > United States (0.29)

Genre: Research Report > New Finding (0.88)

Industry:

Media > Film (0.93)
Leisure & Entertainment (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PlanAlyzer

Communications of the ACMAug-24-2021, 04:00:00 GMT

We did not expect to see any real causal sufficiency errors due to the expert nature of the authors of PLANOUT-A. Rather, we expect to see some false positives due to the fact that PLANALYZER is aggressive about flagging potential causal sufficiency errors. We made this design choice because the cost of unrecorded confounders can be very high. PLANOUT scripts in deployment at Facebook represent a range of experimental designs. We observed factorial designs, conditional assignment, within-subjects experiments, cluster random assignment, and bandits experiments in the scripts we examined. Real-world PLANOUT scripts unsurprisingly contained few errors, because they were primarily written and overseen by experts in experimental design. Therefore, to test how well PLANALYZER finds errors, we selected a subset of fifty scripts from PLANOUT-A and mutated them. We then validated a subset of the contrasts PLANALYZER produced against a corpus of hand-selected contrasts monitored and compared by an automated tool used at Facebook. Finally, we reported on PLANALYZER'S performance, because its effectiveness requires accurately identifying meaningful contrasts within a reasonable amount of time.

experiment, planalyzer, planout program, (14 more...)

Communications of the ACM

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.14)
North America > United States > Vermont > Chittenden County > Burlington (0.14)
(4 more...)

Genre:

Research Report > Experimental Study (0.89)
Research Report > Strength High (0.68)

Industry: Information Technology > Services (0.93)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.66)

Add feedback

Model Size Reduction Using Frequency Based Double Hashing for Recommender Systems

Zhang, Caojin, Liu, Yicun, Xie, Yuanpu, Ktena, Sofia Ira, Tejani, Alykhan, Gupta, Akshay, Myana, Pranay Kumar, Dilipkumar, Deepak, Paul, Suvadip, Ihara, Ikuhiro, Upadhyaya, Prasang, Huszar, Ferenc, Shi, Wenzhe

arXiv.org Machine LearningJul-28-2020

Deep Neural Networks (DNNs) with sparse input features have been widely used in recommender systems in industry. These models have large memory requirements and need a huge amount of training data. The large model size usually entails a cost, in the range of millions of dollars, for storage and communication with the inference services. In this paper, we propose a hybrid hashing method to combine frequency hashing and double hashing techniques for model size reduction, without compromising performance. We evaluate the proposed models on two product surfaces. In both cases, experiment results demonstrated that we can reduce the model size by around 90 % while keeping the performance on par with the original baselines.

artificial intelligence, frequency, machine learning, (16 more...)

arXiv.org Machine Learning

2007.14523

Country:

North America > United States > Hawaii (0.04)
Asia (0.04)

Genre: Research Report (0.50)

Industry:

Information Technology > Services (0.46)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Now available: Batch Recommendations in Amazon Personalize Amazon Web Services

#artificialintelligenceNov-16-2019, 19:28:22 GMT

Today, we're very happy to announce that Amazon Personalize now supports batch recommendations/ Launched at AWS re:Invent 2018, Personalize is a fully-managed service that allows you to create private, customized recommendations for your applications, with little to no machine learning experience required. With Personalize, you provide the unique signals in your activity data (page views, sign-ups, purchases, and so forth) along with optional customer demographic information (age, location, etc.). You then provide the inventory of the items you want to recommend, such as articles, products, videos, or music: as explained in previous blog posts, you can use both historical data stored in Amazon Simple Storage Service (S3) and streaming data sent in real-time from a JavaScript tracker or server-side. Then, entirely under the covers, Personalize will process and examine the data, identify what is meaningful, select the right algorithms, train and optimize a personalization model that is customized for your data, and is accessible via an API that can be easily invoked by your business application. However, some customers have told us that batch recommendations would be a better fit for their use cases.

artificial intelligence, personalize, userid, (10 more...)

#artificialintelligence

Industry:

Retail > Online (0.40)
Information Technology > Services (0.40)

Technology:

Information Technology > Communications > Web (0.40)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.36)

Add feedback