Collaborating Authors

1.5 TB dataset of anonymized user interactions released by Yahoo


The Yahoo News Feed dataset is a collection based on a sample of anonymized user interactions on the news feeds of several Yahoo properties, including the Yahoo homepage, Yahoo News, Yahoo Sports, Yahoo Finance, Yahoo Movies, and Yahoo Real Estate. The dataset stands at a massive 110B lines (1.5TB bzipped) of user-news item interaction data, collected by recording the user- news item interaction of about 20M users from February 2015 to May 2015. In addition to the interaction data, we are providing the demographic information (age segment and gender) and the city in which the user is based for a subset of the anonymized users. On the item side, we are releasing the title, summary, and key-phrases of the pertinent news article. The interaction data is timestamped with the user's local time and also contains partial information of the device on which the user accessed the news feeds, which allows for interesting work in contextual recommendation and temporal data mining.


AAAI Conferences

Context personalisation is a flourishing area of research with many applications. Context personalisation systems usually employ a user model to predict the appeal of the context to a particular user given a history of interactions. Most of the models used are context-dependent and their applicability is usually limited to the system and the data used for model construction. Establishing models of user experience that are highly scalable while maintaing the performance constitutes an important research direction. In this paper, we propose generic models of user experience in the computer games domain.

Interaction-aware Factorization Machines for Recommender Systems Machine Learning

Factorization Machine (FM) is a widely used supervised learning approach by effectively modeling of feature interactions. Despite the successful application of FM and its many deep learning variants, treating every feature interaction fairly may degrade the performance. For example, the interactions of a useless feature may introduce noises; the importance of a feature may also differ when interacting with different features. In this work, we propose a novel model named \emph{Interaction-aware Factorization Machine} (IFM) by introducing Interaction-Aware Mechanism (IAM), which comprises the \emph{feature aspect} and the \emph{field aspect}, to learn flexible interactions on two levels. The feature aspect learns feature interaction importance via an attention network while the field aspect learns the feature interaction effect as a parametric similarity of the feature interaction vector and the corresponding field interaction prototype. IFM introduces more structured control and learns feature interaction importance in a stratified manner, which allows for more leverage in tweaking the interactions on both feature-wise and field-wise levels. Besides, we give a more generalized architecture and propose Interaction-aware Neural Network (INN) and DeepIFM to capture higher-order interactions. To further improve both the performance and efficiency of IFM, a sampling scheme is developed to select interactions based on the field aspect importance. The experimental results from two well-known datasets show the superiority of the proposed models over the state-of-the-art methods.

SPACE: A Simulator for Physical Interactions and Causal Learning in 3D Environments Artificial Intelligence

Recent advancements in deep learning, computer vision, and embodied AI have given rise to synthetic causal reasoning video datasets. These datasets facilitate the development of AI algorithms that can reason about physical interactions between objects. However, datasets thus far have primarily focused on elementary physical events such as rolling or falling. There is currently a scarcity of datasets that focus on the physical interactions that humans perform daily with objects in the real world. To address this scarcity, we introduce SPACE: A Simulator for Physical Interactions and Causal Learning in 3D Environments. The SPACE simulator allows us to generate the SPACE dataset, a synthetic video dataset in a 3D environment, to systematically evaluate physics-based models on a range of physical causal reasoning tasks. Inspired by daily object interactions, the SPACE dataset comprises videos depicting three types of physical events: containment, stability and contact. These events make up the vast majority of the basic physical interactions between objects. We then further evaluate it with a state-of-the-art physics-based deep model and show that the SPACE dataset improves the learning of intuitive physics with an approach inspired by curriculum learning. Repository:

SAS Python Interaction


The objective of this article is to understand Python 3.x interaction with SAS 9.4 university edition. Read SAS datasets using python pandas library and manipulate datasets and write the result back to SAS. SAS University Edition is free SAS software that can be used for teaching and learning statistics and quantitative methods. The scope of this article is however limited to ETL operations with SAS and Python. A SAS library is a collection of one or more SAS files/datasets that are recognized by SAS and that are referenced and stored as a unit. Whenever a new session is created SAS automatically creates two libraries Work, temporary library, and SASUSER, permanent library.