AITopics

2110.00175

Country: Asia (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

arXiv.org Artificial IntelligenceSep-23-2021

Modeling Dynamic Attributes for Next Basket Recommendation

Chen, Yongjun, Li, Jia, Liu, Chenghao, Li, Chenxi, Anderle, Markus, McAuley, Julian, Xiong, Caiming

Traditional approaches to next item and next basket recommendation typically extract users' interests based on their past interactions and associated static contextual information (e.g. a user id or item category). However, extracted interests can be inaccurate and become obsolete. Dynamic attributes, such as user income changes, item price changes (etc.), change over time. Such dynamics can intrinsically reflect the evolution of users' interests. We argue that modeling such dynamic attributes can boost recommendation performance. However, properly integrating them into user interest models is challenging since attribute dynamics can be diverse such as time-interval aware, periodic patterns (etc.), and they represent users' behaviors from different perspectives, which can happen asynchronously with interactions. Besides dynamic attributes, items in each basket contain complex interdependencies which might be beneficial but nontrivial to effectively capture. To address these challenges, we propose a novel Attentive network to model Dynamic attributes (named AnDa). AnDa separately encodes dynamic attributes and basket item sequences. We design a periodic aware encoder to allow the model to capture various temporal patterns from dynamic attributes. To effectively learn useful item relationships, intra-basket attention module is proposed. Experimental results on three real-world datasets demonstrate that our method consistently outperforms the state-of-the-art.

deep learning, neural network, recommendation, (18 more...)

2109.11654

Country: North America > United States (0.29)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.70)
Information Technology > Data Science (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Machine LearningSep-19-2021

Merlion: A Machine Learning Library for Time Series

Bhatnagar, Aadyot, Kassianik, Paul, Liu, Chenghao, Lan, Tian, Yang, Wenzhuo, Cassius, Rowan, Sahoo, Doyen, Arpit, Devansh, Subramanian, Sri, Woo, Gerald, Saha, Amrita, Jagota, Arun Kumar, Gopalakrishnan, Gokulakrishnan, Singh, Manpreet, Krithika, K C, Maddineni, Sukumar, Cho, Daeki, Zong, Bo, Zhou, Yingbo, Xiong, Caiming, Savarese, Silvio, Hoi, Steven, Wang, Huan

We introduce Merlion, an open-source machine learning library for time series. It features a unified interface for many commonly used models and datasets for anomaly detection and forecasting on both univariate and multivariate time series, along with standard pre/post-processing layers. It has several modules to improve ease-of-use, including visualization, anomaly score calibration to improve interpetability, AutoML for hyperparameter tuning and model selection, and model ensembling. Merlion also provides a unique evaluation framework that simulates the live deployment and re-training of a model in production. This library aims to provide engineers and researchers a one-stop solution to rapidly develop models for their specific time series needs and benchmark them across multiple time series datasets. In this technical report, we highlight Merlion's architecture and major functionalities, and we report benchmark numbers across different baseline models and ensembles.

dataset, deep learning, neural network, (17 more...)

2109.09265

Country: North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report > Experimental Study (0.48)

Industry:

Information Technology (0.68)
Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.78)

arXiv.org Artificial IntelligenceNov-1-2020

Learning Transferrable Parameters for Long-tailed Sequential User Behavior Modeling

Yin, Jianwen, Liu, Chenghao, Wang, Weiqing, Sun, Jianling, Hoi, Steven C. H.

Sequential user behavior modeling plays a crucial role in online user-oriented services, such as product purchasing, news feed consumption, and online advertising. The performance of sequential modeling heavily depends on the scale and quality of historical behaviors. However, the number of user behaviors inherently follows a long-tailed distribution, which has been seldom explored. In this work, we argue that focusing on tail users could bring more benefits and address the long tails issue by learning transferrable parameters from both optimization and feature perspectives. Specifically, we propose a gradient alignment optimizer and adopt an adversarial training scheme to facilitate knowledge transfer from the head to the tail. Such methods can also deal with the cold-start problem of new users. Moreover, it could be directly adaptive to various well-established sequential models. Extensive experiments on four real-world datasets verify the superiority of our framework compared with the state-of-the-art baselines.

deep learning, neural network, tail user, (18 more...)

doi: 10.1145/3394486.3403078

2010.11401

Genre: Research Report > New Finding (0.46)

Industry:

Media (0.48)
Leisure & Entertainment (0.46)
Information Technology > Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Communications (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceOct-15-2020

Decentralized Knowledge Graph Representation Learning

Guo, Lingbing, Wang, Weiqing, Sun, Zequn, Liu, Chenghao, Hu, Wei

Knowledge graph (KG) representation learning methods have achieved competitive performance in many KG-oriented tasks, among which the best ones are usually based on graph neural networks (GNNs), a powerful family of networks that learns the representation of an entity by aggregating the features of its neighbors and itself. However, many KG representation learning scenarios only provide the structure information that describes the relationships among entities, causing that entities have no input features. In this case, existing aggregation mechanisms are incapable of inducing embeddings of unseen entities as these entities have no pre-defined features for aggregation. In this paper, we present a decentralized KG representation learning approach, decentRL, which encodes each entity from and only from the embeddings of its neighbors. For optimization, we design an algorithm to distill knowledge from the model itself such that the output embeddings can continuously gain knowledge from the corresponding original embeddings. Extensive experiments show that the proposed approach performed better than many cutting-edge models on the entity alignment task, and achieved competitive performance on the entity prediction task. Furthermore, under the inductive setting, it significantly outperformed all baselines on both tasks.

artificial intelligence, neural network, representation, (19 more...)

2010.08114

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningSep-5-2020

Graph Prototypical Networks for Few-shot Learning on Attributed Networks

Ding, Kaize, Wang, Jianling, Li, Jundong, Shu, Kai, Liu, Chenghao, Liu, Huan

Attributed networks nowadays are ubiquitous in a myriad of high-impact applications, such as social network analysis, financial fraud detection, and drug discovery. As a central analytical task on attributed networks, node classification has received much attention in the research community. In real-world attributed networks, a large portion of node classes only contain limited labeled instances, rendering a long-tail node class distribution. Existing node classification algorithms are unequipped to handle the \textit{few-shot} node classes. As a remedy, few-shot learning has attracted a surge of attention in the research community. Yet, few-shot node classification remains a challenging problem as we need to address the following questions: (i) How to extract meta-knowledge from an attributed network for few-shot node classification? (ii) How to identify the informativeness of each labeled instance for building a robust and effective model? To answer these questions, in this paper, we propose a graph meta-learning framework -- Graph Prototypical Networks (GPN). By constructing a pool of semi-supervised node classification tasks to mimic the real test environment, GPN is able to perform \textit{meta-learning} on an attributed network and derive a highly generalizable model for handling the target classification task. Extensive experiments demonstrate the superior capability of GPN in few-shot node classification.

deep learning, neural network, node, (18 more...)

2006.12739

Country: North America > United States > Texas (0.14)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Information Technology (0.48)
Law Enforcement & Public Safety (0.34)
Health & Medicine > Pharmaceuticals & Biotechnology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Machine LearningJul-30-2020

Bilevel Continual Learning

Pham, Quang, Sahoo, Doyen, Liu, Chenghao, Hoi, Steven C. H

Continual learning aims to learn continuously from a stream of tasks and data in an online-learning fashion, being capable of exploiting what was learned previously to improve current and future tasks while still being able to perform well on the previous tasks. One common limitation of many existing continual learning methods is that they often train a model directly on all available training data without validation due to the nature of continual learning, thus suffering poor generalization at test time. In this work, we present a novel framework of continual learning named "Bilevel Continual Learning" (BCL) by unifying a bilevel optimization objective and a dual memory management strategy comprising both episodic memory and generalization memory to achieve effective knowledge transfer to future tasks and alleviate catastrophic forgetting on old tasks simultaneously. Our extensive experiments on continual learning benchmarks demonstrate the efficacy of the proposed BCL compared to many state-of-the-art methods. Unlike humans, conventional machine learning methods, particularly neural networks, struggle to learn continuously because these models lose their abilities to perform acquired skills when they learn a new task (French, 1999). Continual learning systems are specifically designed to learn continuously from a stream of tasks. They are able to accumulate knowledge over time to improve the future learning outcome, while still being able to perform well on the previous tasks.

continual learning, educational setting, neural network, (18 more...)

2007.15553

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry: Education > Educational Setting > Online (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

arXiv.org Machine LearningAug-19-2019

Graph Neural Networks with High-order Feature Interactions

Ding, Kaize, Li, Yichuan, Li, Jundong, Liu, Chenghao, Liu, Huan

Network representation learning, a fundamental research problem which aims at learning low-dimension node representations on graph-structured data, has been extensively studied in the research community. By generalizing the power of neural networks on graph-structured data, graph neural networks (GNNs) achieve superior capability in network representation learning. However, the node features of many real-world graphs could be high-dimensional and sparse, rendering the learned node representations from existing GNN architectures less expressive. The main reason lies in that those models directly makes use of the raw features of nodes as input for the message-passing and have limited power in capturing sophisticated interactions between features. In this paper, we propose a novel GNN framework for learning node representations that incorporate high-order feature interactions on feature-sparse graphs. Specifically, the proposed message aggregator and feature factorizer extract two channels of embeddings from the feature-sparse graph, characterizing the aggregated node features and high-order feature interactions, respectively. Furthermore, we develop an attentive fusion network to seamlessly combine the information from two different channels and learn the feature interaction-aware node representations. Extensive experiments on various datasets demonstrate the effectiveness of the proposed framework on a variety of graph learning tasks.

deep learning, interaction, neural network, (17 more...)

1908.0711

Country: North America > United States (0.28)

Genre: Research Report (0.83)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Machine LearningOct-19-2018

SL$^2$MF: Predicting Synthetic Lethality in Human Cancers via Logistic Matrix Factorization

Liu, Yong, Wu, Min, Liu, Chenghao, Li, Xiao-Li, Zheng, Jie

Synthetic lethality (SL) is a promising concept for novel discovery of anti-cancer drug targets. However, wet-lab experiments for detecting SLs are faced with various challenges, such as high cost, low consistency across platforms or cell lines. Therefore, computational prediction methods are needed to address these issues. This paper proposes a novel SL prediction method, named SL2MF, which employs logistic matrix factorization to learn latent representations of genes from the observed SL data. The probability that two genes are likely to form SL is modeled by the linear combination of gene latent vectors. As known SL pairs are more trustworthy than unknown pairs, we design importance weighting schemes to assign higher importance weights for known SL pairs and lower importance weights for unknown pairs in SL2MF. Moreover, we also incorporate biological knowledge about genes from protein-protein interaction (PPI) data and Gene Ontology (GO). In particular, we calculate the similarity between genes based on their GO annotations and topological properties in the PPI network. Extensive experiments on the SL interaction data from SynLethDB database have been conducted to demonstrate the effectiveness of SL2MF.

inductive learning, oncology, sl 2, (18 more...)

1810.08726

Country: North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.93)

AAAI ConferencesFeb-8-2018

Unified Locally Linear Classifiers With Diversity-Promoting Anchor Points

Liu, Chenghao (Zhejiang University, China) | Zhang, Teng (Singapore Management University, Singapore) | Zhao, Peilin (Zhejiang University) | Sun, Jianling (Alibaba-Zhejiang University Joint Institute of Frontier Technologies) | Hoi, Steven C. H. (South China University of Technology)

Locally Linear Support Vector Machine (LLSVM) has been actively used in classification tasks due to its capability of classifying nonlinear patterns. However, existing LLSVM suffers from two drawbacks: (1) a particular and appropriate regularization for LLSVM has not yet been addressed; (2) it usually adopts a three-stage learning scheme composed of learning anchor points by clustering, learning local coding coordinates by a predefined coding scheme, and finally learning for training classifiers. We argue that this decoupled approaches oversimplifies the original optimization problem, resulting in a large deviation due to the disparate purpose of each step. To address the first issue, we propose a novel diversified regularization which could capture infrequent patterns and reduce the model size without sacrificing the representation power. Based on this regularization, we develop a joint optimization algorithm among anchor points, local coding coordinates and classifiers to simultaneously minimize the overall classification risk, which is termed as Diversified and Unified Locally Linear Support Vector Machine (DU-LLSVM for short). To the best of our knowledge, DU-LLSVM is the first principled method that directly learns sparse local coding and can be easily generalized to other supervised learning models. Extensive experiments showed that DU-LLSVM consistently surpassed several state-of-the-art methods with a predefined local coding scheme (e.g. LLSVM) or a supervised anchor point learning (e.g. SAPL-LLSVM).

anchor point, artificial intelligence, machine learning, (16 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

Oceania > Australia (0.28)
Europe (0.28)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)