Goto

Collaborating Authors

 Country


Sequence-Aware Factorization Machines for Temporal Predictive Analytics

arXiv.org Machine Learning

--In various web applications like targeted advertising and recommender systems, the available categorical features (e.g., product type) are often of great importance but sparse. As a widely adopted solution, models based on Factorization Machines (FMs) are capable of modelling high-order interactions among features for effective sparse predictive analytics. As the volume of web-scale data grows exponentially over time, sparse predictive analytics inevitably involves dynamic and sequential features. However, existing FMbased models assume no temporal orders in the data, and are unable to capture the sequential dependencies or patterns within the dynamic features, impeding the performance and adaptivity of these methods. Hence, in this paper, we propose a novel Sequence-A ware Factorization Machine (SeqFM) for temporal predictive analytics, which models feature interactions by fully investigating the effect of sequential dependencies. As static features (e.g., user gender) and dynamic features (e.g., user interacted items) express different semantics, we innovatively devise a multi-view self-attention scheme that separately models the effect of static features, dynamic features and the mutual interactions between static and dynamic features in three different views. In SeqFM, we further map the learned representations of feature interactions to the desired output with a shared residual network. T o showcase the versatility and generalizability of SeqFM, we test SeqFM in three popular application scenarios for FMbased models, namely ranking, classification and regression tasks. Extensive experimental results on six large-scale datasets demonstrate the superior effectiveness and efficiency of SeqFM. As an important supervised learning scheme, predictive analytics play a pivotal role in various applications, ranging from recommender systems [1], [2] to financial analysis [3] and online advertising [4], [5]. In practice, the goal of predictive analytics is to learn a mapping function from the observed variables (i.e., features) to the desired output. When dealing with categorical features in predictive analytics, a common approach is to convert such features into one-hot encodings [6]-[8] so that standard regressors like logistic regression [9] and support vector machines [10] can be directly applied. Due to the large number of possible category variables, the converted one-hot features are usually of high dimensionality but sparse [11], and simply using raw features rarely provides optimal results. The interactions among multiple raw features are usually termed as cross features [7] (a.k.a.


Network Revenue Management with Limited Switches: Known and Unknown Demand Distributions

arXiv.org Machine Learning

This work is motivated by a practical concern from our retail partner. While they respect the advantages of dynamic pricing, they must limit the number of price changes to be within some constant. We study the classical price-based network revenue management problem, where a retailer has finite initial inventory of multiple resources to sell over a finite time horizon. We consider both known and unknown distribution settings, and derive policies that have the best-possible asymptotic performance in both settings. Our results suggest an intrinsic difference between the expected revenue associated with how many switches are allowed, which further depends on the number of resources. Our results are also the first to show a separation between the regret bounds associated with different number of resources.


BottleNet++: An End-to-End Approach for Feature Compression in Device-Edge Co-Inference Systems

arXiv.org Machine Learning

The emergence of various intelligent mobile applications demands the deployment of powerful deep learning models at resource-constrained mobile devices. The device-edge co-inference framework provides a promising solution by splitting a neural network at a mobile device and an edge computing server. In order to balance the on-device computation and the communication overhead, the splitting point needs to be carefully picked, while the intermediate feature needs to be compressed before transmission. Existing studies decoupled the design of model splitting, feature compression, and communication, which may lead to excessive resource consumption of the mobile device. In this paper, we introduce an end-to-end architecture, named BottleNet++, that consists of an encoder, a non-trainable channel layer, and a decoder for more efficient feature compression and transmission. The encoder and decoder essentially implement joint source-channel coding via convolutional neural networks (CNNs), while explicitly considering the effect of channel noise. By exploiting the strong sparsity and the fault-tolerant property of the intermediate feature in a deep neural network (DNN), BottleNet++ achieves a much higher compression ratio than existing methods. Furthermore, by providing the channel condition to the encoder as an input, our method enjoys a strong generalization ability in different channel conditions. Compared with merely transmitting intermediate data without feature compression, BottleNet++ achieves up to 64x bandwidth reduction over the additive white Gaussian noise channel and up to 256x bit compression ratio in the binary erasure channel, with less than 2% reduction in accuracy. With a higher compression ratio, BottleNet++ enables splitting a DNN at earlier layers, which leads to up to 3x reduction in on-device computation compared with other compression methods.


Multi-task Sentence Encoding Model for Semantic Retrieval in Question Answering Systems

arXiv.org Artificial Intelligence

Question Answering (QA) systems are used to provide proper responses to users' questions automatically. Sentence matching is an essential task in the QA systems and is usually reformulated as a Paraphrase Identification (PI) problem. Given a question, the aim of the task is to find the most similar question from a QA knowledge base. In this paper, we propose a Multi-task Sentence Encoding Model (MSEM) for the PI problem, wherein a connected graph is employed to depict the relation between sentences, and a multi-task learning model is applied to address both the sentence matching and sentence intent classification problem. In addition, we implement a general semantic retrieval framework that combines our proposed model and the Approximate Nearest Neighbor (ANN) technology, which enables us to find the most similar question from all available candidates very quickly during online serving. The experiments show the superiority of our proposed method as compared with the existing sentence matching models.


Towards Efficient Anytime Computation and Execution of Decoupled Robustness Envelopes for Temporal Plans

arXiv.org Artificial Intelligence

Robustness Envelopes characterize the set of possible contingencies that a plan is able to address without re-planning, but their exact computation is extremely expensive; furthermore, general robustness envelopes are not amenable for efficient execution. In this paper, we present a novel, anytime algorithm to approximate Robustness Envelopes, making them scalable and executable. This is proven by an experimental analysis showing the efficiency of the algorithm, and by a concrete case study where the execution of robustness envelopes significantly reduces the number of re-plannings. 1 Introduction When planning and scheduling techniques are employed in practical applications, one of the major problems is the need for online re-planning when the observed contingencies are not aligned with the ones that were considered at planning time. These situations are common, because it is arguably impossible to predict the entire range of situations an autonomous system can encounter, especially when the planning domain encompasses time and temporal constraints. Unfortunately, re-planning can be costly in terms of time, and computational resources can be scarce on-board, so limiting the use of re-planning is very important for practical purposes. In principle, it is also possible to continue with the execution of a plan even when the observed contingencies are unexpected, optimistically hoping for a successful completion. However, this approach offers no formal guarantee, and is prone to the risk of continuing execution of a plan that is bound to fail. Several approaches have been proposed in the literature to address this problem (see (In-grand and Ghallab 2017) for a survey focused on robotics).


IKEA Furniture Assembly Environment for Long-Horizon Complex Manipulation Tasks

arXiv.org Artificial Intelligence

The IKEA Furniture Assembly Environment is one of the first benchmarks for testing and accelerating the automation of complex manipulation tasks. The environment is designed to advance reinforcement learning from simple toy tasks to complex tasks requiring both long-term planning and sophisticated low-level control. Our environment supports over 80 different furniture models, Sawyer and Baxter robot simulation, and domain randomization. The IKEA Furniture Assembly Environment is a testbed for methods aiming to solve complex manipulation tasks. The environment is publicly available at https://clvrai.com/furniture


A Joint Model for Definition Extraction with Syntactic Connection and Semantic Consistency

arXiv.org Artificial Intelligence

Definition Extraction (DE) is one of the well-known topics in Information Extraction that aims to identify terms and thei r corresponding definitions in unstructured texts. This task can be formalized either as a sentence classification task (i.e., containing term-definition pairs or not) or a sequential labeling task (i.e., identifying the boundaries of the terms a nd definitions). The previous works for DE have only focused on one of the two approaches, failing to model the interdependencies between the two tasks. In this work, we propose a novel model for DE that simultaneously performs the two tasks in a single framework to benefit from their interdependencies. Our model features deep learning architectu res to exploit the global structures of the input sentences as we ll as the semantic consistencies between the terms and the definitions, thereby improving the quality of the representat ion vectors for DE. Besides the joint inference between sentenc e classification and sequential labeling, the proposed model is fundamentally different from the prior work for DE in that th e prior work has only employed the local structures of the input sentences (i.e., word-to-word relations), and not yet c on-sidered the semantic consistencies between terms and definitions. In order to implement these novel ideas, our model presents a multi-task learning framework that employs grap h convolutional neural networks and predicts the dependency paths between the terms and the definitions. We also seek to enforce the consistency between the representations of t he terms and definitions both globally (i.e., increasing seman - tic consistency between the representations of the entire s en-tences and the terms/definitions) and locally (i.e., promot ing the similarity between the representations of the terms and the definitions). The extensive experiments on three benchmark datasets demonstrate the effectiveness of our approach.


Professional Services: Collaboration and the Future of Work

#artificialintelligence

The bigger your company, the more important it is that every team member is on the same page. When you're as big as Genpact, with 90,000 employees and twice as many partners, then collaboration is a top priority. Sanjay Srivastava is well aware of the challenges. As Genpact's Chief Digital Officer, he is front and center at the effort to make sure the disparate teams and employees within the company are working successfully in a collaborative organizational culture, as well as offering a satisfying customer experience. For Sanjay, there are three main factors that need a strong collaboration platform within a company. It starts with the idea of the business as a connected ecosystem that drives a collective intelligence. Then there's the concept of continuous learning and innovation that requires a collaborative framework to be successful. Finally, there's the convergence of domains, the ability to pull people together from different disciplines, with different experiences, and across ...


Deep Learning in Genomics

#artificialintelligence

You are invited to attend our event next Monday, Nov 18th @6:00 pm at Venture X. Come and join us as Dr. Huang gives a talk on how Deep Learning is used in Genomics. If you are curious about Artificial Intelligence & Data Science in Genomics and want to learn more, then this talk is for you. Dr. Huang's expertise is in the areas of Computational Biology, Computational Neuroergonomics, Brain-Computer Interface, Statistical Modeling, and Bayesian Methods. Dr. Yufei Huang is a Professor and Associate Chair in Research at the Department of Electrical and Computer Engineering at UTSA. He is also an adjunct professor at the Dept. of Epidemiology and Biostatistics at the University of Texas Health Science Center at San Antonio.


How we can use Deep Learning with Small Data? โ€“ Thought Leaders

#artificialintelligence

When it comes to keeping up with emerging cybersecurity trends, the process of staying on top of any recent developments can get quite tedious since there's a lot of news to keep up with. These days, however, the situation has changed dramatically, since the cybersecurity realms seem to be revolving around two words- deep learning. Although we were initially taken aback by the massive coverage that deep learning was receiving, it quickly became apparent that the buzz generated by deep learning was well-earned. In a fashion similar to the human brain, deep learning enables an AI model to achieve highly accurate results, by performing tasks directly from the text, images, and audio cues. Up till this point, it was widely believed that deep learning relies on a huge set of data, quite similar to the magnitude of data housed by Silicon Valley giants Google and Facebook to meet the aim of solving the most complicated problems within an organization.