Goto

Collaborating Authors

 Country


Reluctant additive modeling

arXiv.org Machine Learning

Sparse generalized additive models (GAMs) are an extension of sparse generalized linear models which allow a model's prediction to vary non-linearly with an input variable. This enables the data analyst build more accurate models, especially when the linearity assumption is known to be a poor approximation of reality. Motivated by reluctant interaction modeling (Yu et al. 2019), we propose a multi-stage algorithm, called $\textit{reluctant additive modeling (RAM)}$, that can fit sparse generalized additive models at scale. It is guided by the principle that, if all else is equal, one should prefer a linear feature over a non-linear feature. Unlike existing methods for sparse GAMs, RAM can be extended easily to binary, count and survival data. We demonstrate the method's effectiveness on real and simulated examples.


Event Ticket Price Prediction with Deep Neural Network on Spatial-Temporal Sparse Data

arXiv.org Machine Learning

Event ticket price prediction is important to marketing strategy for any sports team or musical ensemble. An accurate prediction model can help the marketing team to make promotion plan more effectively and efficiently. However, given all the historical transaction records, it is challenging to predict the sale price of the remaining seats at any future timestamp, not only because that the sale price is relevant to a lot of features (seat locations, date-to-event of the transaction, event date, team performance, etc.), but also because of the temporal and spatial sparsity in the dataset. For a game/concert, the ticket selling price of one seat is only observable once at the time of sale. Furthermore, some seats may not even be purchased (therefore no record available). In fact, data sparsity is commonly encountered in many prediction problems. Here, we propose a bi-level optimizing deep neural network to address the curse of spatio-temporal sparsity. Specifically, we introduce coarsening and refining layers, and design a bi-level loss function to integrate different level of loss for better prediction accuracy. Our model can discover the interrelations among ticket sale price, seat locations, selling time, event information, etc. Experiments show that our proposed model outperforms other benchmark methods in real-world ticket selling price prediction.


Mixing autoencoder with classifier: conceptual data visualization

arXiv.org Machine Learning

In this short paper, a neural network that is able to form a low dimensional topological hidden representation is explained. The neural network can be trained as an autoencoder, a classifier or mix of both, and produces different low dimensional topological map for each of them. When it is trained as an autoencoder, the inherent topological structure of the data can be visualized, while when it is trained as a classifier, the topological structure is further constrained by the concept, for example the labels the data, hence the visualization is not only structural but also conceptual. The proposed neural network significantly differ from many dimensional reduction models, primarily in its ability to execute both supervised and unsupervised dimensional reduction. The neural network allows multi perspective visualization of the data, and thus giving more flexibility in data analysis. This paper is supported by preliminary but intuitive visualization experiments.


DeepFPC: Deep Unfolding of a Fixed-Point Continuation Algorithm for Sparse Signal Recovery from Quantized Measurements

arXiv.org Machine Learning

We present DeepFPC, a novel deep neural network designed by unfolding the iterations of the fixed-point continuation algorithm with one-sided l1-norm (FPC-l1), which has been proposed for solving the 1-bit compressed sensing problem. The network architecture resembles that of deep residual learning and incorporates prior knowledge about the signal structure (i.e., sparsity), thereby offering interpretability by design. Once DeepFPC is properly trained, a sparse signal can be recovered fast and accurately from quantized measurements. The proposed model is evaluated in the task of direction-of-arrival (DOA) estimation and is shown to outperform state-of-the-art algorithms, namely, the iterative FPC-l1 algorithm and the 1-bit MUSIC method.


Identifying the number of clusters for K-Means: A hypersphere density based approach

arXiv.org Machine Learning

Application of K-Means algorithm is restricted by the fact that the number of clusters should be known beforehand. Previously suggested methods to solve this problem are either ad hoc or require parametric assumptions and complicated calculations. The proposed method aims to solve this conundrum by considering cluster hypersphere density as the factor to determine the number of clusters in the given dataset. The density is calculated by assuming a hypersphere around the cluster centroid for n-different number of clusters. The calculated values are plotted against their corresponding number of clusters and then the optimum number of clusters is obtained after assaying the elbow region of the graph. The method is simple, easy to comprehend, and provides robust and reliable results.


Pyramid Convolutional RNN for MRI Reconstruction

arXiv.org Machine Learning

Fast and accurate MRI image reconstruction from undersampled data is critically important in clinical practice. Compressed sensing based methods are widely used in image reconstruction but the speed is slow due to the iterative algorithms. Deep learning based methods have shown promising advances in recent years. However, recovering the fine details from highly undersampled data is still challenging. In this paper, we introduce a novel deep learning-based method, Pyramid Convolutional RNN (PC-RNN), to reconstruct the image from multiple scales. We evaluated our model on the fastMRI dataset and the results show that the proposed model achieves significant improvements than other methods and can recover more fine details.


Topic-aware chatbot using Recurrent Neural Networks and Nonnegative Matrix Factorization

arXiv.org Machine Learning

After learning topic vectors from an auxiliary text corpus via NMF, the decoder is trained so that it is more likely to sample response words from the most correlated topic vectors. One of the main advantages in our architecture is that the user can easily switch the NMF-learned topic vectors so that the chatbot obtains desired topic-awareness. We demonstrate our model by training on a single conversational data set which is then augmented with topic matrices learned from different auxiliary data sets. We show that our topic-aware chatbot not only outperforms the non-topic counterpart, but also that each topic-aware model qualitatively and contextually gives the most relevant answer depending on the topic of question. Another area where deep learning algorithms have been successfully applied is sequence learning, which aims at understanding the structure of sequential data such as language, musical notes, and videos. One example of an application of deep learning in language modeling is conversational chatbots . A chatbot is a program that conducts a conversation with a user by simulating one side of it. Chatbots receive inputs from a user one message, or question, at a time, and then form a response that is sent back to the user. One of the most widely used machine learning techniques for sequence learning is Recurrent Neural Networks (RNN).


AMUSED: A Multi-Stream Vector Representation Method for Use in Natural Dialogue

arXiv.org Artificial Intelligence

The problem of building a coherent and non-monotonous conversational agent with proper discourse and coverage is still an area of open research. Current architectures only take care of semantic and contextual information for a given query and fail to completely account for syntactic and external knowledge which are crucial for generating responses in a chit-chat system. To overcome this problem, we propose an end to end multi-stream deep learning architecture which learns unified embeddings for query-response pairs by leveraging contextual information from memory networks and syntactic information by incorporating Graph Convolution Networks (GCN) over their dependency parse. A stream of this network also utilizes transfer learning by pre-training a bidirectional transformer to extract semantic representation for each input sentence and incorporates external knowledge through the the neighborhood of the entities from a Knowledge Base (KB). We benchmark these embeddings on next sentence prediction task and significantly improve upon the existing techniques. Furthermore, we use AMUSED to represent query and responses along with its context to develop a retrieval based conversational agent which has been validated by expert linguists to have comprehensive engagement with humans.


Prioritized Unit Propagation with Periodic Resetting is (Almost) All You Need for Random SAT Solving

arXiv.org Artificial Intelligence

We propose prioritized unit propagation with periodic resetting, which is a simple but surprisingly effective algorithm for solving random SAT instances that are meant to be hard. In particular, an evaluation on the Random Track of the 2017 and 2018 SAT competitions shows that a basic prototype of this simple idea already ranks at second place in both years. We share this observation in the hope that it helps the SAT community better understand the hardness of random instances used in competitions and inspire other interesting ideas on SAT solving.


Inter-Level Cooperation in Hierarchical Reinforcement Learning

arXiv.org Artificial Intelligence

This article presents a novel algorithm for promoting cooperation between internal actors in a goal-conditioned hierarchical reinforcement learning (HRL) policy. Current techniques for HRL policy optimization treat the higher and lower level policies as separate entities which are trained to maximize different objective functions, rendering the HRL problem formulation more similar to a general sum game than a single-agent task. Within this setting, we hypothesize that improved cooperation between the internal agents of a hierarchy can simplify the credit assignment problem from the perspective of the high-level policies, thereby leading to significant improvements to training in situations where intricate sets of action primitives must be performed to yield improvements in performance. In order to promote cooperation within this setting, we propose the inclusion of a connected gradient term to the gradient computations of the higher level policies. Our method is demonstrated to achieve superior results to existing techniques in a set of difficult long time horizon tasks.