Goto

Collaborating Authors

 Country


Matrix Normal PCA for Interpretable Dimension Reduction and Graphical Noise Modeling

arXiv.org Machine Learning

Principal component analysis (PCA) is one of the most widely used dimension reduction and multivariate statistical techniques. From a probabilistic perspective, PCA seeks a low-dimensional representation of data in the presence of independent identical Gaussian noise. Probabilistic PCA (PPCA) and its variants have been extensively studied for decades. Most of them assume the underlying noise follows a certain independent identical distribution. However, the noise in the real world is usually complicated and structured. To address this challenge, some non-linear variants of PPCA have been proposed. But those methods are generally difficult to interpret. To this end, we propose a powerful and intuitive PCA method (MN-PCA) through modeling the graphical noise by the matrix normal distribution, which enables us to explore the structure of noise in both the feature space and the sample space. MN-PCA obtains a low-rank representation of data and the structure of noise simultaneously. And it can be explained as approximating data over the generalized Mahalanobis distance. We develop two algorithms to solve this model: one maximizes the regularized likelihood, the other exploits the Wasserstein distance, which is more robust. Extensive experiments on various data demonstrate their effectiveness.


Deep Ordinal Classification with Inequality Constraints

arXiv.org Machine Learning

This study investigates a new constrained-optimization formulation for deep ordinal classification. We impose uni-modality of the label distribution implicitly via a set of inequality constraints over pairs of adjacent labels. To tackle the ensuing challenging optimization problem, we solve a sequence of unconstrained losses based on a powerful extension of the log-barrier method. This accommodates standard SGD for deep networks, and avoids computationally expensive Lagrangian dual steps and projections, while outperforming substantially penalty methods. Our non-parametric model is more flexible than the existing deep ordinal classification techniques: it does not restrict the learned representation to a specific parametric model, allowing the training to explore larger spaces of solutions and removing the need for ad hoc choices, while scaling up to large numbers of labels. It can be used in conjunction with any standard classification loss and any deep architecture. We also propose a new performance metric for ordinal classification, as a proxy to measure a distribution uni-modality, referred to as the Sides Order Index (SOI). We report comprehensive evaluations and comparisons to state-of-the-art methods on benchmark public datasets for several ordinal classification tasks, showing the merits of our approach in terms of label consistency and scalability. A public reproducible PyTorch implementation is provided (https://github.com/sbelharbi/Deep-Ordinal-Classification-with-Inequality-Constraints).


Survival and Neural Models for Private Equity Exit Prediction

arXiv.org Machine Learning

Within the Private Equity (PE) market, the event of a private company undertaking an Initial Public Offering (IPO) is usually a very high-return one for the investors in the company. For this reason, an effective predictive model for the IPO event is considered as a valuable tool in the PE market, an endeavor in which publicly available quantitative information is generally scarce. In this paper, we describe a data-analytic procedure for predicting the probability with which a company will go public in a given forward period of time. The proposed method is based on the interplay of a neural network (NN) model for estimating the overall event probability, and Survival Analysis (SA) for further modeling the probability of the IPO event in any given interval of time. The proposed neuro-survival model is tuned and tested across nine industrial sectors using real data from the Thomson Reuters Eikon PE database.


Attack on Grid Event Cause Analysis: An Adversarial Machine Learning Approach

arXiv.org Machine Learning

With the ever-increasing reliance on data for data-driven applications in power grids, such as event cause analysis, the authenticity of data streams has become crucially important. The data can be prone to adversarial stealthy attacks aiming to manipulate the data such that residual-based bad data detectors cannot detect them, and the perception of system operators or event classifiers changes about the actual event. This paper investigates the impact of adversarial attacks on convolutional neural network-based event cause analysis frameworks. We have successfully verified the ability of adversaries to maliciously misclassify events through stealthy data manipulations. The vulnerability assessment is studied with respect to the number of compromised measurements. Furthermore, a defense mechanism to robustify the performance of the event cause analysis is proposed. The effectiveness of adversarial attacks on changing the output of the framework is studied using the data generated by real-time digital simulator (RTDS) under different scenarios such as type of attacks and level of access to data.


Patch augmentation: Towards efficient decision boundaries for neural networks

arXiv.org Machine Learning

In this paper we propose a new augmentation technique, called patch augmentation, that, in our experiments, improves model accuracy and makes networks more robust to adversarial attacks. In brief, this data-independent approach creates new image data based on image/label pairs, where a patch from one of the two images in the pair is superimposed on to the other image, creating a new augmented sample. The new image's label is a linear combination of the image pair's corresponding labels. Initial experiments show a several percentage point increase in accuracy on CIFAR-10, from a baseline of approximately 81% to 89%. CIFAR-100 sees larger improvements still, from a baseline of 52% to 68% accuracy. Networks trained using patch augmentation are also more robust to adversarial attacks, which we demonstrate using the Fast Gradient Sign Method.


Sparse $\ell_1$ and $\ell_2$ Center Classifiers

arXiv.org Machine Learning

The nearest-centroid classifier is a simple linear-time classifier based on computing the centroids of the data classes in the training phase, and then assigning a new datum to the class corresponding to its nearest centroid. Thanks to its very low computational cost, the nearest-centroid classifier is still widely used in machine learning, despite the development of many other more sophisticated classification methods. In this paper, we propose two sparse variants of the nearest-centroid classifier, based respectively on $\ell_1$ and $\ell_2$ distance criteria. The proposed sparse classifiers perform simultaneous classification and feature selection, by detecting the features that are most relevant for the classification purpose. We show that training of the proposed sparse models, with both distance criteria, can be performed exactly (i.e., the globally optimal set of features is selected) and at a quasi-linear computational cost. The experimental results show that the proposed methods are competitive in accuracy with state-of-the-art feature selection techniques, while having a significantly lower computational cost.


Automated Peer-to-peer Negotiation for Energy Contract Settlements in Residential Cooperatives

arXiv.org Artificial Intelligence

This paper presents an automated peer-to-peer negotiation strategy for settling energy contracts among prosumers in a Residential Energy Cooperative considering heterogeneity prosumer preferences. The heterogeneity arises from prosumers' evaluation of energy contracts through multiple societal and environmental criteria and the prosumers' private preferences over those criteria. The prosumers engage in bilateral negotiations with peers to mutually agree on periodical energy contracts/loans consisting of the energy volume to be exchanged at that period and the return time of the exchanged energy. The negotiating prosumers navigate through a common negotiation domain consisting of potential energy contracts and evaluate those contracts from their valuations on the entailed criteria against a utility function that is robust against generation and demand uncertainty. From the repeated interactions, a prosumer gradually learns about the compatibility of its peers in reaching energy contracts that are closer to Nash solutions. Empirical evaluation on real demand, generation and storage profiles -- in multiple system scales -- illustrates that the proposed negotiation based strategy can increase the system efficiency (measured by utilitarian social welfare) and fairness (measured by Nash social welfare) over a baseline strategy and an individual flexibility control strategy representing the status quo strategy. We thus elicit system benefits from peer-to-peer flexibility exchange already without any central coordination and market operator, providing a simple yet flexible and effective paradigm that complements existing markets.


Bridging the Gap between Semantics and Multimedia Processing

arXiv.org Artificial Intelligence

--In this paper, we give an overview of the semantic gap problem in multimedia and discuss how machine learning and symbolic AI can be combined to narrow this gap. We describe the gap in terms of a classical architecture for multimedia processing and discuss a structured approach to bridge it. This approach combines machine learning (for mapping signals to objects) and symbolic AI (for linking objects to meanings). Our main goal is to raise awareness and discuss the challenges involved in this structured approach to multimedia understanding, especially in the view of the latest developments in machine learning and symbolic AI. A classic problem in multimedia representation and understanding is the semantic gap problem [1].


Few-Shot Knowledge Graph Completion

arXiv.org Artificial Intelligence

Knowledge graphs (KGs) serve as useful resources for various natural language processing applications. Previous KG completion approaches require a large number of training instances (i.e., head-tail entity pairs) for every relation. The real case is that for most of the relations, very few entity pairs are available. Existing work of one-shot learning limits method generalizability for few-shot scenarios and does not fully use the supervisory information; however, few-shot KG completion has not been well studied yet. In this work, we propose a novel few-shot relation learning model (FSRL) that aims at discovering facts of new relations with few-shot references. FSRL can effectively capture knowledge from heterogeneous graph structure, aggregate representations of few-shot references, and match similar entity pairs of reference set for every relation. Extensive experiments on two public datasets demonstrate that FSRL outperforms the state-of-the-art. Introduction Large-scale knowledge graphs (KGs) such as Y AGO (Suchanek, Kasneci, and Weikum 2007), NELL (Carlson et al. 2010), and Wikidata (Vrande ˇ ci c and Kr otzsch 2014) usually represent facts in the form of relations (edges) between (head-tail) entity pairs (nodes). This kind of graph-structured knowledge is essential for many downstream applications such as search, question answering, and semantic web.


Deep Reinforcement Learning for Multi-Driver Vehicle Dispatching and Repositioning Problem

arXiv.org Artificial Intelligence

--Order dispatching and driver repositioning (also known as fleet management) in the face of spatially and temporally varying supply and demand are central to a ride-sharing platform marketplace. Handcrafting heuristic solutions that account for the dynamics in these resource allocation problems is difficult, and may be better handled by an end-to-end machine learning method. Previous works have explored machine learning methods to the problem from a high-level perspective, where the learning method is responsible for either repositioning the drivers or dispatching orders, and as a further simplification, the drivers are considered independent agents maximizing their own reward functions. In this paper we present a deep reinforcement learning approach for tackling the full fleet management and dispatching problems. In addition to treating the drivers as individual agents, we consider the problem from a system-centric perspective, where a central fleet management agent is responsible for decision-making for all drivers. I NTRODUCTION The order dispatching and fleet management system at a ride-sharing company must make decisions both for assigning available drivers to nearby passengers (hereby called orders) and for repositioning drivers who have no nearby orders. These decisions have short-term effects on the revenue generated by the drivers and driver availability. In the long term they change the distribution of drivers across the city, which in turn has a critical impact on how well future orders can be served. Provident algorithmic solutions, which account for the short term and long-term consequences of their decisions can improve the quality of service of the ride-sharing platforms and are thus an important area of research. Recent works [1], [2] have successfully applied Deep Reinforcement Learning (RL) techniques to dispatching problems, such as the Traveling Salesman Problem (TSP) and the more general V ehicle Routing Problem (VRP) [3], however they have primarily focused on static ( i. e. those where all orders are known up front) and/or single-driver dispatching problems. In contrast to these problems, the fleet management and order dispatching problem ride-sharing platforms face has multiple drivers and dynamically changing supply and demand conditions. We refer to this dynamic dispatching and fleet management problem as the Multi-Driver V ehicle Dispatching and Repositioning Problem (MDVDRP). VRPs and other problems similar to the MDVDRP are studied in the field of combinatorial optimization. Exactly solving instances of these problems at the scale of real-world environment is computationally intractable [4].