Collaborating Authors


Spectral clustering via adaptive layer aggregation for multi-layer networks Machine Learning

One of the fundamental problems in network analysis is detecting community structure in multi-layer networks, of which each layer represents one type of edge information among the nodes. We propose integrative spectral clustering approaches based on effective convex layer aggregations. Our aggregation methods are strongly motivated by a delicate asymptotic analysis of the spectral embedding of weighted adjacency matrices and the downstream $k$-means clustering, in a challenging regime where community detection consistency is impossible. In fact, the methods are shown to estimate the optimal convex aggregation, which minimizes the mis-clustering error under some specialized multi-layer network models. Our analysis further suggests that clustering using Gaussian mixture models is generally superior to the commonly used $k$-means in spectral clustering. Extensive numerical studies demonstrate that our adaptive aggregation techniques, together with Gaussian mixture model clustering, make the new spectral clustering remarkably competitive compared to several popularly used methods.

A Survey on Data Pricing: from Economics to Data Science Artificial Intelligence

How can we assess the value of data objectively, systematically and quantitatively? Pricing data, or information goods in general, has been studied and practiced in dispersed areas and principles, such as economics, marketing, electronic commerce, data management, data mining and machine learning. In this article, we present a unified, interdisciplinary and comprehensive overview of this important direction. We examine various motivations behind data pricing, understand the economics of data pricing and review the development and evolution of pricing models according to a series of fundamental principles. We discuss both digital products and data products. We also consider a series of challenges and directions for future work.

Sr. Machine Learning- Software Engineer VP at JPMorgan Chase Bank, N.A.


The Corporate & Investment Bank is a global leader across investment banking, wholesale payments, markets and securities services. The world's most important corporations, governments and institutions entrust us with their business in more than 100 countries. We provide strategic advice, raise capital, manage risk and extend liquidity in markets around the world. J.P. Morgan is a global leader in financial services, providing strategic advice and products to the world's most prominent corporations, governments, wealthy individuals and institutional investors. Our first-class business in a first-class way approach to serving clients drives everything we do., machine learning startup backed by software pioneer Tom Siebel, files for IPO


Tom Siebel, an early employee of database giant Oracle, later a billionaire after selling his eponymous software firm to Oracle, says his new venture, C3, is bigger than either of those., the artificial intelligence services company founded by software pioneer Tom Siebel, Friday evening filed for an initial public offering of $100 million worth of its shares, led by investment banks Morgan Stanley, JP Morgan, and Bank of America. C3 plans to list under the ticker "AI" on The New York Stock Exchange. The number of shares to be offered and the price range for the proposed offering have not yet been determined, C3 said. Siebel, who was recruited to database giant Oracle in 1983, later founded the eponymous enterprise customer relationship management software firm in 1993.

Outlier Detection with RNN Autoencoders


Anomalies, often referred to as outliers, are data points, data sequences or patterns in data which do not conform to the overarching behaviour of the data series. As such, anomaly detection is the task of detecting data points or sequences which don't conform to patterns present in the broader data. The effective detection and removal of anomalous data can provide highly useful insights across a number of business functions, such as detecting broken links embedded within a website, spikes in internet traffic, or dramatic changes in stock prices. Flagging these phenomena as outliers, or enacting a pre-planned response can save businesses both time and money. Anomalous data can typically be separated into three distinct categories, Additive Outliers, Temporal Changes, or Level Shifts. Additive Outliers are characterised by sudden large increases or decreases in value, which can be driven by exogenous or endogenous factors.

Deep Learning for NLP and Speech Recognition: Kamath, Uday, Liu, John, Whitaker, James: 9783030145989: Books


Uday Kamath has more than 20 years of experience architecting and building analytics-based commercial solutions. He currently works as the Chief Analytics Officer at Digital Reasoning, one of the leading companies in AI for NLP and Speech Recognition, heading the Applied Machine Learning research group. Most recently, Uday served as the Chief Data Scientist at BAE Systems Applied Intelligence, building machine learning products and solutions for the financial industry, focused on fraud, compliance, and cybersecurity. Uday has previously authored many books on machine learning such as Machine Learning: End-to-End guide for Java developers: Data Analysis, Machine Learning, and Neural Networks simplified and Mastering Java Machine Learning: A Java developer's guide to implementing machine learning and big data architectures. Uday has published many academic papers in different machine learning journals and conferences.

The insideBIGDATA IMPACT 50 List for Q4 2020 - insideBIGDATA


The team here at insideBIGDATA is deeply entrenched in following the big data ecosystem of companies from around the globe. Our in-box is filled each day with new announcements, commentaries, and insights about what's driving the success of our industry so we're in a unique position to publish our quarterly IMPACT 50 List of the most important movers and shakers in our industry. These companies have proven their relevance by the way they're impacting the enterprise through leading edge products and services. We're happy to publish this evolving list of the industry's most impactful companies! The selected companies come from our massive data set of vendors and industry metrics.

Blogs about Big Data, Blockchain, IoT, Drones, Artificial Intelligence and Machine Learning.


Find numerous blogs on big data, blockchain, IoT, drones, artificial intelligence, machine learning, deep learning and augmented reality. "Google will fulfill its mission only when its search engine is AI-complete. You guys know what that means? "Deep learning will revolutionize supply chain automation." "The first to fully integrate the following technologies will create a near autonomous supply chain: IoT, Big Data, Blockchain, 3D Printing, Artificial Intelligence, Machine Learning and Deep Learning." "Artificial intelligence is the future and the future is here." "Integration of the following technologies will revolutionize supply chain: IoT, Big Data, Blockchain, 3D Printing, Artificial Intelligence, and Augmented Reality." "Artificial intelligence will disrupt all industries.

Parsimonious Feature Extraction Methods: Extending Robust Probabilistic Projections with Generalized Skew-t Machine Learning

The study focuses on extension to the approach of Principal Component Analysis (PCA), as defined in [1], [2] or [3]. PCA and related matrix factorisation methodologies are widely used in data-rich environments for dimensionality reduction, data compression, feature-extraction techniques or data de-noising. The methodologies identify a lower-dimensional linear subspace to represent the data, which captures second-order dominant information contained in high-dimensional data sets. PCA can be viewed as a matrix factorisation problem which aims to learn the lower-dimensional representation of the data, preserving its Euclidean structure. However, in the presence of either a non-Gaussian distribution of the data generating distribution or in the presence of outliers which corrupt the data, the standard PCA methodology provides biased information about the lower-rank representation. In many applications, the stochastic noise or observation errors in the data set are assumed to be, in some sense, "well-behaved"; for instance, additive, light-tailed, symmetric and zero-mean. When non-robust feature extraction methods are naively utilised in the presence of violations of these implicit statistical assumptions, the information contained in the extracted features cannot be relied upon, resulting in misleading inference. Therefore, it is critical to ensure that the feature extraction captures information about correct characteristics of the process generating the data. In the following study, we relax the inherent assumption of "well-behaved" observation noise by developing a class of robust estimators that can withstand violations of such assumptions, which routinely arise in real data sets.

Machine learning based forecasting of significant daily returns in foreign exchange markets Machine Learning

Asset value forecasting has always attracted an enormous amount of interest among researchers in quantitative analysis. The advent of modern machine learning models has introduced new tools to tackle this classical problem. In this paper, we apply machine learning algorithms to hitherto unexplored question of forecasting instances of significant fluctuations in currency exchange rates. We perform analysis of nine modern machine learning algorithms using data on four major currency pairs over a 10 year period. A key contribution is the novel use of outlier detection methods for this purpose. Numerical experiments show that outlier detection methods substantially outperform traditional machine learning and finance techniques. In addition, we show that a recently proposed new outlier detection method PKDE produces best overall results. Our findings hold across different currency pairs, significance levels, and time horizons indicating the robustness of the proposed method.