AITopics

doi: 10.1007/s10618-018-0561-2

1805.02102

Country: North America (0.46)

Genre: Research Report > Promising Solution (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)
Information Technology > Data Science > Data Mining (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.67)

Alexander, Clark, Shi, Luke, Akhmametyeva, Sofya

Using Quantum Mechanics to Cluster Time Series

arXiv.org Machine LearningMay-4-2018

In this article we present a method by which we can reduce a time series into a single point in $\mathbb{R}^{13}$. We have chosen 13 dimensions so as to prevent too many points from being labeled as "noise." When using a Euclidean (or Mahalanobis) metric, a simple clustering algorithm will with near certainty label the majority of points as "noise." On pure physical considerations, this is not possible. Included in our 13 dimensions are four parameters which describe the coefficients of a cubic polynomial attached to a Gaussian picking up a general trend, four parameters picking up periodicity in a time series, two each for amplitude of a wave and period of a wave, and the final five report the "leftover" noise of the detrended and aperiodic time series. Of the final five parameters, four are the centralized probabilistic moments, and the final for the relative size of the series. The first main contribution of this work is to apply a theorem of quantum mechanics about the completeness of the solutions to the quantum harmonic oscillator on $L^2(\mathbb{R})$ to estimating trends in time series. The second main contribution is the method of fitting parameters. After many numerical trials, we realized that methods such a Newton-Rhaphson and Levenberg-Marquardt converge extremely fast if the initial guess is good. Thus we guessed many initial points in our parameter space and computed only a few iterations, a technique common in Keogh's work on time series clustering. Finally, we have produced a model which gives incredibly accurate results quickly. We ackowledge that there are faster methods as well of more accurate methods, but this work shows that we can still increase computation speed with little, if any, cost to accuracy in the sense of data clustering.

artificial intelligence, data mining, machine learning, (15 more...)

1805.01711

Country: North America > United States (0.67)

Genre: Research Report > New Finding (0.46)

Industry: Banking & Finance > Economy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Data Science > Data Mining (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.87)

arXiv.org Artificial IntelligenceMay-2-2018

Residential Transformer Overloading Risk Assessment Using Clustering Analysis

Dong, Ming, Li, Benzhe, Nassif, Alex

Residential transformer population is a critical type of asset that many electric utility companies have been attempting to manage proactively and effectively to reduce unexpected failures and life losses that are often caused by transformer overloading. Within the typical power asset portfolio, the residential transformer asset is often large in population, having lowest reliability design, lacking transformer loading data and susceptible to customer loading behaviors such as adoption of distributed energy resources and electric vehicles. On the bright side, the availability of more residential operation data along with the advancement of data analytics techniques have provided a new path to further our understanding of local residential transformer overloading risks statistically. This research developed a new data-driven method to combine clustering analysis and the simulation of transformer temperature rise and insulation life loss to quantitatively and statistically assess the overloading risk of residential transformer population in one area and suggest proper risk management measures according to the assessment results. Case studies from an actual Canadian utility company have been presented and discussed in detail to demonstrate the applicability and usefulness of the proposed method.

artificial intelligence, data mining, machine learning, (15 more...)

1805.0063

Country:

North America > Canada (0.94)
North America > United States (0.94)

Genre: Research Report (1.00)

Industry: Energy > Power Industry > Utilities (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.69)

Van Lierde, H., Chow, T. W. S., Delvenne, J. -C.

Spectral clustering algorithms for the detection of clusters in block-cyclic and block-acyclic graphs

arXiv.org Machine LearningMay-2-2018

We propose two spectral algorithms for partitioning nodes in directed graphs respectively with a cyclic and an acyclic pattern of connection between groups of nodes. Our methods are based on the computation of extremal eigenvalues of the transition matrix associated to the directed graph. The two algorithms outperform state-of-the art methods for directed graph clustering on synthetic datasets, including methods based on blockmodels, bibliometric symmetrization and random walks. Our algorithms have the same space complexity as classical spectral clustering algorithms for undirected graphs and their time complexity is also linear in the number of edges in the graph. One of our methods is applied to a trophic network based on predator-prey relationships. It successfully extracts common categories of preys and predators encountered in food chains. The same method is also applied to highlight the hierarchical structure of a worldwide network of Autonomous Systems depicting business agreements between Internet Service Providers.

artificial intelligence, data mining, machine learning, (18 more...)

doi: 10.1093/comnet/cny011

1805.00862

Country:

North America > United States (0.46)
Asia > China (0.28)
Europe > Belgium (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Van Craenendonck, Toon, Meert, Wannes, Dumancic, Sebastijan, Blockeel, Hendrik

COBRAS-TS: A new approach to Semi-Supervised Clustering of Time Series

arXiv.org Machine LearningMay-2-2018

Clustering is ubiquitous in data analysis, including analysis of time series. It is inherently subjective: different users may prefer different clusterings for a particular dataset. Semi-supervised clustering addresses this by allowing the user to provide examples of instances that should (not) be in the same cluster. This paper studies semi-supervised clustering in the context of time series. We show that COBRAS, a state-of-the-art semi-supervised clustering method, can be adapted to this setting. We refer to this approach as COBRAS-TS. An extensive experimental evaluation supports the following claims: (1) COBRAS-TS far outperforms the current state of the art in semi-supervised clustering for time series, and thus presents a new baseline for the field; (2) COBRAS-TS can identify clusters with separated components; (3) COBRAS-TS can identify clusters that are characterized by small local patterns; (4) a small amount of semi-supervision can greatly improve clustering quality for time series; (5) the choice of the clustering algorithm matters (contrary to earlier claims in the literature).

artificial intelligence, data mining, machine learning, (18 more...)

1805.00779

Country:

North America > United States (0.46)
Europe (0.28)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

#artificialintelligenceMay-1-2018, 10:18:15 GMT

80. Grouping unlabelled data with k-means clustering

Sometimes we may have prior knowledge that we want to group the data into a given number of clusters. Other times we may wish to investigate what may be a good number of clusters. In the example below we look at changing the number of clusters between 1 and 100 and measure the average distance points are from their closest cluster centre (kmeans.transform Looking at the results we may decide that up to about 10 clusters may be useful, but after that there are diminishing returns of adding further clusters.

artificial intelligence, grouping unlabelled data, machine learning, (1 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

@machinelearnbotApr-30-2018, 00:50:12 GMT

The Structure and Function of Complex Networks SIAM Review Vol. 45, No. 2

Journal of Parallel and Distributed Computing 104, 19-35.

constraint-based reasoning, information technology and artificial intelligence conference, vascular disease, (76 more...)

@machinelearnbot

Country:

Asia > China (1.00)
Asia > Middle East (0.45)
Oceania > Australia (0.27)
(15 more...)

Genre:

Overview (1.00)
Instructional Material > Course Syllabus & Notes (1.00)
Research Report > Experimental Study (0.92)
Research Report > New Finding (0.67)

Industry:

Water & Waste Management > Water Management (1.00)
Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)
(48 more...)

Technology:

Information Technology > e-Commerce > Financial Technology (1.00)
Information Technology > Software Engineering (1.00)
Information Technology > Software (1.00)
(45 more...)

arXiv.org Artificial IntelligenceApr-30-2018

Non-Intrusive Signature Extraction for Major Residential Loads

Dong, M., Meira, P. C. M., Xu, W., Chung, C. Y.

The data collected by smart meters contain a lot of useful information. One potential use of the data is to track the energy consumptions and operating statuses of major home appliances.The results will enable homeowners to make sound decisions on how to save energy and how to participate in demand response programs. This paper presents a new method to breakdown the total power demand measured by a smart meter to those used by individual appliances. A unique feature of the proposed method is that it utilizes diverse signatures associated with the entire operating window of an appliance for identification. As a result, appliances with complicated middle process can be tracked. A novel appliance registration device and scheme is also proposed to automate the creation of appliance signature database and to eliminate the need of massive training before identification. The software and system have been developed and deployed to real houses in order to verify the proposed method.

artificial intelligence, data mining, machine learning, (19 more...)

doi: 10.1109/TSG.2013.2245926

1804.11049

Country: North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.14)

Genre: Research Report (0.64)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Data Science > Data Mining (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.69)

Locatello, Francesco, Vincent, Damien, Tolstikhin, Ilya, Rätsch, Gunnar, Gelly, Sylvain, Schölkopf, Bernhard

Clustering Meets Implicit Generative Models

arXiv.org Machine LearningApr-30-2018

Clustering is a cornerstone of unsupervised learning which can be thought as disentangling multiple generative mechanisms underlying the data. In this paper we introduce an algorithmic framework to train mixtures of implicit generative models which we particularize for variational autoencoders. Relying on an additional set of discriminators, we propose a competitive procedure in which the models only need to approximate the portion of the data distribution from which they can produce realistic samples. As a byproduct, each model is simpler to train, and a clustering interpretation arises naturally from the partitioning of the training points among the models. We empirically show that our approach splits the training distribution in a reasonable way and increases the quality of the generated samples.

artificial intelligence, data distribution, machine learning, (18 more...)

1804.1113

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Cavallo, Marco, Demiralp, Çağatay

Clustrophile 2: Guided Visual Clustering Analysis

arXiv.org Artificial IntelligenceApr-28-2018

Data clustering is a common unsupervised learning method frequently used in exploratory data analysis. However, identifying relevant structures in unlabeled, high-dimensional data is nontrivial, requiring iterative experimentation with clustering parameters as well as data features and instances. The space of possible clusterings for a typical dataset is vast, and navigating in this vast space is also challenging. The absence of ground-truth labels makes it impossible to define an optimal solution, thus requiring user judgment to establish what can be considered a satisfiable clustering result. Data scientists need adequate interactive tools to effectively explore and navigate the large space of clusterings so as to improve the effectiveness of exploratory clustering analysis. We introduce \textit{Clustrophile 2}, a new interactive tool for guided clustering analysis. \textit{Clustrophile 2} guides users in clustering-based exploratory analysis, adapts user feedback to improve user guidance, facilitates the interpretation of clusters, and helps quickly reason about differences between clusterings. To this end, \textit{Clustrophile 2} contributes a novel feature, the clustering tour, to help users choose clustering parameters and assess the quality of different clustering results in relation to current analysis goals and user expectations. We evaluate \textit{Clustrophile 2} through a user study with 12 data scientists, who used our tool to explore and interpret sub-cohorts in a dataset of Parkinson's disease patients. Results suggest that \textit{Clustrophile 2} improves the speed and effectiveness of exploratory clustering analysis for both experts and non-experts.

clustrophile 2, data mining, machine learning, (19 more...)

1804.03048

Genre:

Questionnaire & Opinion Survey (0.69)
Research Report > New Finding (0.34)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Parkinson's Disease (0.69)
Health & Medicine > Therapeutic Area > Musculoskeletal (0.69)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)