AITopics | Clustering

Collaborating Authors

Clustering

Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). It is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, bioinformatics, data compression, and computer graphics. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Markov models for ocular fixation locations in the presence and absence of colour

Kashlak, Adam B., Devane, Eoin, Dietert, Helge, Jackson, Henry

arXiv.org Machine LearningApr-21-2016

We propose to model the fixation locations of the human eye when observing a still image by a Markovian point process in R 2 . Our approach is data driven using k-means clustering of the fixation locations to identify distinct salient regions of the image, which in turn correspond to the states of our Markov chain. Bayes factors are computed as model selection criterion to determine the number of clusters. Furthermore, we demonstrate that the behaviour of the human eye differs from this model when colour information is removed from the given image.

artificial intelligence, fixation, machine learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1111/rssc.12223

1604.06335

Genre: Research Report > Experimental Study (0.96)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.87)

Add feedback

The DARPA Twitter Bot Challenge

Subrahmanian, V. S., Azaria, Amos, Durst, Skylar, Kagan, Vadim, Galstyan, Aram, Lerman, Kristina, Zhu, Linhong, Ferrara, Emilio, Flammini, Alessandro, Menczer, Filippo, Stevens, Andrew, Dekhtyar, Alexander, Gao, Shuyang, Hogg, Tad, Kooti, Farshad, Liu, Yan, Varol, Onur, Shiralkar, Prashant, Vydiswaran, Vinod, Mei, Qiaozhu, Hwang, Tim

arXiv.org Artificial IntelligenceApr-21-2016

A number of organizations ranging from terrorist groups such as ISIS to politicians and nation states reportedly conduct explicit campaigns to influence opinion on social media, posing a risk to democratic processes. There is thus a growing need to identify and eliminate "influence bots" - realistic, automated identities that illicitly shape discussion on sites like Twitter and Facebook - before they get too influential. Spurred by such events, DARPA held a 4-week competition in February/March 2015 in which multiple teams supported by the DARPA Social Media in Strategic Communications program competed to identify a set of previously identified "influence bots" serving as ground truth on a specific topic within Twitter. Past work regarding influence bots often has difficulty supporting claims about accuracy, since there is limited ground truth (though some exceptions do exist [3,7]). However, with the exception of [3], no past work has looked specifically at identifying influence bots on a specific topic. This paper describes the DARPA Challenge and describes the methods used by the three top-ranked teams.

artificial intelligence, machine learning, social media, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/MC.2016.183

1601.0514

Country: North America > United States > Maryland (0.28)

Genre: Research Report (0.50)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Government > Military (1.00)
Law Enforcement & Public Safety > Terrorism (0.76)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)

Add feedback

Michael Lane's Homepage

#artificialintelligenceApr-19-2016, 02:25:36 GMT

The final homework assignment for CS545 Machine Learning was to implement a K-means clustering algorithm to cluster and classify the OptDigits data. The raw data looks something like the figures to the left. So these instances are fields of 0's whereby some 0's have been flipped to be 1's such that the image is recognizable (to humans) as a handwritten digit. For the K-means classifier, we ran 2 different experiments. The first expeiment used 10 centroids (one per digit), the second used 30 centroids to see if it could find clusters where the handwritten digits were different enough to notice differences.

artificial intelligence, centroid, machine learning, (11 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.56)

Add feedback

SAND: Semi-Supervised Adaptive Novel Class Detection and Classification over Data Stream

Haque, Ahsanul (The University of Texas at Dallas) | Khan, Latifur (The University of Texas at Dallas) | Baron, Michael (The University of Texas at Dallas)

AAAI ConferencesApr-19-2016

Most approaches to classifying data streams either divide the stream into fixed-size chunks or use gradual forgetting. Due to evolving nature of data streams, finding a proper size or choosing a forgetting rate without prior knowledge about time-scale of change is not a trivial task. These approaches hence suffer from a trade-off between performance and sensitivity. Existing dynamic sliding window based approaches address this problem by tracking changes in classifier error rate, but are supervised in nature. We propose an efficient semi-supervised framework in this paper which uses change detection on classifier confidence to detect concept drifts, and to determine chunk boundaries dynamically. It also addresses concept evolution problem by detecting outliers having strong cohesion among themselves. Experiment results on benchmark and synthetic data sets show effectiveness of the proposed approach.

classifier, concept drift, data stream, (15 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Texas > Dallas County > Richardson (0.04)
North America > United States > Wisconsin (0.04)

Technology:

Information Technology > Data Science > Data Mining (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Creating Images by Learning Image Semantics Using Vector Space Models

Heath, Derrall (Brigham Young University) | Ventura, Dan (Brigham Young University)

AAAI ConferencesApr-19-2016

When dealing with images and semantics, most computational systems attempt to automatically extract meaning from images. Here we attempt to go the other direction and autonomously create images that communicate concepts. We present an enhanced semantic model that is used to generate novel images that convey meaning. We employ a vector space model and a large corpus to learn vector representations of words and then train the semantic model to predict word vectors that could describe a given image. Once trained, the model autonomously guides the process of rendering images that convey particular concepts. A significant contribution is that, because of the semantic associations encoded in these word vectors, we can also render images that convey concepts on which the model was not explicitly trained. We evaluate the semantic model with an image clustering technique and demonstrate that the model is successful in creating images that communicate semantic relationships.

adjective, darci, vector, (17 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Alaska (0.04)
North America > United States > Utah > Utah County > Provo (0.04)
North America > United States > New York > New York County > New York City (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.73)
(2 more...)

Add feedback

Intrinsic and Extrinsic Evaluations of Word Embeddings

Zhai, Michael (Emory University) | Tan, Johnny (Emory University) | Choi, Jinho D. (Emory University)

AAAI ConferencesApr-19-2016

In this paper, we first analyze the semantic composition of word embeddings by cross-referencing their clusters with the manual lexical database, WordNet. We then evaluate a variety of word embedding approaches by comparing their contributions to two NLP tasks. Our experiments show that the word embedding clusters give high correlations to the synonym and hyponym sets in WordNet, and give 0.88% and 0.17% absolute improvements in accuracy to named entity recognition and part-of-speech tagging, respectively.

artificial intelligence, machine learning, natural language, (16 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Structure Aware L1 Graph for Data Clustering

Han, Shuchu (Stony Brook Univsersity) | Qin, Hong (Stony Brook Univsersity)

AAAI ConferencesApr-19-2016

In graph-oriented machine learning research, L1 graph is an efficient way to represent the connections of input data samples. Its construction algorithm is based on a numerical optimization motivated by Compressive Sensing theory. As a result, It is a nonparametric method which is highly demanded. However, the information of data such as geometry structure and density distribution are ignored. In this paper, we propose a Structure Aware (SA) L1 graph to improve the data clustering performance by capturing the manifold structure of input data. We use a local dictionary for each datum while calculating its sparse coefficients. SA-L1 graph not only preserves the locality of data but also captures the geometry structure of data. The experimental results show that our new algorithm has better clustering performance than L1 graph.

artificial intelligence, graph, machine learning, (16 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country: North America > United States (0.15)

Genre: Research Report > New Finding (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.73)

Add feedback

Teaching Big Data Analytics Skills with Intelligent Workflow Systems

Gil, Yolanda (University of Southern California)

AAAI ConferencesApr-19-2016

We have designed an open and modular course for data science and big data analytics using a workflow paradigm that allows students to easily experience big data through a sophisticated yet easy to use instrument that is an intelligent workflow system. A key aspect of this work is the use of semantic workflows to capture and reuse end-to-end analytic methods that experts would use to analyze big data, and the use of an intelligent workflow system to elaborate the workflow and manage its execution and resulting datasets. Through the exposure of big data analytics in a workflow framework, students will be able to get first-hand experiences with a breadth of big data topics, including multi-step data analytic and statistical methods, software reuse and composition, parallel distributed programming, high-end computing. In addition, students learn about a range of topics in AI, including semantic representations and ontologies, machine learning, natural language processing, and image analysis.

artificial intelligence, data mining, machine learning, (18 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country:

North America > United States (0.46)
Europe (0.28)

Genre:

Workflow (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Video Semantic Clustering with Sparse and Incomplete Tags

Wang, Jingya (Queen Mary University of London) | Zhu, Xiatian (Queen Mary University of London) | Gong, Shaogang (Queen Mary University of London)

AAAI ConferencesApr-19-2016

Clustering tagged videos into semantic groups is importantbut challenging due to the need for jointly learning correlations between heterogeneous visual and tag data. The taskis made more difficult by inherently sparse and incompletetag labels. In this work, we develop a method for accuratelyclustering tagged videos based on a novel Hierarchical-MultiLabel Random Forest model capable of correlating structured visual and tag information. Specifically, our model exploits hierarchically structured tags of different abstractnessof semantics and multiple tag statistical correlations, thus discovers more accurate semantic correlations among differentvideo data, even with highly sparse/incomplete tags.

correlation, data mining, machine learning, (19 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)

Add feedback

Decentralized Robust Subspace Clustering

Liu, Bo (Rutgers, The State University of New Jersey) | Yuan, Xiao-Tong (Nanjing University of Information Science and Technology) | Yu, Yang (Rutgers, The State University of New Jersey) | Liu, Qingshan (Nanjing University of Information Science and Technology) | Metaxas, Dimitris N. (Rutgers, The State University of New Jersey)

AAAI ConferencesApr-19-2016

We consider the problem of subspace clustering using the SSC (Sparse Subspace Clustering) approach, which has several desirable theoretical properties and has been shown to be effective in various computer vision applications.We develop a large scale distributed framework for the computation of SSC via an alternating direction method of multiplier (ADMM) algorithm. The proposed framework solves SSC in column blocks and only involves parallel multivariate Lasso regression subproblems and sample-wise operations. This appealing property allows us to allocate multiple cores/machines for the processing of individual column blocks.We evaluate our algorithm on a shared-memory architecture. Experimental results on real-world datasets confirm that the proposed block-wise ADMM framework is substantially more efficient than its matrix counterpart used by SSC,without sacrificing accuracy. Moreover, our approach is directly applicable to decentralized neighborhood selection for Gaussian graphical models structure estimation.

artificial intelligence, data mining, machine learning, (16 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country: Asia > China (0.28)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)

Add feedback