AITopics

1906.10921

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.49)

Thosar, Madhura, Mueller, Christian A., Jaeger, Georg, Schleiss, Johannes, Pulugu, Narender, Chennaboina, Ravi Mallikarjun, Jeevangekar, Sai Vivek, Birk, Andreas, Pfingsthorn, Max, Zug, Sebastian

From Multi-modal Property Dataset to Robot-centric Conceptual Knowledge About Household Objects

arXiv.org Artificial IntelligenceJun-26-2019

Tool-use applications in robotics require conceptual knowledge about objects for informed decision making and object interactions. State-of-the-art methods employ hand-crafted symbolic knowledge which is defined from a human perspective and grounded into sensory data afterwards. However, due to different sensing and acting capabilities of robots, their conceptual understanding of objects must be generated from a robot's perspective entirely, which asks for robot-centric conceptual knowledge about objects. With this goal in mind, this article motivates that such knowledge should be based on physical and functional properties of objects. Consequently, a selection of ten properties is defined and corresponding extraction methods are proposed. This multi-modal property extraction forms the basis on which our second contribution, a robot-centric knowledge generation is build on. It employs unsupervised clustering methods to transform numerical property data into symbols, and Bivariate Joint Frequency Distributions and Sample Proportion to generate conceptual knowledge about objects using the robot-centric symbols. A preliminary implementation of the proposed framework is employed to acquire a dataset comprising physical and functional property data of 110 houshold objects. This Robot-Centric dataSet (RoCS) is used to evaluate the framework regarding the property extraction methods, the semantics of the considered properties within the dataset and its usefulness in real-world applications such as tool substitution.

artificial intelligence, knowledge, machine learning, (19 more...)

1906.11114

Country:

Europe (0.93)
North America > United States (0.92)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
(2 more...)

#artificialintelligenceJun-24-2019, 13:27:38 GMT

Versatile linkage: a family of space-conserving strategies for agglomerative hierarchical clustering

Agglomerative hierarchical clustering can be implemented with several strategies that differ in the way elements of a collection are grouped together to build a hierarchy of clusters. Here we introduce versatile linkage, a new infinite system of agglomerative hierarchical clustering strategies based on generalized means, which go from single linkage to complete linkage, passing through arithmetic average linkage and other clustering methods yet unexplored such as geometric linkage and harmonic linkage. We compare the different clustering strategies in terms of cophenetic correlation, mean absolute error, and also tree balance and space distortion, two new measures proposed to describe hierarchical trees. Unlike the β-flexible clustering system, we show that the versatile linkage family is space-conserving.

artificial intelligence, linkage, machine learning, (2 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

arXiv.org Machine LearningJun-24-2019

Data-driven prediction of vortex-induced vibration response of marine risers subjected to three-dimensional current

Riemer-Sørensen, Signe, Wu, Jie, Lie, Halvor, Sævik, Svein, Kim, Sang-Woo

Slender marine structures such as deep-water marine risers are subjected to currents and will normally experience Vortex Induced Vibrations (VIV), which can cause fast accumulation of fatigue damage. The ocean current is often three-dimensional (3D), i.e., the direction and magnitude of the current vary throughout the water column. Today, semi-empirical tools are used by the industry to predict VIV induced fatigue on risers. The load model and hydrodynamic parameters in present VIV prediction tools are developed based on two-dimensional (2D) flow conditions, as it is challenging to consider the effect of 3D flow along the risers. Accordingly, the current profiles must be purposely made 2D during the design process, which leads to significant uncertainty in the prediction results. Further, due to the limitations in the laboratory, VIV model tests are mostly carried out under 2D flow conditions and thus little experimental data exist to document VIV response of riser subjected to varying directions of the current. However, a few experiments have been conducted with 3D current. We have used results from one of these experiments to investigate how well 1) traditional and 2) an alternative method based on a data driven prediction can describe VIV in 3D currents. Data driven modelling is particularly suited for complicated problems with many parameters and non-linear relationships. We have applied a data clustering algorithm to the experimental 3D flow data in order to identify measurable parameters that can influence responses. The riser responses are grouped based on their statistical characteristics, which relate to the direction of the flow. Furthermore we fit a random forest regression model to the measured VIV response and compare its performance with the predictions of existing VIV prediction tools (VIVANA-FD).

artificial intelligence, prediction, upstream oil & gas, (19 more...)

1906.11177

Country:

North America > United States > Texas (0.46)
Europe > Norway (0.29)
Europe > Switzerland (0.14)

Genre: Research Report (0.50)

Industry: Energy > Oil & Gas > Upstream (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Hang, Hanyuan, Cai, Yuchao, Yang, Hanfang

Density-based Clustering with Best-scored Random Forest

arXiv.org Machine LearningJun-24-2019

Regarded as one of the most basic tools to investigate statistical properties of unsupervised data, clustering aims to group a set of objects in such a way that objects in the same cluster are more similar in some sense to each other than to those in other clusters. Typical application possibilities are to be found reaching from categorization of tissues in medical imaging to grouping internet searching results. For instance, on PET scans, cluster analysis can distinguish between different types of tissue in a three-dimensional image for many different purposes (Filipovych et al., 2011) while in the process of intelligent grouping of the files and websites, clustering algorithms create a more relevant set of search results (Marco and Navigli, 2013). Because of their wide applications, more urgent requirements for clustering algorithms that not only maintain desirable prediction accuracy but also have high computational efficiency are raised.

algorithm, artificial intelligence, machine learning, (18 more...)

1906.10094

Genre: Research Report (0.82)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.88)
Health & Medicine > Health Care Technology (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Zafar, Muhammad Rehman, Khan, Naimul Mefraz

DLIME: A Deterministic Local Interpretable Model-Agnostic Explanations Approach for Computer-Aided Diagnosis Systems

arXiv.org Artificial IntelligenceJun-24-2019

Local Interpretable Model-Agnostic Explanations (LIME) is a popular technique used to increase the interpretability and explainability of black box Machine Learning (ML) algorithms. LIME typically generates an explanation for a single prediction by any ML model by learning a simpler interpretable model (e.g. linear classifier) around the prediction through generating simulated data around the instance by random perturbation, and obtaining feature importance through applying some form of feature selection. While LIME and similar local algorithms have gained popularity due to their simplicity, the random perturbation and feature selection methods result in "instability" in the generated explanations, where for the same prediction, different explanations can be generated. This is a critical issue that can prevent deployment of LIME in a Computer-Aided Diagnosis (CAD) system, where stability is of utmost importance to earn the trust of medical professionals. In this paper, we propose a deterministic version of LIME. Instead of random perturbation, we utilize agglomerative Hierarchical Clustering (HC) to group the training data together and K-Nearest Neighbour (KNN) to select the relevant cluster of the new instance that is being explained. After finding the relevant cluster, a linear model is trained over the selected cluster to generate the explanations. Experimental results on three different medical datasets show the superiority for Deterministic Local Interpretable Model-Agnostic Explanations (DLIME), where we quantitatively determine the stability of DLIME compared to LIME utilizing the Jaccard similarity among multiple generated explanations.

artificial intelligence, explanation, machine learning, (12 more...)

1906.10263

Country: North America > United States (0.69)

Genre: Research Report > New Finding (0.47)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.35)

Van Lierde, Hadrien, Chow, Tommy W. S.

Learning with fuzzy hypergraphs: a topical approach to query-oriented text summarization

arXiv.org Artificial IntelligenceJun-22-2019

Existing graph-based methods for extractive document summarization represent sentences of a corpus as the nodes of a graph or a hypergraph in which edges depict relationships of lexical similarity between sentences. Such approaches fail to capture semantic similarities between sentences when they express a similar information but have few words in common and are thus lexically dissimilar. To overcome this issue, we propose to extract semantic similarities based on topical representations of sentences. Inspired by the Hierarchical Dirichlet Process, we propose a probabilistic topic model in order to infer topic distributions of sentences. As each topic defines a semantic connection among a group of sentences with a certain degree of membership for each sentence, we propose a fuzzy hypergraph model in which nodes are sentences and fuzzy hyperedges are topics. To produce an informative summary, we extract a set of sentences from the corpus by simultaneously maximizing their relevance to a user-defined query, their centrality in the fuzzy hypergraph and their coverage of topics present in the corpus. We formulate a polynomial time algorithm building on the theory of submodular functions to solve the associated optimization problem. A thorough comparative analysis with other graph-based summarization systems is included in the paper. Our obtained results show the superiority of our method in terms of content coverage of the summaries.

artificial intelligence, machine learning, natural language, (21 more...)

doi: 10.1016/j.ins.2019.05.020

1906.09445

Country:

North America > United States > District of Columbia > Washington (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
(27 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Regional Government (1.00)
Government > Immigration & Customs (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)

Gentile, Claudio, Vitale, Fabio, Rajagopalan, Anand

Flattening a Hierarchical Clustering through Active Learning

arXiv.org Machine LearningJun-22-2019

We investigate active learning by pairwise similarity over the leaves of trees originating from hierarchical clustering procedures. In the realizable setting, we provide a full characterization of the number of queries needed to achieve perfect reconstruction of the tree cut. In the non-realizable setting, we rely on known important-sampling procedures to obtain regret and query complexity bounds. Our algorithms come with theoretical guarantees on the statistical error and, more importantly, lend themselves to linear-time implementations in the relevant parameters of the problem. We discuss such implementations, prove running time guarantees for them, and present preliminary experiments on real-world datasets showing the compelling practical performance of our algorithms as compared to both passive learning and simple active learning baselines.

algorithm, artificial intelligence, machine learning, (17 more...)

1906.09458

Country: Europe (0.45)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Fernández, Alberto, Gómez, Sergio

Versatile linkage: a family of space-conserving strategies for agglomerative hierarchical clustering

arXiv.org Machine LearningJun-21-2019

Abstract: Agglomerative hierarchical clustering can be implemented with several strategies that differ in the way elements of a collection are grouped together to build a hierarchy of clusters. Here we introduce versatile linkage, a new infinite system of agglomerative hierarchical clustering strategies based on generalized means, which go from single linkage to complete linkage, passing through arithmetic average linkage and other clustering methods yet unexplored such as geometric linkage and harmonic linkage. We compare the different clustering strategies in terms of cophenetic correlation, mean absolute error, and also tree balance and space distortion, two new measures proposed to describe hierarchical trees. Unlike the β-flexible clustering system, we show that the versatile linkage family is space-conserving.

artificial intelligence, linkage, machine learning, (16 more...)

1906.09222

Country: Europe > Spain (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Karlsen, Matthew R, Moschoyiannis, Sotiris K.

Customer Segmentation of Wireless Trajectory Data

arXiv.org Artificial IntelligenceJun-20-2019

Wireless trajectory data consists of a number of (time, point) entries where each point is associated with a particular wireless device (WAP or BLE beacon) tied to a location identifier, such as a place name. A trajectory relates to a particular mobile device. Such data can be clustered `semantically' to identify similar trajectories, where similarity relates to non-geographic characteristics such as the type of location visited. Here we present a new approach to semantic trajectory clustering for such data. The approach is applicable to interpreting data that does not contain geographical coordinates, and thus contributes to the current literature on semantic trajectory clustering. The literature does not appear to provide such an approach, instead focusing on trajectory data where latitude and longitude data is available. We apply the techniques developed above in the context of the Onward Journey Planner Application, with the motivation of providing on-line recommendations for onward journey options in a context-specific manner. The trajectories analysed indicate commute patterns on the London Underground. Points are only recorded for communication with WAP and BLE beacons within the rail network. This context presents additional challenge since the trajectories are `truncated', with no true origin and destination details. In the above context we find that there are a range of travel patterns in the data, without the existence of distinct clusters. Suggestions are made concerning how to approach the problem of provision of on-line recommendations with such a data set. Thoughts concerning the related problem of prediction of journey route and destination are also provided.

data mining, machine learning, trajectory, (18 more...)

1906.08874

Country:

Europe > United Kingdom > England > Greater London > London (0.24)
Europe > United Kingdom > England > Leicestershire > Leicester (0.05)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report (0.40)

Industry:

Transportation > Ground > Rail (1.00)
Transportation > Passenger (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Communications > Mobile (0.87)