AITopics

2211.02396

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Wisconsin (0.05)
North America > United States > Texas (0.05)
(11 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.93)

arXiv.org Artificial IntelligenceNov-27-2022

Dual Information Enhanced Multi-view Attributed Graph Clustering

Lin, Jia-Qi, Chen, Man-Sheng, Zhu, Xi-Ran, Wang, Chang-Dong, Zhang, Haizhang

Multi-view attributed graph clustering is an important approach to partition multi-view data based on the attribute feature and adjacent matrices from different views. Some attempts have been made in utilizing Graph Neural Network (GNN), which have achieved promising clustering performance. Despite this, few of them pay attention to the inherent specific information embedded in multiple views. Meanwhile, they are incapable of recovering the latent high-level representation from the low-level ones, greatly limiting the downstream clustering performance. To fill these gaps, a novel Dual Information enhanced multi-view Attributed Graph Clustering (DIAGC) method is proposed in this paper. Specifically, the proposed method introduces the Specific Information Reconstruction (SIR) module to disentangle the explorations of the consensus and specific information from multiple views, which enables GCN to capture the more essential low-level representations. Besides, the Mutual Information Maximization (MIM) module maximizes the agreement between the latent high-level representation and low-level ones, and enables the high-level representation to satisfy the desired clustering structure with the help of the Self-supervised Clustering (SC) module. Extensive experiments on several real-world benchmarks demonstrate the effectiveness of the proposed DIAGC method compared with the state-of-the-art baselines.

artificial intelligence, machine learning, representation, (17 more...)

2211.14987

Country:

Asia > China > Guangdong Province > Zhuhai (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report (0.64)

Industry: Media (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceNov-27-2022

Simplifying Clustering with Graph Neural Networks

Bianchi, Filippo Maria

The objective functions used in spectral clustering are usually composed of two terms: i) a term that minimizes the local quadratic variation of the cluster assignments on the graph and; ii) a term that balances the clustering partition and helps avoiding degenerate solutions. This paper shows that a graph neural network, equipped with suitable message passing layers, can generate good cluster assignments by optimizing only a balancing term. Results on attributed graph datasets show the effectiveness of the proposed approach in terms of clustering performance and computation time.

artificial intelligence, machine learning, node, (14 more...)

doi: 10.7557/18.6790

2207.08779

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Freire, Pedro J., Napoli, Antonio, Ron, Diego Arguello, Spinnler, Bernhard, Anderson, Michael, Schairer, Wolfgang, Bex, Thomas, Costa, Nelson, Turitsyn, Sergei K., Prilepsky, Jaroslaw E.

Reducing Computational Complexity of Neural Networks in Optical Channel Equalization: From Concepts to Implementation

arXiv.org Artificial IntelligenceNov-26-2022

In this paper, a new methodology is proposed that allows for the low-complexity development of neural network (NN) based equalizers for the mitigation of impairments in high-speed coherent optical transmission systems. In this work, we provide a comprehensive description and comparison of various deep model compression approaches that have been applied to feed-forward and recurrent NN designs. Additionally, we evaluate the influence these strategies have on the performance of each NN equalizer. Quantization, weight clustering, pruning, and other cutting-edge strategies for model compression are taken into consideration. In this work, we propose and evaluate a Bayesian optimization-assisted compression, in which the hyperparameters of the compression are chosen to simultaneously reduce complexity and improve performance. In conclusion, the trade-off between the complexity of each compression approach and its performance is evaluated by utilizing both simulated and experimental data in order to complete the analysis. By utilizing optimal compression approaches, we show that it is possible to design an NN-based equalizer that is simpler to implement and has better performance than the conventional digital back-propagation (DBP) equalizer with only one step per span. This is accomplished by reducing the number of multipliers used in the NN equalizer after applying the weighted clustering and pruning algorithms. Furthermore, we demonstrate that an equalizer based on NN can also achieve superior performance while still maintaining the same degree of complexity as the full electronic chromatic dispersion compensation block. We conclude our analysis by highlighting open questions and existing challenges, as well as possible future research directions.

artificial intelligence, machine learning, quantization, (18 more...)

doi: 10.1109/JLT.2023.3234327

2208.12866

Country:

North America > United States > New York (0.04)
Europe > United Kingdom (0.04)
Europe > Portugal (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre:

Research Report > New Finding (1.00)
Overview (0.92)

Industry:

Information Technology (1.00)
Telecommunications > Networks (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)

Echegoyen, Carlos, Pérez, Aritz, Santafé, Guzmán, Pérez-Goya, Unai, Ugarte, María Dolores

Unsupervised Semantic Analysis of a Region from Satellite Image Time Series

arXiv.org Artificial IntelligenceNov-23-2022

Temporal sequences of satellite images constitute a highly valuable and abundant resource to analyze a given region. However, the labeled data needed to train most machine learning models are scarce and difficult to obtain. In this context, the current work investigates a fully unsupervised methodology that, given a sequence of images, learns a semantic embedding and then, creates a partition of the ground according to its semantic properties and its evolution over time. We illustrate the methodology by conducting the semantic analysis of a sequence of satellite images of a region of Navarre (Spain). The proposed approach reveals a novel broad perspective of the land, where potentially large areas that share both a similar semantic and a similar temporal evolution are connected in a compact and well-structured manner. The results also show a close relationship between the allocation of the clusters in the geographic space and their allocation in the embedded spaces. The semantic analysis is completed by obtaining the representative sequence of tiles corresponding to each cluster, the linear interpolation between related areas, and a graph that shows the relationships between the clusters, providing a concise semantic summary of the whole region.

artificial intelligence, machine learning, representation, (17 more...)

2208.13504

Country:

Oceania > New Caledonia (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Arizona > Maricopa County > Scottsdale (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Energy (0.48)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Evaluating COVID-19 Sequence Data Using Nearest-Neighbors Based Network Model

Ali, Sarwan

The SARS-CoV-2 coronavirus is the cause of the COVID-19 disease in humans. Like many coronaviruses, it can adapt to different hosts and evolve into different lineages. It is well-known that the major SARS-CoV-2 lineages are characterized by mutations that happen predominantly in the spike protein. Understanding the spike protein structure and how it can be perturbed is vital for understanding and determining if a lineage is of concern. These are crucial to identifying and controlling current outbreaks and preventing future pandemics. Machine learning (ML) methods are a viable solution to this effort, given the volume of available sequencing data, much of which is unaligned or even unassembled. However, such ML methods require fixed-length numerical feature vectors in Euclidean space to be applicable. Similarly, euclidean space is not considered the best choice when working with the classification and clustering tasks for biological sequences. For this purpose, we design a method that converts the protein (spike) sequences into the sequence similarity network (SSN). We can then use SSN as an input for the classical algorithms from the graph mining domain for the typical tasks such as classification and clustering to understand the data. We show that the proposed alignment-free method is able to outperform the current SOTA method in terms of clustering results. Similarly, we are able to achieve higher classification accuracy using well-known Node2Vec-based embedding compared to other baseline embedding approaches.

bioinformatics, data mining, machine learning, (21 more...)

2211.10546

Country:

Asia > India (0.04)
North America > United States > California (0.04)
Asia > Japan (0.04)
(9 more...)

Genre: Research Report > Experimental Study (0.68)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Biomedical Informatics > Translational Bioinformatics (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Scalable and Effective Conductance-based Graph Clustering

Lin, Longlong, Li, Rong-Hua, Jia, Tao

Conductance-based graph clustering has been recognized as a fundamental operator in numerous graph analysis applications. Despite the significant success of conductance-based graph clustering, existing algorithms are either hard to obtain satisfactory clustering qualities, or have high time and space complexity to achieve provable clustering qualities. To overcome these limitations, we devise a powerful \textit{peeling}-based graph clustering framework \textit{PCon}. We show that many existing solutions can be reduced to our framework. Namely, they first define a score function for each vertex, then iteratively remove the vertex with the smallest score. Finally, they output the result with the smallest conductance during the peeling process. Based on our framework, we propose two novel algorithms \textit{PCon\_core} and \emph{PCon\_de} with linear time and space complexity, which can efficiently and effectively identify clusters from massive graphs with more than a few billion edges. Surprisingly, we prove that \emph{PCon\_de} can identify clusters with near-constant approximation ratio, resulting in an important theoretical improvement over the well-known quadratic Cheeger bound. Empirical results on real-life and synthetic datasets show that our algorithms can achieve 5$\sim$42 times speedup with a high clustering accuracy, while using 1.4$\sim$7.8 times less memory than the baseline algorithms.

artificial intelligence, graph, machine learning, (16 more...)

2211.12511

Country:

Asia > China > Chongqing Province > Chongqing (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Fichera, Bernardo, Billard, Aude

Linearization and Identification of Multiple-Attractor Dynamical Systems through Laplacian Eigenmaps

Dynamical Systems (DS) are fundamental to the modeling and understanding time evolving phenomena, and have application in physics, biology and control. As determining an analytical description of the dynamics is often difficult, data-driven approaches are preferred for identifying and controlling nonlinear DS with multiple equilibrium points. Identification of such DS has been treated largely as a supervised learning problem. Instead, we focus on an unsupervised learning scenario where we know neither the number nor the type of dynamics. We propose a Graph-based spectral clustering method that takes advantage of a velocity-augmented kernel to connect data points belonging to the same dynamics, while preserving the natural temporal evolution. We study the eigenvectors and eigenvalues of the Graph Laplacian and show that they form a set of orthogonal embedding spaces, one for each sub-dynamics. We prove that there always exist a set of 2-dimensional embedding spaces in which the sub-dynamics are linear and n-dimensional embedding spaces where they are quasi-linear. We compare the clustering performance of our algorithm to Kernel K-Means, Spectral Clustering and Gaussian Mixtures and show that, even when these algorithms are provided with the correct number of sub-dynamics, they fail to cluster them correctly. We learn a diffeomorphism from the Laplacian embedding space to the original space and show that the Laplacian embedding leads to good reconstruction accuracy and a faster training time through an exponential decaying loss compared to the state-of-the-art diffeomorphism-based approaches.

artificial intelligence, eigenvector, machine learning, (17 more...)

2202.09171

Country:

Europe > Switzerland > Vaud > Lausanne (0.04)
Oceania > New Zealand (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.63)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.66)

Sitokonstantinou, Vasileios

Big Earth Data and Machine Learning for Sustainable and Resilient Agriculture

Big streams of Earth images from satellites or other platforms (e.g., drones and mobile phones) are becoming increasingly available at low or no cost and with enhanced spatial and temporal resolution. This thesis recognizes the unprecedented opportunities offered by the high quality and open access Earth observation data of our times and introduces novel machine learning and big data methods to properly exploit them towards developing applications for sustainable and resilient agriculture. The thesis addresses three distinct thematic areas, i.e., the monitoring of the Common Agricultural Policy (CAP), the monitoring of food security and applications for smart and resilient agriculture. The methodological innovations of the developments related to the three thematic areas address the following issues: i) the processing of big Earth Observation (EO) data, ii) the scarcity of annotated data for machine learning model training and iii) the gap between machine learning outputs and actionable advice. This thesis demonstrated how big data technologies such as data cubes, distributed learning, linked open data and semantic enrichment can be used to exploit the data deluge and extract knowledge to address real user needs. Furthermore, this thesis argues for the importance of semi-supervised and unsupervised machine learning models that circumvent the ever-present challenge of scarce annotations and thus allow for model generalization in space and time. Specifically, it is shown how merely few ground truth data are needed to generate high quality crop type maps and crop phenology estimations. Finally, this thesis argues there is considerable distance in value between model inferences and decision making in real-world scenarios and thereby showcases the power of causal and interpretable machine learning in bridging this gap.

data mining, machine learning, natural language, (27 more...)

2211.12584

Country:

Asia > South Korea (0.14)
Oceania > Australia (0.14)
Asia > Vietnam (0.13)
(45 more...)

Genre:

Workflow (1.00)
Summary/Review (1.00)
Research Report > New Finding (1.00)
(2 more...)

Industry:

Food & Agriculture > Agriculture (1.00)
Government > Regional Government > Europe Government (0.45)
Transportation > Ground > Road (0.45)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Communications > Web > Semantic Web (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
(8 more...)

#artificialintelligenceNov-21-2022, 19:00:41 GMT

Hierarchical Clustering in Machine Learning - Javatpoint

Hierarchical Clustering in Machine Learning with Machine Learning Tutorial, Machine Learning Introduction, What is Machine Learning, Data Machine Learning, Machine Learning vs Artificial Intelligence etc.

dendrogram, euclidean distance, hierarchical clustering, (8 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.60)