AITopics

doi: 10.1109/ICECET55527.2022.9872611

2206.06165

Country:

Africa > South Africa > Western Cape > Cape Town (0.05)
Europe > Italy > Lazio (0.04)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

arXiv.org Artificial IntelligenceSep-9-2022

Affinity-VAE for disentanglement, clustering and classification of objects in multidimensional image data

Mirecka, Jola, Famili, Marjan, Kotańska, Anna, Juraschko, Nikolai, Costa-Gomes, Beatriz, Palmer, Colin M., Thiyagalingam, Jeyan, Burnley, Tom, Basham, Mark, Lowe, Alan R.

In this work we present affinity-VAE: a framework for automatic clustering and classification of objects in multidimensional image data based on their similarity. The method expands on the concept of $\beta$-VAEs with an informed similarity-based loss component driven by an affinity matrix. The affinity-VAE is able to create rotationally-invariant, morphologically homogeneous clusters in the latent representation, with improved cluster separation compared with a standard $\beta$-VAE. We explore the extent of latent disentanglement and continuity of the latent spaces on both 2D and 3D image data, including simulated biological electron cryo-tomography (cryo-ET) volumes as an example of a scientific application.

artificial intelligence, latent space, machine learning, (17 more...)

2209.04517

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area (0.68)
Health & Medicine > Pharmaceuticals & Biotechnology (0.47)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.90)

#artificialintelligenceSep-8-2022, 10:00:26 GMT

Color Quantization -- Using K Means Clustering

In simpler terms, it is the quantization of color spaces. Color spaces are a way to characterize the shade channels existing in the photo that offers the photograph that precise hue. This is a useful image compression technique which is quite useful for devices that can show a limited number of colors due to memory restriction. Each image can be represented by three features: the R, G and B values for each pixel. Given that our image has pixel values ranging from 0 to 255, we can say that each image has 256 * 256 * 256 colors. Our goal now is to reduce the number of colors to a manageable number.

color quantization, color space, means clustering, (1 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

arXiv.org Machine LearningSep-8-2022

Quantum Sparse Coding

Romano, Yaniv, Primack, Harel, Vaknin, Talya, Meirzada, Idan, Karpas, Ilan, Furman, Dov, Tradonsky, Chene, Shlomi, Ruti Ben

A ubiquitous problem in machine learning, statistics, and signal processing is to accurately estimate an unknown sparse vector from a few noisy linear measurements. This estimation problem, which we refer to as sparse coding, is at the heart of the field of compressed sensing, revealing that under sparsity assumptions it is possible to successfully recover a signal that sampled significantly below the Nyquist rate [1, 2]. This, in turn, led to a dramatic increase in magnetic resonance imaging (MRI) scanning session speed [3]. Another exciting application that also builds on the sparsity assumption is unsupervised representation learning, i.e., given high-dimensional input data, such as an image, finding a low-dimensional representation that captures the intrinsic underlying structure in the input [4, 5, 6]. These representations are often used in image restoration tasks to effectively remove noise (denoising) [7, 8], fill-in missing pixels (inpainting) [9, 10, 11], and to achieve high quality digital zoom (super-resolution) [10, 12, 13, 14]. Sparsity also plays a key role in linear regression when given a large pool of features, to form a predictive rule that estimates an unknown response using a smaller, interpretable subset of features that manifests the strongest effects [15, 16, 17, 18]. To formalize the sparse coding problem, which is central for tackling the aforementioned applications, we consider the following linear model: b = Ax + v, where A is a matrix of size M N, the vector x is of length N, and v is a noise vector of length M. In this paper, we focus on a challenging setting in which M N, where a crucial assumption we make is that the vector x is k-sparse, i.e., it contains only k non-zero elements with k N [2, 1, 19].

artificial intelligence, machine learning, representation, (18 more...)

arXiv.org Machine Learning

2209.03788

Country:

Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Africa > Comoros > Grande Comore > Moroni (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

arXiv.org Artificial IntelligenceSep-7-2022

Grouping-matrix based Graph Pooling with Adaptive Number of Clusters

Ko, Sung Moon, Cho, Sungjun, Jeong, Dae-Woong, Han, Sehui, Lee, Moontae, Lee, Honglak

Graph pooling is a crucial operation for encoding hierarchical structures within graphs. Most existing graph pooling approaches formulate the problem as a node clustering task which effectively captures the graph topology. Conventional methods ask users to specify an appropriate number of clusters as a hyperparameter, then assume that all input graphs share the same number of clusters. In inductive settings where the number of clusters can vary, however, the model should be able to represent this variation in its pooling layers in order to learn suitable clusters. Thus we propose GMPool, a novel differentiable graph pooling architecture that automatically determines the appropriate number of clusters based on the input data. The main intuition involves a grouping matrix defined as a quadratic form of the pooling operator, which induces use of binary classification probabilities of pairwise combinations of nodes. GMPool obtains the pooling operator by first computing the grouping matrix, then decomposing it. Extensive evaluations on molecular property prediction tasks demonstrate that our method outperforms conventional methods.

adaptive number, artificial intelligence, grouping-matrix, (2 more...)

doi: 10.1609/aaai.v37i7.26005

2209.02939

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Haug, Johannes, Braun, Alexander, Zürn, Stefan, Kasneci, Gjergji

Change Detection for Local Explainability in Evolving Data Streams

As complex machine learning models are increasingly used in sensitive applications like banking, trading or credit scoring, there is a growing demand for reliable explanation mechanisms. Local feature attribution methods have become a popular technique for post-hoc and model-agnostic explanations. However, attribution methods typically assume a stationary environment in which the predictive model has been trained and remains stable. As a result, it is often unclear how local attributions behave in realistic, constantly evolving settings such as streaming and online applications. In this paper, we discuss the impact of temporal change on local feature attributions. In particular, we show that local attributions can become obsolete each time the predictive model is updated or concept drift alters the data generating distribution. Consequently, local feature attributions in data streams provide high explanatory power only when combined with a mechanism that allows us to detect and respond to local changes over time. To this end, we present CDLEEDS, a flexible and model-agnostic framework for detecting local change and concept drift. CDLEEDS serves as an intuitive extension of attribution-based explanation techniques to identify outdated local attributions and enable more targeted recalculations. In experiments, we also show that the proposed framework can reliably detect both local and global concept drift. Accordingly, our work contributes to a more meaningful and robust explainability in online machine learning.

attribution, cdleed, concept drift, (13 more...)

2209.02764

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.05)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.05)
Oceania > Australia > New South Wales (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Semi-Supervised Clustering via Dynamic Graph Structure Learning

Ling, Huaming, Bao, Chenglong, Liang, Xin, Shi, Zuoqiang

Most existing semi-supervised graph-based clustering methods exploit the supervisory information by either refining the affinity matrix or directly constraining the low-dimensional representations of data points. The affinity matrix represents the graph structure and is vital to the performance of semi-supervised graph-based clustering. However, existing methods adopt a static affinity matrix to learn the low-dimensional representations of data points and do not optimize the affinity matrix during the learning process. In this paper, we propose a novel dynamic graph structure learning method for semi-supervised clustering. In this method, we simultaneously optimize the affinity matrix and the low-dimensional representations of data points by leveraging the given pairwise constraints. Moreover, we propose an alternating minimization approach with proven convergence to solve the proposed nonconvex model. During the iteration process, our method cyclically updates the low-dimensional representations of data points and refines the affinity matrix, leading to a dynamic affinity matrix (graph structure). Specifically, for the update of the affinity matrix, we enforce the data points with remarkably different low-dimensional representations to have an affinity value of 0. Furthermore, we construct the initial affinity matrix by integrating the local distance and global self-representation among data points. Experimental results on eight benchmark datasets under different settings show the advantages of the proposed approach.

affinity matrix, pairwise constraint, representation, (13 more...)

2209.02513

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > Singapore (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)

An, Sungeun, Rugaber, Spencer, Hammock, Jennifer, Goel, Ashok K.

Understanding Self-Directed Learning in an Online Laboratory

We described a study on the use of an online laboratory for self-directed learning by constructing and simulating conceptual models of ecological systems. In this study, we could observe only the modeling behaviors and outcomes; the learning goals and outcomes were unknown. We used machine learning techniques to analyze the modeling behaviors of 315 learners and 822 conceptual models they generated. We derive three main conclusions from the results. First, learners manifest three types of modeling behaviors: observation (simulation focused), construction (construction focused), and full exploration (model construction, evaluation and revision). Second, while observation was the most common behavior among all learners, construction without evaluation was more common for less engaged learners and full exploration occurred mostly for more engaged learners. Third, learners who explored the full cycle of model construction, evaluation and revision generated models of higher quality. These modeling behaviors provide insights into self-directed learning at large.

construction, learner, simulation, (12 more...)

2206.02742

Country:

South America > Uruguay > Maldonado > Maldonado (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > United States > District of Columbia > Washington (0.04)
(4 more...)

Genre:

Research Report > New Finding (0.90)
Research Report > Experimental Study (0.70)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (1.00)
Education > Educational Setting > Online (1.00)
Education > Educational Setting > K-12 Education (0.69)
Education > Curriculum > Subject-Specific Education (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Merged-GHCIDR: Geometrical Approach to Reduce Image Data

Joshi, Devvrat, Thakkar, Janvi, Soni, Siddharth, Mody, Shril, Patil, Rohan, Batra, Nipun

The computational resources required to train a model have been increasing since the inception of deep networks. Training neural networks on massive datasets have become a challenging and time-consuming task. So, there arises a need to reduce the dataset without compromising the accuracy. In this paper, we present novel variations of an earlier approach called reduction through homogeneous clustering for reducing dataset size. The proposed methods are based on the idea of partitioning the dataset into homogeneous clusters and selecting images that contribute significantly to the accuracy. We propose two variations: Geometrical Homogeneous Clustering for Image Data Reduction (GHCIDR) and Merged-GHCIDR upon the baseline algorithm - Reduction through Homogeneous Clustering (RHC) to achieve better accuracy and training time. The intuition behind GHCIDR involves selecting data points by cluster weights and geometrical distribution of the training set. Merged-GHCIDR involves merging clusters having the same labels using complete linkage clustering. We used three deep learning models- Fully Connected Networks (FCN), VGG1, and VGG16. We experimented with the two variants on four datasets- MNIST, CIFAR10, Fashion-MNIST, and Tiny-Imagenet. Merged-GHCIDR with the same percentage reduction as RHC showed an increase of 2.8%, 8.9%, 7.6% and 3.5% accuracy on MNIST, Fashion-MNIST, CIFAR10, and Tiny-Imagenet, respectively.

algorithm, dataset, merged-ghcidr, (12 more...)

2209.02609

Country: Asia > India > Gujarat > Gandhinagar (0.05)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Zdybał, Kamila, D'Alessio, Giuseppe, Aversano, Gianmarco, Malik, Mohammad Rafi, Coussement, Axel, Sutherland, James C., Parente, Alessandro

Advancing Reacting Flow Simulations with Data-Driven Models

arXiv.org Artificial IntelligenceSep-5-2022

The use of machine learning algorithms to predict behaviors of complex systems is booming. However, the key to an effective use of machine learning tools in multi-physics problems, including combustion, is to couple them to physical and computer models. The performance of these tools is enhanced if all the prior knowledge and the physical constraints are embodied. In other words, the scientific method must be adapted to bring machine learning into the picture, and make the best use of the massive amount of data we have produced, thanks to the advances in numerical computing. The present chapter reviews some of the open opportunities for the application of data-driven reduced-order modeling of combustion systems. Examples of feature extraction in turbulent combustion data, empirical low-dimensional manifold (ELDM) identification, classification, regression, and reduced-order modeling are provided.

artificial intelligence, machine learning, simulation, (16 more...)

2209.02051

Country:

North America > United States (0.93)
Europe (0.93)

Genre: Research Report > New Finding (0.67)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)