Goto

Collaborating Authors

 elbow method


Application of machine learning in grain-related clustering of Laue spots in a polycrystalline energy dispersive Laue pattern

Tosson, Amir, Shokr, Mohammad, Humaidi, Mahmoud Al, Mikayelyan, Eduard, Gutt, Christian, Pietsch, Ulrich

arXiv.org Artificial Intelligence

We address the identification of grain-corresponding Laue reflections in energy dispersive Laue diffraction (EDLD) experiments by formulating it as a clustering problem solvable through unsupervised machine learning (ML). To achieve reliable and efficient identification of grains in a Laue pattern, we employ a combination of clustering algorithms, namely hierarchical clustering (HC) and K-means. These algorithms allow us to group together similar Laue reflections, revealing the underlying grain structure in the diffraction pattern. Additionally, we utilise the elbow method to determine the optimal number of clusters, ensuring accurate results. To evaluate the performance of our proposed method, we conducted experiments using both simulated and experimental datasets obtained from nickel wires. The simulated datasets were generated to mimic the characteristics of real-world EDLD experiments, while the experimental datasets were obtained from actual measurements.


Exploring Cluster Analysis in Nelore Cattle Visual Score Attribution

Bezerra, Alexandre de Oliveira, Mateus, Rodrigo Goncalves, Weber, Vanessa Ap. de Moraes, Weber, Fabricio de Lima, de Arruda, Yasmin Alves, Gomes, Rodrigo da Costa, Higa, Gabriel Toshio Hirokawa, Pistori, Hemerson

arXiv.org Artificial Intelligence

Although there is not an ideal biotype for all production systems, the adequate biotype should be determined according to the objectives that have been established for the herd, along with the production system being practiced [9]. This is not without consequences. For instance, larger animals usually have higher nutritional and general maintenance requirements [7]. Among the methods used to evaluate beef cattle, the EPMURAS methodology synthesized by Koury Filho [11], Koury Filho et al. [13] is one of the most utilized in Brazil. It consists in a visual assessment of body structure, precocity, muscularity, sheath, racial aspects, angulation and sexuality.


Exploring Unsupervised Learning Metrics - KDnuggets

#artificialintelligence

Unsupervised learning is a branch of machine learning where the models learn patterns from the available data rather than provided with the actual label. We let the algorithm come up with the answers. In unsupervised learning, there are two main techniques; clustering and dimensionality reduction. The clustering technique uses an algorithm to learn the pattern to segment the data. In contrast, the dimensionality reduction technique tries to reduce the number of features by keeping the actual information intact as much as possible.


K-means Clustering and Principal Component Analysis in 10 Minutes

#artificialintelligence

There are 2 major kinds of machine learning models: supervised and unsupervised. In supervised learning, you have input data X and output data y, then the model finds a map from X to y. In unsupervised learning, you only have input data X. The goal of unsupervised learning varies: clustering observations in X, reducing the dimensionality of X, anomaly detection in X, etc. As supervised learning has been discussed extensively in Part 1 and Part 2 of the series, this story is focused on unsupervised learning.


Customer Segmentation With Clustering

#artificialintelligence

Let's say that you work with the sales and marketing team to reach your company's pre-set goals. While your company is doing well in terms of generating revenue and retaining customers, you can not help but think that it can do better. As things stand, the advertisements, promotions, and special offers are homogenous across all customers, which is a serious issue. Engaging with customers in a manner that they won't be receptive to is tantamount to wasting your advertising budget. After all, you don't want your company to spend its limited budget sending diaper coupons to college students or advertising gaming consoles to elderly women.


Customer Segmentation With Clustering

#artificialintelligence

Let's say that you work with the sales and marketing team to reach your company's pre-set goals. While your company is doing well in terms of generating revenue and retaining customers, you can not help but think that it can do better. As things stand, the advertisements, promotions, and special offers are homogenous across all customers, which is a serious issue. Engaging with customers in a manner that they won't be receptive to is tantamount to wasting your advertising budget. After all, you don't want your company to spend its limited budget sending diaper coupons to college students or advertising gaming consoles to elderly women.


Stop using the Elbow Method

#artificialintelligence

A common challenge we face when performing clustering with K-Means is to find the optimal number of clusters. Naturally, the celebrated and popular Elbow method is the technique that most data scientists use to solve this particular problem. In this post, we are going to learn a more precise and less subjective approach to help us find the optimal number of clusters, the silhouette score analysis. In another post, I provide a thorough explanation of the K-Means algorithm, its subtleties, (centroid initialization, data standardization, and the number of clusters), and some pros and cons. There, I also explain when and how to use the Elbow Method.


Understanding KMeans Clustering for Data Science Beginners

#artificialintelligence

Clustering is an unsupervised learning method whose job is to separate the population or data points into several groups, such that data points in a group are more similar to each other dissimilar to the data points of other groups. It is nothing but a collection of objects based on similarity and dissimilarity between them. KMeans clustering is an Unsupervised Machine Learning algorithm that does the clustering task. In this method, the'n' observations are grouped into'K' clusters based on the distance. The algorithm tries to minimize the within-cluster variance(so that similar observations fall in the same cluster).


K-Means Clustering: Techniques to Find the Optimal Clusters

#artificialintelligence

As the points are uniformly distributed, the KMeans algorithm evenly splits the points into K clusters even if there's no separation between them Gap Statistics gives the optimal number of the cluster as 10 based on the maximum gap between the cluster inertia on the data and null referenced data.


Automated Timeline Length Selection for Flexible Timeline Summarization

Li, Xi, Mao, Qianren, Peng, Hao, Zhu, Hongdong, Li, Jianxin, Wang, Zheng

arXiv.org Artificial Intelligence

By producing summaries for long-running events, timeline summarization (TLS) underpins many information retrieval tasks. Successful TLS requires identifying an appropriate set of key dates (the timeline length) to cover. However, doing so is challenging as the right length can change from one topic to another. Existing TLS solutions either rely on an event-agnostic fixed length or an expert-supplied setting. Neither of the strategies is desired for real-life TLS scenarios. A fixed, event-agnostic setting ignores the diversity of events and their development and hence can lead to low-quality TLS. Relying on expert-crafted settings is neither scalable nor sustainable for processing many dynamically changing events. This paper presents a better TLS approach for automatically and dynamically determining the TLS timeline length. We achieve this by employing the established elbow method from the machine learning community to automatically find the minimum number of dates within the time series to generate concise and informative summaries. We applied our approach to four TLS datasets of English and Chinese and compared them against three prior methods. Experimental results show that our approach delivers comparable or even better summaries over state-of-art TLS methods, but it achieves this without expert involvement.