AITopics

2404.17922

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(5 more...)

Bored to Death: Artificial Intelligence Research Reveals the Role of Boredom in Suicide Behavior

Lissak, Shir, Ophir, Yaakov, Tikochinski, Refael, Klomek, Anat Brunstein, Sisso, Itay, Fruchter, Eyal, Reichart, Roi

Background: Recent advancements in Artificial Intelligence (AI) contributed significantly to suicide assessment, however, our theoretical understanding of this complex behavior is still limited. Objective: This study aimed to harness AI methodologies to uncover hidden risk factors that trigger or aggravate suicide behaviors. Method: The primary dataset included 228,052 Facebook postings by 1,006 users who completed the gold-standard Columbia Suicide Severity Rating Scale. This dataset was analyzed using a bottom-up research pipeline without a-priory hypotheses and its findings were validated using a top-down analysis of a new dataset. This secondary dataset included responses by 1,062 participants to the same suicide scale as well as to well-validated scales measuring depression and boredom. Results: An almost fully automated, AI-guided research pipeline resulted in four Facebook topics that predicted the risk of suicide, of which the strongest predictor was boredom. A comprehensive literature review using APA PsycInfo revealed that boredom is rarely perceived as a unique risk factor of suicide. A complementing top-down path analysis of the secondary dataset uncovered an indirect relationship between boredom and suicide, which was mediated by depression. An equivalent mediated relationship was observed in the primary Facebook dataset as well. However, here, a direct relationship between boredom and suicide risk was also observed. Conclusions: Integrating AI methods allowed the discovery of an under-researched risk factor of suicide. The study signals boredom as a maladaptive 'ingredient' that might trigger suicide behaviors, regardless of depression. Further studies are recommended to direct clinicians' attention to this burdening, and sometimes existential experience.

boredom, depression, participant, (14 more...)

2404.14057

Country:

Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Zelch, Christoph, Peters, Jan, von Stryk, Oskar

Clustering of Motion Trajectories by a Distance Measure Based on Semantic Features

Clustering of motion trajectories is highly relevant for human-robot interactions as it allows the anticipation of human motions, fast reaction to those, as well as the recognition of explicit gestures. Further, it allows automated analysis of recorded motion data. Many clustering algorithms for trajectories build upon distance metrics that are based on pointwise Euclidean distances. However, our work indicates that focusing on salient characteristics is often sufficient. We present a novel distance measure for motion plans consisting of state and control trajectories that is based on a compressed representation built from their main features. This approach allows a flexible choice of feature classes relevant to the respective task. The distance measure is used in agglomerative hierarchical clustering. We compare our method with the widely used dynamic time warping algorithm on test sets of motion plans for the Furuta pendulum and the Manutec robot arm and on real-world data from a human motion dataset. The proposed method demonstrates slight advantages in clustering and strong advantages in runtime, especially for long trajectories.

distance measure, sequence, trajectory, (13 more...)

doi: 10.1109/Humanoids57100.2023.10375228

2404.17269

Country:

Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.05)
Europe > Switzerland > Basel-City > Basel (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Wang, Zhengxiang, Rambow, Owen

Clustering Document Parts: Detecting and Characterizing Influence Campaigns from Documents

We propose a novel clustering pipeline to detect and characterize influence campaigns from documents. This approach clusters parts of document, detects clusters that likely reflect an influence campaign, and then identifies documents linked to an influence campaign via their association with the high-influence clusters. Our approach outperforms both the direct document-level classification and the direct document-level clustering approach in predicting if a document is part of an influence campaign. We propose various novel techniques to enhance our pipeline, including using an existing event factuality prediction system to obtain document parts, and aggregating multiple clustering experiments to improve the performance of both cluster and document classification. Classifying documents after clustering not only accurately extracts the parts of the documents that are relevant to influence campaigns, but also captures influence campaigns as a coordinated and holistic phenomenon. Our approach makes possible more fine-grained and interpretable characterizations of influence campaigns from documents.

dataset, high-influence cluster, influence campaign, (16 more...)

2402.17151

Country:

North America > United States > New York > Suffolk County > Stony Brook (0.04)
Europe > Russia (0.04)
Asia > Russia (0.04)
(8 more...)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (0.93)
Government > Voting & Elections (0.93)
Media > News (0.69)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Robertson, Sawyer, Wan, Zhengchao, Cloninger, Alexander

All You Need is Resistance: On the Equivalence of Effective Resistance and Certain Optimal Transport Problems on Graphs

The fields of effective resistance and optimal transport on graphs are filled with rich connections to combinatorics, geometry, machine learning, and beyond. In this article we put forth a bold claim: that the two fields should be understood as one and the same, up to a choice of $p$. We make this claim precise by introducing the parameterized family of $p$-Beckmann distances for probability measures on graphs and relate them sharply to certain Wasserstein distances. Then, we break open a suite of results including explicit connections to optimal stopping times and random walks on graphs, graph Sobolev spaces, and a Benamou-Brenier type formula for $2$-Beckmann distance. We further explore empirical implications in the world of unsupervised learning for graph data and propose further study of the usage of these metrics where Wasserstein distance may produce computational bottlenecks.

effective resistance, graph, theorem 3, (15 more...)

2404.15261

Country:

Europe > United Kingdom (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > Russia (0.04)
(3 more...)

Genre:

Research Report (0.50)
Instructional Material > Course Syllabus & Notes (0.46)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.92)

Block-Diagonal Guided DBSCAN Clustering

Zhao, Weibing

Cluster analysis plays a crucial role in database mining, and one of the most widely used algorithms in this field is DBSCAN. However, DBSCAN has several limitations, such as difficulty in handling high-dimensional large-scale data, sensitivity to input parameters, and lack of robustness in producing clustering results. This paper introduces an improved version of DBSCAN that leverages the block-diagonal property of the similarity graph to guide the clustering procedure of DBSCAN. The key idea is to construct a graph that measures the similarity between high-dimensional large-scale data points and has the potential to be transformed into a block-diagonal form through an unknown permutation, followed by a cluster-ordering procedure to generate the desired permutation. The clustering structure can be easily determined by identifying the diagonal blocks in the permuted graph. We propose a gradient descent-based method to solve the proposed problem. Additionally, we develop a DBSCAN-based points traversal algorithm that identifies clusters with high densities in the graph and generates an augmented ordering of clusters. The block-diagonal structure of the graph is then achieved through permutation based on the traversal order, providing a flexible foundation for both automatic and interactive cluster analysis. We introduce a split-and-refine algorithm to automatically search for all diagonal blocks in the permuted graph with theoretically optimal guarantees under specific cases. We extensively evaluate our proposed approach on twelve challenging real-world benchmark clustering datasets and demonstrate its superior performance compared to the state-of-the-art clustering method on every dataset.

algorithm, graph, representation, (16 more...)

2404.01341

Country:

North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
North America > Dominican Republic > Samaná > Samaná (0.04)
Asia (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Wagner, Theresa, Nestler, Franziska, Stoll, Martin

Fast Evaluation of Additive Kernels: Feature Arrangement, Fourier Methods, and Kernel Derivatives

One of the main computational bottlenecks when working with kernel based learning is dealing with the large and typically dense kernel matrix. Techniques dealing with fast approximations of the matrix vector product for these kernel matrices typically deteriorate in their performance if the feature vectors reside in higher-dimensional feature spaces. We here present a technique based on the non-equispaced fast Fourier transform (NFFT) with rigorous error analysis. We show that this approach is also well suited to allow the approximation of the matrix that arises when the kernel is differentiated with respect to the kernel hyperparameters; a problem often found in the training phase of methods such as Gaussian processes. We also provide an error analysis for this case. We illustrate the performance of the additive kernel scheme with fast matrix vector products on a number of data sets.

approximation, fourier coefficient, kernel, (14 more...)

2404.17344

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Indiana > Hamilton County > Fishers (0.04)
Europe > Germany (0.04)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)

Genre:

Overview (0.67)
Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Pandey, Anurag Kumar, Goyal, Anil, Sikka, Nikhil

RE-RFME: Real-Estate RFME Model for customer segmentation

Marketing is one of the high-cost activities for any online platform. With the increase in the number of customers, it is crucial to understand customers based on their dynamic behaviors to design effective marketing strategies. Customer segmentation is a widely used approach to group customers into different categories and design the marketing strategy targeting each group individually. Therefore, in this paper, we propose an end-to-end pipeline RE-RFME for segmenting customers into 4 groups: high value, promising, need attention, and need activation. Concretely, we propose a novel RFME (Recency, Frequency, Monetary and Engagement) model to track behavioral features of customers and segment them into different categories. Finally, we train the K-means clustering algorithm to cluster the user into one of the 4 categories. We show the effectiveness of the proposed approach on real-world Housing.com datasets for both website and mobile application users.

customer, platform, segmentation, (14 more...)

2404.17177

Country:

Asia > India (0.06)
North America > United States > District of Columbia > Washington (0.05)
Asia > Indonesia > Java > Jakarta > Jakarta (0.04)

Genre: Research Report (0.50)

Industry: Banking & Finance > Real Estate (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Turfah, Ali, Wen, Xiaoquan

Interpretable Clustering with the Distinguishability Criterion

arXiv.org Machine LearningApr-25-2024

Cluster analysis is a popular unsupervised learning tool used in many disciplines to identify heterogeneous sub-populations within a sample. However, validating cluster analysis results and determining the number of clusters in a data set remains an outstanding problem. In this work, we present a global criterion called the Distinguishability criterion to quantify the separability of identified clusters and validate inferred cluster configurations. Our computational implementation of the Distinguishability criterion corresponds to the Bayes risk of a randomized classifier under the 0-1 loss. We propose a combined loss function-based computational framework that integrates the Distinguishability criterion with many commonly used clustering procedures, such as hierarchical clustering, k-means, and finite mixture models. We present these new algorithms as well as the results from comprehensive data analysis based on simulation studies and real data applications.

algorithm, distinguishability criterion, interpretable clustering, (11 more...)

arXiv.org Machine Learning

2404.15967

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
Europe > Middle East (0.04)
Asia > Middle East (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

arXiv.org Artificial IntelligenceApr-24-2024

MoDE: CLIP Data Experts via Clustering

Ma, Jiawei, Huang, Po-Yao, Xie, Saining, Li, Shang-Wen, Zettlemoyer, Luke, Chang, Shih-Fu, Yih, Wen-Tau, Xu, Hu

The success of contrastive language-image pretraining (CLIP) relies on the supervision from the pairing between images and captions, which tends to be noisy in web-crawled data. We present Mixture of Data Experts (MoDE) and learn a system of CLIP data experts via clustering. Each data expert is trained on one data cluster, being less sensitive to false negative noises in other clusters. At inference time, we ensemble their outputs by applying weights determined through the correlation between task metadata and cluster conditions. To estimate the correlation precisely, the samples in one cluster should be semantically similar, but the number of data experts should still be reasonable for training and inference. As such, we consider the ontology in human language and propose to use fine-grained cluster centers to represent each data expert at a coarse-grained level. Experimental studies show that four CLIP data experts on ViT-B/16 outperform the ViT-L/14 by OpenAI CLIP and OpenCLIP on zero-shot image classification but with less ($<$35\%) training cost. Meanwhile, MoDE can train all data expert asynchronously and can flexibly include new data experts. The code is available at https://github.com/facebookresearch/MetaCLIP/tree/main/mode.

data expert, image-caption pair, vit-b 32, (14 more...)

2404.1603

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)
(5 more...)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)