AITopics

2303.04134

Country:

North America > United States > New York > New York County > New York City (0.05)
Asia > India > Karnataka > Bengaluru (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
(4 more...)

Genre: Research Report (0.82)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.95)

Fachada, Nuno, de Andrade, Diogo

Generating Multidimensional Clusters With Support Lines

arXiv.org Artificial IntelligenceJul-30-2023

Synthetic data is essential for assessing clustering techniques, complementing and extending real data, and allowing for more complete coverage of a given problem's space. In turn, synthetic data generators have the potential of creating vast amounts of data -- a crucial activity when real-world data is at premium -- while providing a well-understood generation procedure and an interpretable instrument for methodically investigating cluster analysis algorithms. Here, we present Clugen, a modular procedure for synthetic data generation, capable of creating multidimensional clusters supported by line segments using arbitrary distributions. Clugen is open source, comprehensively unit tested and documented, and is available for the Python, R, Julia, and MATLAB/Octave ecosystems. We demonstrate that our proposal can produce rich and varied results in various dimensions, is fit for use in the assessment of clustering algorithms, and has the potential to be a widely used framework in diverse clustering-related research tasks.

artificial intelligence, data mining, machine learning, (18 more...)

doi: 10.1016/j.knosys.2023.110836

2301.10327

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York > New York County > New York City (0.04)
South America > Argentina > Pampas > Buenos Aires Province > La Plata (0.04)
(6 more...)

Genre:

Research Report (1.00)
Workflow (0.93)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

arXiv.org Artificial IntelligenceJul-29-2023

MUSE: Multi-View Contrastive Learning for Heterophilic Graphs

Yuan, Mengyi, Chen, Minjie, Li, Xiang

In recent years, self-supervised learning has emerged as a promising approach in addressing the issues of label dependency and poor generalization performance in traditional GNNs. However, existing self-supervised methods have limited effectiveness on heterophilic graphs, due to the homophily assumption that results in similar node representations for connected nodes. In this work, we propose a multi-view contrastive learning model for heterophilic graphs, namely, MUSE. Specifically, we construct two views to capture the information of the ego node and its neighborhood by GNNs enhanced with contrastive learning, respectively. Then we integrate the information from these two views to fuse the node representations. Fusion contrast is utilized to enhance the effectiveness of fused node representations. Further, considering that the influence of neighboring contextual information on information fusion may vary across different ego nodes, we employ an information fusion controller to model the diversity of node-neighborhood similarity at both the local and global levels. Finally, an alternating training scheme is adopted to ensure that unsupervised node representation learning and information fusion controller can mutually reinforce each other. We conduct extensive experiments to evaluate the performance of MUSE on 9 benchmark datasets. Our results show the effectiveness of MUSE on both node classification and clustering tasks.

artificial intelligence, machine learning, representation, (15 more...)

2307.16026

Country:

North America > United States > Texas (0.05)
North America > United States > Wisconsin (0.05)
Asia > China > Shanghai > Shanghai (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.54)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

arXiv.org Artificial IntelligenceJul-29-2023

Clustering with Neural Network and Index

Liu, Gangli

A new model called Clustering with Neural Network and Index (CNNI) is introduced. CNNI uses a Neural Network to cluster data points. Training of the Neural Network mimics supervised learning, with an internal clustering evaluation index acting as the loss function. An experiment is conducted to test the feasibility of the new model, and compared with results of other clustering models like K-means and Gaussian Mixture Model (GMM). The result shows CNNI can work properly for clustering data; CNNI equipped with MMJ-SC, achieves the first parametric (inductive) clustering model that can deal with non-convex shaped (non-flat geometry) data.

artificial intelligence, machine learning, neural network, (17 more...)

2212.03853

Country:

Asia > China > Beijing > Beijing (0.06)
Oceania > New Zealand (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceJul-28-2023

GraphDAC: A Graph-Analytic Approach to Dynamic Airspace Configuration

Feng, Ke, Liu, Dahai, Liu, Yongxin, Liu, Hong, Song, Houbing

The current National Airspace System (NAS) is reaching capacity due to increased air traffic, and is based on outdated pre-tactical planning. This study proposes a more dynamic airspace configuration (DAC) approach that could increase throughput and accommodate fluctuating traffic, ideal for emergencies. The proposed approach constructs the airspace as a constraints-embedded graph, compresses its dimensions, and applies a spectral clustering-enabled adaptive algorithm to generate collaborative airport groups and evenly distribute workloads among them. Under various traffic conditions, our experiments demonstrate a 50\% reduction in workload imbalances. This research could ultimately form the basis for a recommendation system for optimized airspace configuration. Code available at https://github.com/KeFenge2022/GraphDAC.git

airport, data mining, machine learning, (17 more...)

2307.15876

Country:

North America > United States > Maryland > Baltimore County (0.04)
North America > United States > Maryland > Baltimore (0.04)
North America > United States > Florida > Sarasota County > Sarasota (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry:

Transportation > Air (1.00)
Consumer Products & Services > Travel (1.00)
Transportation > Infrastructure & Services > Airport (0.47)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.69)

arXiv.org Artificial IntelligenceJul-28-2023

Towards a Fully Unsupervised Framework for Intent Induction in Customer Support Dialogues

Costa, Rita, Martins, Bruno, Viana, Sérgio, Coheur, Luisa

The evolution of technology has allowed the automation of several processes across diversified engineering industry fields, such as customer support services, which have drastically evolved with the advances in Natural Language Processing and Machine Learning. One of the major challenges of these systems is to identify users intentions, a complex Natural Language Understanding task, that vary across domains. With the evolution of Deep Learning architectures, recent works focused on modelling intentions and creating a taxonomy of intents, so they can be fed to powerful supervised clustering algorithms (Haponchyk et al., 2020; Chatterjee and Sengupta, 2021). However, these systems have the bottleneck of requiring the existence of labelled data to be trained and deployed, and, thus, they can not be easily transferred to real world customer support services, where the available data for a commercial chatbot usually consists in no more than a dataset of interactions between clients and operators. As labeling hundreds of utterances with intent labels can be time-consuming, laborious, expensive and, sometimes, even requires someone with expertise, it is not straightforward to apply current state of the art supervised models to new domains (Chatterjee and Sengupta, 2020).

machine learning, natural language, utterance, (18 more...)

2307.1541

Country:

Europe > Portugal > Lisbon > Lisbon (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre:

Overview (0.66)
Research Report (0.50)

Industry: Consumer Products & Services (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Sturm, Dominik, Maddu, Suryanarayana, Sbalzarini, Ivo F.

Learning locally dominant force balances in active particle systems

We use a combination of unsupervised clustering and sparsity-promoting inference algorithms to learn locally dominant force balances that explain macroscopic pattern formation in self-organized active particle systems. The self-organized emergence of macroscopic patterns from microscopic interactions between self-propelled particles can be widely observed nature. Although hydrodynamic theories help us better understand the physical basis of this phenomenon, identifying a sufficient set of local interactions that shape, regulate, and sustain self-organized structures in active particle systems remains challenging. We investigate a classic hydrodynamic model of self-propelled particles that produces a wide variety of patterns, like asters and moving density bands. Our data-driven analysis shows that propagating bands are formed by local alignment interactions driven by density gradients, while steady-state asters are shaped by a mechanism of splay-induced negative compressibility arising from strong particle interactions. Our method also reveals analogous physical principles of pattern formation in a system where the speed of the particle is influenced by local density. This demonstrates the ability of our method to reveal physical commonalities across models. The physical mechanisms inferred from the data are in excellent agreement with analytical scaling arguments and experimental observations.

artificial intelligence, latexit sha1, machine learning, (17 more...)

2307.1497

Country:

Europe > Germany > Saxony (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Quebec (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Data Science (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)

Zhang, Heng, Vargas, Danilo Vasconcellos

A Survey on Reservoir Computing and its Interdisciplinary Applications Beyond Traditional Machine Learning

Reservoir computing (RC), first applied to temporal signal processing, is a recurrent neural network in which neurons are randomly connected. Once initialized, the connection strengths remain unchanged. Such a simple structure turns RC into a non-linear dynamical system that maps low-dimensional inputs into a high-dimensional space. The model's rich dynamics, linear separability, and memory capacity then enable a simple linear readout to generate adequate responses for various applications. RC spans areas far beyond machine learning, since it has been shown that the complex dynamics can be realized in various physical hardware implementations and biological devices. This yields greater flexibility and shorter computation time. Moreover, the neuronal responses triggered by the model's dynamics shed light on understanding brain mechanisms that also exploit similar dynamical processes. While the literature on RC is vast and fragmented, here we conduct a unified review of RC's recent developments from machine learning to physics, biology, and neuroscience. We first review the early RC models, and then survey the state-of-the-art models and their applications. We further introduce studies on modeling the brain's mechanisms by RC. Finally, we offer new perspectives on RC development, including reservoir design, coding frameworks unification, physical RC implementations, and interaction between RC, cognitive neuroscience and evolution.

artificial intelligence, evolutionary algorithm, machine learning, (17 more...)

doi: 10.1109/ACCESS.2023.3299296

2307.15092

Country:

Asia > Japan > Honshū (0.14)
Europe > France (0.14)
Asia > Japan > Kyūshū & Okinawa > Kyūshū (0.14)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Research Report > Promising Solution (0.87)

Industry:

Semiconductors & Electronics (1.00)
Information Technology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Neuroscience (1.00)

Napravnik, Mateja, Hržić, Franko, Tschauner, Sebastian, Štajduhar, Ivan

Building RadiologyNET: Unsupervised annotation of a large-scale multimodal medical database

Background and objective: The usage of machine learning in medical diagnosis and treatment has witnessed significant growth in recent years through the development of computer-aided diagnosis systems that are often relying on annotated medical radiology images. However, the availability of large annotated image datasets remains a major obstacle since the process of annotation is time-consuming and costly. This paper explores how to automatically annotate a database of medical radiology images with regard to their semantic similarity. Material and methods: An automated, unsupervised approach is used to construct a large annotated dataset of medical radiology images originating from Clinical Hospital Centre Rijeka, Croatia, utilising multimodal sources, including images, DICOM metadata, and narrative diagnoses. Several appropriate feature extractors are tested for each of the data sources, and their utility is evaluated using k-means and k-medoids clustering on a representative data subset. Results: The optimal feature extractors are then integrated into a multimodal representation, which is then clustered to create an automated pipeline for labelling a precursor dataset of 1,337,926 medical images into 50 clusters of visually similar images. The quality of the clusters is assessed by examining their homogeneity and mutual information, taking into account the anatomical region and modality representation. Conclusion: The results suggest that fusing the embeddings of all three data sources together works best for the task of unsupervised clustering of large-scale medical data, resulting in the most concise clusters. Hence, this work is the first step towards building a much larger and more fine-grained annotated dataset of medical radiology images.

data mining, machine learning, natural language, (21 more...)

2308.08517

Country:

Europe > Croatia > Primorje-Gorski Kotar County > Rijeka (0.25)
Europe > Austria > Styria > Graz (0.04)
Europe > Switzerland > Neuchâtel > Neuchâtel (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Identifying acute illness phenotypes via deep temporal interpolation and clustering network on physiologic signatures

Ren, Yuanfang, Li, Yanjun, Loftus, Tyler J., Balch, Jeremy, Abbott, Kenneth L., Datta, Shounak, Ruppert, Matthew M., Guan, Ziyuan, Shickel, Benjamin, Rashidi, Parisa, Ozrazgat-Baslanti, Tezcan, Bihorac, Azra

Initial hours of hospital admission impact clinical trajectory, but early clinical decisions often suffer due to data paucity. With clustering analysis for vital signs within six hours of admission, patient phenotypes with distinct pathophysiological signatures and outcomes may support early clinical decisions. We created a single-center, longitudinal EHR dataset for 75,762 adults admitted to a tertiary care center for 6+ hours. We proposed a deep temporal interpolation and clustering network to extract latent representations from sparse, irregularly sampled vital sign data and derived distinct patient phenotypes in a training cohort (n=41,502). Model and hyper-parameters were chosen based on a validation cohort (n=17,415). Test cohort (n=16,845) was used to analyze reproducibility and correlation with biomarkers. The training, validation, and testing cohorts had similar distributions of age (54-55 yrs), sex (55% female), race, comorbidities, and illness severity. Four clusters were identified. Phenotype A (18%) had most comorbid disease with higher rate of prolonged respiratory insufficiency, acute kidney injury, sepsis, and three-year mortality. Phenotypes B (33%) and C (31%) had diffuse patterns of mild organ dysfunction. Phenotype B had favorable short-term outcomes but second-highest three-year mortality. Phenotype C had favorable clinical outcomes. Phenotype D (17%) had early/persistent hypotension, high rate of early surgery, and substantial biomarker rate of inflammation but second-lowest three-year mortality. After comparing phenotypes' SOFA scores, clustering results did not simply repeat other acuity assessments. In a heterogeneous cohort, four phenotypes with distinct categories of disease and outcomes were identified by a deep temporal interpolation and clustering network. This tool may impact triage decisions and clinical decision-support under time constraints.

data mining, machine learning, natural language, (18 more...)

2307.15719

Country:

North America > United States > Florida > Alachua County > Gainesville (0.14)
Europe > United Kingdom > England (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Nephrology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(3 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)
Information Technology > Artificial Intelligence > Natural Language (0.67)