AITopics

Federated Learning (FL) shows promise in preserving privacy and enabling collaborative learning. However, most current solutions focus on private data collected from a single domain. A significant challenge arises when client data comes from diverse domains (i.e., domain shift), leading to poor performance on unseen domains. Existing Federated Domain Generalization approaches address this problem but assume each client holds data for an entire domain, limiting their practicality in real-world scenarios with domain-based heterogeneity and client sampling. To overcome this, we introduce FISC, a novel FL domain generalization paradigm that handles more complex domain distributions across clients. FISC enables learning across domains by extracting an interpolative style from local styles and employing contrastive learning. This strategy gives clients multi-domain representations and unbiased convergent targets. Empirical results on multiple datasets, including PACS, Office-Home, and IWildCam, show FISC outperforms state-of-the-art (SOTA) methods. Our method achieves accuracy improvements ranging from 3.64% to 57.22% on unseen domains. Our code is available at https://anonymous.4open.science/r/FISC-AAAI-16107.

artificial intelligence, dataset, machine learning, (17 more...)

2410.22622

Country:

North America > United States > Tennessee > Davidson County > Nashville (0.04)
North America > United States > Virginia (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.93)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

A Walsh Hadamard Derived Linear Vector Symbolic Architecture

Alam, Mohammad Mahmudul, Oberle, Alexander, Raff, Edward, Biderman, Stella, Oates, Tim, Holt, James

Vector Symbolic Architectures (VSAs) are one approach to developing Neuro-symbolic AI, where two vectors in $\mathbb{R}^d$ are `bound' together to produce a new vector in the same space. VSAs support the commutativity and associativity of this binding operation, along with an inverse operation, allowing one to construct symbolic-style manipulations over real-valued vectors. Most VSAs were developed before deep learning and automatic differentiation became popular and instead focused on efficacy in hand-designed systems. In this work, we introduce the Hadamard-derived linear Binding (HLB), which is designed to have favorable computational efficiency, and efficacy in classic VSA tasks, and perform well in differentiable systems. Code is available at https://github.com/FutureComputing4AI/Hadamard-derived-Linear-Binding

artificial intelligence, machine learning, natural language, (19 more...)

2410.22669

Country:

North America > United States > Maryland > Baltimore County (0.04)
North America > United States > Maryland > Baltimore (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

An Efficient Approach to Generate Safe Drivable Space by LiDAR-Camera-HDmap Fusion

Ning, Minghao, Alghooneh, Ahmad Reza, Sun, Chen, Zhang, Ruihe, Panahandeh, Pouya, Tuer, Steven, Hashemi, Ehsan, Khajepour, Amir

In this paper, we propose an accurate and robust perception module for Autonomous Vehicles (AVs) for drivable space extraction. Perception is crucial in autonomous driving, where many deep learning-based methods, while accurate on benchmark datasets, fail to generalize effectively, especially in diverse and unpredictable environments. Our work introduces a robust easy-to-generalize perception module that leverages LiDAR, camera, and HD map data fusion to deliver a safe and reliable drivable space in all weather conditions. We present an adaptive ground removal and curb detection method integrated with HD map data for enhanced obstacle detection reliability. Additionally, we propose an adaptive DBSCAN clustering algorithm optimized for precipitation noise, and a cost-effective LiDAR-camera frustum association that is resilient to calibration discrepancies. Our comprehensive drivable space representation incorporates all perception data, ensuring compatibility with vehicle dimensions and road regulations. This approach not only improves generalization and efficiency, but also significantly enhances safety in autonomous vehicle operations. Our approach is tested on a real dataset and its reliability is verified during the daily (including harsh snowy weather) operation of our autonomous shuttle, WATonoBus

artificial intelligence, detection, machine learning, (17 more...)

2410.22314

Country:

North America > United States (0.46)
North America > Canada > Alberta (0.04)
North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)

Genre: Research Report (0.82)

Industry:

Automobiles & Trucks (0.70)
Government > Regional Government (0.46)
Transportation > Ground > Road (0.36)
Information Technology > Robotics & Automation (0.36)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Roy, Krishna Chandra, Chen, Qian

LogSHIELD: A Graph-based Real-time Anomaly Detection Framework using Frequency Analysis

Anomaly-based cyber threat detection using deep learning is on a constant growth in popularity for novel cyber-attack detection and forensics. A robust, efficient, and real-time threat detector in a large-scale operational enterprise network requires high accuracy, high fidelity, and a high throughput model to detect malicious activities. Traditional anomaly-based detection models, however, suffer from high computational overhead and low detection accuracy, making them unsuitable for real-time threat detection. In this work, we propose LogSHIELD, a highly effective graph-based anomaly detection model in host data. We present a real-time threat detection approach using frequency-domain analysis of provenance graphs. To demonstrate the significance of graph-based frequency analysis we proposed two approaches. Approach-I uses a Graph Neural Network (GNN) LogGNN and approach-II performs frequency domain analysis on graph node samples for graph embedding. Both approaches use a statistical clustering algorithm for anomaly detection. The proposed models are evaluated using a large host log dataset consisting of 774M benign logs and 375K malware logs. LogSHIELD explores the provenance graph to extract contextual and causal relationships among logs, exposing abnormal activities. It can detect stealthy and sophisticated attacks with over 98% average AUC and F1 scores. It significantly improves throughput, achieves an average detection latency of 0.13 seconds, and outperforms state-of-the-art models in detection time.

detection, graph, logshield, (17 more...)

2410.21936

Country:

North America > United States > Texas > Bexar County > San Antonio (0.04)
North America > United States > New Mexico > Socorro County > Socorro (0.04)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (0.72)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

A Fresh Look at Generalized Category Discovery through Non-negative Matrix Factorization

Ji, Zhong, Yang, Shuo, Liu, Jingren, Pang, Yanwei, Han, Jungong

Generalized Category Discovery (GCD) aims to classify both base and novel images using labeled base data. However, current approaches inadequately address the intrinsic optimization of the co-occurrence matrix $\bar{A}$ based on cosine similarity, failing to achieve zero base-novel regions and adequate sparsity in base and novel domains. To address these deficiencies, we propose a Non-Negative Generalized Category Discovery (NN-GCD) framework. It employs Symmetric Non-negative Matrix Factorization (SNMF) as a mathematical medium to prove the equivalence of optimal K-means with optimal SNMF, and the equivalence of SNMF solver with non-negative contrastive learning (NCL) optimization. Utilizing these theoretical equivalences, it reframes the optimization of $\bar{A}$ and K-means clustering as an NCL optimization problem. Moreover, to satisfy the non-negative constraints and make a GCD model converge to a near-optimal region, we propose a GELU activation function and an NMF NCE loss. To transition $\bar{A}$ from a suboptimal state to the desired $\bar{A}^*$, we introduce a hybrid sparse regularization approach to impose sparsity constraints. Experimental results show NN-GCD outperforms state-of-the-art methods on GCD benchmarks, achieving an average accuracy of 66.1\% on the Semantic Shift Benchmark, surpassing prior counterparts by 4.7\%.

category, dataset, international conference, (13 more...)

2410.21807

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > China > Tianjin Province > Tianjin (0.05)
Asia > China > Shanghai > Shanghai (0.04)
(2 more...)

Genre:

Research Report > Promising Solution (0.34)
Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

A Novel Score-CAM based Denoiser for Spectrographic Signature Extraction without Ground Truth

Elias, Noel

Sonar based audio classification techniques are a growing area of research in the field of underwater acoustics. Usually, underwater noise picked up by passive sonar transducers contains all types of signals that travel through the ocean and is transformed into spectrographic images. As a result, the corresponding spectrograms intended to display the temporal-frequency data of a certain object often include the tonal regions of abundant extraneous noise that can effectively interfere with a 'contact'. So, a majority of spectrographic samples extracted from underwater audio signals are rendered unusable due to their clutter and lack the required indistinguishability between different objects. With limited clean true data for supervised training, creating classification models for these audio signals is severely bottlenecked. This paper derives several new techniques to combat this problem by developing a novel Score-CAM based denoiser to extract an object's signature from noisy spectrographic data without being given any ground truth data. In particular, this paper proposes a novel generative adversarial network architecture for learning and producing spectrographic training data in similar distributions to low-feature spectrogram inputs. In addition, this paper also a generalizable class activation mapping based denoiser for different distributions of acoustic data, even real-world data distributions. Utilizing these novel architectures and proposed denoising techniques, these experiments demonstrate state-of-the-art noise reduction accuracy and improved classification accuracy than current audio classification standards. As such, this approach has applications not only to audio data but for countless data distributions used all around the world for machine learning.

signature, spectrogram, target class, (17 more...)

doi: 10.1109/IJCNN54540.2023.10191897

2410.21557

Country: North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Self-Supervised Graph Embedding Clustering

Li, Fangfang, Gao, Quanxue, Deng, Cheng, Xia, Wei

The K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks. However, it combines the K-means clustering and dimensionality reduction processes for optimization, leading to limitations in the clustering effect due to the introduced hyperparameters and the initialization of clustering centers. Moreover, maintaining class balance during clustering remains challenging. To overcome these issues, we propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework. Specifically, we establish a connection between K-means and the manifold structure, allowing us to perform K-means without explicitly defining centroids. Additionally, we use this centroid-free K-means to generate labels in low-dimensional space and subsequently utilize the label information to determine the similarity between samples. This approach ensures consistency between the manifold structure and the labels. Our model effectively achieves one-step clustering without the need for redundant balancing hyperparameters. Notably, we have discovered that maximizing the $\ell_{2,1}$-norm naturally maintains class balance during clustering, a result that we have theoretically proven. Finally, experiments on multiple datasets demonstrate that the clustering results of Our-LPP and Our-MFA exhibit excellent and reliable performance.

dimensionality reduction, k-means, wang, (14 more...)

2409.15887

Country: North America > United States (0.15)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Aragam, Bryon, Yang, Ruiyi

Model-free Estimation of Latent Structure via Multiscale Nonparametric Maximum Likelihood

arXiv.org Machine LearningOct-29-2024

Multivariate distributions often carry latent structures that are difficult to identify and estimate, and which better reflect the data generating mechanism than extrinsic structures exhibited simply by the raw data. In this paper, we propose a model-free approach for estimating such latent structures whenever they are present, without assuming they exist a priori. Given an arbitrary density $p_0$, we construct a multiscale representation of the density and propose data-driven methods for selecting representative models that capture meaningful discrete structure. Our approach uses a nonparametric maximum likelihood estimator to estimate the latent structure at different scales and we further characterize their asymptotic limits. By carrying out such a multiscale analysis, we obtain coarseto-fine structures inherent in the original distribution, which are integrated via a model selection procedure to yield an interpretable discrete representation of it. As an application, we design a clustering algorithm based on the proposed procedure and demonstrate its effectiveness in capturing a wide range of latent structures.

algorithm, latent structure, statistics, (16 more...)

arXiv.org Machine Learning

2410.22248

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Li, Xuetong, Zhang, Xiao-Dong

Multi-view clustering integrating anchor attribute and structural information

arXiv.org Artificial IntelligenceOct-28-2024

Multisource data has spurred the development of advanced clustering algorithms, such as multi-view clustering, which critically relies on constructing similarity matrices. Traditional algorithms typically generate these matrices from sample attributes alone. However, real-world networks often include pairwise directed topological structures critical for clustering. This paper introduces a novel multi-view clustering algorithm, AAS. It utilizes a two-step proximity approach via anchors in each view, integrating attribute and directed structural information. This approach enhances the clarity of category characteristics in the similarity matrices. The anchor structural similarity matrix leverages strongly connected components of directed graphs. The entire process-from similarity matrices construction to clustering - is consolidated into a unified optimization framework. Comparative experiments on the modified Attribute SBM dataset against eight algorithms affirm the effectiveness and superiority of AAS.

artificial intelligence, data mining, machine learning, (19 more...)

2410.21711

Country:

North America > United States (0.14)
Asia > China > Shanghai > Shanghai (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.81)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)

arXiv.org Artificial IntelligenceOct-27-2024

Multiple kernel concept factorization algorithm based on global fusion

Li, Fei, Du, Liang, Ren, Chaohong

Abstract: Non-negative Matrix Factorization (NMF) algorithm can only be used to find low rank approximation of original non-negative data while Concept Factorization (CF) algorithm extends matrix factorization to single non-linear kernel space, improving learning ability and adaptability of matrix factorization. In unsupervised environment, to design or select proper kernel function for specific dataset, a new algorithm called Globalized Multiple Kernel CF (GMKCF) was proposed. Multiple candidate kernel functions were input in the same time and learned in the CF framework based on global linear fusion, obtaining a clustering result with high quality and stability and solving the problem of kernel function selection that the CF faced. The convergence of the proposed algorithm was verified by solving the model with alternate iteration. The experimental results on several real databases show that the proposed algorithm outperforms comparison algorithms in data clustering, such as Kernel K-Means (KKM), Spectral Clustering (SC), Kernel CF (KCF), Co-regularized multi-view spectral clustering (Coreg), and Robust Multiple KKM (RMKKM).

artificial intelligence, factorization, machine learning, (9 more...)

doi: 10.11772/j.issn.1001-9081.2018081817

2410.20383

Country:

Asia > China (0.05)
North America > United States > New York (0.05)
North America > United States > California > San Mateo County > Menlo Park (0.05)
(3 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)