Goto

Collaborating Authors

 individual component



HeteroJIVE: Joint Subspace Estimation for Heterogeneous Multi-View Data

Li, Jingyang, Lyu, Zhongyuan

arXiv.org Machine Learning

Many modern datasets consist of multiple related matrices measured on a common set of units, where the goal is to recover the shared low-dimensional subspace. While the Angle-based Joint and Individual Variation Explained (AJIVE) framework provides a solution, it relies on equal-weight aggregation, which can be strictly suboptimal when views exhibit significant statistical heterogeneity (arising from varying SNR and dimensions) and structural heterogeneity (arising from individual components). In this paper, we propose HeteroJIVE, a weighted two-stage spectral algorithm tailored to such heterogeneity. Theoretically, we first revisit the ``non-diminishing" error barrier with respect to the number of views $K$ identified in recent literature for the equal-weight case. We demonstrate that this barrier is not universal: under generic geometric conditions, the bias term vanishes and our estimator achieves the $O(K^{-1/2})$ rate without the need for iterative refinement. Extending this to the general-weight case, we establish error bounds that explicitly disentangle the two layers of heterogeneity. Based on this, we derive an oracle-optimal weighting scheme implemented via a data-driven procedure. Extensive simulations corroborate our theoretical findings, and an application to TCGA-BRCA multi-omics data validates the superiority of HeteroJIVE in practice.


Image Segmentation and Classification of E-waste for Training Robots for Waste Segregation

Tripathi, Prakriti

arXiv.org Artificial Intelligence

Abstract--Industry partners provided a problem statement that involves classifying electronic waste using machine learning models, which will be utilized by pick-and-place robots for waste segregation. This was achieved by taking common electronic waste items, such as a mouse and a charger, unsol-dering them, and taking pictures to create a custom dataset. The state-of-the-art YOLOv11 model was trained and run to achieve 70 mAP in real-time. The Mask R-CNN model was also trained and achieved 41 mAP . The model can be integrated with pick-and-place robots to perform segregation of e-waste. Electronic waste (e-waste) is one of the fastest-growing solid waste streams globally [2].


Enhancing Non-Intrusive Load Monitoring with Features Extracted by Independent Component Analysis

Hoosh, Sahar Moghimian, Kamyshev, Ilia, Ouerdane, Henni

arXiv.org Artificial Intelligence

In this paper, a novel neural network architecture is proposed to address the challenges in energy disaggregation algorithms. These challenges include the limited availability of data and the complexity of disaggregating a large number of appliances operating simultaneously. The proposed model utilizes independent component analysis as the backbone of the neural network and is evaluated using the F1-score for varying numbers of appliances working concurrently. Our results demonstrate that the model is less prone to overfitting, exhibits low complexity, and effectively decomposes signals with many individual components. Furthermore, we show that the proposed model outperforms existing algorithms when applied to real-world data.


Supervised Multi-Modal Fission Learning

Mao, Lingchao, wang, Qi, Su, Yi, Lure, Fleming, Li, Jing

arXiv.org Artificial Intelligence

Learning from multimodal datasets can leverage complementary information and improve performance in prediction tasks. A commonly used strategy to account for feature correlations in high-dimensional datasets is the latent variable approach. Several latent variable methods have been proposed for multimodal datasets. However, these methods either focus on extracting the shared component across all modalities or on extracting both a shared component and individual components specific to each modality. To address this gap, we propose a Multi-Modal Fission Learning (MMFL) model that simultaneously identifies globally joint, partially joint, and individual components underlying the features of multimodal datasets. Unlike existing latent variable methods, MMFL uses supervision from the response variable to identify predictive latent components and has a natural extension for incorporating incomplete multimodal data. Through simulation studies, we demonstrate that MMFL outperforms various existing multimodal algorithms in both complete and incomplete modality settings. We applied MMFL to a real-world case study for early prediction of Alzheimers Disease using multimodal neuroimaging and genomics data from the Alzheimers Disease Neuroimaging Initiative (ADNI) dataset. MMFL provided more accurate predictions and better insights into within- and across-modality correlations compared to existing methods.


Surgical Triplet Recognition via Diffusion Model

Liu, Daochang, Hu, Axel, Shah, Mubarak, Xu, Chang

arXiv.org Artificial Intelligence

Surgical triplet recognition is an essential building block to enable next-generation context-aware operating rooms. The goal is to identify the combinations of instruments, verbs, and targets presented in surgical video frames. In this paper, we propose DiffTriplet, a new generative framework for surgical triplet recognition employing the diffusion model, which predicts surgical triplets via iterative denoising. To handle the challenge of triplet association, two unique designs are proposed in our diffusion framework, i.e., association learning and association guidance. During training, we optimize the model in the joint space of triplets and individual components to capture the dependencies among them. At inference, we integrate association constraints into each update of the iterative denoising process, which refines the triplet prediction using the information of individual components. Experiments on the CholecT45 and CholecT50 datasets show the superiority of the proposed method in achieving a new state-of-the-art performance for surgical triplet recognition. Our codes will be released.


Learning Joint and Individual Structure in Network Data with Covariates

James, Carson, Yuan, Dongbang, Gaynanova, Irina, Arroyo, Jesús

arXiv.org Machine Learning

Network data is ubiquitous in many disciplines and application domains, including computer science, statistics, biology, and physics. These data, encoding relationships between units represented as nodes, are often accompanied by additional information about the nodes, usually referred to as node covariates, attributes, or metadata (Newman and Clauset, 2016; Liu, 2019; Chunaev, 2020). In these situations, a common goal is to understand the associations between the network connectivity and the node covariates. In our example, we consider international food commodity trade data represented as a network, where the nodes correspond to different countries and edge weights encode food commodity trade volumes between corresponding countries. The covariates at each node consist of economic and geographic information for each country, such as gross domestic product (GDP) per capita, birth rate and region. We wish to exploit that both datasets contain information about the nodes in order to better understand the structure of the network, node covariates and their relationship. Specifically, we seek to understand how economic and geographic factors explain the observed trade between countries, and identify additional information in the network that cannot be explained solely by these variables. There has been substantial work that incorporates network and node covariate information. Some examples include methods that use node covariates to improve community detection (Binkiewicz et al., 2017; Huang et al., 2023), dimensionality reduction (Zhao et al., 2022), regression with network information (Li et al., 2019) and mixed effect models for network edges (Hoff, 2005).


sJIVE: Supervised Joint and Individual Variation Explained

Palzer, Elise F., Wendt, Christine, Bowler, Russell, Hersh, Craig P., Safo, Sandra E., Lock, Eric F.

arXiv.org Machine Learning

Analyzing multi-source data, which are multiple views of data on the same subjects, has become increasingly common in molecular biomedical research. Recent methods have sought to uncover underlying structure and relationships within and/or between the data sources, and other methods have sought to build a predictive model for an outcome using all sources. However, existing methods that do both are presently limited because they either (1) only consider data structure shared by all datasets while ignoring structures unique to each source, or (2) they extract underlying structures first without consideration to the outcome. We propose a method called supervised joint and individual variation explained (sJIVE) that can simultaneously (1) identify shared (joint) and source-specific (individual) underlying structure and (2) build a linear prediction model for an outcome using these structures. These two components are weighted to compromise between explaining variation in the multi-source data and in the outcome. Simulations show sJIVE to outperform existing methods when large amounts of noise are present in the multi-source data. An application to data from the COPDGene study reveals gene expression and proteomic patterns that are predictive of lung function. Functions to perform sJIVE are included in the R.JIVE package, available online at http://github.com/lockEF/r.jive .


System-Level Predictive Maintenance: Review of Research Literature and Gap Analysis

Miller, Kyle, Dubrawski, Artur

arXiv.org Artificial Intelligence

This paper reviews current literature in the field of predictive maintenance from the system point of view. We differentiate the existing capabilities of condition estimation and failure risk forecasting as currently applied to simple components, from the capabilities needed to solve the same tasks for complex assets. System-level analysis faces more complex latent degradation states, it has to comprehensively account for active maintenance programs at each component level and consider coupling between different maintenance actions, while reflecting increased monetary and safety costs for system failures. As a result, methods that are effective for forecasting risk and informing maintenance decisions regarding individual components do not readily scale to provide reliable sub-system or system level insights. A novel holistic modeling approach is needed to incorporate available structural and physical knowledge and naturally handle the complexities of actively fielded and maintained assets.


Scientists Engineer First Particle Robots That Mimic Cells

#artificialintelligence

Researchers at Columbia Engineering and MIT Computer Science & Artificial Intelligence Lab (CSAIL) have engineered for the first time a particle robotic swarm with individual components that function as a whole. The novel kind of robot has never been seen before. "You can think of our new robot as the proverbial "Gray Goo," said Hod Lipson, professor of mechanical engineering at Columbia Engineering. "Our robot has no single point of failure and no centralized control. It's still fairly primitive, but now we know that this fundamental robot paradigm is actually possible.