Goto

Collaborating Authors

 sle


Scalable Structure Learning of Bayesian Networks by Learning Algorithm Ensembles

Liu, Shengcai, Ou-yang, Hui, Wang, Zhiyuan, Chen, Cheng, Cai, Qijun, Ong, Yew-Soon, Tang, Ke

arXiv.org Artificial Intelligence

--Learning the structure of Bayesian networks (BNs) from data is challenging, especially for datasets involving a large number of variables. The recently proposed divide-and-conquer (D&D) strategies present a promising approach for learning large BNs. However, they still face a main issue of unstable learning accuracy across subproblems. In this work, we introduce the idea of employing structure learning ensemble (SLE), which combines multiple BN structure learning algorithms, to consistently achieve high learning accuracy. We further propose an automatic approach called Auto-SLE for learning near-optimal SLEs, addressing the challenge of manually designing high-quality SLEs. The learned SLE is then integrated into a D&D method. Extensive experiments firmly show the superiority of our method over D&D methods with single BN structure learning algorithm in learning large BNs, achieving accuracy improvement usually by 30% 225% on datasets involving 10,000 variables. These results indicate the significant potential of employing (automatic learning of) SLEs for scalable BN structure learning. Learning the structure of Bayesian networks (BNs) [1] from data has attracted much research interest, due to its wide applications in machine learning, statistical modeling, and causal inference [2]-[4].


Subjective Logic Encodings

Vasilakes, Jake

arXiv.org Artificial Intelligence

Many existing approaches for learning from labeled data assume the existence of gold-standard labels. According to these approaches, inter-annotator disagreement is seen as noise to be removed, either through refinement of annotation guidelines, label adjudication, or label filtering. However, annotator disagreement can rarely be totally eradicated, especially on more subjective tasks such as sentiment analysis or hate speech detection where disagreement is natural. Therefore, a new approach to learning from labeled data, called data perspectivism, seeks to leverage inter-annotator disagreement to learn models that stay true to the inherent uncertainty of the task by treating annotations as opinions of the annotators, rather than gold-standard facts. Despite this conceptual grounding, existing methods under data perspectivism are limited to using disagreement as the sole source of annotation uncertainty. To expand the possibilities of data perspectivism, we introduce Subjective Logic Encodings (SLEs), a flexible framework for constructing classification targets that explicitly encodes annotations as opinions of the annotators. Based on Subjective Logic Theory, SLEs encode labels as Dirichlet distributions and provide principled methods for encoding and aggregating various types of annotation uncertainty -- annotator confidence, reliability, and disagreement -- into the targets. We show that SLEs are a generalization of other types of label encodings as well as how to estimate models to predict SLEs using a distribution matching objective.


A data-driven approach to discover and quantify systemic lupus erythematosus etiological heterogeneity from electronic health records

Mota, Marco Barbero, Still, John M., Gamboa, Jorge L., Strobl, Eric V., Stein, Charles M., Kawai, Vivian K., Lasko, Thomas A.

arXiv.org Artificial Intelligence

Systemic lupus erythematosus (SLE) is a complex heterogeneous disease with many manifestational facets. We propose a data-driven approach to discover probabilistic independent sources from multimodal imperfect EHR data. These sources represent exogenous variables in the data generation process causal graph that estimate latent root causes of the presence of SLE in the health record. We objectively evaluated the sources against the original variables from which they were discovered by training supervised models to discriminate SLE from negative health records using a reduced set of labelled instances. We found 19 predictive sources with high clinical validity and whose EHR signatures define independent factors of SLE heterogeneity. Using the sources as input patient data representation enables models to provide with rich explanations that better capture the clinical reasons why a particular record is (not) an SLE case. Providers may be willing to trade patient-level interpretability for discrimination especially in challenging cases. Introduction Systemic lupus erythematosus (SLE) is a complex relapsing disease that manifests through various combinations of symptoms and clinical signs. SLE's heterogeneity makes its recognition in the health record slow and subjective.


Phenome-wide causal proteomics enhance systemic lupus erythematosus flare prediction: A study in Asian populations

Chen, Liying, Deng, Ou, Fang, Ting, Chen, Mei, Zhang, Xvfeng, Cong, Ruichen, Lu, Dingqi, Zhang, Runrun, Jin, Qun, Wang, Xinchang

arXiv.org Artificial Intelligence

Objective: Systemic lupus erythematosus (SLE) is a complex autoimmune disease characterized by unpredictable flares. This study aimed to develop a novel proteomics-based risk prediction model specifically for Asian SLE populations to enhance personalized disease management and early intervention. Methods: A longitudinal cohort study was conducted over 48 weeks, including 139 SLE patients monitored every 12 weeks. Patients were classified into flare (n = 53) and non-flare (n = 86) groups. Baseline plasma samples underwent data-independent acquisition (DIA) proteomics analysis, and phenome-wide Mendelian randomization (PheWAS) was performed to evaluate causal relationships between proteins and clinical predictors. Logistic regression (LR) and random forest (RF) models were used to integrate proteomic and clinical data for flare risk prediction. Results: Five proteins (SAA1, B4GALT5, GIT2, NAA15, and RPIA) were significantly associated with SLE Disease Activity Index-2K (SLEDAI-2K) scores and 1-year flare risk, implicating key pathways such as B-cell receptor signaling and platelet degranulation. SAA1 demonstrated causal effects on flare-related clinical markers, including hemoglobin and red blood cell counts. A combined model integrating clinical and proteomic data achieved the highest predictive accuracy (AUC = 0.769), surpassing individual models. SAA1 was highlighted as a priority biomarker for rapid flare discrimination. Conclusion: The integration of proteomic and clinical data significantly improves flare prediction in Asian SLE patients. The identification of key proteins and their causal relationships with flare-related clinical markers provides valuable insights for proactive SLE management and personalized therapeutic approaches.


Simplicity Level Estimate (SLE): A Learned Reference-Less Metric for Sentence Simplification

Cripwell, Liam, Legrand, Joël, Gardent, Claire

arXiv.org Artificial Intelligence

Automatic evaluation for sentence simplification remains a challenging problem. Most popular evaluation metrics require multiple high-quality references -- something not readily available for simplification -- which makes it difficult to test performance on unseen domains. Furthermore, most existing metrics conflate simplicity with correlated attributes such as fluency or meaning preservation. We propose a new learned evaluation metric (SLE) which focuses on simplicity, outperforming almost all existing metrics in terms of correlation with human judgements.


Scalable and Adaptive Graph Neural Networks with Self-Label-Enhanced training

Sun, Chuxiong, Wu, Guoshi

arXiv.org Artificial Intelligence

It is hard to directly implement Graph Neural Networks (GNNs) on large scaled graphs. Besides of existed neighbor sampling techniques, scalable methods decoupling graph convolutions and other learnable transformations into preprocessing and post classifier allow normal minibatch training. By replacing redundant concatenation operation with attention mechanism in SIGN, we propose Scalable and Adaptive Graph Neural Networks (SAGN). SAGN can adaptively gather neighborhood information among different hops. To further improve scalable models on semi-supervised learning tasks, we propose Self-Label-Enhance (SLE) framework combining self-training approach and label propagation in depth. We add base model with a scalable node label module. Then we iteratively train models and enhance train set in several stages. To generate input of node label module, we directly apply label propagation based on one-hot encoded label vectors without inner random masking. We find out that empirically the label leakage has been effectively alleviated after graph convolutions. The hard pseudo labels in enhanced train set participate in label propagation with true labels. Experiments on both inductive and transductive datasets demonstrate that, compared with other sampling-based and sampling-free methods, SAGN achieves better or comparable results and SLE can further improve performance.


Predictors of Treatment Response Among New Data on Stelara in SLE

#artificialintelligence

Higher expression of nine genes may help identify people with systemic lupus erythematosus (SLE) who will respond to treatment with Stelara (ustekinumab) -- an approved therapy in inflammatory disorders but not in SLE. At the 2019 American College of Rheumatology (ACR)/Association for Rheumatology Health Professionals (ARHP) Annual Meeting, being held in Atlanta Nov. 8-13, Janssen is presenting evidence of reduced SLE disease activity with Stelara, as well as a tool to predict benefits in clinical trials. Stelara works by blocking interleukin (IL)-12 and IL-23, two pro-inflammatory molecules. It is approved in the U.S. for the treatment of psoriasis and psoriatic arthritis, as well as Crohn's disease and ulcerative colitis, which are two forms of inflammatory bowel disease. Results from a Phase 2 trial (NCT02349061) showed that Stelara reduced SLE disease activity and severe flares, among other benefits, compared with a placebo.


Signed Laplacian Embedding for Supervised Dimension Reduction

Gong, Chen (Shanghai Jiao Tong University and University of Technology Sydney) | Tao, Dacheng (University of Technology Sydney) | Yang, Jie (Shanghai Jiao Tong University) | Fu, Keren (Shanghai Jiao Tong University)

AAAI Conferences

Manifold learning is a powerful tool for solving nonlinear dimension reduction problems. By assuming that the high-dimensional data usually lie on a low-dimensional manifold, many algorithms have been proposed. However, most algorithms simply adopt the traditional graph Laplacian to encode the data locality, so the discriminative ability is limited and the embedding results are not always suitable for the subsequent classification. Instead, this paper deploys the signed graph Laplacian and proposes Signed Laplacian Embedding (SLE) for supervised dimension reduction. By exploring the label information, SLE comprehensively transfers the discrimination carried by the original data to the embedded low-dimensional space. Without perturbing the discrimination structure, SLE also retains the locality.Theoretically, we prove the immersion property by computing the rank of projection, and relate SLE to existing algorithms in the frame of patch alignment. Thorough empirical studies on synthetic and real datasets demonstrate the effectiveness of SLE.