pht
Partial Hard Thresholding: Towards A Principled Analysis of Support Recovery
In machine learning and compressed sensing, it is of central importance to understand when a tractable algorithm recovers the support of a sp arse signal from its compressed measurements. In this paper, we present a princi pled analysis on the support recovery performance for a family of hard threshold ing algorithms. To this end, we appeal to the partial hard thresholding (PHT) op erator proposed recently by Jain et al. [IEEE Trans.
- North America > United States > New Jersey (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
Federated Timeline Synthesis: Scalable and Private Methodology For Model Training and Deployment
Renc, Pawel, Grzeszczyk, Michal K., Qian, Linglong, Oufattole, Nassim, Rasley, Jeff, Sitek, Arkadiusz
We present Federated Timeline Synthesis (FTS), a novel framework for training generative foundation models across distributed timeseries data applied to electronic health records (EHR). At its core, FTS represents patient history as tokenized Patient Health Timelines (PHTs), language-agnostic sequences encoding temporal, categorical, and continuous clinical information. Each institution trains an autoregressive transformer on its local PHTs and transmits only model weights to a central server. The server uses the generators to synthesize a large corpus of trajectories and train a Global Generator (GG), enabling zero-shot inference via Monte Carlo simulation of future PHTs. We evaluate FTS on five clinically meaningful prediction tasks using MIMIC-IV data, showing that models trained on synthetic data generated by GG perform comparably to those trained on real data. FTS offers strong privacy guarantees, scalability across institutions, and extensibility to diverse prediction and simulation tasks especially in healthcare, including counterfactual inference, early warning detection, and synthetic trial design.
- North America > United States > Virginia (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Poland > Lesser Poland Province > Kraków (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
Time to Retrain? Detecting Concept Drifts in Machine Learning Systems
Pham, Tri Minh Triet, Premkumar, Karthikeyan, Naili, Mohamed, Yang, Jinqiu
With the boom of machine learning (ML) techniques, software practitioners build ML systems to process the massive volume of streaming data for diverse software engineering tasks such as failure prediction in AIOps. Trained using historical data, such ML models encounter performance degradation caused by concept drift, i.e., data and inter-relationship (concept) changes between training and production. It is essential to use concept rift detection to monitor the deployed ML models and re-train the ML models when needed. In this work, we explore applying state-of-the-art (SOTA) concept drift detection techniques on synthetic and real-world datasets in an industrial setting. Such an industrial setting requires minimal manual effort in labeling and maximal generality in ML model architecture. We find that current SOTA semi-supervised methods not only require significant labeling effort but also only work for certain types of ML models. To overcome such limitations, we propose a novel model-agnostic technique (CDSeer) for detecting concept drift. Our evaluation shows that CDSeer has better precision and recall compared to the state-of-the-art while requiring significantly less manual labeling. We demonstrate the effectiveness of CDSeer at concept drift detection by evaluating it on eight datasets from different domains and use cases. Results from internal deployment of CDSeer on an industrial proprietary dataset show a 57.1% improvement in precision while using 99% fewer labels compared to the SOTA concept drift detection method. The performance is also comparable to the supervised concept drift detection method, which requires 100% of the data to be labeled. The improved performance and ease of adoption of CDSeer are valuable in making ML systems more reliable.
- North America > Canada > Quebec > Montreal (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Oregon > Multnomah County > Portland (0.04)
- (5 more...)
- Information Technology (1.00)
- Telecommunications (0.67)
Partial Hard Thresholding: Towards A Principled Analysis of Support Recovery
In machine learning and compressed sensing, it is of central importance to understand when a tractable algorithm recovers the support of a sparse signal from its compressed measurements. In this paper, we present a principled analysis on the support recovery performance for a family of hard thresholding algorithms. To this end, we appeal to the partial hard thresholding (PHT) operator proposed recently by Jain et al. [IEEE Trans.
- North America > United States > New Jersey (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
Zero Shot Health Trajectory Prediction Using Transformer
Renc, Pawel, Jia, Yugang, Samir, Anthony E., Was, Jaroslaw, Li, Quanzheng, Bates, David W., Sitek, Arkadiusz
Integrating modern machine learning and clinical decision-making has great promise for mitigating healthcare's increasing cost and complexity. We introduce the Enhanced Transformer for Health Outcome Simulation (ETHOS), a novel application of the transformer deep-learning architecture for analyzing high-dimensional, heterogeneous, and episodic health data. ETHOS is trained using Patient Health Timelines (PHTs)-detailed, tokenized records of health events-to predict future health trajectories, leveraging a zero-shot learning approach. ETHOS represents a significant advancement in foundation model development for healthcare analytics, eliminating the need for labeled data and model fine-tuning. Its ability to simulate various treatment pathways and consider patient-specific factors positions ETHOS as a tool for care optimization and addressing biases in healthcare delivery. Future developments will expand ETHOS' capabilities to incorporate a wider range of data types and data sources. Our work demonstrates a pathway toward accelerated AI development and deployment in healthcare.
- Oceania > New Zealand (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- (3 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.94)
- Overview > Innovation (0.66)
- Health & Medicine > Consumer Health (1.00)
- Health & Medicine > Health Care Technology > Medical Record (0.93)
- Health & Medicine > Therapeutic Area > Endocrinology (0.92)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.67)
M2pht: Mixed Models with Preferences and Hybrid Transitions for Next-Basket Recommendation
Peng, Bo, Ren, Zhiyun, Parthasarathy, Srinivasan, Ning, Xia
Next-basket recommendation considers the problem of recommending a set of items into the next basket that users will purchase as a whole. In this paper, we develop a new mixed model with preferences and hybrid transitions for the next-basket recommendation problem. This method explicitly models three important factors: 1) users' general preferences; 2) transition patterns among items and 3) transition patterns among baskets. We compared this method with 5 state-of-the-art next-basket recommendation methods on 4 public benchmark datasets. Our experimental results demonstrate that our method significantly outperforms the state-of-the-art methods on all the datasets. We also conducted a comprehensive ablation study to verify the effectiveness of the different factors.
- North America > United States > Ohio > Franklin County > Columbus (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
Partial Hard Thresholding: Towards A Principled Analysis of Support Recovery
In machine learning and compressed sensing, it is of central importance to understand when a tractable algorithm recovers the support of a sparse signal from its compressed measurements. In this paper, we present a principled analysis on the support recovery performance for a family of hard thresholding algorithms. To this end, we appeal to the partial hard thresholding (PHT) operator proposed recently by Jain et al. [IEEE Trans. Information Theory, 2017]. We show that under proper conditions, PHT recovers an arbitrary "s"-sparse signal within O(s kappa log kappa) iterations where "kappa" is an appropriate condition number. Specifying the PHT operator, we obtain the best known result for hard thresholding pursuit and orthogonal matching pursuit with replacement. Experiments on the simulated data complement our theoretical findings and also illustrate the effectiveness of PHT compared to other popular recovery methods.
- North America > United States > New Jersey (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)