Not enough data to create a plot.
Try a different view from the menu above.
Clinical Informatics
A Consensus Privacy Metrics Framework for Synthetic Data
Pilgram, Lisa, Dankar, Fida K., Drechsler, Jorg, Elliot, Mark, Domingo-Ferrer, Josep, Francis, Paul, Kantarcioglu, Murat, Kong, Linglong, Malin, Bradley, Muralidhar, Krishnamurty, Myles, Puja, Prasser, Fabian, Raisaro, Jean Louis, Yan, Chao, Emam, Khaled El
Synthetic data generation is one approach for sharing individual-level data. However, to meet legislative requirements, it is necessary to demonstrate that the individuals' privacy is adequately protected. There is no consolidated standard for measuring privacy in synthetic data. Through an expert panel and consensus process, we developed a framework for evaluating privacy in synthetic data. Our findings indicate that current similarity metrics fail to measure identity disclosure, and their use is discouraged. For differentially private synthetic data, a privacy budget other than close to zero was not considered interpretable. There was consensus on the importance of membership and attribute disclosure, both of which involve inferring personal information about an individual without necessarily revealing their identity. The resultant framework provides precise recommendations for metrics that address these types of disclosures effectively. Our findings further present specific opportunities for future research that can help with widespread adoption of synthetic data.
How AI could supercharge your glucose monitor - and catch other health issues
Researchers at Stanford have been using artificial intelligence (AI) to dive deeper into diabetes diagnosis -- and the results could mean better, more accessible care. We commonly understand diabetes as being either Type 1 or Type 2. But in recent years, scientists have discovered important variations, or subtypes, within Type 2 -- which makes up 95% of diagnoses -- that shed light on the risk of contracting related conditions, like kidney, heart, or liver issues. "Understanding the physiology behind [diabetes] requires metabolic tests done in a research setting, but the tests are cumbersome and expensive and not practical for use in the clinic," explained Tracey McLaughlin, MD, an endocrinology professor at Stanford. Using data collected by glucose monitors, researchers developed an algorithm identifying three of the four most common subtypes of Type 2 diabetes. Compared to clinical data, the algorithm "predicted metabolic subtypes, such as insulin resistance and beta-cell deficiency, with greater accuracy than the traditional metabolic tests" -- roughly 90% of the time.
A Global Cybersecurity Standardization Framework for Healthcare Informatics
Gupta, Kishu, Mishra, Vinaytosh, Makkar, Aaisha
Healthcare has witnessed an increased digitalization in the post-COVID world. Technologies such as the medical internet of things and wearable devices are generating a plethora of data available on the cloud anytime from anywhere. This data can be analyzed using advanced artificial intelligence techniques for diagnosis, prognosis, or even treatment of disease. This advancement comes with a major risk to protecting and securing protected health information (PHI). The prevailing regulations for preserving PHI are neither comprehensive nor easy to implement. The study first identifies twenty activities crucial for privacy and security, then categorizes them into five homogeneous categories namely: $\complement_1$ (Policy and Compliance Management), $\complement_2$ (Employee Training and Awareness), $\complement_3$ (Data Protection and Privacy Control), $\complement_4$ (Monitoring and Response), and $\complement_5$ (Technology and Infrastructure Security) and prioritizes these categories to provide a framework for the implementation of privacy and security in a wise manner. The framework utilized the Delphi Method to identify activities, criteria for categorization, and prioritization. Categorization is based on the Density-Based Spatial Clustering of Applications with Noise (DBSCAN), and prioritization is performed using a Technique for Order of Preference by Similarity to the Ideal Solution (TOPSIS). The outcomes conclude that $\complement_3$ activities should be given first preference in implementation and followed by $\complement_1$ and $\complement_2$ activities. Finally, $\complement_4$ and $\complement_5$ should be implemented. The prioritized view of identified clustered healthcare activities related to security and privacy, are useful for healthcare policymakers and healthcare informatics professionals.
Implementing a Nordic-Baltic Federated Health Data Network: a case report
Chomutare, Taridzo, Babic, Aleksandar, Peltonen, Laura-Maria, Elunurm, Silja, Lundberg, Peter, Jönsson, Arne, Eneling, Emma, Gerstenberger, Ciprian-Virgil, Siggaard, Troels, Kolde, Raivo, Jerdhaf, Oskar, Hansson, Martin, Makhlysheva, Alexandra, Muzny, Miroslav, Ylipää, Erik, Brunak, Søren, Dalianis, Hercules
Background: Centralized collection and processing of healthcare data across national borders pose significant challenges, including privacy concerns, data heterogeneity and legal barriers. To address some of these challenges, we formed an interdisciplinary consortium to develop a feder-ated health data network, comprised of six institutions across five countries, to facilitate Nordic-Baltic cooperation on secondary use of health data. The objective of this report is to offer early insights into our experiences developing this network. Methods: We used a mixed-method ap-proach, combining both experimental design and implementation science to evaluate the factors affecting the implementation of our network. Results: Technically, our experiments indicate that the network functions without significant performance degradation compared to centralized simu-lation. Conclusion: While use of interdisciplinary approaches holds a potential to solve challeng-es associated with establishing such collaborative networks, our findings turn the spotlight on the uncertain regulatory landscape playing catch up and the significant operational costs.
Enhancing Antibiotic Stewardship using a Natural Language Approach for Better Feature Representation
Lee, Simon A., Brokowski, Trevor, Chiang, Jeffrey N.
The rapid emergence of antibiotic-resistant bacteria is recognized as a global healthcare crisis, undermining the efficacy of life-saving antibiotics. This crisis is driven by the improper and overuse of antibiotics, which escalates bacterial resistance. In response, this study explores the use of clinical decision support systems, enhanced through the integration of electronic health records (EHRs), to improve antibiotic stewardship. However, EHR systems present numerous data-level challenges, complicating the effective synthesis and utilization of data. In this work, we transform EHR data into a serialized textual representation and employ pretrained foundation models to demonstrate how this enhanced feature representation can aid in antibiotic susceptibility predictions. Our results suggest that this text representation, combined with foundation models, provides a valuable tool to increase interpretability and support antibiotic stewardship efforts.
Effects of Added Emphasis and Pause in Audio Delivery of Health Information
Ahmed, Arif, Leroy, Gondy, Rains, Stephen A., Harber, Philip, Kauchak, David, Barai, Prosanta
Health literacy is crucial to supporting good health and is a major national goal. Audio delivery of information is becoming more popular for informing oneself. In this study, we evaluate the effect of audio enhancements in the form of information emphasis and pauses with health texts of varying difficulty and we measure health information comprehension and retention. We produced audio snippets from difficult and easy text and conducted the study on Amazon Mechanical Turk (AMT). Our findings suggest that emphasis matters for both information comprehension and retention. When there is no added pause, emphasizing significant information can lower the perceived difficulty for difficult and easy texts. Comprehension is higher (54%) with correctly placed emphasis for the difficult texts compared to not adding emphasis (50%). Adding a pause lowers perceived difficulty and can improve retention but adversely affects information comprehension.
A primer on synthetic health data
Bartell, Jennifer Anne, Valentin, Sander Boisen, Krogh, Anders, Langberg, Henning, Bøgsted, Martin
Recent advances in deep generative models have greatly expanded the potential to create realistic synthetic health datasets. These synthetic datasets aim to preserve the characteristics, patterns, and overall scientific conclusions derived from sensitive health datasets without disclosing patient identity or sensitive information. Thus, synthetic data can facilitate safe data sharing that supports a range of initiatives including the development of new predictive models, advanced health IT platforms, and general project ideation and hypothesis development. However, many questions and challenges remain, including how to consistently evaluate a synthetic dataset's similarity and predictive utility in comparison to the original real dataset and risk to privacy when shared. Additional regulatory and governance issues have not been widely addressed. In this primer, we map the state of synthetic health data, including generation and evaluation methods and tools, existing examples of deployment, the regulatory and ethical landscape, access and governance options, and opportunities for further development.
Optimal service resource management strategy for IoT-based health information system considering value co-creation of users
Fang, Ji, Lee, Vincent CS, Wang, Haiyan
This paper explores optimal service resource management strategy, a continuous challenge for health information service to enhance service performance, optimise service resource utilisation and deliver interactive health information service. An adaptive optimal service resource management strategy was developed considering a value co-creation model in health information service with a focus on collaborative and interactive with users. The deep reinforcement learning algorithm was embedded in the Internet of Things (IoT)-based health information service system (I-HISS) to allocate service resources by controlling service provision and service adaptation based on user engagement behaviour. The simulation experiments were conducted to evaluate the significance of the proposed algorithm under different user reactions to the health information service.
Medical records condensation: a roadmap towards healthcare data democratisation
Wang, Yujiang, Thakur, Anshul, Dong, Mingzhi, Ma, Pingchuan, Petridis, Stavros, Shang, Li, Zhu, Tingting, Clifton, David A.
The prevalence of artificial intelligence (AI) has envisioned an era of healthcare democratisation that promises every stakeholder a new and better way of life. However, the advancement of clinical AI research is significantly hurdled by the dearth of data democratisation in healthcare. To truly democratise data for AI studies, challenges are two-fold: 1. the sensitive information in clinical data should be anonymised appropriately, and 2. AI-oriented clinical knowledge should flow freely across organisations. This paper considers a recent deep-learning advent, dataset condensation (DC), as a stone that kills two birds in democratising healthcare data. The condensed data after DC, which can be viewed as statistical metadata, abstracts original clinical records and irreversibly conceals sensitive information at individual levels; nevertheless, it still preserves adequate knowledge for learning deep neural networks (DNNs). More favourably, the compressed volumes and the accelerated model learnings of condensed data portray a more efficient clinical knowledge sharing and flowing system, as necessitated by data democratisation. We underline DC's prospects for democratising clinical data, specifically electrical healthcare records (EHRs), for AI research through experimental results and analysis across three healthcare datasets of varying data types.
Data-Centric Foundation Models in Computational Healthcare: A Survey
Zhang, Yunkun, Gao, Jin, Tan, Zheling, Zhou, Lingfeng, Ding, Kexin, Zhou, Mu, Zhang, Shaoting, Wang, Dequan
In computational healthcare [3, 72], FMs can handle a variety of clinical data with their appealing capabilities in logical reasoning and semantic understanding. Examples span fields in medical conversation [241, 316], patient health profiling [48], and treatment planning [192]. Moreover, given the strength in largescale data processing, FMs offer a shifting paradigm to assess real-world clinical data in the healthcare workflow rapidly and effectively [208, 261]. FM research places a sharp focus on the data-centric perspective [318]. First, FMs demonstrate the power of scale, where the enlarged model and data size permit FMs to capture vast amounts of information, thus increasing the pressing need of training data quantity [272]. Second, FMs encourage homogenization [21] as evidenced by their extensive adaptability to downstream tasks. High-quality data for FM training thus becomes critical since it can impact the performance of both pre-trained FM and downstream models. Therefore, addressing key data challenges is progressively recognized as a research priority.