AITopics | uzuner

Collaborating Authors

uzuner

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LongBoX: Evaluating Transformers on Long-Sequence Clinical Tasks

Parmar, Mihir, Naik, Aakanksha, Gupta, Himanshu, Agrawal, Disha, Baral, Chitta

arXiv.org Artificial IntelligenceNov-15-2023

Many large language models (LLMs) for medicine have largely been evaluated on short texts, and their ability to handle longer sequences such as a complete electronic health record (EHR) has not been systematically explored. Assessing these models on long sequences is crucial since prior work in the general domain has demonstrated performance degradation of LLMs on longer texts. Motivated by this, we introduce LongBoX, a collection of seven medical datasets in text-to-text format, designed to investigate model performance on long sequences. Preliminary experiments reveal that both medical LLMs (e.g., BioGPT) and strong general domain LLMs (e.g., FLAN-T5) struggle on this benchmark. We further evaluate two techniques designed for long-sequence handling: (i) local-global attention, and (ii) Fusion-in-Decoder (FiD). Our results demonstrate mixed results with long-sequence handling - while scores on some datasets increase, there is substantial room for improvement. We hope that LongBoX facilitates the development of more effective long-sequence techniques for the medical domain. Data and source code are available at https://github.com/Mihir3009/LongBoX.

dataset, language model, sequence, (11 more...)

arXiv.org Artificial Intelligence

2311.09564

Country:

North America > United States > Washington > King County > Seattle (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Arizona (0.04)

Genre: Research Report > New Finding (0.69)

Industry: Health & Medicine > Health Care Technology > Medical Record (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Machine Learning Application in Health

Alshabana, Ghadah, Sadati, Marjn, Tran, Thao, Thompson, Michael, Chitimalla, Ashritha

arXiv.org Artificial IntelligenceJun-9-2022

Coronavirus can be transmitted through the air by close proximity to infected persons. Commercial aircraft are a likely way to both transmit the virus among passengers and move the virus between locations. The importance of learning about where and how coronavirus has entered the United States will help further our understanding of the disease. Air travelers can come from countries or areas with a high rate of infection and may very well be at risk of being exposed to the virus. Therefore, as they reach the United States, the virus could easily spread. On our analysis, we utilized machine learning to determine if the number of flights into the Washington DC Metro Area had an effect on the number of cases and deaths reported in the city and surrounding area.

correlation, flight, heidari, (15 more...)

arXiv.org Artificial Intelligence

2207.06228

Country:

North America > United States > District of Columbia > Washington (0.36)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > Virginia > Loudoun County (0.05)
(8 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Automatic end-to-end De-identification: Is high accuracy the only metric?

Yogarajan, Vithya, Pfahringer, Bernhard, Mayo, Michael

arXiv.org Machine LearningJan-27-2019

De-identification of electronic health records (EHR) is a vital step towards advancing health informatics research and maximising the use of available data. It is a two-step process where step one is the identification of protected health information (PHI), and step two is replacing such PHI with surrogates. Despite the recent advances in automatic de-identification of EHR, significant obstacles remain if the abundant health data available are to be used to the full potential. Accuracy in de-identification could be considered a necessary, but not sufficient condition for the use of EHR without individual patient consent. We present here a comprehensive review of the progress to date, both the impressive successes in achieving high accuracy and the significant risks and challenges that remain. To best of our knowledge, this is the first paper to present a complete picture of end-to-end automatic de-identification. We review 18 recently published automatic de-identification systems -designed to de-identify EHR in the form of free text- to show the advancements made in improving the overall accuracy of the system, and in identifying individual PHI. We argue that despite the improvements in accuracy there remain challenges in surrogate generation and replacements of identified PHIs, and the risks posed to patient protection and privacy.

de-identification system, f-measure, uzuner, (14 more...)

arXiv.org Machine Learning

1901.10583

Country:

North America > United States (0.68)
Oceania > New Zealand > North Island > Waikato > Hamilton (0.04)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Health Care Technology > Medical Record (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

A survey of automatic de-identification of longitudinal clinical narratives

Yogarajan, Vithya, Mayo, Michael, Pfahringer, Bernhard

arXiv.org Artificial IntelligenceOct-15-2018

Use of medical data, also known as electronic health records, in research helps develop and advance medical science. However, protecting patient confidentiality and identity while using medical data for analysis is crucial. Medical data can be in the form of tabular structures (i.e. tables), free-form narratives, and images. This study focuses on medical data in the free form longitudinal text. De-identification of electronic health records provides the opportunity to use such data for research without it affecting patient privacy, and avoids the need for individual patient consent. In recent years there is increasing interest in developing an accurate, robust and adaptable automatic de-identification system for electronic health records. This is mainly due to the dilemma between the availability of an abundance of health data, and the inability to use such data in research due to legal and ethical restrictions. De-identification tracks in competitions such as the 2014 i2b2 UTHealth and the 2016 CEGS N-GRID shared tasks have provided a great platform to advance this area. The primary reasons for this include the open source nature of the dataset and the fact that raw psychiatric data were used for 2016 competitions. This study focuses on noticeable trend changes in the techniques used in the development of automatic de-identification for longitudinal clinical narratives. More specifically, the shift from using conditional random fields (CRF) based systems only or rules (regular expressions, dictionary or combinations) based systems only, to hybrid models (combining CRF and rules), and more recently to deep learning based systems. We review the literature and results that arose from the 2014 and the 2016 competitions and discuss the outcomes of these systems. We also provide a list of research questions that emerged from this survey.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

1810.06765

Country: North America > United States (1.00)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.66)

Industry:

Health & Medicine > Health Care Technology > Medical Record (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Add feedback