AITopics | Bozkurt, Selen

Collaborating Authors

Bozkurt, Selen

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

HILGEN: Hierarchically-Informed Data Generation for Biomedical NER Using Knowledgebases and Large Language Models

Ge, Yao, Guo, Yuting, Das, Sudeshna, Rajwal, Swati, Bozkurt, Selen, Sarker, Abeed

arXiv.org Artificial IntelligenceMar-6-2025

We present HILGEN, a Hierarchically-Informed Data Generation approach that combines domain knowledge from the Unified Medical Language System (UMLS) with synthetic data generated by large language models (LLMs), specifically GPT-3.5. Our approach leverages UMLS's hierarchical structure to expand training data with related concepts, while incorporating contextual information from LLMs through targeted prompts aimed at automatically generating synthetic examples for sparsely occurring named entities. The performance of the HILGEN approach was evaluated across four biomedical NER datasets (MIMIC III, BC5CDR, NCBI-Disease, and Med-Mentions) using BERT-Large and DANN (Data Augmentation with Nearest Neighbor Classifier) models, applying various data generation strategies, including UMLS, GPT-3.5, and their best ensemble. For the BERT-Large model, incorporating UMLS led to an average F1 score improvement of 40.36%, while using GPT-3.5 resulted in a comparable average increase of 40.52%. The Best-Ensemble approach using BERT-Large achieved the highest improvement, with an average increase of 42.29%. DANN model's F1 score improved by 22.74% on average using the UMLS-only approach. The GPT-3.5-based method resulted in a 21.53% increase, and the Best-Ensemble DANN model showed a more notable improvement, with an average increase of 25.03%. Our proposed HILGEN approach improves NER performance in few-shot settings without requiring additional manually annotated data. Our experiments demonstrate that an effective strategy for optimizing biomedical NER is to combine biomedical knowledge curated in the past, such as the UMLS, and generative LLMs to create synthetic training instances. Our future research will focus on exploring additional innovative synthetic data generation strategies for further improving NER performance.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.0493

Country:

North America > United States (0.28)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Cerebral microbleeds: Association with cognitive decline and pathology build-up

Rathore, Saima, Chaudhary, Jatin, Tong, Boning, Bozkurt, Selen

arXiv.org Artificial IntelligenceSep-30-2024

Cerebral microbleeds, markers of brain damage from vascular and amyloid pathologies, are linked to cognitive decline in aging, but their role in Alzheimer's disease (AD) onset and progression remains unclear. This study aimed to explore whether the presence and location of lobar microbleeds are associated with amyloid-$\beta$ (A$\beta$)-PET, tau tangle formation (tau-PET), and longitudinal cognitive decline. We analyzed 1,573 ADNI participants with MR imaging data and information on the number and location of microbleeds. Associations between lobar microbleeds and pathology, cerebrospinal fluid (CSF), genetics, and cognition were examined, focusing on regional microbleeds and domain-specific cognitive decline using ordinary least-squares regression while adjusting for covariates. Cognitive decline was assessed with ADAS-Cog11 and its domain-specific sub-scores. Participants underwent neuropsychological testing at least twice, with a minimum two-year interval between assessments. Among the 1,573 participants (692 women, mean age 71.23 years), 373 participants had microbleeds. The presence of microbleeds was linked to cognitive decline, particularly in the semantic, language, and praxis domains for those with temporal lobe microbleeds. Microbleeds in the overall cortex were associated with language decline. Pathologically, temporal lobe microbleeds were associated with increased tau in the overall cortex, while cortical microbleeds were linked to elevated A$\beta$ in the temporal, parietal, and frontal regions. In this mixed population, microbleeds were connected to longitudinal cognitive decline, especially in semantic and language domains, and were associated with higher baseline A$\beta$ and tau pathology. These findings suggest that lobar microbleeds should be included in AD diagnostic and prognostic evaluations.

artificial intelligence, coefficient, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2410.12809

Country: North America > United States > Pennsylvania (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.47)
Information Technology > Artificial Intelligence > Cognitive Science (0.46)

Add feedback

Two-layer retrieval augmented generation framework for low-resource medical question-answering: proof of concept using Reddit data

Das, Sudeshna, Ge, Yao, Guo, Yuting, Rajwal, Swati, Hairston, JaMor, Powell, Jeanne, Walker, Drew, Peddireddy, Snigdha, Lakamana, Sahithi, Bozkurt, Selen, Reyna, Matthew, Sameni, Reza, Xiao, Yunyu, Kim, Sangmi, Chandler, Rasheeta, Hernandez, Natalie, Mowery, Danielle, Wightman, Rachel, Love, Jennifer, Spadaro, Anthony, Perrone, Jeanmarie, Sarker, Abeed

arXiv.org Artificial IntelligenceMay-29-2024

Retrieval augmented generation (RAG) provides the capability to constrain generative model outputs, and mitigate the possibility of hallucination, by providing relevant in-context text. The number of tokens a generative large language model (LLM) can incorporate as context is finite, thus limiting the volume of knowledge from which to generate an answer. We propose a two-layer RAG framework for query-focused answer generation and evaluate a proof-of-concept for this framework in the context of query-focused summary generation from social media forums, focusing on emerging drug-related information. The evaluations demonstrate the effectiveness of the two-layer framework in resource constrained settings to enable researchers in obtaining near real-time data from users.

large language model, machine learning, xylazine, (21 more...)

arXiv.org Artificial Intelligence

2405.19519

Country: North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)

Genre:

Research Report > New Finding (0.69)
Research Report > Experimental Study (0.48)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Consumer Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Social Media as a Sensor: Analyzing Twitter Data for Breast Cancer Medication Effects Using Natural Language Processing

Kobara, Seibi, Rafiei, Alireza, Nateghi, Masoud, Bozkurt, Selen, Kamaleswaran, Rishikesan, Sarker, Abeed

arXiv.org Artificial IntelligenceFeb-26-2024

Breast cancer is a significant public health concern and is the leading cause of cancer-related deaths among women. Despite advances in breast cancer treatments, medication non-adherence remains a major problem. As electronic health records do not typically capture patient-reported outcomes that may reveal information about medication-related experiences, social media presents an attractive resource for enhancing our understanding of the patients' treatment experiences. In this paper, we developed natural language processing (NLP) based methodologies to study information posted by an automatically curated breast cancer cohort from social media. We employed a transformer-based classifier to identify breast cancer patients/survivors on X (Twitter) based on their self-reported information, and we collected longitudinal data from their profiles. We then designed a multi-layer rule-based model to develop a breast cancer therapy-associated side effect lexicon and detect patterns of medication usage and associated side effects among breast cancer patients. 1,454,637 posts were available from 583,962 unique users, of which 62,042 were detected as breast cancer members using our transformer-based model. 198 cohort members mentioned breast cancer medications with tamoxifen as the most common. Our side effect lexicon identified well-known side effects of hormone and chemotherapy. Furthermore, it discovered a subject feeling towards cancer and medications, which may suggest a pre-clinical phase of side effects or emotional distress. This analysis highlighted not only the utility of NLP techniques in unstructured social media data to identify self-reported breast cancer posts, medication usage patterns, and treatment side effects but also the richness of social data on such clinical questions.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2403.00821

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > Experimental Study (0.47)

Industry: Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)

Add feedback