AITopics | Jahan, Israt

Collaborating Authors

Jahan, Israt

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Uncertainty Quantification of Wind Gust Predictions in the Northeast US: An Evidential Neural Network and Explainable Artificial Intelligence Approach

Jahan, Israt, Schreck, John S., Gagne, David John, Becker, Charlie, Astitha, Marina

arXiv.org Machine LearningJan-31-2025

Machine learning has shown promise in reducing bias in numerical weather model predictions of wind gusts. Yet, they underperform to predict high gusts even with additional observations due to the right-skewed distribution of gusts. Uncertainty quantification (UQ) addresses this by identifying when predictions are reliable or needs cautious interpretation. Using data from 61 extratropical storms in the Northeastern USA, we introduce evidential neural network (ENN) as a novel approach for UQ in gust predictions, leveraging atmospheric variables from the Weather Research and Forecasting (WRF) model as features and gust observations as targets. Explainable artificial intelligence (XAI) techniques demonstrated that key predictive features also contributed to higher uncertainty. Estimated uncertainty correlated with storm intensity and spatial gust gradients. ENN allowed constructing gust prediction intervals without requiring an ensemble. From an operational perspective, providing gust forecasts with quantified uncertainty enhances stakeholders' confidence in risk assessment and response planning for extreme gust events.

artificial intelligence, machine learning, prediction, (19 more...)

arXiv.org Machine Learning

2502.003

Country: North America > United States > Connecticut > Tolland County > Storrs (0.14)

Genre: Research Report (1.00)

Industry: Government > Regional Government > North America Government > United States Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.84)

Add feedback

A Systematic Survey and Critical Review on Evaluating Large Language Models: Challenges, Limitations, and Recommendations

Laskar, Md Tahmid Rahman, Alqahtani, Sawsan, Bari, M Saiful, Rahman, Mizanur, Khan, Mohammad Abdullah Matin, Khan, Haidar, Jahan, Israt, Bhuiyan, Amran, Tan, Chee Wei, Parvez, Md Rizwan, Hoque, Enamul, Joty, Shafiq, Huang, Jimmy

arXiv.org Artificial IntelligenceJul-4-2024

Large Language Models (LLMs) have recently gained significant attention due to their remarkable capabilities in performing diverse tasks across various domains. However, a thorough evaluation of these models is crucial before deploying them in real-world applications to ensure they produce reliable performance. Despite the well-established importance of evaluating LLMs in the community, the complexity of the evaluation process has led to varied evaluation setups, causing inconsistencies in findings and interpretations. To address this, we systematically review the primary challenges and limitations causing these inconsistencies and unreliable evaluations in various steps of LLM evaluation. Based on our critical review, we present our perspectives and recommendations to ensure LLM evaluations are reproducible, reliable, and robust.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2407.04069

Country:

Europe (1.00)
Asia > Middle East (0.46)
North America > Canada (0.28)
Asia > China (0.28)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Information Technology (0.67)
Leisure & Entertainment (0.47)
Education (0.46)
Banking & Finance (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Comprehensive Evaluation of Large Language Models on Benchmark Biomedical Text Processing Tasks

Jahan, Israt, Laskar, Md Tahmid Rahman, Peng, Chun, Huang, Jimmy

arXiv.org Artificial IntelligenceOct-9-2023

Recently, Large Language Models (LLM) have demonstrated impressive capability to solve a wide range of tasks. However, despite their success across various tasks, no prior work has investigated their capability in the biomedical domain yet. To this end, this paper aims to evaluate the performance of LLMs on benchmark biomedical tasks. For this purpose, we conduct a comprehensive evaluation of 4 popular LLMs in 6 diverse biomedical tasks across 26 datasets. To the best of our knowledge, this is the first work that conducts an extensive evaluation and comparison of various LLMs in the biomedical domain. Interestingly, we find based on our evaluation that in biomedical datasets that have smaller training sets, zero-shot LLMs even outperform the current state-of-the-art fine-tuned biomedical models. This suggests that pretraining on large text corpora makes LLMs quite specialized even in the biomedical domain. We also find that not a single LLM can outperform other LLMs in all tasks, with the performance of different LLMs may vary depending on the task. While their performance is still quite poor in comparison to the biomedical models that were fine-tuned on large training sets, our findings demonstrate that LLMs have the potential to be a valuable tool for various biomedical tasks that lack large annotated data.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2310.0427

Country:

North America > Canada > Ontario (0.14)
Asia > Middle East > UAE (0.14)
North America > United States > Texas (0.14)
North America > United States > Maryland (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Evaluation of ChatGPT on Biomedical Tasks: A Zero-Shot Comparison with Fine-Tuned Generative Transformers

Jahan, Israt, Laskar, Md Tahmid Rahman, Peng, Chun, Huang, Jimmy

arXiv.org Artificial IntelligenceAug-24-2023

ChatGPT is a large language model developed by OpenAI. Despite its impressive performance across various tasks, no prior work has investigated its capability in the biomedical domain yet. To this end, this paper aims to evaluate the performance of ChatGPT on various benchmark biomedical tasks, such as relation extraction, document classification, question answering, and summarization. To the best of our knowledge, this is the first work that conducts an extensive evaluation of ChatGPT in the biomedical domain. Interestingly, we find based on our evaluation that in biomedical datasets that have smaller training sets, zero-shot ChatGPT even outperforms the state-of-the-art fine-tuned generative transformer models, such as BioGPT and BioBART. This suggests that ChatGPT's pre-training on large text corpora makes it quite specialized even in the biomedical domain. Our findings demonstrate that ChatGPT has the potential to be a valuable tool for various tasks in the biomedical domain that lack large annotated data.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2306.04504

Country:

North America > United States > Texas (0.14)
North America > United States > Maryland (0.14)
North America > Canada > Ontario (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CQSumDP: A ChatGPT-Annotated Resource for Query-Focused Abstractive Summarization Based on Debatepedia

Laskar, Md Tahmid Rahman, Rahman, Mizanur, Jahan, Israt, Hoque, Enamul, Huang, Jimmy

arXiv.org Artificial IntelligenceMar-31-2023

Debatepedia is a publicly available dataset consisting of arguments and counter-arguments on controversial topics that has been widely used for the single-document query-focused abstractive summarization task in recent years. However, it has been recently found that this dataset is limited by noise and even most queries in this dataset do not have any relevance to the respective document. In this paper, we present a methodology for cleaning the Debatepedia dataset by leveraging the generative power of large language models to make it suitable for query-focused abstractive summarization. More specifically, we harness the language generation capabilities of ChatGPT to regenerate its queries. We evaluate the effectiveness of the proposed ChatGPT annotated version of the Debatepedia dataset using several benchmark summarization models and demonstrate that the newly annotated version of Debatepedia outperforms the original dataset in terms of both query relevance as well as summary generation quality. We will make this annotated and cleaned version of the dataset publicly available.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2305.06147

Country:

North America > United States (1.00)
North America > Canada > Ontario (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (1.00)
Government > Military (1.00)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback