AITopics | Ofli, Ferda

Collaborating Authors

Ofli, Ferda

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Evaluating Robustness of LLMs on Crisis-Related Microblogs across Events, Information Types, and Linguistic Features

Imran, Muhammad, Ziaullah, Abdul Wahab, Chen, Kai, Ofli, Ferda

arXiv.org Artificial IntelligenceDec-8-2024

The widespread use of microblogging platforms like X (formerly Twitter) during disasters provides real-time information to governments and response authorities. However, the data from these platforms is often noisy, requiring automated methods to filter relevant information. Traditionally, supervised machine learning models have been used, but they lack generalizability. In contrast, Large Language Models (LLMs) show better capabilities in understanding and processing natural language out of the box. This paper provides a detailed analysis of the performance of six well-known LLMs in processing disaster-related social media data from a large-set of real-world events. Our findings indicate that while LLMs, particularly GPT-4o and GPT-4, offer better generalizability across different disasters and information types, most LLMs face challenges in processing flood-related data, show minimal improvement despite the provision of examples (i.e., shots), and struggle to identify critical information categories like urgent requests and needs. Additionally, we examine how various linguistic features affect model performance and highlight LLMs' vulnerabilities against certain features like typos. Lastly, we provide benchmarking results for all events across both zero- and few-shot settings and observe that proprietary models outperform open-source ones in all tasks.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2412.10413

Country:

Europe (0.68)
North America > United States > California (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.94)

Industry:

Information Technology (0.68)
Government (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Monitoring Critical Infrastructure Facilities During Disasters Using Large Language Models

Ziaullah, Abdul Wahab, Ofli, Ferda, Imran, Muhammad

arXiv.org Artificial IntelligenceApr-18-2024

Critical Infrastructure Facilities (CIFs), such as healthcare and transportation facilities, are vital for the functioning of a community, especially during large-scale emergencies. In this paper, we explore a potential application of Large Language Models (LLMs) to monitor the status of CIFs affected by natural disasters through information disseminated in social media networks. To this end, we analyze social media data from two disaster events in two different countries to identify reported impacts to CIFs as well as their impact severity and operational status. We employ state-of-the-art open-source LLMs to perform computational tasks including retrieval, classification, and inference, all in a zero-shot setting. Through extensive experimentation, we report the results of these tasks using standard evaluation metrics and reveal insights into the strengths and weaknesses of LLMs. We note that although LLMs perform well in classification tasks, they encounter challenges with inference tasks, especially when the context/prompt is complex and lengthy. Additionally, we outline various potential directions for future exploration that can be beneficial during the initial adoption phase of LLMs for disaster response tasks.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2404.14432

Country:

North America > United States > Florida (0.17)
Asia > Middle East > Qatar (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Health Care Providers & Services (0.46)
Energy > Power Industry (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Bias-Aware Face Mask Detection Dataset

Kantarcı, Alperen, Ofli, Ferda, Imran, Muhammad, Ekenel, Hazım Kemal

arXiv.org Artificial IntelligenceJan-10-2023

In December 2019, a novel coronavirus (COVID-19) spread so quickly around the world that many countries had to set mandatory face mask rules in public areas to reduce the transmission of the virus. To monitor public adherence, researchers aimed to rapidly develop efficient systems that can detect faces with masks automatically. However, the lack of representative and novel datasets proved to be the biggest challenge. Early attempts to collect face mask datasets did not account for potential race, gender, and age biases. Therefore, the resulting models show inherent biases toward specific race groups, such as Asian or Caucasian. In this work, we present a novel face mask detection dataset that contains images posted on Twitter during the pandemic from around the world. Unlike previous datasets, the proposed Bias-Aware Face Mask Detection (BAFMD) dataset contains more images from underrepresented race and age groups to mitigate the problem for the face mask detection task. We perform experiments to investigate potential biases in widely used face mask detection datasets and illustrate that the BAFMD dataset yields models with better performance and generalization ability. The dataset is publicly available at https://github.com/Alpkant/BAFMD.

artificial intelligence, machine learning, social media, (12 more...)

arXiv.org Artificial Intelligence

2211.01207

Country:

Asia > Middle East (0.47)
North America > United States (0.46)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Fine-grained Population Mapping from Coarse Census Counts and Open Geodata

Metzger, Nando, Vargas-Muñoz, John E., Daudt, Rodrigo C., Kellenberger, Benjamin, Whelan, Thao Ton-That, Ofli, Ferda, Imran, Muhammad, Schindler, Konrad, Tuia, Devis

arXiv.org Artificial IntelligenceNov-8-2022

Fine-grained population maps are needed in several domains, like urban planning, environmental monitoring, public health, and humanitarian operations. Unfortunately, in many countries only aggregate census counts over large spatial units are collected, moreover, these are not always up-to-date. We present POMELO, a deep learning model that employs coarse census counts and open geodata to estimate fine-grained population maps with 100m ground sampling distance. Moreover, the model can also estimate population numbers when no census counts at all are available, by generalizing across countries. In a series of experiments for several countries in sub-Saharan Africa, the maps produced with POMELOare in good agreement with the most detailed available reference counts: disaggregation of coarse census counts reaches R2 values of 85-89%; unconstrained prediction in the absence of any counts reaches 48-69%.

artificial intelligence, machine learning, occupancy rate, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1038/s41598-022-24495-w

2211.04039

Country:

Africa > Tanzania (0.72)
Europe > Switzerland (0.46)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Communications (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

MEDIC: A Multi-Task Learning Dataset for Disaster Image Classification

Alam, Firoj, Alam, Tanvirul, Hasan, Md. Arid, Hasnat, Abul, Imran, Muhammad, Ofli, Ferda

arXiv.org Artificial IntelligenceJun-8-2022

Recent research in disaster informatics demonstrates a practical and important use case of artificial intelligence to save human lives and suffering during natural disasters based on social media contents (text and images). While notable progress has been made using texts, research on exploiting the images remains relatively under-explored. To advance image-based approaches, we propose MEDIC (Available at: https://crisisnlp.qcri.org/medic/index.html), which is the largest social media image classification dataset for humanitarian response consisting of 71,198 images to address four different tasks in a multi-task learning setup. This is the first dataset of its kind: social media images, disaster response, and multi-task learning research. An important property of this dataset is its high potential to facilitate research on multi-task learning, which recently receives much interest from the machine learning community and has shown remarkable results in terms of memory, inference speed, performance, and generalization capability. Therefore, the proposed dataset is an important resource for advancing image-based disaster management and multi-task machine learning research. We experiment with different deep learning architectures and report promising results, which are above the majority baselines for all tasks. Along with the dataset, we also release all relevant scripts (https://github.com/firojalam/medic).

artificial intelligence, disaster image classification, machine learning, (2 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s00521-022-07717-0

2108.12828

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.60)

Add feedback

TBCOV: Two Billion Multilingual COVID-19 Tweets with Sentiment, Entity, Geo, and Gender Labels

Imran, Muhammad, Qazi, Umair, Ofli, Ferda

arXiv.org Artificial IntelligenceOct-4-2021

The widespread usage of social networks during mass convergence events, such as health emergencies and disease outbreaks, provides instant access to citizen-generated data that carry rich information about public opinions, sentiments, urgent needs, and situational reports. Such information can help authorities understand the emergent situation and react accordingly. Moreover, social media plays a vital role in tackling misinformation and disinformation. This work presents TBCOV, a large-scale Twitter dataset comprising more than two billion multilingual tweets related to the COVID-19 pandemic collected worldwide over a continuous period of more than one year. More importantly, several state-of-the-art deep learning models are used to enrich the data with important attributes, including sentiment labels, named-entities (e.g., mentions of persons, organizations, locations), user types, and gender information. Last but not least, a geotagging method is proposed to assign country, state, county, and city information to tweets, enabling a myriad of data analysis tasks to understand real-world issues. Our sentiment and trend analyses reveal interesting insights and confirm TBCOV's broad coverage of important topics.

data mining, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.3390/data7010008

2110.03664

Country:

North America > United States (1.00)
Europe (1.00)
Africa (1.00)
(2 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(2 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

HumAID: Human-Annotated Disaster Incidents Data from Twitter with Deep Learning Benchmarks

Alam, Firoj, Qazi, Umair, Imran, Muhammad, Ofli, Ferda

arXiv.org Artificial IntelligenceApr-8-2021

Social networks are widely used for information consumption and dissemination, especially during time-critical events such as natural disasters. Despite its significantly large volume, social media content is often too noisy for direct use in any application. Therefore, it is important to filter, categorize, and concisely summarize the available content to facilitate effective consumption and decision-making. To address such issues automatic classification systems have been developed using supervised modeling approaches, thanks to the earlier efforts on creating labeled datasets. However, existing datasets are limited in different aspects (e.g., size, contains duplicates) and less suitable to support more advanced and data-hungry deep learning models. In this paper, we present a new large-scale dataset with ~77K human-labeled tweets, sampled from a pool of ~24 million tweets across 19 disaster events that happened between 2016 and 2019. Moreover, we propose a data collection and sampling pipeline, which is important for social media data sampling for human annotation. We report multiclass classification results using classic and deep learning (fastText and transformer) based models to set the ground for future studies. The dataset and associated resources are publicly available. https://crisisnlp.qcri.org/humaid_dataset.html

deep learning, neural network, tweet, (21 more...)

arXiv.org Artificial Intelligence

2104.0309

Country:

Asia > Middle East (0.46)
North America > United States > Missouri (0.14)

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.66)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deep Learning Benchmarks and Datasets for Social Media Image Classification for Disaster Response

Alam, Firoj, Ofli, Ferda, Imran, Muhammad, Alam, Tanvirul, Qazi, Umair

arXiv.org Artificial IntelligenceNov-17-2020

During a disaster event, images shared on social media helps crisis managers gain situational awareness and assess incurred damages, among other response tasks. Recent advances in computer vision and deep neural networks have enabled the development of models for real-time image classification for a number of tasks, including detecting crisis incidents, filtering irrelevant images, classifying images into specific humanitarian categories, and assessing the severity of damage. Despite several efforts, past works mainly suffer from limited resources (i.e., labeled images) available to train more robust deep learning models. In this study, we propose new datasets for disaster type detection, and informativeness classification, and damage severity assessment. Moreover, we relabel existing publicly available datasets for new tasks. We identify exact- and near-duplicates to form non-overlapping data splits, and finally consolidate them to create larger datasets. In our extensive experiments, we benchmark several state-of-the-art deep learning models and achieve promising results. We release our datasets and models publicly, aiming to provide proper baselines as well as to spur further research in the crisis informatics community.

dataset, deep learning, neural network, (22 more...)

arXiv.org Artificial Intelligence

2011.08916

Country:

Asia > Middle East > Qatar (0.14)
Asia > Bangladesh (0.14)

Genre: Research Report > New Finding (0.49)

Industry:

Transportation (0.93)
Information Technology (0.68)
Government > Military (0.34)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GeoCoV19: A Dataset of Hundreds of Millions of Multilingual COVID-19 Tweets with Location Information

Qazi, Umair, Imran, Muhammad, Ofli, Ferda

arXiv.org Artificial IntelligenceMay-22-2020

The past several years have witnessed a huge surge in the use of social media platforms during mass convergence events such as health emergencies, natural or human-induced disasters. These non-traditional data sources are becoming vital for disease forecasts and surveillance when preparing for epidemic and pandemic outbreaks. In this paper, we present GeoCoV19, a large-scale Twitter dataset containing more than 524 million multilingual tweets posted over a period of 90 days since February 1, 2020. Moreover, we employ a gazetteer-based approach to infer the geolocation of tweets. We postulate that this large-scale, multilingual, geolocated social media data can empower the research communities to evaluate how societies are collectively coping with this unprecedented global crisis as well as to develop computational methods to address challenges such as identifying fake news, understanding communities' knowledge gaps, building disease forecast and surveillance models, among others.

artificial intelligence, natural language, tweet, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3404820.3404823

2005.11177

Country:

Europe (1.00)
North America > United States (0.70)
Asia > Middle East > Qatar (0.14)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Epidemiology (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.46)

Add feedback

Face-to-BMI: Using Computer Vision to Infer Body Mass Index on Social Media

Kocabey, Enes (Massachusetts Institute of Technology) | Camurcu, Mustafa (Northeastern University) | Ofli, Ferda (Hamad Bin Khalifa University) | Aytar, Yusuf (Massachusetts Institute of Technology) | Marin, Javier (Massachusetts Institute of Technology) | Torralba, Antonio (Massachusetts Institute of Technology) | Weber, Ingmar (Hamad Bin Khalifa University)

AAAI ConferencesMay-11-2017

A person's weight status can have profound implications on their life, ranging from mental health, to longevity, to financial income. At the societal level, "fat shaming'" and other forms of "sizeism'' are a growing concern, while increasing obesity rates are linked to ever raising healthcare costs. For these reasons, researchers from a variety of backgrounds are interested in studying obesity from all angles. To obtain data, traditionally, a person would have to accurately self-report their body-mass index (BMI) or would have to see a doctor to have it measured. In this paper, we show how computer vision can be used to infer a person's BMI from social media images. We hope that our tool, which we release, helps to advance the study of social aspects related to body weight.

computer vision, infer body mass index, social media, (1 more...)

AAAI Conferences

Eleventh International AAAI Conference on Web and Social Media

Industry: Health & Medicine > Consumer Health (0.89)

Technology:

Information Technology > Communications > Social Media (0.60)
Information Technology > Artificial Intelligence > Vision (0.60)

Add feedback