AITopics | Asmara

Collaborating Authors

Asmara

A Benchmark Dataset and a Framework for Urdu Multimodal Named Entity Recognition

Ahmad, Hussain, Zeng, Qingyang, Wan, Jing

arXiv.org Artificial IntelligenceMay-9-2025

The emergence of multimodal content, particularly text and images on social media, has positioned Multimodal Named Entity Recognition (MNER) as an increasingly important area of research within Natural Language Processing. Despite progress in high-resource languages such as English, MNER remains underexplored for low-resource languages like Urdu. The primary challenges include the scarcity of annotated multimodal datasets and the lack of standardized baselines. To address these challenges, we introduce the U-MNER framework and release the Twitter2015-Urdu dataset, a pioneering resource for Urdu MNER. Adapted from the widely used Twitter2015 dataset, it is annotated with Urdu-specific grammar rules. We establish benchmark baselines by evaluating both text-based and multimodal models on this dataset, providing comparative analyses to support future research on Urdu MNER. The U-MNER framework integrates textual and visual context using Urdu-BERT for text embeddings and ResNet for visual feature extraction, with a Cross-Modal Fusion Module to align and fuse information. Our model achieves state-of-the-art performance on the Twitter2015-Urdu dataset, laying the groundwork for further MNER research in low-resource languages.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2505.05148

Country:

Asia > China > Beijing > Beijing (0.05)
Africa > Eritrea > Maekel > Asmara (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(10 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Coralscapes Dataset: Semantic Scene Understanding in Coral Reefs

Sauder, Jonathan, Domazetoski, Viktor, Banc-Prandi, Guilhem, Perna, Gabriela, Meibom, Anders, Tuia, Devis

arXiv.org Artificial IntelligenceMar-25-2025

Coral reefs are declining worldwide due to climate change and local stressors. To inform effective conservation or restoration, monitoring at the highest possible spatial and temporal resolution is necessary. Conventional coral reef surveying methods are limited in scalability due to their reliance on expert labor time, motivating the use of computer vision tools to automate the identification and abundance estimation of live corals from images. However, the design and evaluation of such tools has been impeded by the lack of large high quality datasets. We release the Coralscapes dataset, the first general-purpose dense semantic segmentation dataset for coral reefs, covering 2075 images, 39 benthic classes, and 174k segmentation masks annotated by experts. Coralscapes has a similar scope and the same structure as the widely used Cityscapes dataset for urban scene segmentation, allowing benchmarking of semantic segmentation models in a new challenging domain which requires expert knowledge to annotate. We benchmark a wide range of semantic segmentation models, and find that transfer learning from Coralscapes to existing smaller datasets consistently leads to state-of-the-art performance. Coralscapes will catalyze research on efficient, scalable, and standardized coral reef surveying methods based on computer vision, and holds the potential to streamline the development of underwater ecological robotics.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2503.2

Country:

Asia > Middle East > Jordan (0.28)
Africa > Middle East > Djibouti (0.15)
Africa > Middle East > Egypt (0.14)
(13 more...)

Genre: Research Report (0.82)

Industry:

Health & Medicine (0.68)
Government (0.46)
Materials (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Interpretable LLM-based Table Question Answering

Giang, null, Nguyen, null, Brugere, Ivan, Sharma, Shubham, Kariyappa, Sanjay, Nguyen, Anh Totti, Lecue, Freddy

arXiv.org Artificial IntelligenceDec-16-2024

Interpretability for Table Question Answering (Table QA) is critical, particularly in high-stakes industries like finance or healthcare. Although recent approaches using Large Language Models (LLMs) have significantly improved Table QA performance, their explanations for how the answers are generated are ambiguous. To fill this gap, we introduce Plan-of-SQLs ( or POS), an interpretable, effective, and efficient approach to Table QA that answers an input query solely with SQL executions. Through qualitative and quantitative evaluations with human and LLM judges, we show that POS is most preferred among explanation methods, helps human users understand model decision boundaries, and facilitates model success and error identification. Furthermore, when evaluated in standard benchmarks (TabFact, WikiTQ, and FetaQA), POS achieves competitive or superior accuracy compared to existing methods, while maintaining greater efficiency by requiring significantly fewer LLM calls and database queries.

explanation, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2412.12386

Country:

North America > United States > Iowa (0.07)
North America > United States > Michigan (0.05)
North America > United States > Tennessee (0.05)
(39 more...)

Genre:

Workflow (1.00)
Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine (1.00)
Banking & Finance (1.00)
Leisure & Entertainment > Sports > Tennis (0.46)
Leisure & Entertainment > Sports > Golf (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Beware of Metacognitive Laziness: Effects of Generative Artificial Intelligence on Learning Motivation, Processes, and Performance

Fan, Yizhou, Tang, Luzhen, Le, Huixiao, Shen, Kejie, Tan, Shufang, Zhao, Yueying, Shen, Yuan, Li, Xinyu, Gašević, Dragan

arXiv.org Artificial IntelligenceDec-12-2024

With the continuous development of technological and educational innovation, learners nowadays can obtain a variety of support from agents such as teachers, peers, education technologies, and recently, generative artificial intelligence such as ChatGPT. The concept of hybrid intelligence is still at a nascent stage, and how learners can benefit from a symbiotic relationship with various agents such as AI, human experts and intelligent learning systems is still unknown. The emerging concept of hybrid intelligence also lacks deep insights and understanding of the mechanisms and consequences of hybrid human-AI learning based on strong empirical research. In order to address this gap, we conducted a randomised experimental study and compared learners' motivations, self-regulated learning processes and learning performances on a writing task among different groups who had support from different agents (ChatGPT, human expert, writing analytics tools, and no extra tool). A total of 117 university students were recruited, and their multi-channel learning, performance and motivation data were collected and analysed. The results revealed that: learners who received different learning support showed no difference in post-task intrinsic motivation; there were significant differences in the frequency and sequences of the self-regulated learning processes among groups; ChatGPT group outperformed in the essay score improvement but their knowledge gain and transfer were not significantly different. Our research found that in the absence of differences in motivation, learners with different supports still exhibited different self-regulated learning processes, ultimately leading to differentiated performance. What is particularly noteworthy is that AI technologies such as ChatGPT may promote learners' dependence on technology and potentially trigger metacognitive laziness.

large language model, learner, machine learning, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1111/bjet.13544

2412.09315

Country:

Europe > Austria > Vienna (0.14)
Asia > China > Beijing > Beijing (0.04)
Africa > Eritrea > Maekel > Asmara (0.04)
(6 more...)

Genre:

Research Report > Strength High (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (1.00)
Education > Assessment & Standards (0.93)
Education > Curriculum > Subject-Specific Education (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.84)

Add feedback

Learning Algorithms Made Simple

Golilarz, Noorbakhsh Amiri, Hossain, Elias, Addeh, Abdoljalil, Rahimi, Keyan Alexander

arXiv.org Artificial IntelligenceOct-11-2024

In this paper, we discuss learning algorithms and their importance in different types of applications which includes training to identify important patterns and features in a straightforward, easy-to-understand manner. We will review the main concepts of artificial intelligence (AI), machine learning (ML), deep learning (DL), and hybrid models. Some important subsets of Machine Learning algorithms such as supervised, unsupervised, and reinforcement learning are also discussed in this paper. These techniques can be used for some important tasks like prediction, classification, and segmentation. Convolutional Neural Networks (CNNs) are used for image and video processing and many more applications. We dive into the architecture of CNNs and how to integrate CNNs with ML algorithms to build hybrid models. This paper explores the vulnerability of learning algorithms to noise, leading to misclassification. We further discuss the integration of learning algorithms with Large Language Models (LLM) to generate coherent responses applicable to many domains such as healthcare, marketing, and finance by learning important patterns from large volumes of data. Furthermore, we discuss the next generation of learning algorithms and how we may have an unified Adaptive and Dynamic Network to perform important tasks. Overall, this article provides brief overview of learning algorithms, exploring their current state, applications and future direction.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.09186

Country:

North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.14)
North America > United States > Mississippi (0.04)
Asia > Taiwan (0.04)
(5 more...)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (0.68)
Education > Curriculum > Subject-Specific Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Cleaning Robots in Public Spaces: A Survey and Proposal for Benchmarking Based on Stakeholders Interviews

Memmesheimer, Raphael, Overbeck, Martina, Kral, Bjoern, Steffen, Lea, Behnke, Sven, Gersch, Martin, Roennau, Arne

arXiv.org Artificial IntelligenceJul-23-2024

Autonomous cleaning robots for public spaces have potential for addressing current societal challenges, such as labor shortages and cleanliness in public spaces. Other application domains like autonomous driving, bin picking, or search and rescue have shown that benchmarking platforms and approaches in competitive settings can advance their respective research fields, resulting in more applicable systems under real-world conditions. For this paper, we analyzed seven semi-structured, qualitative stakeholder interviews about outdoor cleaning, identified current needs as well as limitations, and considered those results for the development of a benchmarking scenario based on the previous observations.

competition, public space, robot, (12 more...)

arXiv.org Artificial Intelligence

2407.16393

Country:

North America > United States (0.16)
Asia > Japan (0.04)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
(5 more...)

Genre: Research Report (0.64)

Industry:

Transportation > Ground (0.49)
Information Technology > Robotics & Automation (0.48)
Energy > Renewable > Solar (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.47)

Add feedback

PEDANTS (Precise Evaluations of Diverse Answer Nominee Text for Skinflints): Efficient Evaluation Analysis and Benchmarking for Open-Domain Question Answering

Li, Zongxia, Mondal, Ishani, Liang, Yijun, Nghiem, Huy, Boyd-Graber, Jordan Lee

arXiv.org Artificial IntelligenceJul-6-2024

Question answering (QA) can only make progress if we know if an answer is correct, but for many of the most challenging and interesting QA examples, current efficient answer correctness (AC) metrics do not align with human judgments, particularly verbose, free-form answers from large language models (LLMs). There are two challenges: a lack of diverse evaluation data and that models are too big and non-transparent; LLM-based scorers correlate better with humans, but this expensive task has only been tested on limited QA datasets. We rectify these issues by providing guidelines and datasets for evaluating machine QA adopted from human QA community. We also propose an efficient, low-resource, and interpretable QA evaluation method more stable than an exact match and neural methods.

dataset, evaluation, pedant, (16 more...)

arXiv.org Artificial Intelligence

2402.11161

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > France (0.04)
Asia > Middle East > Jordan (0.04)
(12 more...)

Genre: Research Report (1.00)

Industry:

Transportation > Air (1.00)
Government > Regional Government > North America Government > United States Government (0.93)
Leisure & Entertainment > Games > Jeopardy! (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SecurePose: Automated Face Blurring and Human Movement Kinematics Extraction from Videos Recorded in Clinical Settings

Bajpai, Rishabh, Aravamuthan, Bhooma

arXiv.org Artificial IntelligenceFeb-21-2024

Movement disorders are typically diagnosed by consensus-based expert evaluation of clinically acquired patient videos. However, such broad sharing of patient videos poses risks to patient privacy. Face blurring can be used to de-identify videos, but this process is often manual and time-consuming. Available automated face blurring techniques are subject to either excessive, inconsistent, or insufficient facial blurring - all of which can be disastrous for video assessment and patient privacy. Furthermore, assessing movement disorders in these videos is often subjective. The extraction of quantifiable kinematic features can help inform movement disorder assessment in these videos, but existing methods to do this are prone to errors if using pre-blurred videos. We have developed an open-source software called SecurePose that can both achieve reliable face blurring and automated kinematic extraction in patient videos recorded in a clinic setting using an iPad. SecurePose, extracts kinematics using a pose estimation method (OpenPose), tracks and uniquely identifies all individuals in the video, identifies the patient, and performs face blurring. The software was validated on gait videos recorded in outpatient clinic visits of 116 children with cerebral palsy. The validation involved assessing intermediate steps of kinematics extraction and face blurring with manual blurring (ground truth). Moreover, when SecurePose was compared with six selected existing methods, it outperformed other methods in automated face detection and achieved ceiling accuracy in 91.08% less time than a robust manual face blurring method. Furthermore, ten experienced researchers found SecurePose easy to learn and use, as evidenced by the System Usability Scale. The results of this work validated the performance and usability of SecurePose on clinically recorded gait videos for face blurring and kinematics extraction.

algorithm, securepose, video, (16 more...)

arXiv.org Artificial Intelligence

2402.14143

Country:

North America > United States (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Asia > India (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.88)
Health & Medicine > Health Care Providers & Services (0.86)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
(4 more...)

Add feedback

CFMatch: Aligning Automated Answer Equivalence Evaluation with Expert Judgments For Open-Domain Question Answering

Li, Zongxia, Mondal, Ishani, Liang, Yijun, Nghiem, Huy, Boyd-Graber, Jordan

arXiv.org Artificial IntelligenceJan-23-2024

Question answering (QA) can only make progress if we know if an answer is correct, but for many of the most challenging and interesting QA examples, current evaluation metrics to determine answer equivalence (AE) often do not align with human judgments, particularly more verbose, free-form answers from large language models (LLM). There are two challenges: a lack of data and that models are too big: LLM-based scorers can correlate better with human judges, but this task has only been tested on limited QA datasets, and even when available, update of the model is limited because LLMs are large and often expensive. We rectify both of these issues by providing clear and consistent guidelines for evaluating AE in machine QA adopted from professional human QA contests. We also introduce a combination of standard evaluation and a more efficient, robust, and lightweight discriminate AE classifier-based matching method (CFMatch, smaller than 1 MB), trained and validated to more accurately evaluate answer correctness in accordance with adopted expert AE rules that are more aligned with human judgments.

candidate answer, dataset, evaluation method, (15 more...)

arXiv.org Artificial Intelligence

2401.1317

Country:

Asia > Middle East > Jordan (0.04)
Asia > South Korea (0.04)
Africa > Eritrea > Maekel > Asmara (0.04)
(15 more...)

Genre: Research Report (0.82)

Industry:

Health & Medicine (1.00)
Education (0.67)
Transportation > Air (0.67)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

The Challenges of Machine Learning for Trust and Safety: A Case Study on Misinformation Detection

Xiao, Madelyne, Mayer, Jonathan

arXiv.org Artificial IntelligenceAug-23-2023

We examine the disconnect between scholarship and practice in applying machine learning to trust and safety problems, using misinformation detection as a case study. We systematize literature on automated detection of misinformation across a corpus of 270 well-cited papers in the field. We then examine subsets of papers for data and code availability, design missteps, reproducibility, and generalizability. We find significant shortcomings in the literature that call into question claimed performance and practicality. Detection tasks are often meaningfully distinct from the challenges that online services actually face. Datasets and model evaluation are often non-representative of real-world contexts, and evaluation frequently is not independent of model training. Data and code availability is poor. Models do not generalize well to out-of-domain data. Based on these results, we offer recommendations for evaluating machine learning applications to trust and safety problems. Our aim is for future work to avoid the pitfalls that we identify.

artificial intelligence, detection, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2308.12215

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Russia (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
(21 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Media > News (1.00)
Government > Regional Government > North America Government > United States Government (0.45)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Add feedback