Goto

Collaborating Authors

 Overview


Assessing wildfire susceptibility in Iran: Leveraging machine learning for geospatial analysis of climatic and anthropogenic factors

arXiv.org Artificial Intelligence

This study investigates the multifaceted factors influencing wildfire risk in Iran, focusing on the interplay between climatic conditions and human activities. Utilizing advanced remote sensing, geospatial information system (GIS) processing techniques such as cloud computing, and machine learning algorithms, this research analyzed the impact of climatic parameters, topographic features, and human-related factors on wildfire susceptibility assessment and prediction in Iran. Multiple scenarios were developed for this purpose based on the data sampling strategy. The findings revealed that climatic elements such as soil moisture, temperature, and humidity significantly contribute to wildfire susceptibility, while human activities-particularly population density and proximity to powerlines-also played a crucial role. Furthermore, the seasonal impact of each parameter was separately assessed during warm and cold seasons. The results indicated that human-related factors, rather than climatic variables, had a more prominent influence during the seasonal analyses. This research provided new insights into wildfire dynamics in Iran by generating high-resolution wildfire susceptibility maps using advanced machine learning classifiers. The generated maps identified high risk areas, particularly in the central Zagros region, the northeastern Hyrcanian Forest, and the northern Arasbaran forest, highlighting the urgent need for effective fire management strategies.


FPGA-based Acceleration for Convolutional Neural Networks: A Comprehensive Review

arXiv.org Artificial Intelligence

Convolutional Neural Networks (CNNs) are fundamental to deep learning, driving applications across various domains. However, their growing complexity has significantly increased computational demands, necessitating efficient hardware accelerators. Field-Programmable Gate Arrays (FPGAs) have emerged as a leading solution, offering reconfigurability, parallelism, and energy efficiency. This paper provides a comprehensive review of FPGA-based hardware accelerators specifically designed for CNNs. It presents and summarizes the performance evaluation framework grounded in existing studies and explores key optimization strategies, such as parallel computing, dataflow optimization, and hardware-software co-design. It also compares various FPGA architectures in terms of latency, throughput, compute efficiency, power consumption, and resource utilization. Finally, the paper highlights future challenges and opportunities, emphasizing the potential for continued innovation in this field.


Plane Geometry Problem Solving with Multi-modal Reasoning: A Survey

arXiv.org Artificial Intelligence

Plane geometry problem solving (PGPS) has recently gained significant attention as a benchmark to assess the multi-modal reasoning capabilities of large vision-language models. Despite the growing interest in PGPS, the research community still lacks a comprehensive overview that systematically synthesizes recent work in PGPS. To fill this gap, we present a survey of existing PGPS studies. We first categorize PGPS methods into an encoder-decoder framework and summarize the corresponding output formats used by their encoders and decoders. Subsequently, we classify and analyze these encoders and decoders according to their architectural designs. Finally, we outline major challenges and promising directions for future research. In particular, we discuss the hallucination issues arising during the encoding phase within encoder-decoder architectures, as well as the problem of data leakage in current PGPS benchmarks.


Toward Embodied AGI: A Review of Embodied AI and the Road Ahead

arXiv.org Artificial Intelligence

Artificial General Intelligence (AGI) is often envisioned as inherently embodied. With recent advances in robotics and foundational AI models, we stand at the threshold of a new era-one marked by increasingly generalized embodied AI systems. This paper contributes to the discourse by introducing a systematic taxonomy of Embodied AGI spanning five levels (L1-L5). We review existing research and challenges at the foundational stages (L1-L2) and outline the key components required to achieve higher-level capabilities (L3-L5). Building on these insights and existing technologies, we propose a conceptual framework for an L3+ robotic brain, offering both a technical outlook and a foundation for future exploration.


Enhancing Abstractive Summarization of Scientific Papers Using Structure Information

arXiv.org Artificial Intelligence

The code and dataset can be accessed at https://github.com/tongbao96/code - for - SFR - AS 1. Introduction W ith the rapid growth of scientific research and the academic community, numerous scientific papers are published daily. This notable increase in publications has led to information overload and requiring schol a r s to spend considerable time in reading and comprehending a large volume of articles . The goal of automatic summarization is to employ algorithms to extract key information and reorganize it into shorter, concise summaries (El - Kassas et al., 2021) . Automatic summarization holds significant research value in fields such as information retrieval (Spina et al., 2017), question and answer system (Y ulianti et al., 2018), and content review (Hu et al., 2017) . Existing a utomatic summarization methods are broadly divided into two categories: extractive method s and abstractive methods . Extractive methods generate summaries by selecting sentences directly from the original document, resulting in summaries that are more accurate and semantically consistent but may lack coherence. In contrast, abstractive methods generate summaries based on an understanding of the text, rather than extracting sentences directly from the original document . Therefore, summaries produced by this approach are typically more coherent and better aligned with human reading preferences (El - Kassas et al., 2021; Ghadimi & Beigy, 2022) . In this paper, we focus on abstractive summarization .


Enhancing Keyphrase Extraction from Academic Articles Using Section Structure Information

arXiv.org Artificial Intelligence

The exponential increase in academic papers has significantly increased the time required for researchers to access relevant literature. Keyphrase Extraction (KPE) offers a solution to this situation by enabling researchers to efficiently retrieve relevant literature. The current study on KPE from academic articles aims to improve the performance of extraction models through innovative approaches using Title and Abstract as input corpora. However, the semantic richness of keywords is significantly constrained by the length of the abstract. While full-text-based KPE can address this issue, it simultaneously introduces noise, which significantly diminishes KPE performance. To address this issue, this paper utilized the structural features and section texts obtained from the section structure information of academic articles to extract keyphrase from academic papers. The approach consists of two main parts: (1) exploring the effect of seven structural features on KPE models, and (2) integrating the extraction results from all section texts used as input corpora for KPE models via a keyphrase integration algorithm to obtain the keyphrase integration result. Furthermore, this paper also examined the effect of the classification quality of section structure on the KPE performance. The results show that incorporating structural features improves KPE performance, though different features have varying effects on model efficacy. The keyphrase integration approach yields the best performance, and the classification quality of section structure can affect KPE performance. These findings indicate that using the section structure information of academic articles contributes to effective KPE from academic articles. The code and dataset supporting this study are available at https://github.com/yan-xinyi/SSB_KPE.


CAFES: A Collaborative Multi-Agent Framework for Multi-Granular Multimodal Essay Scoring

arXiv.org Artificial Intelligence

Automated Essay Scoring (AES) is crucial for modern education, particularly with the increasing prevalence of multimodal assessments. However, traditional AES methods struggle with evaluation generalizability and multimodal perception, while even recent Multimodal Large Language Model (MLLM)-based approaches can produce hallucinated justifications and scores misaligned with human judgment. To address the limitations, we introduce CAFES, the first collaborative multi-agent framework specifically designed for AES. It orchestrates three specialized agents: an Initial Scorer for rapid, trait-specific evaluations; a Feedback Pool Manager to aggregate detailed, evidence-grounded strengths; and a Reflective Scorer that iteratively refines scores based on this feedback to enhance human alignment. Extensive experiments, using state-of-the-art MLLMs, achieve an average relative improvement of 21% in Quadratic Weighted Kappa (QWK) against ground truth, especially for grammatical and lexical diversity. Our proposed CAFES framework paves the way for an intelligent multimodal AES system. The code will be available upon acceptance.


OMGPT: A Sequence Modeling Framework for Data-driven Operational Decision Making

arXiv.org Artificial Intelligence

We build a Generative Pre-trained Transformer (GPT) model from scratch to solve sequential decision making tasks arising in contexts of operations research and management science which we call OMGPT. We first propose a general sequence modeling framework to cover several operational decision making tasks as special cases, such as dynamic pricing, inventory management, resource allocation, and queueing control. Under the framework, all these tasks can be viewed as a sequential prediction problem where the goal is to predict the optimal future action given all the historical information. Then we train a transformer-based neural network model (OMGPT) as a natural and powerful architecture for sequential modeling. This marks a paradigm shift compared to the existing methods for these OR/OM tasks in that (i) the OMGPT model can take advantage of the huge amount of pre-trained data; (ii) when tackling these problems, OMGPT does not assume any analytical model structure and enables a direct and rich mapping from the history to the future actions. Either of these two aspects, to the best of our knowledge, is not achieved by any existing method. We establish a Bayesian perspective to theoretically understand the working mechanism of the OMGPT on these tasks, which relates its performance with the pre-training task diversity and the divergence between the testing task and pre-training tasks. Numerically, we observe a surprising performance of the proposed model across all the above tasks.


Zero-Shot Forecasting Mortality Rates: A Global Study

arXiv.org Artificial Intelligence

This study explores the potential of zero-shot time series forecasting, an innovative approach leveraging pre-trained foundation models, to forecast mortality rates without task-specific fine-tuning. We evaluate two state-of-the-art foundation models, TimesFM and CHRONOS, alongside traditional and machine learning-based methods across three forecasting horizons (5, 10, and 20 years) using data from 50 countries and 111 age groups. In our investigations, zero-shot models showed varying results: while CHRONOS delivered competitive shorter-term forecasts, outperforming traditional methods like ARIMA and the Lee-Carter model, TimesFM consistently underperformed. Fine-tuning CHRONOS on mortality data significantly improved long-term accuracy. A Random Forest model, trained on mortality data, achieved the best overall performance. These findings underscore the potential of zero-shot forecasting while highlighting the need for careful model selection and domain-specific adaptation.


IG Parser: A Software Package for the Encoding of Institutional Statements using the Institutional Grammar

arXiv.org Artificial Intelligence

This article provides an overview of IG Parser, a software that facilitates qualitative content analysis of formal (e.g., legal) rules or informal (e.g., social) norms, and strategies (such as conventions) -- referred to as institutions -- that govern social systems and operate configurally to describe institutional systems. To this end, the IG Parser employs a distinctive syntax that ensures rigorous encoding of natural language, while automating the transformation into various formats that support the downstream analysis using diverse analytical techniques. The conceptual core of the IG Parser is an associated syntax, IG Script, that operationalizes the conceptual foundations of the Institutional Grammar, and more specifically the Institutional Grammar 2.0, an analytical paradigm for institutional analysis. This article presents the IG Parser, including its conceptual foundations, the syntax specification of IG Script, and its architectural principles. This overview is augmented with selective illustrative examples that highlight its use and the associated benefits.