AITopics

2503.04502

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Asia > Middle East > Israel > Haifa District > Haifa (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(10 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Water & Waste Management > Water Management (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (1.00)
(3 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(6 more...)

Nguyen, Tin, Bolton, Logan, Taesiri, Mohammad Reza, Nguyen, Anh Totti

HoT: Highlighted Chain of Thought for Referencing Supporting Facts from Inputs

arXiv.org Artificial IntelligenceMar-4-2025

An Achilles heel of Large Language Models (LLMs) is their tendency to hallucinate non-factual statements. A response mixed of factual and non-factual statements poses a challenge for humans to verify and accurately base their decisions on. To combat this problem, we propose Highlighted Chain-of-Thought Prompting (HoT), a technique for prompting LLMs to generate responses with XML tags that ground facts to those provided in the query. That is, given an input question, LLMs would first re-format the question to add XML tags highlighting key facts, and then, generate a response with highlights over the facts referenced from the input. Interestingly, in few-shot settings, HoT outperforms vanilla chain of thought prompting (CoT) on a wide range of 17 tasks from arithmetic, reading comprehension to logical reasoning. When asking humans to verify LLM responses, highlights help time-limited participants to more accurately and efficiently recognize when LLMs are correct. Yet, surprisingly, when LLMs are wrong, HoTs tend to make users believe that an answer is correct.

accuracy, highlighted chain, reformatted question, (16 more...)

2503.02003

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Alberta (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(22 more...)

Genre:

Research Report (1.00)
Overview > Fact Book (0.34)

Industry:

Leisure & Entertainment > Sports > Football (1.00)
Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Cromp, Sonia, GNVV, Satya Sai Srinath Namburi, Alkhudhayri, Mohammed, Cao, Catherine, Guo, Samuel, Roberts, Nicholas, Sala, Frederic

Tabby: Tabular Data Synthesis with Language Models

arXiv.org Artificial IntelligenceMar-3-2025

While advances in large language models (LLMs) have greatly improved the quality of synthetic text data in recent years, synthesizing tabular data has received relatively less attention. We address this disparity with Tabby, a simple but powerful post-training modification to the standard Transformer language model architecture, enabling its use for tabular dataset synthesis. Tabby enables the representation of differences across columns using Gated Mixture-of-Experts, with column-specific sets of parameters. Empirically, Tabby results in data quality near or equal to that of real data. By pairing our novel LLM table training technique, Plain, with Tabby, we observe up to a 44% improvement in quality over previous methods. We also show that Tabby extends beyond tables to more general structured data, reaching parity with real data on a nested JSON dataset as well.

dataset, section 3, tabular data, (17 more...)

2503.02152

Country:

North America > United States > California (0.04)
Asia > Bangladesh (0.04)
Oceania > Australia > Tasmania (0.04)
(4 more...)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area (0.49)
Transportation > Air (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Ranjbar, Niloofar, Baghbani, Hamed

Lotus at SemEval-2025 Task 11: RoBERTa with Llama-3 Generated Explanations for Multi-Label Emotion Classification

arXiv.org Artificial IntelligenceFeb-28-2025

This paper presents a novel approach for multi-label emotion detection, where Llama-3 is used to generate explanatory content that clarifies ambiguous emotional expressions, thereby enhancing RoBERTa's emotion classification performance. By incorporating explanatory context, our method improves F1-scores, particularly for emotions like fear, joy, and sadness, and outperforms text-only models. The addition of explanatory content helps resolve ambiguity, addresses challenges like overlapping emotional cues, and enhances multi-label classification, marking a significant advancement in emotion detection tasks.

emotion, llama-3, roberta, (15 more...)

2502.19935

Country:

Europe > Austria > Vienna (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Indian Ocean > Arabian Gulf (0.04)
(5 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

RankCoT: Refining Knowledge for Retrieval-Augmented Generation through Ranking Chain-of-Thoughts

Wu, Mingyan, Liu, Zhenghao, Yan, Yukun, Li, Xinze, Yu, Shi, Zeng, Zheni, Gu, Yu, Yu, Ge

Retrieval-Augmented Generation (RAG) enhances the performance of Large Language Models (LLMs) by incorporating external knowledge. However, LLMs still encounter challenges in effectively utilizing the knowledge from retrieved documents, often being misled by irrelevant or noisy information. To address this issue, we introduce RankCoT, a knowledge refinement method that incorporates reranking signals in generating CoT-based summarization for knowledge refinement based on given query and all retrieval documents. During training, RankCoT prompts the LLM to generate Chain-of-Thought (CoT) candidates based on the query and individual documents. It then fine-tunes the LLM to directly reproduce the best CoT from these candidate outputs based on all retrieved documents, which requires LLM to filter out irrelevant documents during generating CoT-style summarization. Additionally, RankCoT incorporates a self-reflection mechanism that further refines the CoT outputs, resulting in higher-quality training data. Our experiments demonstrate the effectiveness of RankCoT, showing its superior performance over other knowledge refinement models. Further analysis reveals that RankCoT can provide shorter but effective refinement results, enabling the generator to produce more accurate answers. All code and data are available at https://github.com/NEUIR/RankCoT.

large language model, machine learning, natural language, (15 more...)

2502.17888

Country:

Oceania > Australia > Western Australia (0.14)
Indian Ocean (0.05)
Pacific Ocean > South Pacific Ocean > Coral Sea (0.04)
(13 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language Models

Poznanski, Jake, Borchardt, Jon, Dunkelberger, Jason, Huff, Regan, Lin, Daniel, Rangapur, Aman, Wilhelm, Christopher, Lo, Kyle, Soldaini, Luca

PDF documents have the potential to provide trillions of novel, high-quality tokens for training language models. However, these documents come in a diversity of types with differing formats and visual layouts that pose a challenge when attempting to extract and faithfully represent the underlying content for language model use. We present olmOCR, an open-source Python toolkit for processing PDFs into clean, linearized plain text in natural reading order while preserving structured content like sections, tables, lists, equations, and more. Our toolkit runs a fine-tuned 7B vision language model (VLM) trained on a sample of 260,000 pages from over 100,000 crawled PDFs with diverse properties, including graphics, handwritten text and poor quality scans. olmOCR is optimized for large-scale batch processing, able to scale flexibly to different hardware setups and convert a million PDF pages for only $190 USD. We release all components of olmOCR including VLM weights, data and training code, as well as inference code built on serving frameworks including vLLM and SGLang.

olmocr, wang, zhang, (16 more...)

2502.18443

Country:

Indian Ocean (0.04)
Asia > Middle East > Jordan (0.04)
Asia > India (0.04)
(4 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Subudhi, Sonalika, Pati, Alok Kumar, Bose, Sephali, Sahoo, Subhasmita, Pattanaik, Avipsa, Acharya, Biswa Mohan

Integrating Boosted learning with Differential Evolution (DE) Optimizer: A Prediction of Groundwater Quality Risk Assessment in Odisha

Groundwater is eventually undermined by human exercises, such as fast industrialization, urbanization, over-extraction, and contamination from agrarian and urban sources. From among the different contaminants, the presence of heavy metals like cadmium (Cd), chromium (Cr), arsenic (As), and lead (Pb) proves to have serious dangers when present in huge concentrations in groundwater. Long-term usage of these poisonous components may lead to neurological disorders, kidney failure and different sorts of cancer. To address these issues, this study developed a machine learning-based predictive model to evaluate the Groundwater Quality Index (GWQI) and identify the main contaminants which are affecting the water quality. It has been achieved with the help of a hybrid machine learning model i.e. LCBoost Fusion . The model has undergone several processes like data preprocessing, hyperparameter tuning using Differential Evolution (DE) optimization, and evaluation through cross-validation. The LCBoost Fusion model outperforms individual models (CatBoost and LightGBM), by achieving low RMSE (0.6829), MSE (0.5102), MAE (0.3147) and a high R$^2$ score of 0.9809. Feature importance analysis highlights Potassium (K), Fluoride (F) and Total Hardness (TH) as the most influential indicators of groundwater contamination. This research successfully demonstrates the application of machine learning in assessing groundwater quality risks in Odisha. The proposed LCBoost Fusion model offers a reliable and efficient approach for real-time groundwater monitoring and risk mitigation. These findings will help the environmental organizations and the policy makers to map out targeted places for sustainable groundwater management. Future work will focus on using remote sensing data and developing an interactive decision-making system for groundwater quality assessment.

lcboost fusion, prediction, quality indicator, (13 more...)

2502.17929

Country:

North America > United States (0.69)
Africa > South Africa (0.04)
Oceania > New Zealand (0.04)
(14 more...)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.88)

Industry:

Water & Waste Management > Water Management > Water Supplies & Services (1.00)
Law (1.00)
Health & Medicine (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.67)

From underwater to aerial: a novel multi-scale knowledge distillation approach for coral reef monitoring

Contini, Matteo, Illien, Victor, Barde, Julien, Poulain, Sylvain, Bernard, Serge, Joly, Alexis, Bonhommeau, Sylvain

Drone-based remote sensing combined with AI-driven methodologies has shown great potential for accurate mapping and monitoring of coral reef ecosystems. This study presents a novel multi-scale approach to coral reef monitoring, integrating fine-scale underwater imagery with medium-scale aerial imagery. Underwater images are captured using an Autonomous Surface Vehicle (ASV), while aerial images are acquired with an aerial drone. A transformer-based deep-learning model is trained on underwater images to detect the presence of 31 classes covering various coral morphotypes, associated fauna, and habitats. These predictions serve as annotations for training a second model applied to aerial images. The transfer of information across scales is achieved through a weighted footprint method that accounts for partial overlaps between underwater image footprints and aerial image tiles. The results show that the multi-scale methodology successfully extends fine-scale classification to larger reef areas, achieving a high degree of accuracy in predicting coral morphotypes and associated habitats. The method showed a strong alignment between underwater-derived annotations and ground truth data, reflected by an AUC (Area Under the Curve) score of 0.9251. This shows that the integration of underwater and aerial imagery, supported by deep-learning models, can facilitate scalable and accurate reef assessments. This study demonstrates the potential of combining multi-scale imaging and AI to facilitate the monitoring and conservation of coral reefs. Our approach leverages the strengths of underwater and aerial imagery, ensuring the precision of fine-scale analysis while extending it to cover a broader reef area.

annotation, orthophoto, underwater image, (16 more...)

2502.17883

Country:

Africa > La Réunion (0.15)
Europe > France > Occitanie > Hérault > Montpellier (0.05)
North America > Canada > Quebec (0.04)
(7 more...)

Genre: Research Report > New Finding (0.88)

Industry:

Food & Agriculture (0.46)
Information Technology (0.46)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Schreyer, W. Max, Anderson, Christopher, Thompson, Reid F.

Generalization is not a universal guarantee: Estimating similarity to training data with an ensemble out-of-distribution metric

Failure of machine learning models to generalize to new data is a core problem limiting the reliability of AI systems, partly due to the lack of simple and robust methods for comparing new data to the original training dataset. We propose a standardized approach for assessing data similarity in a model-agnostic manner by constructing a supervised autoencoder for generalizability estimation (SAGE). We compare points in a low-dimensional embedded latent space, defining empirical probability measures for k -Nearest Neighbors (kNN) distance, reconstruction of inputs and task-based performance. As proof of concept for classification tasks, we use MNIST and CIFAR-10 to demonstrate how an ensemble output probability score can separate deformed images from a mixture of typical test examples, and how this SAGE score is robust to transformations of increasing severity. As further proof of concept, we extend this approach to a regression task using non-imaging data (UCI Abalone). In all cases, we show that out-of-the-box model performance increases after SAGE score filtering, even when applied to data from the model's own training and test datasets. Our out-of-distribution scoring method can be introduced during several steps of model construction and assessment, leading to future improvements in responsible deep learning implementation. 1 Background The presence of generalization gaps, where machine learning performance degrades when a trained model encounters previously-unseen data, represents a critical ongoing challenge in the implementation of AI systems.

dataset, reconstruction error, sage score, (16 more...)

2502.16329

Country:

North America > United States > Oregon > Multnomah County > Portland (0.05)
Oceania > Australia > Tasmania (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Indian Ocean > Bass Strait (0.04)

Genre: Research Report (0.82)

Industry:

Transportation (1.00)
Health & Medicine (1.00)
Government > Military (0.68)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Ning, Ding, Vetrova, Varvara, Delaux, Sébastien, Tappenden, Rachael, Bryan, Karin R., Koh, Yun Sing

A Study on Monthly Marine Heatwave Forecasts in New Zealand: An Investigation of Imbalanced Regression Loss Functions with Neural Network Models

arXiv.org Artificial IntelligenceFeb-19-2025

Marine heatwaves (MHWs) are extreme ocean-temperature events with significant impacts on marine ecosystems and related industries. Accurate forecasts (one to six months ahead) of MHWs would aid in mitigating these impacts. However, forecasting MHWs presents a challenging imbalanced regression task due to the rarity of extreme temperature anomalies in comparison to more frequent moderate conditions. In this study, we examine monthly MHW forecasts for 12 locations around New Zealand. We use a fully-connected neural network and compare standard and specialized regression loss functions, including the mean squared error (MSE), the mean absolute error (MAE), the Huber, the weighted MSE, the focal-R, the balanced MSE, and a proposed scaling-weighted MSE. Results show that (i) short lead times (one month) are considerably more predictable than three- and six-month leads, (ii) models trained with the standard MSE or MAE losses excel at forecasting average conditions but struggle to capture extremes, and (iii) specialized loss functions such as the balanced MSE and our scaling-weighted MSE substantially improve forecasting of MHW and suspected MHW events. These findings underscore the importance of tailored loss functions for imbalanced regression, particularly in forecasting rare but impactful events such as MHWs.

forecast, loss function, mhw forecast, (9 more...)

2502.13495

Country:

Indian Ocean (0.04)
Asia (0.04)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)