AITopics

Cost-effective localization methods for Autonomous Underwater Vehicle (AUV) navigation are key for ocean monitoring and data collection at high resolution in time and space. Algorithmic solutions suitable for real-time processing that handle nonlinear measurement models and different forms of measurement uncertainty will accelerate the development of field-ready technology. This paper details a Bayesian estimation method for landmark-aided navigation using a Side-scan Sonar (SSS) sensor. The method bounds navigation filter error in the GPS-denied undersea environment and captures the highly nonlinear nature of slant range measurements while remaining computationally tractable. Combining a novel measurement model with the chosen statistical framework facilitates the efficient use of SSS data and, in the future, could be used in real time. The proposed filter has two primary steps: a prediction step using an unscented transform and an update step utilizing particles. The update step performs probabilistic association of sonar detections with known landmarks. We evaluate algorithm performance and tractability using synthetic data and real data collected field experiments. Field experiments were performed using two different marine robotic platforms with two different SSS and at two different sites. Finally, we discuss the computational requirements of the proposed method and how it extends to real-time applications.

artificial intelligence, data mining, machine learning, (21 more...)

2503.079

Country:

North America > United States > California > San Diego County (0.28)
Oceania > Australia (0.28)
North America > Canada > Quebec (0.14)
(4 more...)

Genre: Research Report > New Finding (0.67)

Industry: Energy > Oil & Gas > Upstream (0.71)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Porfirio, David, Roberts, Mark, Hiatt, Laura M.

Uncertainty Expression for Human-Robot Task Communication

An underlying assumption of many existing approaches to human-robot task communication is that the robot possesses a sufficient amount of environmental domain knowledge, including the locations of task-critical objects. This assumption is unrealistic if the locations of known objects change or have not yet been discovered by the robot. In this work, our key insight is that in many scenarios, robot end users possess more scene insight than the robot and need ways to express it. Presently, there is a lack of research on how solutions for collecting end-user scene insight should be designed. We thereby created an Uncertainty Expression System (UES) to investigate how best to elicit end-user scene insight. The UES allows end users to convey their knowledge of object uncertainty using either: (1) a precision interface that allows meticulous expression of scene insight; (2) a painting interface by which users create a heat map of possible object locations; and (3) a ranking interface by which end users express object locations via an ordered list. We then conducted a user study to compare the effectiveness of these approaches based on the accuracy of scene insight conveyed to the robot, the efficiency at which end users are able to express this scene insight, and both usability and task load. Results indicate that the rank interface is more user friendly and efficient than the precision interface, and that the paint interface is the least accurate.

interface, robot, scene insight, (15 more...)

2503.16493

Country:

North America > United States > New York > New York County > New York City (0.15)
North America > United States > District of Columbia > Washington (0.04)
North America > United States > Colorado > Boulder County > Boulder (0.04)
(15 more...)

Genre:

Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (0.87)
Research Report > New Finding (0.70)

Industry:

Government > Military > Navy (0.69)
Government > Regional Government > North America Government > United States Government (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.93)
Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.62)

Analysis of Learning-based Offshore Wind Power Prediction Models with Various Feature Combinations

Fang, Linhan, Jiang, Fan, Toms, Ann Mary, Li, Xingpeng

Accurate wind speed prediction is crucial for designing and selecting sites for offshore wind farms. This paper investigates the effectiveness of various machine learning models in predicting offshore wind power for a site near the Gulf of Mexico by analyzing meteorological data. After collecting and preprocessing meteorological data, nine different input feature combinations were designed to assess their impact on wind power predictions at multiple heights. The results show that using wind speed as the output feature improves prediction accuracy by approximately 10% compared to using wind power as the output. In addition, the improvement of multi-feature input compared with single-feature input is not obvious mainly due to the poor correlation among key features and limited generalization ability of models. These findings underscore the importance of selecting appropriate output features and highlight considerations for using machine learning in wind power forecasting, offering insights that could guide future wind power prediction models and conversion techniques.

prediction, wind power, wind speed, (11 more...)

2503.13493

Country:

North America > Mexico (0.35)
Atlantic Ocean > Gulf of Mexico (0.25)
North America > United States > Texas > Harris County > Houston (0.05)
(9 more...)

Genre: Research Report > New Finding (0.69)

Industry: Energy > Renewable > Wind (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.98)

Fine-Tuning LLMs for Report Summarization: Analysis on Supervised and Unsupervised Data

Rallapalli, Swati, Gallagher, Shannon, Mellinger, Andrew O., Ratchford, Jasmine, Sinha, Anusha, Brooks, Tyler, Nichols, William R., Winski, Nick, Brown, Bryan

We study the efficacy of fine-tuning Large Language Models (LLMs) for the specific task of report (government archives, news, intelligence reports) summarization. While this topic is being very actively researched - our specific application set-up faces two challenges: (i) ground-truth summaries maybe unavailable (e.g., for government archives), and (ii) availability of limited compute power - the sensitive nature of the application requires that computation is performed on-premise and for most of our experiments we use one or two A100 GPU cards. Under this set-up we conduct experiments to answer the following questions. First, given that fine-tuning the LLMs can be resource intensive, is it feasible to fine-tune them for improved report summarization capabilities on-premise? Second, what are the metrics we could leverage to assess the quality of these summaries? We conduct experiments on two different fine-tuning approaches in parallel and our findings reveal interesting trends regarding the utility of fine-tuning LLMs. Specifically, we find that in many cases, fine-tuning helps improve summary quality and in other cases it helps by reducing the number of invalid or garbage summaries.

dataset, invalid summary, summarization, (14 more...)

2503.10676

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report > New Finding (0.88)

Industry: Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Enhancing Retrieval for ESGLLM via ESG-CID -- A Disclosure Content Index Finetuning Dataset for Mapping GRI and ESRS

Ahmed, Shafiuddin Rehan, Shah, Ankit Parag, Tran, Quan Hung, Khetan, Vivek, Kang, Sukryool, Mehta, Ankit, Bao, Yujia, Wei, Wei

Climate change has intensified the need for transparency and accountability in organizational practices, making Environmental, Social, and Governance (ESG) reporting increasingly crucial. Frameworks like the Global Reporting Initiative (GRI) and the new European Sustainability Reporting Standards (ESRS) aim to standardize ESG reporting, yet generating comprehensive reports remains challenging due to the considerable length of ESG documents and variability in company reporting styles. To facilitate ESG report automation, Retrieval-Augmented Generation (RAG) systems can be employed, but their development is hindered by a lack of labeled data suitable for training retrieval models. In this paper, we leverage an underutilized source of weak supervision -- the disclosure content index found in past ESG reports -- to create a comprehensive dataset, ESG-CID, for both GRI and ESRS standards. By extracting mappings between specific disclosure requirements and corresponding report sections, and refining them using a Large Language Model as a judge, we generate a robust training and evaluation set. We benchmark popular embedding models on this dataset and show that fine-tuning BERT-based models can outperform commercial embeddings and leading public models, even under temporal data splits for cross-report style transfer from GRI to ESRS

content index, fine-tuning, requirement, (15 more...)

2503.10674

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > California > Santa Clara County > Mountain View (0.04)
(4 more...)

Genre:

Research Report (0.64)
Public Relations > Community Relations (0.35)

Industry:

Law (1.00)
Energy > Renewable (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Exploring Bias in over 100 Text-to-Image Generative Models

Vice, Jordan, Akhtar, Naveed, Hartley, Richard, Mian, Ajmal

We investigate bias trends in text-to-image generative models over time, focusing on the increasing availability of models through open platforms like Hugging Face. While these platforms democratize AI, they also facilitate the spread of inherently biased models, often shaped by task-specific fine-tuning. Ensuring ethical and transparent AI deployment requires robust evaluation frameworks and quantifiable bias metrics. To this end, we assess bias across three key dimensions: (i) distribution bias, (ii) generative hallucination, and (iii) generative miss-rate. Analyzing over 100 models, we reveal how bias patterns evolve over time and across generative tasks. Our findings indicate that artistic and style-transferred models exhibit significant bias, whereas foundation models, benefiting from broader training distributions, are becoming progressively less biased. By identifying these systemic trends, we contribute a large-scale evaluation corpus to inform bias research and mitigation strategies, fostering more responsible AI development.

eulerdiscretescheduler 1024, photo realism pndmscheduler 512, pndmscheduler 512, (13 more...)

2503.08012

Country:

Asia > Middle East > Jordan (0.04)
Asia > China > Tibet Autonomous Region (0.04)
Oceania > Australia > Western Australia (0.04)
Europe > Switzerland (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.86)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

MoRE: Unlocking Scalability in Reinforcement Learning for Quadruped Vision-Language-Action Models

Zhao, Han, Song, Wenxuan, Wang, Donglin, Tong, Xinyang, Ding, Pengxiang, Cheng, Xuelian, Ge, Zongyuan

Developing versatile quadruped robots that can smoothly perform various actions and tasks in real-world environments remains a significant challenge. This paper introduces a novel vision-language-action (VLA) model, mixture of robotic experts (MoRE), for quadruped robots that aim to introduce reinforcement learning (RL) for fine-tuning large-scale VLA models with a large amount of mixed-quality data. MoRE integrates multiple low-rank adaptation modules as distinct experts within a dense multi-modal large language model (MLLM), forming a sparse-activated mixture-of-experts model. This design enables the model to effectively adapt to a wide array of downstream tasks. Moreover, we employ a reinforcement learning-based training objective to train our model as a Q-function after deeply exploring the structural properties of our tasks. Effective learning from automatically collected mixed-quality data enhances data efficiency and model performance. Extensive experiments demonstrate that MoRE outperforms all baselines across six different skills and exhibits superior generalization capabilities in out-of-distribution scenarios. We further validate our method in real-world scenarios, confirming the practicality of our approach and laying a solid foundation for future research on multi-task learning in quadruped robots.

arxiv, language model, robot, (14 more...)

2503.08007

Country:

Asia > China (0.05)
Oceania > Australia (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Enhancing Sentiment Analysis through Multimodal Fusion: A BERT-DINOv2 Approach

Zhao, Taoxu, Li, Meisi, Chen, Kehao, Wang, Liye, Zhou, Xucheng, Chaturvedi, Kunal, Prasad, Mukesh, Anaissi, Ali, Braytee, Ali

Multimodal sentiment analysis enhances conventional sentiment analysis, which traditionally relies solely on text, by incorporating information from different modalities such as images, text, and audio. This paper proposes a novel multimodal sentiment analysis architecture that integrates text and image data to provide a more comprehensive understanding of sentiments. For text feature extraction, we utilize BERT, a natural language processing model. For image feature extraction, we employ DINOv2, a vision-transformer-based model. The textual and visual latent features are integrated using proposed fusion techniques, namely the Basic Fusion Model, Self-Attention Fusion Model, and Dual-Attention Fusion Model. Experiments on three datasets--the Memotion 7k dataset, MVSA-single dataset, and MVSA-multi dataset--demonstrate the viability and practicality of the proposed multimodal architecture.

dataset, fusion model, sentiment analysis, (17 more...)

2503.07943

Country:

Oceania > Australia (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

LLM-based Corroborating and Refuting Evidence Retrieval for Scientific Claim Verification

Wang, Siyuan, Foulds, James R., Gani, Md Osman, Pan, Shimei

In this paper, we introduce CIBER (Claim Investigation Based on Evidence Retrieval), an extension of the Retrieval-Augmented Generation (RAG) framework designed to identify corroborating and refuting documents as evidence for scientific claim verification. CIBER addresses the inherent uncertainty in Large Language Models (LLMs) by evaluating response consistency across diverse interrogation probes. By focusing on the behavioral analysis of LLMs without requiring access to their internal information, CIBER is applicable to both white-box and black-box models. Furthermore, CIBER operates in an unsupervised manner, enabling easy generalization across various scientific domains. Comprehensive evaluations conducted using LLMs with varying levels of linguistic proficiency reveal CIBER's superior performance compared to conventional RAG approaches. These findings not only highlight the effectiveness of CIBER but also provide valuable insights for future advancements in LLM-based scientific claim verification.

ciber, dataset, llm response, (11 more...)

2503.07937

Country:

Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
North America > United States > Maryland > Baltimore County (0.04)
North America > United States > Maryland > Baltimore (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Immunology (0.95)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.94)
Health & Medicine > Therapeutic Area > Vaccines (0.69)
Health & Medicine > Pharmaceuticals & Biotechnology (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Visual and Text Prompt Segmentation: A Novel Multi-Model Framework for Remote Sensing

Zi, Xing, Jin, Kairui, Tao, Xian, Li, Jun, Braytee, Ali, Shah, Rajiv Ratn, Prasad, Mukesh

Pixel-level segmentation is essential in remote sensing, where foundational vision models like CLIP and Segment Anything Model(SAM) have demonstrated significant capabilities in zero-shot segmentation tasks. Despite their advances, challenges specific to remote sensing remain substantial. Firstly, The SAM without clear prompt constraints, often generates redundant masks, and making post-processing more complex. Secondly, the CLIP model, mainly designed for global feature alignment in foundational models, often overlooks local objects crucial to remote sensing. This oversight leads to inaccurate recognition or misplaced focus in multi-target remote sensing imagery. Thirdly, both models have not been pre-trained on multi-scale aerial views, increasing the likelihood of detection failures. To tackle these challenges, we introduce the innovative VTPSeg pipeline, utilizing the strengths of Grounding DINO, CLIP, and SAM for enhanced open-vocabulary image segmentation. The Grounding DINO+(GD+) module generates initial candidate bounding boxes, while the CLIP Filter++(CLIP++) module uses a combination of visual and textual prompts to refine and filter out irrelevant object bounding boxes, ensuring that only pertinent objects are considered. Subsequently, these refined bounding boxes serve as specific prompts for the FastSAM model, which executes precise segmentation. Our VTPSeg is validated by experimental and ablation study results on five popular remote sensing image segmentation datasets.

dataset, segmentation, vtpseg, (12 more...)

2503.07911

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)
(2 more...)

Genre: Research Report (0.51)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)