AITopics | Pacific Ocean

Collaborating Authors

Pacific Ocean

Waymo aims to offer paid robotaxi rides in Washington DC next year

EngadgetMar-25-2025, 16:15:23 GMT

Waymo is continuing to expand its foothold across the US, having recently started offering paid robotaxi services in more parts of the San Francisco Bay Area. Next up are Atlanta and Miami, and now the company has revealed plans to offer its driverless Waymo One service in the nation's capital in 2026. Before that can happen, though, Waymo will need to get approval from regulators. The company says it will "continue to work closely with policymakers to formalize the regulations needed to operate without a human behind the wheel in the District." DC currently requires autonomous vehicles to have a human at the wheel, ready to take control if necessary.

artificial intelligence, robotaxi ride, washington dc, (3 more...)

Engadget

Country:

North America > United States > District of Columbia > Washington (0.40)
Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.28)
North America > United States > California > San Francisco County > San Francisco (0.28)
(2 more...)

Industry:

Government (0.85)
Transportation > Ground > Road (0.77)
Information Technology > Robotics & Automation (0.77)
Automobiles & Trucks (0.77)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)

Add feedback

Bigger But Not Better: Small Neural Language Models Outperform Large Language Models in Detection of Thought Disorder

Li, Changye, Xu, Weizhe, Pakhomov, Serguei, Bradley, Ellen, Ben-Zeev, Dror, Cohen, Trevor

arXiv.org Artificial IntelligenceMar-25-2025

Disorganized thinking is a key diagnostic indicator of schizophrenia-spectrum disorders. Recently, clinical estimates of the severity of disorganized thinking have been shown to correlate with measures of how difficult speech transcripts would be for large language models (LLMs) to predict. However, LLMs' deployment challenges -- including privacy concerns, computational and financial costs, and lack of transparency of training data -- limit their clinical utility. We investigate whether smaller neural language models can serve as effective alternatives for detecting positive formal thought disorder, using the same sliding window based perplexity measurements that proved effective with larger models. Surprisingly, our results show that smaller models are more sensitive to linguistic differences associated with formal thought disorder than their larger counterparts. Detection capability declines beyond a certain model size and context length, challenging the common assumption of ``bigger is better'' for LLM-based applications. Our findings generalize across audio diaries and clinical interview speech samples from individuals with psychotic symptoms, suggesting a promising direction for developing efficient, cost-effective, and privacy-preserving screening tools that can be deployed in both clinical and naturalistic settings.

computational linguistic, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2503.20103

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Why Representation Engineering Works: A Theoretical and Empirical Study in Vision-Language Models

Tian, Bowei, Lyu, Xuntao, Liu, Meng, Wang, Hongyi, Li, Ang

arXiv.org Artificial IntelligenceMar-25-2025

Representation Engineering (RepE) has emerged as a powerful paradigm for enhancing AI transparency by focusing on high-level representations rather than individual neurons or circuits. It has proven effective in improving interpretability and control, showing that representations can emerge, propagate, and shape final model outputs in large language models (LLMs). However, in Vision-Language Models (VLMs), visual input can override factual linguistic knowledge, leading to hallucinated responses that contradict reality. To address this challenge, we make the first attempt to extend RepE to VLMs, analyzing how multimodal representations are preserved and transformed. Building on our findings and drawing inspiration from successful RepE applications, we develop a theoretical framework that explains the stability of neural activity across layers using the principal eigenvector, uncovering the underlying mechanism of RepE. We empirically validate these instrinsic properties, demonstrating their broad applicability and significance. By bridging theoretical insights with empirical validation, this work transforms RepE from a descriptive tool into a structured theoretical framework, opening new directions for improving AI robustness, fairness, and transparency.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.2272

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Pacific Ocean > North Pacific Ocean > San Francisco Bay > Golden Gate (0.04)
North America > United States > North Carolina (0.04)
(4 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Government > Regional Government > North America Government > United States Government (0.93)
Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

Towards Long-Range ENSO Prediction with an Explainable Deep Learning Model

Chen, Qi, Cui, Yinghao, Hong, Guobin, Ashok, Karumuri, Pu, Yuchun, Zheng, Xiaogu, Zhang, Xuanze, Zhong, Wei, Zhan, Peng, Wang, Zhonglei

arXiv.org Artificial IntelligenceMar-25-2025

Its evolution is governed by intricate air-sea interactions, posing significant challenges for long-term prediction. In this study, we introduce CTEFNet, a multivariate deep learning model that synergizes convolutional neural networks and transformers to enhance ENSO forecasting. By integrating multiple oceanic and atmospheric predictors, CTEFNet extends the effective forecast lead time to 20 months while mitigating the impact of the spring predictability barrier, outperforming both dynamical models and state-of-the-art deep learning approaches. Furthermore, CTEFNet offers physically meaningful and statistically significant insights through gradient-based sensitivity analysis, revealing the key precursor signals that govern ENSO dynamics, which align with well-established theories and reveal new insights about inter-basin interactions among the Pacific, Atlantic, and Indian Oceans. The CTEFNet's superior predictive skill and interpretable sensitivity assessments underscore its potential for advancing climate prediction. Our findings highlight the importance of multivariate coupling in ENSO evolution and demonstrate the promise of deep learning in capturing complex climate dynamics with enhanced interpretability. 1 Introduction El Ni no-Southern Oscillation (ENSO) is one of the most prominent modes of inter-annual climate variability, characterized by shifts in sea surface temperatures (SST) across the tropical Pacific Ocean and the weakening of equatorial trade winds.

artificial intelligence, ctefnet, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2503.19502

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > China > Fujian Province > Xiamen (0.05)
South America (0.04)
(18 more...)

Genre: Research Report > New Finding (0.68)

Industry: Energy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MAGIC-VQA: Multimodal And Grounded Inference with Commonsense Knowledge for Visual Question Answering

Yang, Shuo, Luo, Siwen, Han, Soyeon Caren, Hovy, Eduard

arXiv.org Artificial IntelligenceMar-24-2025

Visual Question Answering (VQA) requires reasoning across visual and textual modalities, yet Large Vision-Language Models (LVLMs) often lack integrated commonsense knowledge, limiting their robustness in real-world scenarios. To address this, we introduce MAGIC-VQA, a novel framework that enhances VQA by systematically integrating commonsense knowledge with LVLMs. MAGIC-VQA employs a three-stage process: (1) Explicit Knowledge Integration from external sources, (2) By-Type Post-Processing for contextual refinement, and (3) Implicit Knowledge Augmentation using a Graph Neural Network (GNN) for structured reasoning. While GNNs bring greater depth to structured inference, they enable superior relational inference beyond LVLMs. MAGIC-VQA bridges a key gap by unifying commonsensse knowledge with LVLM-driven reasoning, eliminating the need for extensive pre-training or complex prompt tuning. Our framework achieves state-of-the-art performance on benchmark datasets, significantly improving commonsense reasoning in VQA.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.18491

Country:

Atlantic Ocean (0.04)
North America > United States > Virginia (0.04)
Pacific Ocean (0.04)
(4 more...)

Genre: Research Report (0.64)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

When is dataset cartography ineffective? Using training dynamics does not improve robustness against Adversarial SQuAD

Mandal, Paul K.

arXiv.org Artificial IntelligenceMar-23-2025

In this paper, I investigate the effectiveness of dataset cartography for extractive question answering on the SQuAD dataset. I begin by analyzing annotation artifacts in SQuAD and evaluate the impact of two adversarial datasets, AddSent and AddOneSent, on an ELECTRA-small model. Using training dynamics, I partition SQuAD into easy-to-learn, ambiguous, and hard-to-learn subsets. I then compare the performance of models trained on these subsets to those trained on randomly selected samples of equal size. Results show that training on cartography-based subsets does not improve generalization to the SQuAD validation set or the AddSent adversarial set. While the hard-to-learn subset yields a slightly higher F1 score on the AddOneSent dataset, the overall gains are limited. These findings suggest that dataset cartography provides little benefit for adversarial robustness in SQuAD-style QA tasks. I conclude by comparing these results to prior findings on SNLI and discuss possible reasons for the observed differences.

dataset cartography, machine learning, question answering, (17 more...)

arXiv.org Artificial Intelligence

2503.1829

Country:

North America > United States > Texas > Travis County > Austin (0.28)
North America > United States > Colorado (0.05)
Europe > Italy > Tuscany > Florence (0.05)
(9 more...)

Genre: Research Report > New Finding (0.87)

Industry: Leisure & Entertainment > Sports > Football (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.37)

Add feedback

Embedding spatial context in urban traffic forecasting with contrastive pre-training

Low, Matthew, Prabowo, Arian, Xue, Hao, Salim, Flora

arXiv.org Artificial IntelligenceMar-19-2025

Urban traffic forecasting is a commonly encountered problem, with wide-ranging applications in fields such as urban planning, civil engineering and transport. In this paper, we study the enhancement of traffic forecasting with pre-training, focusing on spatio-temporal graph methods. While various machine learning methods to solve traffic forecasting problems have been explored and extensively studied, there is a gap of a more contextual approach: studying how relevant non-traffic data can improve prediction performance on traffic forecasting problems. We call this data spatial context. We introduce a novel method of combining road and traffic information through the notion of a traffic quotient graph, a quotient graph formed from road geometry and traffic sensors. We also define a way to encode this relationship in the form of a geometric encoder, pre-trained using contrastive learning methods and enhanced with OpenStreetMap data. We introduce and discuss ways to integrate this geometric encoder with existing graph neural network (GNN)-based traffic forecasting models, using a contrastive pre-training paradigm. We demonstrate the potential for this hybrid model to improve generalisation and performance with zero additional traffic data. Code for this paper is available at https://github.com/mattchrlw/forecasting-on-new-roads.

artificial intelligence, machine learning, node, (18 more...)

arXiv.org Artificial Intelligence

2503.1498

Country:

Oceania > Australia (0.14)
North America > United States > California > San Francisco County > San Francisco (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.04)

Genre: Research Report (1.00)

Industry: Transportation (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Iffy-Or-Not: Extending the Web to Support the Critical Evaluation of Fallacious Texts

Lim, Gionnieve, Kim, Juho, Perrault, Simon T.

arXiv.org Artificial IntelligenceMar-18-2025

Social platforms have expanded opportunities for deliberation with the comments being used to inform one's opinion. However, using such information to form opinions is challenged by unsubstantiated or false content. To enhance the quality of opinion formation and potentially confer resistance to misinformation, we developed Iffy-Or-Not (ION), a browser extension that seeks to invoke critical thinking when reading texts. With three features guided by argumentation theory, ION highlights fallacious content, suggests diverse queries to probe them with, and offers deeper questions to consider and chat with others about. From a user study (N=18), we found that ION encourages users to be more attentive to the content, suggests queries that align with or are preferable to their own, and poses thought-provoking questions that expands their perspectives. However, some participants expressed aversion to ION due to misalignments with their information goals and thinking predispositions. Potential backfiring effects with ION are discussed.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2503.14412

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Pacific Ocean > North Pacific Ocean > San Francisco Bay > Golden Gate (0.04)
(22 more...)

Genre:

Questionnaire & Opinion Survey (1.00)
Overview (0.92)
Personal > Interview (0.46)
Research Report > Experimental Study (0.45)

Industry:

Media > News (1.00)
Information Technology > Services (1.00)
Government (1.00)
(2 more...)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Human Computer Interaction (1.00)
Information Technology > Communications > Social Media (1.00)
(6 more...)

Add feedback

Generative AI in Transportation Planning: A Survey

Da, Longchao, Chen, Tiejin, Li, Zhuoheng, Bachiraju, Shreyas, Yao, Huaiyuan, Li, Li, Dong, Yushun, Hu, Xiyang, Tu, Zhengzhong, Wang, Dongjie, Zhao, Yue, Xuanyu, null, Zhou, null, Pendyala, Ram, Stabler, Benjamin, Yang, Yezhou, Zhou, Xuesong, Wei, Hua

arXiv.org Artificial IntelligenceMar-18-2025

The integration of generative artificial intelligence (GenAI) into transportation planning has the potential to revolutionize tasks such as demand forecasting, infrastructure design, policy evaluation, and traffic simulation. However, there is a critical need for a systematic framework to guide the adoption of GenAI in this interdisciplinary domain. In this survey, we, a multidisciplinary team of researchers spanning computer science and transportation engineering, present the first comprehensive framework for leveraging GenAI in transportation planning. Specifically, we introduce a new taxonomy that categorizes existing applications and methodologies into two perspectives: transportation planning tasks and computational techniques. From the transportation planning perspective, we examine the role of GenAI in automating descriptive, predictive, generative, simulation, and explainable tasks to enhance mobility systems. From the computational perspective, we detail advancements in data preparation, domain-specific fine-tuning, and inference strategies, such as retrieval-augmented generation and zero-shot learning tailored to transportation applications. Additionally, we address critical challenges, including data scarcity, explainability, bias mitigation, and the development of domain-specific evaluation frameworks that align with transportation goals like sustainability, equity, and system efficiency. This survey aims to bridge the gap between traditional transportation planning methodologies and modern AI techniques, fostering collaboration and innovation. By addressing these challenges and opportunities, we seek to inspire future research that ensures ethical, equitable, and impactful use of generative AI in transportation planning.

arxiv preprint arxiv, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.07158

Country:

North America > United States > New York (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
(20 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Transportation > Passenger (1.00)
Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)
(11 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.97)

Add feedback

Where do Large Vision-Language Models Look at when Answering Questions?

Xing, Xiaoying, Kuo, Chia-Wen, Fuxin, Li, Niu, Yulei, Chen, Fan, Li, Ming, Wu, Ying, Wen, Longyin, Zhu, Sijie

arXiv.org Artificial IntelligenceMar-18-2025

Large Vision-Language Models (LVLMs) have shown promising performance in vision-language understanding and reasoning tasks. However, their visual understanding behaviors remain underexplored. A fundamental question arises: to what extent do LVLMs rely on visual input, and which image regions contribute to their responses? It is non-trivial to interpret the free-form generation of LVLMs due to their complicated visual architecture (e.g., multiple encoders and multi-resolution) and variable-length outputs. In this paper, we extend existing heatmap visualization methods (e.g., iGOS++) to support LVLMs for open-ended visual question answering. We propose a method to select visually relevant tokens that reflect the relevance between generated answers and input image. Furthermore, we conduct a comprehensive analysis of state-of-the-art LVLMs on benchmarks designed to require visual information to answer. Our findings offer several insights into LVLM behavior, including the relationship between focus region and answer correctness, differences in visual attention across architectures, and the impact of LLM scale on visual understanding. The code and data are available at https://github.com/bytedance/LVLM_Interpretation.

artificial intelligence, large language model, natural language, (12 more...)

arXiv.org Artificial Intelligence

2503.13891

Country:

Pacific Ocean > South Pacific Ocean (0.04)
Oceania > Nauru (0.04)
Oceania > Australia (0.04)
North America > United States > Oregon (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.51)

Add feedback