AITopics

2409.18529

Country:

Southern Ocean > Weddell Sea (0.04)
North America (0.04)
Asia (0.04)
(8 more...)

Genre: Research Report > New Finding (0.87)

Industry: Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Chakraborty, Anirban, Katzfuss, Matthias

Learning non-Gaussian spatial distributions via Bayesian transport maps with parametric shrinkage

arXiv.org Machine LearningSep-27-2024

Many applications, including climate-model analysis and stochastic weather generators, require learning or emulating the distribution of a high-dimensional and non-Gaussian spatial field based on relatively few training samples. To address this challenge, a recently proposed Bayesian transport map (BTM) approach consists of a triangular transport map with nonparametric Gaussian-process (GP) components, which is trained to transform the distribution of interest distribution to a Gaussian reference distribution. To improve the performance of this existing BTM, we propose to shrink the map components toward a ``base'' parametric Gaussian family combined with a Vecchia approximation for scalability. The resulting ShrinkTM approach is more accurate than the existing BTM, especially for small numbers of training samples. It can even outperform the ``base'' family when trained on a single sample of the spatial field. We demonstrate the advantage of ShrinkTM though numerical experiments on simulated data and on climate-model output.

shrinktm, transport map, vecchia approximation, (15 more...)

arXiv.org Machine Learning

2409.19208

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
South America (0.04)
Pacific Ocean (0.04)
(2 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Modeling & Simulation (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

WIREDSep-26-2024, 12:00:00 GMT

Is AI More Sustainable if You Generate it Underwater?

AI data centers are so hot right now. Each time generative AI services churn through their large language models to make a chatbot answer one of your questions, it takes a great deal of processing power to sift through all that data. Doing so can use massive amounts of energy, which means the proliferation of AI is raising questions about how sustainable this tech actually is and how it affects the ecosystems around it. Some companies think they have a solution: running those data centers underwater, where they can use the surrounding seawater to cool and better control the temperature of the hard working GPUs inside. But it turns out just plopping something into the ocean isn't always a foolproof plan for reducing its environmental impact.

paresh dave, underwater, underwater data, (3 more...)

WIRED

Country:

Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.06)
North America > United States > New York (0.06)
North America > United States > California > San Francisco County > San Francisco (0.06)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Communications > Mobile (0.94)

Cavusoglu, Devrim, Sen, Secil, Sert, Ulas

DisGeM: Distractor Generation for Multiple Choice Questions with Span Masking

arXiv.org Artificial IntelligenceSep-26-2024

Recent advancements in Natural Language Processing (NLP) have impacted numerous sub-fields such as natural language generation, natural language inference, question answering, and more. However, in the field of question generation, the creation of distractors for multiple-choice questions (MCQ) remains a challenging task. In this work, we present a simple, generic framework for distractor generation using readily available Pre-trained Language Models (PLMs). Unlike previous methods, our framework relies solely on pre-trained language models and does not require additional training on specific datasets. Building upon previous research, we introduce a two-stage framework consisting of candidate generation and candidate selection. Our proposed distractor generation framework outperforms previous methods without the need for training or fine-tuning. Human evaluations confirm that our approach produces more effective and engaging distractors. The related codebase is publicly available at https://github.com/obss/disgem.

large language model, machine learning, natural language, (21 more...)

2409.18263

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Croatia (0.05)
North America > United States > Colorado (0.04)
(17 more...)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Sports > Football (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceSep-26-2024

Pixel-Space Post-Training of Latent Diffusion Models

Zhang, Christina, Motwani, Simran, Yu, Matthew, Hou, Ji, Juefei-Xu, Felix, Tsai, Sam, Vajda, Peter, He, Zijian, Wang, Jialiang

Latent diffusion models (LDMs) have made significant advancements in the field of image generation in recent years. One major advantage of LDMs is their ability to operate in a compressed latent space, allowing for more efficient training and deployment. However, despite these advantages, challenges with LDMs still remain. For example, it has been observed that LDMs often generate high-frequency details and complex compositions imperfectly. We hypothesize that one reason for these flaws is due to the fact that all pre- and post-training of LDMs are done in latent space, which is typically $8 \times 8$ lower spatial-resolution than the output images. To address this issue, we propose adding pixel-space supervision in the post-training process to better preserve high-frequency details. Experimentally, we show that adding a pixel-space objective significantly improves both supervised quality fine-tuning and preference-based post-training by a large margin on a state-of-the-art DiT transformer and U-Net diffusion models in both visual quality and visual flaw metrics, while maintaining the same text alignment quality.

arxiv preprint arxiv, diffusion model, fine-tuning, (14 more...)

2409.17565

Country:

Pacific Ocean (0.04)
North America > United States > New York (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Artificial IntelligenceSep-26-2024

PGN: The RNN's New Successor is Effective for Long-Range Time Series Forecasting

Jia, Yuxin, Lin, Youfang, Yu, Jing, Wang, Shuo, Liu, Tianhao, Wan, Huaiyu

Due to the recurrent structure of RNN, the long information propagation path poses limitations in capturing long-term dependencies, gradient explosion/vanishing issues, and inefficient sequential execution. Based on this, we propose a novel paradigm called Parallel Gated Network (PGN) as the new successor to RNN. PGN directly captures information from previous time steps through the designed Historical Information Extraction (HIE) layer and leverages gated mechanisms to select and fuse it with the current time step information. This reduces the information propagation path to $\mathcal{O}(1)$, effectively addressing the limitations of RNN. To enhance PGN's performance in long-range time series forecasting tasks, we propose a novel temporal modeling framework called Temporal PGN (TPGN). TPGN incorporates two branches to comprehensively capture the semantic information of time series. One branch utilizes PGN to capture long-term periodic patterns while preserving their local characteristics. The other branch employs patches to capture short-term information and aggregate the global representation of the series. TPGN achieves a theoretical complexity of $\mathcal{O}(\sqrt{L})$, ensuring efficiency in its operations. Experimental results on five benchmark datasets demonstrate the state-of-the-art (SOTA) performance and high efficiency of TPGN, further confirming the effectiveness of PGN as the new successor to RNN in long-range time series forecasting. The code is available in this repository: \url{https://github.com/Water2sea/TPGN}.

information, information propagation path, tpgn, (14 more...)

2409.17703

Country:

Asia > China > Beijing > Beijing (0.04)
Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.88)

A Visual-Analytical Approach for Automatic Detection of Cyclonic Events in Satellite Observations

Agrawal, Akash, Mohapatra, Mayesh, Raja, Abhinav, Tiwari, Paritosh, Pattanaik, Vishwajeet, Jaiswal, Neeru, Agarwal, Arpit, Rathore, Punit

Estimating the location and intensity of tropical cyclones holds crucial significance for predicting catastrophic weather events. In this study, we approach this task as a detection and regression challenge, specifically over the North Indian Ocean (NIO) region where best tracks location and wind speed information serve as the labels. The current process for cyclone detection and intensity estimation involves physics-based simulation studies which are time-consuming, only using image features will automate the process for significantly faster and more accurate predictions. While conventional methods typically necessitate substantial prior knowledge for training, we are exploring alternative approaches to enhance efficiency. This research aims to focus specifically on cyclone detection, intensity estimation and related aspects using only image input and data-driven approaches and will lead to faster inference time and automate the process as opposed to current NWP models being utilized at SAC. In context to algorithm development, a novel two stage detection and intensity estimation module is proposed. In the first level detection we try to localize the cyclone over an entire image as captured by INSAT3D over the NIO (North Indian Ocean). For the intensity estimation task, we propose a CNN-LSTM network, which works on the cyclone centered images, utilizing a ResNet-18 backbone, by which we are able to capture both temporal and spatial characteristics.

artificial intelligence, cyclone, machine learning, (15 more...)

2410.08218

Country:

Asia > India > Karnataka > Bengaluru (0.05)
Asia > Myanmar (0.04)
Asia > India > Andhra Pradesh (0.04)
(8 more...)

Genre: Research Report > New Finding (0.48)

Industry: Energy (0.48)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Building Real-time Awareness of Out-of-distribution in Trajectory Prediction for Autonomous Vehicles

Tongfei, null, Guo, null, Banerjee, Taposh, Liu, Rui, Su, Lili

Trajectory prediction describes the motions of surrounding moving obstacles for an autonomous vehicle; it plays a crucial role in enabling timely decision-making, such as collision avoidance and trajectory replanning. Accurate trajectory planning is the key to reliable vehicle deployments in open-world environment, where unstructured obstacles bring in uncertainties that are impossible to fully capture by training data. For traditional machine learning tasks, such uncertainties are often addressed reasonably well via methods such as continual learning. On the one hand, naively applying those methods to trajectory prediction can result in continuous data collection and frequent model updates, which can be resource-intensive. On the other hand, the predicted trajectories can be far away from the true trajectories, leading to unsafe decision-making. In this paper, we aim to establish real-time awareness of out-of-distribution in trajectory prediction for autonomous vehicles. We focus on the challenging and practically relevant setting where the out-of-distribution is deceptive, that is, the one not easily detectable by human intuition. Drawing on the well-established techniques of sequential analysis, we build real-time awareness of out-of-distribution by monitoring prediction errors using the quickest change point detection (QCD). Our solutions are lightweight and can handle the occurrence of out-of-distribution at any time during trajectory prediction inference. Experimental results on multiple real-world datasets using a benchmark trajectory prediction model demonstrate the effectiveness of our methods.

detection, prediction, trajectory prediction, (14 more...)

2409.17277

Country:

North America > United States > California > San Francisco County > San Francisco (0.04)
Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Ground > Road (1.00)
Health & Medicine (1.00)
Automobiles & Trucks (1.00)
Information Technology (0.94)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

SEA-ViT: Sea Surface Currents Forecasting Using Vision Transformer and GRU-Based Spatio-Temporal Covariance Modeling

Panboonyuen, Teerapong

Forecasting sea surface currents is essential for applications such as maritime navigation, environmental monitoring, and climate analysis, particularly in regions like the Gulf of Thailand and the Andaman Sea. This paper introduces SEA-ViT, an advanced deep learning model that integrates Vision Transformer (ViT) with bidirectional Gated Recurrent Units (GRUs) to capture spatio-temporal covariance for predicting sea surface currents (U, V) using high-frequency radar (HF) data. The name SEA-ViT is derived from ``Sea Surface Currents Forecasting using Vision Transformer,'' highlighting the model's emphasis on ocean dynamics and its use of the ViT architecture to enhance forecasting capabilities. SEA-ViT is designed to unravel complex dependencies by leveraging a rich dataset spanning over 30 years and incorporating ENSO indices (El Ni\~no, La Ni\~na, and neutral phases) to address the intricate relationship between geographic coordinates and climatic variations. This development enhances the predictive capabilities for sea surface currents, supporting the efforts of the Geo-Informatics and Space Technology Development Agency (GISTDA) in Thailand's maritime regions. The code and pretrained models are available at \url{https://github.com/kaopanboonyuen/gistda-ai-sea-surface-currents}.

sea surface, sea surface current, surface current, (14 more...)

2409.16313

Country:

Asia > Thailand (0.45)
Pacific Ocean > North Pacific Ocean > Gulf of Thailand (0.24)
Indian Ocean > Bay of Bengal > Andaman Sea (0.24)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.91)

Goal-based Neural Physics Vehicle Trajectory Prediction Model

Gan, Rui, Shi, Haotian, Li, Pei, Wu, Keshu, An, Bocheng, Li, Linheng, Ma, Junyi, Ma, Chengyuan, Ran, Bin

Vehicle trajectory prediction plays a vital role in intelligent transportation systems and autonomous driving, as it significantly affects vehicle behavior planning and control, thereby influencing traffic safety and efficiency. Numerous studies have been conducted to predict short-term vehicle trajectories in the immediate future. However, long-term trajectory prediction remains a major challenge due to accumulated errors and uncertainties. Additionally, balancing accuracy with interpretability in the prediction is another challenging issue in predicting vehicle trajectory. To address these challenges, this paper proposes a Goal-based Neural Physics Vehicle Trajectory Prediction Model (GNP). The GNP model simplifies vehicle trajectory prediction into a two-stage process: determining the vehicle's goal and then choosing the appropriate trajectory to reach this goal. The GNP model contains two sub-modules to achieve this process. The first sub-module employs a multi-head attention mechanism to accurately predict goals. The second sub-module integrates a deep learning model with a physics-based social force model to progressively predict the complete trajectory using the generated goals. The GNP demonstrates state-of-the-art long-term prediction accuracy compared to four baseline models. We provide interpretable visualization results to highlight the multi-modality and inherent nature of our neural physics framework. Additionally, ablation studies are performed to validate the effectiveness of our key designs.

prediction, trajectory, vehicle, (13 more...)

2409.15182

Country:

North America > United States > Wisconsin > Dane County > Madison (0.15)
Asia > China > Jiangsu Province > Nanjing (0.04)
Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.04)
(10 more...)

Genre: Research Report (1.00)

Industry: Transportation > Ground > Road (0.88)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)