AITopics | presto

Collaborating Authors

presto

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Matching Ranks Over Probability Yields Truly Deep Safety Alignment

Vega, Jason, Singh, Gagandeep

arXiv.org Artificial IntelligenceDec-8-2025

A frustratingly easy technique known as the prefilling attack has been shown to effectively circumvent the safety alignment of frontier LLMs by simply prefilling the assistant response with an affirmative prefix before decoding. In response, recent work proposed a supervised fine-tuning (SFT) defense using data augmentation to achieve a \enquote{deep} safety alignment, allowing the model to generate natural language refusals immediately following harmful prefills. Unfortunately, we show in this work that the "deep" safety alignment produced by such an approach is in fact not very deep. A generalization of the prefilling attack, which we refer to as the Rank-Assisted Prefilling (RAP) attack, can effectively extract harmful content from models fine-tuned with the data augmentation defense by selecting low-probability "harmful" tokens from the top 20 predicted next tokens at each step (thus ignoring high-probability "refusal" tokens). We argue that this vulnerability is enabled due to the "gaming" of the SFT objective when the target distribution entropies are low, where low fine-tuning loss is achieved by shifting large probability mass to a small number of refusal tokens while neglecting the high ranks of harmful tokens. We then propose a new perspective on achieving deep safety alignment by matching the token ranks of the target distribution, rather than their probabilities. This perspective yields a surprisingly simple fix to the data augmentation defense based on regularizing the attention placed on harmful prefill tokens, an approach we call PRefill attEntion STOpping (PRESTO). Adding PRESTO yields up to a 4.7x improvement in the mean StrongREJECT score under RAP attacks across three popular open-source LLMs, with low impact to model utility.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2512.05518

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.96)

Add feedback

PRESTO: Preimage-Informed Instruction Optimization for Prompting Black-Box LLMs

Chu, Jaewon, Lee, Seunghun, Kim, Hyunwoo J.

arXiv.org Artificial IntelligenceOct-31-2025

Large language models (LLMs) have achieved remarkable success across diverse domains, due to their strong instruction-following capabilities. This has led to increasing interest in optimizing instructions for black-box LLMs, whose internal parameters are inaccessible but widely used due to their strong performance. To optimize instructions for black-box LLMs, recent methods employ white-box LLMs to generate candidate instructions from optimized soft prompts. However, white-box LLMs often map different soft prompts to the same instruction, leading to redundant queries. While previous studies regarded this many-to-one mapping as a structure that hinders optimization efficiency, we reinterpret it as a useful prior knowledge that can accelerate the optimization. To this end, we introduce PREimage-informed inSTruction Optimization (PRESTO), a novel framework that leverages the preimage structure of soft prompts for efficient optimization. PRESTO consists of three key components: (1) score sharing, which shares the evaluation score with all soft prompts in a preimage; (2) preimage-based initialization, which selects initial data points that maximize search space coverage using preimage information; and (3) score consistency regularization, which enforces prediction consistency within each preimage. By leveraging preimages, PRESTO achieves the effect of effectively obtaining 14 times more scored data under the same query budget, resulting in more efficient optimization. Experimental results on 33 instruction optimization tasks demonstrate the superior performance of PRESTO. Code is available at https://github.com/mlvlab/PRESTO

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.25808

Genre: Research Report > New Finding (0.46)

Industry: Transportation > Air (0.83)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Sensor-Adaptive Flood Mapping with Pre-trained Multi-Modal Transformers across SAR and Multispectral Modalities

Tanaka, Tomohiro, Tsutsumida, Narumasa

arXiv.org Artificial IntelligenceSep-30-2025

Floods are increasingly frequent natural disasters causing extensive human and economic damage, highlighting the critical need for rapid and accurate flood inundation mapping. While remote sensing technologies have advanced flood monitoring capabilities, operational challenges persist: single-sensor approaches face weather-dependent data availability and limited revisit periods, while multi-sensor fusion methods require substantial computational resources and large-scale labeled datasets. To address these limitations, this study introduces a novel sensor-flexible flood detection methodology by fine-tuning Presto, a lightweight ($\sim$0.4M parameters) multi-modal pre-trained transformer that processes both Synthetic Aperture Radar (SAR) and multispectral (MS) data at the pixel level. Our approach uniquely enables flood mapping using SAR-only, MS-only, or combined SAR+MS inputs through a single model architecture, addressing the critical operational need for rapid response with whatever sensor data becomes available first during disasters. We evaluated our method on the Sen1Floods11 dataset against the large-scale Prithvi-100M baseline ($\sim$100M parameters) across three realistic data availability scenarios. The proposed model achieved superior performance with an F1 score of 0.896 and mIoU of 0.886 in the optimal sensor-fusion scenario, outperforming the established baseline. Crucially, the model demonstrated robustness by maintaining effective performance in MS-only scenarios (F1: 0.893) and functional capabilities in challenging SAR-only conditions (F1: 0.718), confirming the advantage of multi-modal pre-training for operational flood mapping. Our parameter-efficient, sensor-flexible approach offers an accessible and robust solution for real-world disaster scenarios requiring immediate flood extent assessment regardless of sensor availability constraints.

artificial intelligence, information fusion, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2509.23035

Country: Asia > Japan > Honshū (0.14)

Genre: Research Report (0.82)

Industry: Energy (0.37)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.89)

Add feedback

Deploying Geospatial Foundation Models in the Real World: Lessons from WorldCereal

Butsko, Christina, Van Tricht, Kristof, Tseng, Gabriel, Milli, Giorgia, Rolnick, David, Cartuyvels, Ruben, Reshef, Inbal Becker, Szantoi, Zoltan, Kerner, Hannah

arXiv.org Artificial IntelligenceAug-5-2025

The increasing availability of geospatial foundation models has the potential to transform remote sensing applications such as land cover classification, environmental monitoring, and change detection. Despite promising benchmark results, the deployment of these models in operational settings is challenging and rare. Standardized evaluation tasks often fail to capture real-world complexities relevant for end-user adoption such as data heterogeneity, resource constraints, and application-specific requirements. This paper presents a structured approach to integrate geospatial foundation models into operational mapping systems. Our protocol has three key steps: defining application requirements, adapting the model to domain-specific data and conducting rigorous empirical testing. Using the Presto model in a case study for crop mapping, we demonstrate that fine-tuning a pre-trained model significantly improves performance over conventional supervised methods. Our results highlight the model's strong spatial and temporal generalization capabilities. Our protocol provides a replicable blueprint for practitioners and lays the groundwork for future research to operationalize foundation models in diverse remote sensing applications. Application of the protocol to the WorldCereal global crop-mapping system showcases the framework's scalability.

artificial intelligence, foundation model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2508.00858

Country:

North America > United States (1.00)
Europe (1.00)
Africa (1.00)

Genre: Research Report > New Finding (0.66)

Industry:

Food & Agriculture > Agriculture (0.68)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.58)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)

Add feedback

Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation

Yan, Xin, Cai, Yuxuan, Wang, Qiuyue, Zhou, Yuan, Huang, Wenhao, Yang, Huan

arXiv.org Artificial IntelligenceDec-2-2024

We introduce Presto, a novel video diffusion model designed to generate 15-second videos with long-range coherence and rich content. Extending video generation methods to maintain scenario diversity over long durations presents significant challenges. To address this, we propose a Segmented Cross-Attention (SCA) strategy, which splits hidden states into segments along the temporal dimension, allowing each segment to cross-attend to a corresponding sub-caption. SCA requires no additional parameters, enabling seamless incorporation into current DiT-based architectures. To facilitate high-quality long video generation, we build the LongTake-HD dataset, consisting of 261k content-rich videos with scenario coherence, annotated with an overall video caption and five progressive sub-captions. Experiments show that our Presto achieves 78.5% on the VBench Semantic Score and 100% on the Dynamic Degree, outperforming existing state-of-the-art video generation methods. This demonstrates that our proposed Presto significantly enhances content richness, maintains long-range coherence, and captures intricate textual details. More details are displayed on our project page: https://presto-video.github.io/.

coherence, segmented cross-attention, video, (15 more...)

arXiv.org Artificial Intelligence

2412.01316

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.82)

Industry: Media (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

PRESTO: Fast motion planning using diffusion models based on key-configuration environment representation

Seo, Mingyo, Cho, Yoonyoung, Sung, Yoonchang, Stone, Peter, Zhu, Yuke, Kim, Beomjoon

arXiv.org Artificial IntelligenceSep-24-2024

We introduce a learning-guided motion planning framework that provides initial seed trajectories using a diffusion model for trajectory optimization. Given a workspace, our method approximates the configuration space (C-space) obstacles through a key-configuration representation that consists of a sparse set of task-related key configurations, and uses this as an input to the diffusion model. The diffusion model integrates regularization terms that encourage collision avoidance and smooth trajectories during training, and trajectory optimization refines the generated seed trajectories to further correct any colliding segments. Our experimental results demonstrate that using high-quality trajectory priors, learned through our C-space-grounded diffusion model, enables efficient generation of collision-free trajectories in narrow-passage environments, outperforming prior learning- and planning-based baselines. Videos and additional materials can be found on the project page: https://kiwi-sherbet.github.io/PRESTO.

configuration, diffusion model, trajectory, (16 more...)

arXiv.org Artificial Intelligence

2409.16012

Country: North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

PRESTO: Progressive Pretraining Enhances Synthetic Chemistry Outcomes

Cao, He, Shao, Yanjun, Liu, Zhiyuan, Liu, Zijing, Tang, Xiangru, Yao, Yuan, Li, Yu

arXiv.org Artificial IntelligenceJun-18-2024

Multimodal Large Language Models (MLLMs) have seen growing adoption across various scientific disciplines. These advancements encourage the investigation of molecule-text modeling within synthetic chemistry, a field dedicated to designing and conducting chemical reactions to synthesize new compounds with desired properties and applications. Current approaches, however, often neglect the critical role of multiple molecule graph interaction in understanding chemical reactions, leading to suboptimal performance in synthetic chemistry tasks. This study introduces PRESTO(Progressive Pretraining Enhances Synthetic Chemistry Outcomes), a new framework that bridges the molecule-text modality gap by integrating a comprehensive benchmark of pretraining strategies and dataset configurations. It progressively improves multimodal LLMs through cross-modal alignment and multi-graph understanding. Our extensive experiments demonstrate that PRESTO offers competitive results in downstream synthetic chemistry tasks. The code can be found at https://github.com/IDEA-XL/PRESTO.

prediction, reaction, representation, (15 more...)

arXiv.org Artificial Intelligence

2406.13193

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > Singapore (0.04)
Asia > China > Jiangsu Province > Yancheng (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.81)

Industry: Materials > Chemicals (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PreSto: An In-Storage Data Preprocessing System for Training Recommendation Models

Lee, Yunjae, Kim, Hyeseong, Rhu, Minsoo

arXiv.org Artificial IntelligenceJun-11-2024

Training recommendation systems (RecSys) faces several challenges as it requires the "data preprocessing" stage to preprocess an ample amount of raw data and feed them to the GPU for training in a seamless manner. To sustain high training throughput, state-of-the-art solutions reserve a large fleet of CPU servers for preprocessing which incurs substantial deployment cost and power consumption. Our characterization reveals that prior CPU-centric preprocessing is bottlenecked on feature generation and feature normalization operations as it fails to reap out the abundant inter-/intra-feature parallelism in RecSys preprocessing. PreSto is a storage-centric preprocessing system leveraging In-Storage Processing (ISP), which offloads the bottlenecked preprocessing operations to our ISP units. We show that PreSto outperforms the baseline CPU-centric system with a $9.6\times$ speedup in end-to-end preprocessing time, $4.3\times$ enhancement in cost-efficiency, and $11.3\times$ improvement in energyefficiency on average for production-scale RecSys preprocessing.

opération, presto, proceedings, (15 more...)

arXiv.org Artificial Intelligence

2406.14571

Genre: Research Report (0.84)

Industry: Information Technology > Services (0.93)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Architecture (1.00)
(2 more...)

Add feedback

Lightweight, Pre-trained Transformers for Remote Sensing Timeseries

Tseng, Gabriel, Cartuyvels, Ruben, Zvonkov, Ivan, Purohit, Mirali, Rolnick, David, Kerner, Hannah

arXiv.org Artificial IntelligenceFeb-4-2024

Machine learning methods for satellite data have a range of societally relevant applications, but labels used to train models can be difficult or impossible to acquire. Self-supervision is a natural solution in settings with limited labeled data, but current self-supervised models for satellite data fail to take advantage of the characteristics of that data, including the temporal dimension (which is critical for many applications, such as monitoring crop growth) and availability of data from many complementary sensors (which can significantly improve a model's predictive performance). We present Presto (the Pretrained Remote Sensing Transformer), a model pre-trained on remote sensing pixel-timeseries data. By designing Presto specifically for remote sensing data, we can create a significantly smaller but performant model. Presto excels at a wide variety of globally distributed remote sensing tasks and performs competitively with much larger models while requiring far less compute. Presto can be used for transfer learning or as a feature extractor for simple models, enabling efficient deployment at scale.

dataset, presto, resolution, (17 more...)

arXiv.org Artificial Intelligence

2304.14065

Country:

South America > Brazil (0.04)
Africa > Togo (0.04)
Africa > Kenya (0.04)
(10 more...)

Genre: Research Report (1.00)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Predicting Rare Events by Shrinking Towards Proportional Odds

Faletto, Gregory, Bien, Jacob

arXiv.org Machine LearningMay-29-2023

Training classifiers is difficult with severe class imbalance, but many rare events are the culmination of a sequence with much more common intermediate outcomes. For example, in online marketing a user first sees an ad, then may click on it, and finally may make a purchase; estimating the probability of purchases is difficult because of their rarity. We show both theoretically and through data experiments that the more abundant data in earlier steps may be leveraged to improve estimation of probabilities of rare events. We present PRESTO, a relaxation of the proportional odds model for ordinal regression. Instead of estimating weights for one separating hyperplane that is shifted by separate intercepts for each of the estimated Bayes decision boundaries between adjacent pairs of categorical responses, we estimate separate weights for each of these transitions. We impose an L1 penalty on the differences between weights for the same feature in adjacent weight vectors in order to shrink towards the proportional odds model. We prove that PRESTO consistently estimates the decision boundary weights under a sparsity assumption. Synthetic and real data experiments show that our method can estimate rare probabilities in this setting better than both logistic regression on the rare category, which fails to borrow strength from more abundant categories, and the proportional odds model, which is too inflexible.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

2305.187

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area (0.67)
Health & Medicine > Epidemiology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback