AITopics | Trabucco, Brandon

Collaborating Authors

Trabucco, Brandon

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards Internet-Scale Training For Agents

Trabucco, Brandon, Sigurdsson, Gunnar, Piramuthu, Robinson, Salakhutdinov, Ruslan

arXiv.org Artificial IntelligenceFeb-10-2025

The predominant approach for training web navigation agents gathers human demonstrations for a set of popular websites and hand-written tasks, but it is becoming clear that human data are an inefficient resource. We develop a pipeline to facilitate Internet-scale training for agents without laborious human annotations. In the first stage, an LLM generates tasks for 150k diverse websites. In the next stage, LLM agents complete tasks and produce trajectories. In the final stage, an LLM reviews the trajectories and judges their success. Language models are competitive with human annotators, detecting and filtering out harmful content with an accuracy of 97%, generating feasible tasks with an 89% rate, and judging successful trajectories with an 82.6% accuracy. Scaling the pipeline, agents based on Llama 3.1 70B solve 16.7% of tasks for 150k sites. Training on the data generated by our pipeline is competitive with training on human demonstrations. In data-limited settings derived from Mind2Web and WebLINX, we improve Step Accuracy by up to +89.5% and +122.1% respectively for agents trained on mixtures of data from our pipeline, and human data. When training agents with all available human data from these benchmarks, agents fail to generalize to diverse real sites, and adding our data improves their generalization by +149.0% for WebLINX and +156.3% for Mind2Web. Code will be available at: data-for-agents.github.io.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2502.06776

Country:

North America > United States > California (0.14)
Europe > Middle East > Malta (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.92)
Government > Regional Government > North America Government > United States Government (0.67)
Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Understanding Visual Concepts Across Models

Trabucco, Brandon, Gurinas, Max, Doherty, Kyle, Salakhutdinov, Ruslan

arXiv.org Artificial IntelligenceJun-11-2024

Large multimodal models such as Stable Diffusion can generate, detect, and classify new visual concepts after fine-tuning just a single word embedding. Do models learn similar words for the same concepts (i.e. = orange + cat)? We conduct a large-scale analysis on three state-of-the-art models in text-to-image generation, open-set object detection, and zero-shot classification, and find that new word embeddings are model-specific and non-transferable. Across 4,800 new embeddings trained for 40 diverse visual concepts on four standard datasets, we find perturbations within an $\epsilon$-ball to any prior embedding that generate, detect, and classify an arbitrary concept. When these new embeddings are spliced into new models, fine-tuning that targets the original model is lost. We show popular soft prompt-tuning approaches find these perturbative solutions when applied to visual concept learning tasks, and embeddings for visual concepts are not transferable. Code for reproducing our work is available at: https://visual-words.github.io.

machine learning, natural language, visual concept, (19 more...)

arXiv.org Artificial Intelligence

2406.07506

Country:

North America > United States > Hawaii (0.14)
North America > United States > California (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
(2 more...)

Genre:

Research Report > Promising Solution (0.54)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Leafy Spurge Dataset: Real-world Weed Classification Within Aerial Drone Imagery

Doherty, Kyle, Gurinas, Max, Samsoe, Erik, Casper, Charles, Larkin, Beau, Ramsey, Philip, Trabucco, Brandon, Salakhutdinov, Ruslan

arXiv.org Artificial IntelligenceMay-8-2024

Invasive plant species are detrimental to the ecology of both agricultural and wildland areas. Euphorbia esula, or leafy spurge, is one such plant that has spread through much of North America from Eastern Europe. When paired with contemporary computer vision systems, unmanned aerial vehicles, or drones, offer the means to track expansion of problem plants, such as leafy spurge, and improve chances of controlling these weeds. We gathered a dataset of leafy spurge presence and absence in grasslands of western Montana, USA, then surveyed these areas with a commercial drone. We trained image classifiers on these data, and our best performing model, a pre-trained DINOv2 vision transformer, identified leafy spurge with 0.84 accuracy (test set). This result indicates that classification of leafy spurge is tractable, but not solved. We release this unique dataset of labelled and unlabelled, aerial drone imagery for the machine learning community to explore. Improving classification performance of leafy spurge would benefit the fields of ecology, conservation, and remote sensing alike. Code and data are available at our website: leafy-spurge-dataset.github.io.

artificial intelligence, machine learning, spurge, (14 more...)

arXiv.org Artificial Intelligence

2405.03702

Country: North America > United States > Montana (0.69)

Genre: Research Report (0.51)

Industry: Food & Agriculture > Agriculture (0.69)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Stylus: Automatic Adapter Selection for Diffusion Models

Luo, Michael, Wong, Justin, Trabucco, Brandon, Huang, Yanping, Gonzalez, Joseph E., Chen, Zhifeng, Salakhutdinov, Ruslan, Stoica, Ion

arXiv.org Artificial IntelligenceApr-29-2024

Beyond scaling base models with more data or parameters, fine-tuned adapters provide an alternative way to generate high fidelity, custom images at reduced costs. As such, adapters have been widely adopted by open-source communities, accumulating a database of over 100K adapters-most of which are highly customized with insufficient descriptions. This paper explores the problem of matching the prompt to a set of relevant adapters, built on recent work that highlight the performance gains of composing adapters. We introduce Stylus, which efficiently selects and automatically composes task-specific adapters based on a prompt's keywords. Stylus outlines a three-stage approach that first summarizes adapters with improved descriptions and embeddings, retrieves relevant adapters, and then further assembles adapters based on prompts' keywords by checking how well they fit the prompt. To evaluate Stylus, we developed StylusDocs, a curated dataset featuring 75K adapters with pre-computed adapter embeddings. In our evaluation on popular Stable Diffusion checkpoints, Stylus achieves greater CLIP-FID Pareto efficiency and is twice as preferred, with humans and multimodal models as evaluators, over the base model. See stylus-diffusion.github.io for more.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2404.18928

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Effective Data Augmentation With Diffusion Models

Trabucco, Brandon, Doherty, Kyle, Gurinas, Max, Salakhutdinov, Ruslan

arXiv.org Artificial IntelligenceMay-25-2023

Data augmentation is one of the most prevalent tools in deep learning, underpinning many recent advances, including those from classification, generative models, and representation learning. The standard approach to data augmentation combines simple transformations like rotations and flips to generate new images from existing ones. However, these new images lack diversity along key semantic axes present in the data. Current augmentations cannot alter the high-level semantic attributes, such as animal species present in a scene, to enhance the diversity of data. We address the lack of diversity in data augmentation with image-to-image transformations parameterized by pre-trained text-to-image diffusion models. Our method edits images to change their semantics using an off-the-shelf diffusion model, and generalizes to novel visual concepts from a few labelled examples. We evaluate our approach on few-shot image classification tasks, and on a real-world weed recognition task, and observe an improvement in accuracy in tested domains.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2302.07944

Country:

North America > United States > Texas > Terry County (0.14)
North America > United States > Texas > Gaines County (0.14)
North America > United States > Montana (0.14)
(3 more...)

Genre: Research Report (0.83)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Discovering Non-monotonic Autoregressive Orderings with Variational Inference

Li, Xuanlin, Trabucco, Brandon, Park, Dong Huk, Luo, Michael, Shen, Sheng, Darrell, Trevor, Gao, Yang

arXiv.org Artificial IntelligenceOct-27-2021

The predominant approach for language modeling is to process sequences from left to right, but this eliminates a source of information: the order by which the sequence was generated. One strategy to recover this information is to decode both the content and ordering of tokens. Existing approaches supervise content and ordering by designing problem-specific loss functions and pre-training with an ordering pre-selected. Other recent works use iterative search to discover problem-specific orderings for training, but suffer from high time complexity and cannot be efficiently parallelized. We address these limitations with an unsupervised parallelizable learner that discovers high-quality generation orders purely from training data -- no domain knowledge required. The learner contains an encoder network and decoder language model that perform variational inference with autoregressive orders (represented as permutation matrices) as latent variables. The corresponding ELBO is not differentiable, so we develop a practical algorithm for end-to-end optimization using policy gradients. We implement the encoder as a Transformer with non-causal attention that outputs permutations in one forward pass. Permutations then serve as target generation orders for training an insertion-based Transformer language model. Empirical results in language modeling tasks demonstrate that our method is context-aware and discovers orderings that are competitive with or even better than fixed orders.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2110.15797

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California (0.46)

Genre: Research Report > New Finding (0.45)

Industry:

Government (0.70)
Leisure & Entertainment > Sports (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback