AITopics | Dimakis, Alex

Collaborating Authors

Dimakis, Alex

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Warped Diffusion: Solving Video Inverse Problems with Image Diffusion Models

Daras, Giannis, Nie, Weili, Kreis, Karsten, Dimakis, Alex, Mardani, Morteza, Kovachki, Nikola Borislavov, Vahdat, Arash

arXiv.org Artificial IntelligenceOct-21-2024

Using image models naively for solving inverse video problems often suffers from flickering, texture-sticking, and temporal inconsistency in generated videos. To tackle these problems, in this paper, we view frames as continuous functions in the 2D space, and videos as a sequence of continuous warping transformations between different frames. This perspective allows us to train function space diffusion models only on images and utilize them to solve temporally correlated inverse problems. The function space diffusion models need to be equivariant with respect to the underlying spatial transformations. To ensure temporal consistency, we introduce a simple post-hoc test-time guidance towards (self)-equivariant solutions. Our method allows us to deploy state-of-the-art latent diffusion models such as Stable Diffusion XL to solve video inverse problems. We demonstrate the effectiveness of our method for video inpainting and $8\times$ video super-resolution, outperforming existing techniques based on noise transformations. We provide generated video results: https://giannisdaras.github.io/warped_diffusion.github.io/.

artificial intelligence, diffusion model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2410.16152

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

SVFT: Parameter-Efficient Fine-Tuning with Singular Vectors

Lingam, Vijay, Tejaswi, Atula, Vavre, Aditya, Shetty, Aneesh, Gudur, Gautham Krishna, Ghosh, Joydeep, Dimakis, Alex, Choi, Eunsol, Bojchevski, Aleksandar, Sanghavi, Sujay

arXiv.org Artificial IntelligenceMay-29-2024

Popular parameter-efficient fine-tuning (PEFT) methods, such as LoRA and its variants, freeze pre-trained model weights $W$ and inject learnable matrices $\Delta W$. These $\Delta W$ matrices are structured for efficient parameterization, often using techniques like low-rank approximations or scaling vectors. However, these methods typically show a performance gap compared to full fine-tuning. Although recent PEFT methods have narrowed this gap, they do so at the cost of additional learnable parameters. We propose SVFT, a simple approach that fundamentally differs from existing methods: the structure imposed on $\Delta W$ depends on the specific weight matrix $W$. Specifically, SVFT updates $W$ as a sparse combination of outer products of its singular vectors, training only the coefficients (scales) of these sparse combinations. This approach allows fine-grained control over expressivity through the number of coefficients. Extensive experiments on language and vision benchmarks show that SVFT recovers up to 96% of full fine-tuning performance while training only 0.006 to 0.25% of parameters, outperforming existing methods that only recover up to 85% performance using 0.03 to 0.8% of the trainable parameter budget.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2405.19597

Country: North America > United States > Texas (0.14)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)

Add feedback

On a Foundation Model for Operating Systems

Saxena, Divyanshu, Sharma, Nihal, Kim, Donghyun, Dwivedula, Rohit, Chen, Jiayi, Yang, Chenxi, Ravula, Sriram, Hu, Zichao, Akella, Aditya, Angel, Sebastian, Biswas, Joydeep, Chaudhuri, Swarat, Dillig, Isil, Dimakis, Alex, Godfrey, P. Brighten, Kim, Daehyeok, Rossbach, Chris, Wang, Gang

arXiv.org Artificial IntelligenceDec-12-2023

This paper lays down the research agenda for a domain-specific foundation model for operating systems (OSes). Our case for a foundation model revolves around the observations that several OS components such as CPU, memory, and network subsystems are interrelated and that OS traces offer the ideal dataset for a foundation model to grasp the intricacies of diverse OS components and their behavior in varying environments and workloads. We discuss a wide range of possibilities that then arise, from employing foundation models as policy agents to utilizing them as generators and predictors to assist traditional OS control algorithms. Our hope is that this paper spurs further research into OS foundation models and creating the next generation of operating systems for the evolving computing landscape.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2312.07813

Country:

North America > United States > Texas (0.14)
North America > United States > Hawaii (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.51)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
(2 more...)

Add feedback

DataComp: In search of the next generation of multimodal datasets

Gadre, Samir Yitzhak, Ilharco, Gabriel, Fang, Alex, Hayase, Jonathan, Smyrnis, Georgios, Nguyen, Thao, Marten, Ryan, Wortsman, Mitchell, Ghosh, Dhruba, Zhang, Jieyu, Orgad, Eyal, Entezari, Rahim, Daras, Giannis, Pratt, Sarah, Ramanujan, Vivek, Bitton, Yonatan, Marathe, Kalyani, Mussmann, Stephen, Vencu, Richard, Cherti, Mehdi, Krishna, Ranjay, Koh, Pang Wei, Saukh, Olga, Ratner, Alexander, Song, Shuran, Hajishirzi, Hannaneh, Farhadi, Ali, Beaumont, Romain, Oh, Sewoong, Dimakis, Alex, Jitsev, Jenia, Carmon, Yair, Shankar, Vaishaal, Schmidt, Ludwig

arXiv.org Artificial IntelligenceOct-20-2023

Multimodal datasets are a critical component in recent breakthroughs such as Stable Diffusion and GPT-4, yet their design does not receive the same research attention as model architectures or training algorithms. To address this shortcoming in the ML ecosystem, we introduce DataComp, a testbed for dataset experiments centered around a new candidate pool of 12.8 billion image-text pairs from Common Crawl. Participants in our benchmark design new filtering techniques or curate new data sources and then evaluate their new dataset by running our standardized CLIP training code and testing the resulting model on 38 downstream test sets. Our benchmark consists of multiple compute scales spanning four orders of magnitude, which enables the study of scaling trends and makes the benchmark accessible to researchers with varying resources. Our baseline experiments show that the DataComp workflow leads to better training sets. In particular, our best baseline, DataComp-1B, enables training a CLIP ViT-L/14 from scratch to 79.2% zero-shot accuracy on ImageNet, outperforming OpenAI's CLIP ViT-L/14 by 3.7 percentage points while using the same training procedure and compute. We release DataComp and all accompanying code at www.datacomp.ai.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2304.14108

Country:

Asia (0.67)
Europe (0.67)
North America > United States (0.45)

Genre:

Research Report > New Finding (1.00)
Overview (0.92)

Industry:

Law (1.00)
Information Technology > Security & Privacy (0.92)
Health & Medicine (0.92)
Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback