AITopics | Almahairi, Amjad

Collaborating Authors

Almahairi, Amjad

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Jack of All Tasks, Master of Many: Designing General-purpose Coarse-to-Fine Vision-Language Model

Pramanick, Shraman, Han, Guangxing, Hou, Rui, Nag, Sayan, Lim, Ser-Nam, Ballas, Nicolas, Wang, Qifan, Chellappa, Rama, Almahairi, Amjad

arXiv.org Artificial IntelligenceDec-19-2023

The ability of large language models (LLMs) to process visual inputs has given rise to general-purpose vision systems, unifying various vision-language (VL) tasks by instruction tuning. However, due to the enormous diversity in input-output formats in the vision domain, existing general-purpose models fail to successfully integrate segmentation and multi-image inputs with coarse-level tasks into a single framework. In this work, we introduce VistaLLM, a powerful visual system that addresses coarse- and fine-grained VL tasks over single and multiple input images using a unified framework. VistaLLM utilizes an instruction-guided image tokenizer that filters global embeddings using task descriptions to extract compressed and refined features from numerous images. Moreover, VistaLLM employs a gradient-aware adaptive sampling technique to represent binary segmentation masks as sequences, significantly improving over previously used uniform sampling. To bolster the desired capability of VistaLLM, we curate CoinIt, a comprehensive coarse-to-fine instruction tuning dataset with 6.8M samples. We also address the lack of multi-image grounding datasets by introducing a novel task, AttCoSeg (Attribute-level Co-Segmentation), which boosts the model's reasoning and grounding capability over multiple input images. Extensive experiments on a wide range of V- and VL tasks demonstrate the effectiveness of VistaLLM by achieving consistent state-of-the-art performance over strong baselines across all downstream tasks. Our project page can be found at https://shramanpramanick.github.io/VistaLLM/.

large language model, machine learning, vistallm, (20 more...)

arXiv.org Artificial Intelligence

2312.12423

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Netherlands (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Learning Easily Updated General Purpose Text Representations with Adaptable Task-Specific Prefixes

Huang, Kuan-Hao, Tan, Liang, Hou, Rui, Wang, Sinong, Almahairi, Amjad, Rinott, Ruty

arXiv.org Artificial IntelligenceOct-14-2023

Many real-world applications require making multiple predictions from the same text. Fine-tuning a large pre-trained language model for each downstream task causes computational burdens in the inference time due to several times of forward passes. To amortize the computational cost, freezing the language model and building lightweight models for downstream tasks based on fixed text representations are common solutions. Accordingly, how to learn fixed but general text representations that can generalize well to unseen downstream tasks becomes a challenge. Previous works have shown that the generalizability of representations can be improved by fine-tuning the pre-trained language model with some source tasks in a multi-tasking way. In this work, we propose a prefix-based method to learn the fixed text representations with source tasks. We learn a task-specific prefix for each source task independently and combine them to get the final representations. Our experimental results show that prefix-based training performs better than multi-tasking training and can update the text representations at a smaller computational cost than multi-tasking training.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2305.13499

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)

Add feedback

Llama 2: Open Foundation and Fine-Tuned Chat Models

Touvron, Hugo, Martin, Louis, Stone, Kevin, Albert, Peter, Almahairi, Amjad, Babaei, Yasmine, Bashlykov, Nikolay, Batra, Soumya, Bhargava, Prajjwal, Bhosale, Shruti, Bikel, Dan, Blecher, Lukas, Ferrer, Cristian Canton, Chen, Moya, Cucurull, Guillem, Esiobu, David, Fernandes, Jude, Fu, Jeremy, Fu, Wenyin, Fuller, Brian, Gao, Cynthia, Goswami, Vedanuj, Goyal, Naman, Hartshorn, Anthony, Hosseini, Saghar, Hou, Rui, Inan, Hakan, Kardas, Marcin, Kerkez, Viktor, Khabsa, Madian, Kloumann, Isabel, Korenev, Artem, Koura, Punit Singh, Lachaux, Marie-Anne, Lavril, Thibaut, Lee, Jenya, Liskovich, Diana, Lu, Yinghai, Mao, Yuning, Martinet, Xavier, Mihaylov, Todor, Mishra, Pushkar, Molybog, Igor, Nie, Yixin, Poulton, Andrew, Reizenstein, Jeremy, Rungta, Rashi, Saladi, Kalyan, Schelten, Alan, Silva, Ruan, Smith, Eric Michael, Subramanian, Ranjan, Tan, Xiaoqing Ellen, Tang, Binh, Taylor, Ross, Williams, Adina, Kuan, Jian Xiang, Xu, Puxin, Yan, Zheng, Zarov, Iliyan, Zhang, Yuchen, Fan, Angela, Kambadur, Melanie, Narang, Sharan, Rodriguez, Aurelien, Stojnic, Robert, Edunov, Sergey, Scialom, Thomas

arXiv.org Artificial IntelligenceJul-19-2023

In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases. Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be a suitable substitute for closed-source models. We provide a detailed description of our approach to fine-tuning and safety improvements of Llama 2-Chat in order to enable the community to build on our work and contribute to the responsible development of LLMs.

machine learning, natural language, reinforcement learning, (20 more...)

arXiv.org Artificial Intelligence

2307.09288

Country:

North America > United States (1.00)
Asia > Middle East > UAE (0.13)

Genre: Research Report > New Finding (1.00)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Security & Privacy (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Residual Prompt Tuning: Improving Prompt Tuning with Residual Reparameterization

Razdaibiedina, Anastasia, Mao, Yuning, Hou, Rui, Khabsa, Madian, Lewis, Mike, Ba, Jimmy, Almahairi, Amjad

arXiv.org Artificial IntelligenceMay-6-2023

Prompt tuning is one of the successful approaches for parameter-efficient tuning of pre-trained language models. Despite being arguably the most parameter-efficient (tuned soft prompts constitute <0.1% of total parameters), it typically performs worse than other efficient tuning methods and is quite sensitive to hyper-parameters. In this work, we introduce Residual Prompt Tuning - a simple and efficient method that significantly improves the performance and stability of prompt tuning. We propose to reparameterize soft prompt embeddings using a shallow network with a residual connection. Our experiments show that Residual Prompt Tuning significantly outperforms prompt tuning on SuperGLUE benchmark. Notably, our method reaches +7 points improvement over prompt tuning with T5-Base and allows to reduce the prompt length by 10x without hurting performance. In addition, we show that our approach is robust to the choice of learning rate and prompt initialization, and is effective in few-shot settings.

experiment, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2305.03937

Country: North America (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Progressive Prompts: Continual Learning for Language Models

Razdaibiedina, Anastasia, Mao, Yuning, Hou, Rui, Khabsa, Madian, Lewis, Mike, Almahairi, Amjad

arXiv.org Artificial IntelligenceJan-28-2023

We introduce Progressive Prompts - a simple and efficient approach for continual learning in language models. Our method allows forward transfer and resists catastrophic forgetting, without relying on data replay or a large number of task-specific parameters. Progressive Prompts learns a new soft prompt for each task and sequentially concatenates it with the previously learned prompts, while keeping the base model frozen. Experiments on standard continual learning benchmarks show that our approach outperforms state-of-the-art methods, with an improvement >20% in average test accuracy over the previous best-preforming method on T5 model. We also explore a more challenging continual learning setup with longer sequences of tasks and show that Progressive Prompts significantly outperforms prior methods.

machine learning, natural language, progressive prompt, (17 more...)

arXiv.org Artificial Intelligence

2301.12314

Country: North America (0.28)

Genre:

Research Report > New Finding (0.68)
Research Report > Promising Solution (0.66)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Uniform Masking Prevails in Vision-Language Pretraining

Verma, Siddharth, Lu, Yuchen, Hou, Rui, Yu, Hanchao, Ballas, Nicolas, Khabsa, Madian, Almahairi, Amjad

arXiv.org Artificial IntelligenceDec-9-2022

Masked Language Modeling (MLM) has proven to be an essential component of Vision-Language (VL) pretraining. To implement MLM, the researcher must make two design choices: the masking strategy, which determines which tokens to mask, and the masking rate, which determines how many tokens to mask. Previous work has focused primarily on the masking strategy while setting the masking rate at a default of 15\%. In this paper, we show that increasing this masking rate improves downstream performance while simultaneously reducing performance gap among different masking strategies, rendering the uniform masking strategy competitive to other more complex ones. Surprisingly, we also discover that increasing the masking rate leads to gains in Image-Text Matching (ITM) tasks, suggesting that the role of MLM goes beyond language modeling in VL pretraining.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2212.05195

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.34)

Add feedback

UniPELT: A Unified Framework for Parameter-Efficient Language Model Tuning

Mao, Yuning, Mathias, Lambert, Hou, Rui, Almahairi, Amjad, Ma, Hao, Han, Jiawei, Yih, Wen-tau, Khabsa, Madian

arXiv.org Artificial IntelligenceOct-14-2021

Conventional fine-tuning of pre-trained language models tunes all model parameters and stores a full model copy for each downstream task, which has become increasingly infeasible as the model size grows larger. Recent parameter-efficient language model tuning (PELT) methods manage to match the performance of fine-tuning with much fewer trainable parameters and perform especially well when the training data is limited. However, different PELT methods may perform rather differently on the same task, making it nontrivial to select the most appropriate method for a specific task, especially considering the fast-growing number of new PELT methods and downstream tasks. In light of model diversity and the difficulty of model selection, we propose a unified framework, UniPELT, which incorporates different PELT methods as submodules and learns to activate the ones that best suit the current data or task setup. Remarkably, on the GLUE benchmark, UniPELT consistently achieves 1~3pt gains compared to the best individual PELT method that it incorporates and even outperforms fine-tuning under different setups. Moreover, UniPELT often surpasses the upper bound when taking the best performance of all its submodules used individually on each task, indicating that a mixture of multiple PELT methods may be inherently more effective than single methods.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2110.07577

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Adversarial Computation of Optimal Transport Maps

Leygonie, Jacob, She, Jennifer, Almahairi, Amjad, Rajeswar, Sai, Courville, Aaron

arXiv.org Machine LearningJun-23-2019

Computing optimal transport maps between high-dimensional and continuous distributions is a challenging problem in optimal transport (OT). Generative adversarial networks (GANs) are powerful generative models which have been successfully applied to learn maps across high-dimensional domains. However, little is known about the nature of the map learned with a GAN objective. To address this problem, we propose a generative adversarial model in which the discriminator's objective is the $2$-Wasserstein metric. We show that during training, our generator follows the $W_2$-geodesic between the initial and the target distributions. As a consequence, it reproduces an optimal map at the end of training. We validate our approach empirically in both low-dimensional and high-dimensional continuous settings, and show that it outperforms prior methods on image data.

artificial intelligence, generator, neural network, (17 more...)

arXiv.org Machine Learning

1906.09691

Country:

North America > United States (0.14)
North America > Canada (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

A Closer Look at the Optimization Landscapes of Generative Adversarial Networks

Berard, Hugo, Gidel, Gauthier, Almahairi, Amjad, Vincent, Pascal, Lacoste-Julien, Simon

arXiv.org Machine LearningJun-11-2019

Generative adversarial networks have been very successful in generative modeling, however they remain relatively hard to optimize compared to standard deep neural networks. In this paper, we try to gain insight into the optimization of GANs by looking at the game vector field resulting from the concatenation of the gradient of both players. Based on this point of view, we propose visualization techniques that allow us to make the following empirical observations. First, the training of GANs suffers from rotational behavior around locally stable stationary points, which, as we show, corresponds to the presence of imaginary components in the eigenvalues of the Jacobian of the game. Secondly, GAN training seems to converge to a stable stationary point which is a saddle point for the generator loss, not a minimum, while still achieving excellent performance. This counter-intuitive yet persistent observation questions whether we actually need a Nash equilibrium to get good performance in GANs.

deep learning, eigenvalue, neural network, (21 more...)

arXiv.org Machine Learning

1906.04848

Country: North America > Canada (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Learning Distributed Representations from Reviews for Collaborative Filtering

Almahairi, Amjad, Kastner, Kyle, Cho, Kyunghyun, Courville, Aaron

arXiv.org Machine LearningJun-18-2018

Recent work has shown that collaborative filter-based recommender systems can be improved by incorporating side information, such as natural language reviews, as a way of regularizing the derived product representations. Motivated by the success of this approach, we introduce two different models of reviews and study their effect on collaborative filtering performance. While the previous state-of-the-art approach is based on a latent Dirichlet allocation (LDA) model of reviews, the models we explore are neural network based: a bag-of-words product-of-experts model and a recurrent neural network. We demonstrate that the increased flexibility offered by the product-of-experts model allowed it to achieve state-of-the-art performance on the Amazon review dataset, outperforming the LDA-based approach. However, interestingly, the greater modeling power offered by the recurrent neural network appears to undermine the model's ability to act as a regularizer of the product representations.

artificial intelligence, deep learning, neural network, (19 more...)

arXiv.org Machine Learning

doi: 10.1145/2792838.2800192

1806.06875

Country:

North America > United States (0.28)
North America > Canada (0.28)
Europe > Austria (0.28)

Genre: Research Report (0.84)

Industry:

Leisure & Entertainment (0.68)
Media > Film (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback