AITopics | Choi, Matthew

Collaborating Authors

Choi, Matthew

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Quantum linear algebra is all you need for Transformer architectures

Guo, Naixu, Yu, Zhan, Choi, Matthew, Agrawal, Aman, Nakaji, Kouhei, Aspuru-Guzik, Alán, Rebentrost, Patrick

arXiv.org Artificial IntelligenceMay-30-2024

Generative machine learning methods such as large-language models are revolutionizing the creation of text and images. While these models are powerful they also harness a large amount of computational resources. The transformer is a key component in large language models that aims to generate a suitable completion of a given partial sequence. In this work, we investigate transformer architectures under the lens of fault-tolerant quantum computing. The input model is one where trained weight matrices are given as block encodings and we construct the query, key, and value matrices for the transformer. We show how to prepare a block encoding of the self-attention matrix, with a new subroutine for the row-wise application of the softmax function. In addition, we combine quantum subroutines to construct important building blocks in the transformer, the residual connection and layer normalization, and the feed-forward neural network. Our subroutines prepare an amplitude encoding of the transformer output, which can be measured to obtain a prediction. Based on common open-source large-language models, we provide insights into the behavior of important parameters determining the run time of the quantum algorithm. We discuss the potential and challenges for obtaining a quantum advantage.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2402.16714

Country:

North America > United States (0.46)
North America > Canada > Ontario (0.28)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Large Language Models on Lexical Semantic Change Detection: An Evaluation

Wang, Ruiyu, Choi, Matthew

arXiv.org Artificial IntelligenceDec-10-2023

Lexical Semantic Change Detection stands out as one of the few areas where Large Language Models (LLMs) have not been extensively involved. Traditional methods like PPMI, and SGNS remain prevalent in research, alongside newer BERT-based approaches. Despite the comprehensive coverage of various natural language processing domains by LLMs, there is a notable scarcity of literature concerning their application in this specific realm. In this work, we seek to bridge this gap by introducing LLMs into the domain of Lexical Semantic Change Detection. Our work presents novel prompting solutions and a comprehensive evaluation that spans all three generations of language models, contributing to the exploration of LLMs in this research area.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2312.06002

Country:

North America > Canada > Ontario > Toronto (0.29)
Asia > Middle East (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

FlexModel: A Framework for Interpretability of Distributed Large Language Models

Choi, Matthew, Asif, Muhammad Adil, Willes, John, Emerson, David

arXiv.org Artificial IntelligenceDec-5-2023

With the growth of large language models, now incorporating billions of parameters, the hardware prerequisites for their training and deployment have seen a corresponding increase. Although existing tools facilitate model parallelization and distributed training, deeper model interactions, crucial for interpretability and responsible AI techniques, still demand thorough knowledge of distributed computing. This often hinders contributions from researchers with machine learning expertise but limited distributed computing background. Addressing this challenge, we present FlexModel, a software package providing a streamlined interface for engaging with models distributed across multi-GPU and multi-node configurations. The library is compatible with existing model distribution libraries and encapsulates PyTorch models. It exposes user-registerable HookFunctions to facilitate straightforward interaction with distributed model internals, bridging the gap between distributed and single-device model paradigms. Primarily, FlexModel enhances accessibility by democratizing model interactions and promotes more inclusive research in the domain of large-scale neural networks.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2312.0314

Country: North America > Canada (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learning quantum dynamics with latent neural ODEs

Choi, Matthew, Flam-Shepherd, Daniel, Kyaw, Thi Ha, Aspuru-Guzik, Alán

arXiv.org Artificial IntelligenceFeb-4-2022

Deep learning and neural networks have recently become the powerhouse in machine learning (ML) and they have successfully been used to tackle complex problems In general, the study of open quantum systems are in classical [1-3] and quantum mechanics [4-7] (see Refs. important for quantum computing as well as many [8-12] for reviews). Machine-assisted scientific discovery other areas of physics from many-body phenomenon [27, is still in its infancy but progress has been made, mostly 28], light-matter interaction [29-31] to non-equilibrium by building the correct inductive bias-or structure into physics [32, 33]. the model or loss function. For example physical conservation laws can be learned [1, 2]. Other work has made progress, in a purely data-driven approach learning relationships between quantum experiments and entanglement Here, we demonstrate that latent ODEs can be trained using generative models [13]. Recently, neural to generate and extrapolate measurement data from dynamical ordinary differential equations (ODEs) were introduced quantum evolution in both closed and open [14, 15], a neural network layer defined by differential quantum systems using only physical observations without equations. Neural ODEs provide the perfect model for specifying the physics a priori. This is in line with physics, since many physical laws are governed by ODEs, treating the quantum system as a black box and the "shut and thus every neural ODE has the correct inductive bias up and calculate" philosophy [34] all the while ignoring built into the model itself.

artificial intelligence, machine learning, quantum dynamic, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1103/PhysRevA.105.042403

2110.10721

Country: North America > Canada > Ontario (0.30)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback