AITopics | qlm

Collaborating Authors

qlm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

One Queue Is All You Need: Resolving Head-of-Line Blocking in Large Language Model Serving

Patke, Archit, Reddy, Dhemath, Jha, Saurabh, Qiu, Haoran, Pinto, Christian, Cui, Shengkun, Narayanaswami, Chandra, Kalbarczyk, Zbigniew, Iyer, Ravishankar

arXiv.org Artificial IntelligenceJun-5-2024

$ $Large language models (LLMs) have become an increasingly important workload for cloud providers catering to both enterprise and consumer applications. LLM inference requests from these applications have end-to-end latency SLOs that must be adhered to in production settings. However, existing LLM serving systems focus on optimization objectives such as request serving throughput or request execution latency rather than the end-to-end latency SLOs. Achieving end-to-end SLOs for latency-sensitive requests is challenging due to head-of-line (HOL) blocking in the request queue, which results from bursty arrival rates and insufficient resources. To address the above challenge, we propose QLM, a multi-model queue management framework for LLM serving. QLM uses stochastic programming to orchestrate the actions of multiple LLM Serving Operations (LSOs) to reduce HOL blocking and maximize SLO attainment. Specifically, QLM uses the following LSOs: model swapping, request eviction, GPU-CPU state swapping, load balancing, and warm model start. Evaluation on heterogeneous GPU devices and models with real-world LLM serving dataset shows that QLM improves SLO attainment by 40-90% and throughput by 20-400% while maintaining or improving device utilization compared to other state-of-the-art LLM serving systems.

queue, request group, virtual queue, (14 more...)

arXiv.org Artificial Intelligence

2407.00047

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry:

Energy > Power Industry (0.34)
Information Technology > Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Quantum Language Model with Entanglement Embedding for Question Answering

Chen, Yiwei, Pan, Yu, Dong, Daoyi

arXiv.org Artificial IntelligenceAug-22-2020

Quantum Language Models (QLMs) in which words are modelled as quantum superposition of sememes have demonstrated a high level of model transparency and good post-hoc interpretability. Nevertheless, in the current literature word sequences are basically modelled as a classical mixture of word states, which cannot fully exploit the potential of a quantum probabilistic description. A full quantum model is yet to be developed to explicitly capture the non-classical correlations within the word sequences. We propose a neural network model with a novel Entanglement Embedding (EE) module, whose function is to transform the word sequences into entangled pure states of many-body quantum systems. Strong quantum entanglement, which is the central concept of quantum information and an indication of parallelized correlations among the words, is observed within the word sequences. Numerical experiments show that the proposed QLM with EE (QLM-EE) achieves superior performance compared with the classical deep neural network models and other QLMs on Question Answering (QA) datasets. In addition, the post-hoc interpretability of the model can be improved by quantizing the degree of entanglement among the words.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2008.09943

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)
Oceania > Australia > New South Wales (0.04)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

End-to-End Quantum-like Language Models with Application to Question Answering

AAAI ConferencesFeb-8-2018

Language Modeling (LM) is a fundamental research topic in a range of areas. Recently, inspired by quantum theory, a novel Quantum Language Model (QLM) has been proposed for Information Retrieval (IR). In this paper, we aim to broaden the theoretical and practical basis of QLM. We develop a Neural Network based Quantum-like Language Model (NNQLM) and apply it to Question Answering. Specifically, based on word embeddings, we design a new density matrix, which represents a sentence (e.g., a question or an answer) and encodes a mixture of semantic subspaces. Such a density matrix, together with a joint representation of the question and the answer, can be integrated into neural network architectures (e.g., 2-dimensional convolutional neural networks). Experiments on the TREC-QA and WIKIQA datasets have verified the effectiveness of our proposed models.

density matrix, joint representation, representation, (14 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

Asia > China > Tianjin Province > Tianjin (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Modeling Quantum Entanglements in Quantum Language Models

AAAI ConferencesJul-15-2015

Recently, a Quantum Language Model (QLM) was proposed to model term dependencies upon Quantum Theory (QT) framework and successively applied in Information Retrieval (IR). Nevertheless, QLM's dependency is based on co-occurrences of terms and has not yet taken into account the Quantum Entanglement (QE), which is a key quantum concept and has a significant cognitive implication. In QT, an entangled state can provide a more complete description for the nature of realities, and determine intrinsic correlations of considered objects globally, rather than those co-occurrences on the surface. It is, however, a real challenge to decide and measure QE using the classical statistics of texts in a post-measurement configuration. In order to circumvent this problem, we theoretically prove the connection between QE and statistically Unconditional Pure Dependence (UPD). Since UPD has an implementable deciding algorithm, we can in turn characterize QE by extracting the UPD patterns from texts. This leads to a measurable QE, based on which we further advance the existing QLM framework. We empirically compare our model with related models, and the results demonstrate the effectiveness of our model.

projector, qlm, upd pattern, (12 more...)

AAAI Conferences

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)
Asia > China > Hong Kong (0.04)
(2 more...)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.35)

Add feedback