AITopics | language model

Collaborating Authors

language model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Unbounded cache model for online language modeling with open vocabulary

Neural Information Processing SystemsMar-17-2026, 18:54:24 GMT

Recently, continuous cache models were proposed as extensions to recurrent neural network language models, to adapt their predictions to local changes in the data distribution. These models only capture the local context, of up to a few thousands tokens. In this paper, we propose an extension of continuous cache models, which can scale to larger contexts. In particular, we use a large scale non-parametric memory component that stores all the hidden activations seen in the past. We leverage recent advances in approximate nearest neighbor search and quantization algorithms to store millions of representations while searching them efficiently. We conduct extensive experiments showing that our approach significantly improves the perplexity of pre-trained language models on new distributions, and can scale efficiently to much larger contexts than previously proposed local cache models.

artificial intelligence, machine learning, natural language, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.62)

Add feedback

An AI image generator for non-English speakers

AIHubMar-17-2026, 11:49:45 GMT

Although text-to-image generation is rapidly advancing, these AI models are mostly English-centric. Researchers at the University of Amsterdam Faculty of Science have created NeoBabel, an AI image generator that can work in six different languages. By making all elements of their research open source, anyone can build on the model and help push inclusive AI research. When you generate an image with AI, the results are often better when your prompt is in English. This is because many AI models are English at their core: if you use another language, your prompt is translated into English before the image is created.

generator, machine learning, natural language, (19 more...)

AIHub

Country:

Europe > Netherlands > North Holland > Amsterdam (0.27)
Asia > Singapore (0.05)

Genre: Research Report (0.35)

Industry: Government (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.90)
Information Technology > Communications > Social Media (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.52)

Add feedback

LightRNN: Memory and Computation-Efficient Recurrent Neural Networks

Neural Information Processing SystemsMar-17-2026, 11:05:59 GMT

Recurrent neural networks (RNNs) have achieved state-of-the-art performances in many natural language processing tasks, such as language modeling and machine translation. However, when the vocabulary is large, the RNN model will become very big (e.g., possibly beyond the memory capacity of a GPU device) and its training will become very inefficient. In this work, we propose a novel technique to tackle this challenge. The key idea is to use 2-Component (2C) shared embedding for word representations. We allocate every word in the vocabulary into a table, each row of which is associated with a vector, and each column associated with another vector.

artificial intelligence, machine learning, natural language, (11 more...)

Neural Information Processing Systems

Genre: Research Report (0.38)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.63)

Add feedback

Fully Neural Network Based Speech Recognition on Mobile and Embedded Devices

Neural Information Processing SystemsMar-16-2026, 20:25:58 GMT

Real-time automatic speech recognition (ASR) on mobile and embedded devices has been of great interests for many years. We present real-time speech recognition on smartphones or embedded systems by employing recurrent neural network (RNN) based acoustic models, RNN based language models, and beam-search decoding. The acoustic model is end-to-end trained with connectionist temporal classification (CTC) loss. The RNN implementation on embedded devices can suffer from excessive DRAM accesses because the parameter size of a neural network usually exceeds that of the cache memory and the parameters are used only once for each time step. To remedy this problem, we employ a multi-time step parallelization approach that computes multiple output samples at a time with the parameters fetched from the DRAM.

artificial intelligence, machine learning, proceedings, (10 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.58)

Add feedback

Unsupervised Text Style Transfer using Language Models as Discriminators

Neural Information Processing SystemsMar-16-2026, 19:54:35 GMT

Binary classifiers are employed as discriminators in GAN-based unsupervised style transfer models to ensure that transferred sentences are similar to sentences in the target domain. One difficulty with the binary discriminator is that error signal is sometimes insufficient to train the model to produce rich-structured language. In this paper, we propose a technique of using a target domain language model as the discriminator to provide richer, token-level feedback during the learning process. Because our language model scores sentences directly using a product of locally normalized probabilities, it offers more stable and more useful training signal to the generator. We train the generator to minimize the negative log likelihood (NLL) of generated sentences evaluated by a language model. By using continuous approximation of the discrete samples, our model can be trained using back-propagation in an end-to-end way. Moreover, we find empirically with a language model as a structured discriminator, it is possible to eliminate the adversarial training steps using negative samples, thus making training more stable. We compare our model with previous work using convolutional neural networks (CNNs) as discriminators and show our model outperforms them significantly in three tasks including word substitution decipherment, sentiment modification and related language translation.

artificial intelligence, machine learning, natural language, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.97)

Add feedback

AI and Theory of Mind: an interview with Nitay Alon

AIHubMar-16-2026, 09:32:21 GMT

How did you arrive at this research area?

machine learning, natural language, theory, (21 more...)

AIHub

Country:

Asia > Singapore (0.05)
North America > United States (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Genre: Personal > Interview (1.00)

Industry: Government (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.99)
Information Technology > Artificial Intelligence > Natural Language (0.72)
Information Technology > Communications > Social Media (0.69)
Information Technology > Artificial Intelligence > Machine Learning (0.69)

Add feedback

Reinforcement learning applied to autonomous vehicles: an interview with Oliver Chang

AIHubMar-12-2026, 23:37:38 GMT

In this interview series, we're meeting some of the AAAI/SIGAI Doctoral Consortium participants to find out more about their research. We caught up with Oliver Chang whose research interests span deep reinforcement learning, autonomous vehicles, and explainable AI. We found out more about some of the projects he's worked on so far, what drew him to the field, and what future AI directions he's excited about. Could you give us a quick introduction to who you are, where you're studying, and the topic of your research? I'm specializing in reinforcement learning applied to autonomous vehicles and UAVs.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

AIHub

Country: Asia > Singapore (0.04)

Genre: Personal > Interview (1.00)

Industry:

Education (0.70)
Government (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Google built a flash-flood prediction tool using Gemini and old news reports

EngadgetMar-12-2026, 15:45:42 GMT

It's the first time that the company has used language models for this kind of thing. Flash floods are, but Google might have a novel solution. The company, a prediction tool for flash floods that uses Gemini to source data from old news reports. This is the first time it has used a language model for this type of work. Flash flood prediction models need historical data and model training that often doesn't exist.

artificial intelligence, engadget, google, (13 more...)

Engadget

Country: North America > United States (0.30)

Industry:

Marketing (0.47)
Health & Medicine (0.32)
Government (0.30)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Adaptation to Intrinsic Dependence in Diffusion Language Models

Zhao, Yunxiao, Cai, Changxiao

arXiv.org Machine LearningFeb-24-2026

Diffusion language models (DLMs) have recently emerged as a promising alternative to autoregressive (AR) approaches, enabling parallel token generation beyond a rigid left-to-right order. Despite growing empirical success, the theoretical understanding of how unmasking schedules -- which specify the order and size of unmasked tokens during sampling -- affect generation quality remains limited. In this work, we introduce a distribution-agnostic unmasking schedule for DLMs that adapts to the (unknown) dependence structure of the target data distribution, without requiring any prior knowledge or hyperparameter tuning. In contrast to prior deterministic procedures that fix unmasking sizes, our method randomizes the number of tokens revealed at each iteration. We show that, for two specific parameter choices, the sampling convergence guarantees -- measured by Kullback-Leibler (KL) divergence -- scale as $\widetilde O(\mathsf{TC}/K)$ and $\widetilde O(\mathsf{DTC}/K)$ respectively. Here, $K$ is the number of iterations, and $\mathsf{TC}$ and $\mathsf{DTC}$ are the total correlation and dual total correlation of the target distribution, capturing the intrinsic dependence structure underlying the data. Importantly, our guarantees hold in the practically relevant parallel-sampling regime $K

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Machine Learning

2602.20126

Country: North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)

Genre:

Workflow (0.93)
Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Content preserving text generation with attribute controls

Lajanugen Logeswaran, Honglak Lee, Samy Bengio

Neural Information Processing SystemsFeb-19-2026, 21:02:13 GMT

We focus on categorical attributes of language. Examples of such attributes include sentiment, language complexity, tense, voice, honorifics, mood, etc. Our approach draws inspiration from styletransfer methods inthevision andlanguage literature.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Oregon > Multnomah County > Portland (0.04)
North America > Canada > Quebec > Montreal (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback