AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.63)

WIREDJun-9-2026, 21:00:15 GMT

GM Wants Your Electric Car to Power Your House--and Your Neighborhood

The automaker today is turning on vehicle-to-grid charging for its GM Energy customers. Will people actually use it? Some 250,000 electric vehicles manufactured by General Motors are driving around the US today--right now!--with an oft-secret capability: Their big, powerful batteries can charge other things. Potentially appliances, homes, and now, thanks to a software update pushed by the automaker this week, an electrical grid . Twelve of GM's EVs have this "bidirectional charging" capability, way more than US competitors' battery-electrics.

artificial intelligence, bidirectional, promo code, (15 more...)

WIRED

Country: North America > United States > California (0.71)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Electric Vehicle (1.00)
Energy > Power Industry (1.00)
Automobiles & Trucks > Manufacturer (1.00)

Technology:

Information Technology > Artificial Intelligence (0.71)
Information Technology > Software (0.49)

Neural Information Processing SystemsDec-24-2025, 16:00:58 GMT

Bounds all around: training energy-based models with bidirectional bounds

Energy-based models (EBMs) provide an elegant framework for density estimation, but they are notoriously difficult to train. Recent work has established links to generative adversarial networks, where the EBM is trained through a minimax game with a variational value function. We propose a bidirectional bound on the EBM log-likelihood, such that we maximize a lower bound and minimize an upper bound when solving the minimax game. We link one bound to a gradient penalty that stabilizes training, thereby provide grounding for best engineering practice. To evaluate the bounds we develop a new and efficient estimator of the Jacobi-determinant of the EBM generator. We demonstrate that these developments stabilize training and yield high-quality density estimation and sample generation.

bidirectional, name change, training energy-based model, (5 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Feldman, Yair, Artzi, Yoav

Simple Context Compression: Mean-Pooling and Multi-Ratio Training

arXiv.org Artificial IntelligenceOct-24-2025

A common strategy to reduce the computational costs of using long contexts in retrieval-augmented generation (RAG) with large language models (LLMs) is soft context compression, where the input sequence is transformed into a shorter continuous representation. We develop a lightweight and simple mean-pooling approach that consistently outperforms the widely used compression-tokens architecture, and study training the same compressor to output multiple compression ratios. We conduct extensive experiments across in-domain and out-of-domain QA datasets, as well as across model families, scales, and compression ratios. Overall, our simple mean-pooling approach achieves the strongest performance, with a relatively small drop when training for multiple compression ratios. More broadly though, across architectures and training regimes the trade-offs are more nuanced, illustrating the complex landscape of compression methods.

large language model, machine learning, natural language, (17 more...)

2510.20797

Country: North America > United States > Minnesota (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsAug-16-2025, 14:11:52 GMT

Bounds all around training energy based models with bidirectional bounds Supplementary Material

A.1 Proof of Theorem 1 Proof log null E The first inequality is derived by Holder's inequality, so Existence is ensured as long as the chosen activation functions have at least one derivative almost everywhere. Smooth activations naturally satisfy this assumption, but it is worth noting that e.g. the ReLU activation We cannot guarantee that the Jacobian has full rank through clever choices of neural architectures. This is a natural requirement for the generator anyway. In our model, we aim to maximize the entropy of the generator, which encourages the generator to create as diverse samples as possible. In practice this ensures that the Jacobian has full rank as a degenerate Jacobian implies a reduction of entropy.

artificial intelligence, linear, machine learning, (18 more...)

Country:

Europe > Denmark (0.05)
Asia > China > Shanghai > Shanghai (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsAug-16-2025, 14:11:48 GMT

Bounds all around: training energy-based models with bidirectional bounds

Recent work has established links to generative adversarial networks, where the EBM is trained through a minimax game with a variational value function.

artificial intelligence, international conference, machine learning, (14 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Denmark (0.04)
Asia > China > Shanghai > Shanghai (0.04)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Artificial IntelligenceAug-12-2025

Fractal Language Modelling by Universal Sequence Maps (USM)

Almeida, Jonas S, Russ, Daniel E, Vinga, Susana, Duarte, Ines, Mason, Lee, Bhawsar, Praphulla, Ge, Aaron, Oliveira, Arlindo, Balasubramanian, Jeya Balaji

Motivation: With the advent of Language Models using Transformers, popularized by ChatGPT, there is a renewed interest in exploring encoding procedures that numerically represent symbolic sequences at multiple scales and embedding dimensions. The challenge that encoding addresses is the need for mechanisms that uniquely retain contextual information about the succession of individual symbols, which can then be modeled by nonlinear formulations such as neural networks. Context: Universal Sequence Maps(USM) are iterated functions that bijectively encode symbolic sequences onto embedded numerical spaces. USM is composed of two Chaos Game Representations (CGR), iterated forwardly and backwardly, that can be projected into the frequency domain (FCGR). The corresponding USM coordinates can be used to compute a Chebyshev distance metric as well as k-mer frequencies, without having to recompute the embedded numeric coordinates, and, paradoxically, allowing for non-integers values of k. Results: This report advances the bijective fractal encoding by Universal Sequence Maps (USM) by resolving seeding biases affecting the iterated process. The resolution had two results, the first expected, the second an intriguing outcome: 1) full reconciliation of numeric positioning with sequence identity; and 2) uncovering the nature of USM as an efficient numeric process converging towards a steady state sequence embedding solution. We illustrate these results for genomic sequences because of the convenience of a planar representation defined by an alphabet with only 4 tokens (the 4 nucleotides). Nevertheless, the application to alphabet of arbitrary cardinality was found to be straightforward.

large language model, machine learning, natural language, (20 more...)

2508.06641

Country:

North America > United States > Maryland (0.28)
Europe > Portugal > Lisbon > Lisbon (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)

Ratajczak, Martin, Robichaud, Jean-Philippe, Fox, Jennifer Drexler

Accurate, fast, cheap: Choose three. Replacing Multi-Head-Attention with Bidirectional Recurrent Attention for Long-Form ASR

arXiv.org Artificial IntelligenceJun-25-2025

Long-form speech recognition is an application area of increasing research focus. ASR models based on multi-head attention (MHA) are ill-suited to long-form ASR because of their quadratic complexity in sequence length. We build on recent work that has investigated linear complexity recurrent attention (RA) layers for ASR. We find that bidirectional RA layers can match the accuracy of MHA for both short- and long-form applications. We present a strong limited-context attention (LCA) baseline, and show that RA layers are just as accurate while being more efficient. We develop a long-form training paradigm which further improves RA performance, leading to better accuracy than LCA with 44% higher throughput. We also present Direction Dropout, a novel regularization method that improves accuracy, provides fine-grained control of the accuracy/throughput trade-off of bidirectional RA, and enables a new alternating directions decoding mode with even higher throughput.

accuracy, artificial intelligence, machine learning, (16 more...)

2506.19761

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Europe > Czechia > South Moravian Region > Brno (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.69)

arXiv.org Artificial IntelligenceJun-17-2025

An Exploration of Mamba for Speech Self-Supervised Models

Lin, Tzu-Quan, Kuo, Heng-Cheng, Wei, Tzu-Chieh, Cheng, Hsi-Chun, Chen, Chun-Wei, Hsiao, Hsien-Fu, Tsao, Yu, Lee, Hung-yi

--While Mamba has demonstrated strong performance in language modeling, its potential as a speech self-supervised (SSL) model remains underexplored, with prior studies limited to isolated tasks. T o address this, we explore Mamba-based HuBERT models as alternatives to Transformer-based SSL architectures. Leveraging the linear-time Selective State Space, these models enable fine-tuning on long-context ASR with significantly lower compute. Moreover, they show superior performance when fine-tuned for streaming ASR. Beyond fine-tuning, these models show competitive performance on SUPERB probing benchmarks, particularly in causal settings. Our analysis shows that they yield higher-quality quantized representations and capture speaker-related features more distinctly than Transformer-based models. In recent years, Transformer-based models and their multi-head self-attention mechanisms have achieved remarkable success across various domains [1]-[3].

artificial intelligence, machine learning, natural language, (18 more...)

2506.12606

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.91)

arXiv.org Artificial IntelligenceJun-6-2025

Exploring bidirectional bounds for minimax-training of Energy-based models

Geng, Cong, Wang, Jia, Chen, Li, Gao, Zhiyong, Frellsen, Jes, Hauberg, Søren

Energy-based models (EBMs) estimate unnormalized densities in an elegant framework, but they are generally difficult to train. Recent work has linked EBMs to generative adversarial networks, by noting that they can be trained through a minimax game using a variational lower bound. To avoid the instabilities caused by minimizing a lower bound, we propose to instead work with bidirectional bounds, meaning that we maximize a lower bound and minimize an upper bound when training the EBM. We investigate four different bounds on the log-likelihood derived from different perspectives. We derive lower bounds based on the singular values of the generator Jacobian and on mutual information. To upper bound the negative log-likelihood, we consider a gradient penalty-like bound, as well as one based on diffusion processes. In all cases, we provide algorithms for evaluating the bounds. We compare the different bounds to investigate, the pros and cons of the different approaches. Finally, we demonstrate that the use of bidirectional bounds stabilizes EBM training and yields high-quality density estimation and sample generation.

artificial intelligence, international conference, machine learning, (17 more...)

2506.04609

Country:

North America > United States (0.46)
North America > Canada (0.46)
Asia > China (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)