pullover
ABS: Enforcing Constraint Satisfaction On Generated Sequences Via Automata-Guided Beam Search
Collura, Vincenzo, Tit, Karim, Bussi, Laura, Giunchiglia, Eleonora, Cordy, Maxime
Sequence generation and prediction form a cornerstone of modern machine learning, with applications spanning natural language processing, program synthesis, and time-series forecasting. These tasks are typically modeled in an autoregressive fashion, where each token is generated conditional on the preceding ones, and beam search is commonly used to balance exploration and fluency during decoding. While deep learning models and Large Language Models (LLMs) excel at capturing statistical patterns in this setting, they remain ill-equipped to guarantee compliance with formal constraints. In this paper, we introduce ABS: a general and model-agnostic inference-time algorithm that guarantees compliance with any constraint that can be compiled into a Deterministic Finite Automaton (DFA), without requiring retraining. ABS leverages the DFA to guide a constrained variant of beam search: at each decoding step, transitions leading to violations are masked, while remaining paths are dynamically re-ranked according to both the model's probabilities and the automaton's acceptance structure. We formally prove that the resulting sequences are guaranteed to satisfy the given constraints, and we empirically demonstrate that ABS also improves output quality. We validate our approach on three distinct tasks: constrained image-stream classification, controlled text generation, and text infilling. In all settings, ABS achieves perfect constraint satisfaction, while outperforming or matching state-of-the-art baselines on standard quality metrics and efficiency.
- North America > United States > Washington > King County > Seattle (0.04)
- North America > Dominican Republic (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Save on last year's Patagonia jackets, apparel, and accessories during REI's seasonal clearance sale
We may earn revenue from the products available on this page and participate in affiliate programs. The company has been making great jackets, bags, and pretty much everything else you need for outdoor activities since 1973. "While Patagonia stuff is great, it's not usually cheap. Fortunately, REI currently has a ton of last year's products on sale with steep discounts. That includes some of the most popular items like the puffer jackets and the fleece pullovers.
- Retail (0.86)
- Health & Medicine > Consumer Health (0.35)
The rise of intelligent matter: Taking cues from nature to develop smarter tech
Imagine if the pullover you're wearing automatically adapted itself to the temperature, warming you if you were shivering, or cooling you down if you were sweating. This means that the pullover would have to learn to recognize your discomfort and alter its properties so as to counter this discomfort. Other potential functionalities could include rapid drying or cushioning a fall. But how can such a pullover be created? What energy would it need?
Multi-class Generative Adversarial Nets for Semi-supervised Image Classification
Motamed, Saman, Khalvati, Farzad
From generating never-before-seen images to domain adaptation, applications of Generative Adversarial Networks (GANs) spread wide in the domain of vision and graphics problems. With the remarkable ability of GANs in learning the distribution and generating images of a particular class, they can be used for semi-supervised classification tasks. However, the problem is that if two classes of images share similar characteristics, the GAN might learn to generalize and hinder the classification of the two classes. In this paper, we use various images from MNIST and Fashion-MNIST datasets to illustrate how similar images cause the GAN to generalize, leading to the poor classification of images. We propose a modification to the traditional training of GANs that allows for improved multi-class classification in similar classes of images in a semi-supervised learning framework.
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.56)
- Information Technology > Artificial Intelligence > Vision > Image Understanding (0.52)
Why is Attention Not So Interpretable?
Bai, Bing, Liang, Jian, Zhang, Guanhua, Li, Hao, Bai, Kun, Wang, Fei
Attention-based methods have played an important role in model interpretations, where the calculated attention weights are expected to highlight the critical parts of inputs (e.g., keywords in sentences). However, recent research points out that attention-as-importance interpretations often do not work as well as we expect. For example, learned attention weights sometimes highlight less meaningful tokens like "[SEP]", ",", and ".", and are frequently uncorrelated with other feature importance indicators like gradient-based measures. Finally, a debate on the effectiveness of attention-based interpretations has been raised. In this paper, we reveal that one root cause of this phenomenon can be ascribed to the combinatorial shortcuts, which stands for that in addition to the highlighted parts, the attention weights themselves may carry extra information which could be utilized by downstream models of attention layers. As a result, the attention weights are no longer pure importance indicators. We theoretically analyze the combinatorial shortcuts, design one intuitive experiment to demonstrate their existence, and propose two methods to mitigate this issue. Empirical studies on attention-based interpretation models are conducted, and the results show that the proposed methods can effectively improve the interpretability of attention mechanisms on a variety of datasets.
- North America > United States (0.04)
- Asia > China > Heilongjiang Province > Harbin (0.04)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Data Science > Data Mining (0.89)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)