finite-state transducer
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
ShrutiSense: Microtonal Modeling and Correction in Indian Classical Music
Ghosh, Rajarshi, Athipatla, Jayanth
Indian classical music relies on a sophisticated microtonal system of 22 shrutis (pitch intervals), which provides expressive nuance beyond the 12-tone equal temperament system. Existing symbolic music processing tools fail to account for these microtonal distinctions and culturally specific raga grammars that govern melodic movement. We present ShrutiSense, a comprehensive symbolic pitch processing system designed for Indian classical music, addressing two critical tasks: (1) correcting westernized or corrupted pitch sequences, and (2) completing melodic sequences with missing values. Our approach employs complementary models for different tasks: a Shruti-aware finite-state transducer (FST) that performs contextual corrections within the 22-shruti framework and a grammar-constrained Shruti hidden Markov model (GC-SHMM) that incorporates raga-specific transition rules for contextual completions. Comprehensive evaluation on simulated data across five ragas demonstrates that ShrutiSense (FST model) achieves 91.3% shruti classification accuracy for correction tasks, with example sequences showing 86.7-90.0% accuracy at corruption levels of 0.2 to 0.4. The system exhibits robust performance under pitch noise up to +/-50 cents, maintaining consistent accuracy across ragas (90.7-91.8%), thus preserving the cultural authenticity of Indian classical music expression.
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
Neural Search Space in Gboard Decoder
Zhang, Yanxiang, Zhang, Yuanbo, Sun, Haicheng, Wang, Yun, Dou, Billy, Sivek, Gary, Zhai, Shumin
Gboard Decoder produces suggestions by looking for paths that best match input touch points on the context aware search space, which is backed by the language Finite State Transducers (FST). The language FST is currently an N-gram language model (LM). However, N-gram LMs, limited in context length, are known to have sparsity problem under device model size constraint. In this paper, we propose \textbf{Neural Search Space} which substitutes the N-gram LM with a Neural Network LM (NN-LM) and dynamically constructs the search space during decoding. Specifically, we integrate the long range context awareness of NN-LM into the search space by converting its outputs given context, into the language FST at runtime. This involves language FST structure redesign, pruning strategy tuning, and data structure optimizations. Online experiments demonstrate improved quality results, reducing Words Modified Ratio by [0.26\%, 1.19\%] on various locales with acceptable latency increases. This work opens new avenues for further improving keyboard decoding quality by enhancing neural LM more directly.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Spain (0.04)
- Europe > Portugal (0.04)
Online Learning with Transductive Regret
We study online learning with the general notion of transductive regret, that is regret with modification rules applying to expert sequences (as opposed to single experts) that are representable by weighted finite-state transducers. We show how transductive regret generalizes existing notions of regret, including: (1) external regret; (2) internal regret; (3) swap regret; and (4) conditional swap regret. We present a general and efficient online learning algorithm for minimizing transductive regret. We further extend that to design efficient algorithms for the time-selection and sleeping expert settings. A by-product of our study is an algorithm for swap regret, which, under mild assumptions, is more efficient than existing ones, and a substantially more efficient algorithm for time selection swap regret.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
Large Language Models and the Extended Church-Turing Thesis
Wiedermann, Jiří, van Leeuwen, Jan
The Extended Church-Turing Thesis (ECTT) posits that all effective information processing, including unbounded and non-uniform interactive computations, can be described in terms of interactive Turing machines with advice. Does this assertion also apply to the abilities of contemporary large language models (LLMs)? From a broader perspective, this question calls for an investigation of the computational power of LLMs by the classical means of computability and computational complexity theory, especially the theory of automata. Along these lines, we establish a number of fundamental results. Firstly, we argue that any fixed (non-adaptive) LLM is computationally equivalent to a, possibly very large, deterministic finite-state transducer. This characterizes the base level of LLMs. We extend this to a key result concerning the simulation of space-bounded Turing machines by LLMs. Secondly, we show that lineages of evolving LLMs are computationally equivalent to interactive Turing machines with advice. The latter finding confirms the validity of the ECTT for lineages of LLMs. From a computability viewpoint, it also suggests that lineages of LLMs possess super-Turing computational power. Consequently, in our computational model knowledge generation is in general a non-algorithmic process realized by lineages of LLMs. Finally, we discuss the merits of our findings in the broader context of several related disciplines and philosophies.
- Europe > Netherlands (0.04)
- Europe > Czechia > Prague (0.04)
Structure-Aware Path Inference for Neural Finite State Transducers
Tan, Weiting, Lin, Chu-cheng, Eisner, Jason
Neural finite-state transducers (NFSTs) form an expressive family of neurosymbolic sequence transduction models. An NFST models each string pair as having been generated by a latent path in a finite-state transducer. As they are deep generative models, both training and inference of NFSTs require inference networks that approximate posterior distributions over such latent variables. In this paper, we focus on the resulting challenge of imputing the latent alignment path that explains a given pair of input and output strings (e.g., during training). We train three autoregressive approximate models for amortized inference of the path, which can then be used as proposal distributions for importance sampling. All three models perform lookahead. Our most sophisticated (and novel) model leverages the FST structure to consider the graph of future paths; unfortunately, we find that it loses out to the simpler approaches -- except on an artificial task that we concocted to confuse the simpler approaches.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Massachusetts > Middlesex County > Reading (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (2 more...)
- Instructional Material > Course Syllabus & Notes (0.46)
- Research Report > Promising Solution (0.34)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Constraining Linear-chain CRFs to Regular Languages
Papay, Sean, Klinger, Roman, Padó, Sebastian
A major challenge in structured prediction is to represent the interdependencies within output structures. When outputs are structured as sequences, linear-chain conditional random fields (CRFs) are a widely used model class which can learn \textit{local} dependencies in the output. However, the CRF's Markov assumption makes it impossible for CRFs to represent distributions with \textit{nonlocal} dependencies, and standard CRFs are unable to respect nonlocal constraints of the data (such as global arity constraints on output labels). We present a generalization of CRFs that can enforce a broad class of constraints, including nonlocal ones, by specifying the space of possible output structures as a regular language $\mathcal{L}$. The resulting regular-constrained CRF (RegCCRF) has the same formal properties as a standard CRF, but assigns zero probability to all label sequences not in $\mathcal{L}$. Notably, RegCCRFs can incorporate their constraints during training, while related models only enforce constraints during decoding. We prove that constrained training is never worse than constrained decoding, and show empirically that it can be substantially better in practice. Additionally, we demonstrate a practical benefit on downstream tasks by incorporating a RegCCRF into a deep neural model for semantic role labeling, exceeding state-of-the-art results on a standard dataset.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > California > San Diego County > San Diego (0.04)
- (14 more...)
Benchmarking Compositionality with Formal Languages
Valvoda, Josef, Saphra, Naomi, Rawski, Jonathan, Williams, Adina, Cotterell, Ryan
Recombining known primitive concepts into larger novel combinations is a quintessentially human cognitive capability. Whether large neural models in NLP can acquire this ability while learning from data is an open question. In this paper, we investigate this problem from the perspective of formal languages. We use deterministic finite-state transducers to make an unbounded number of datasets with controllable properties governing compositionality. By randomly sampling over many transducers, we explore which of their properties contribute to learnability of a compositional relation by a neural network. We find that the models either learn the relations completely or not at all. The key is transition coverage, setting a soft learnability limit at 400 examples per transition.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > New York (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- (8 more...)
Building Metadata Inference Using a Transducer Based Language Model
Waterworth, David, Sethuvenkatraman, Subbu, Sheng, Quan Z.
Solving the challenges of automatic machine translation of Building Automation System text metadata is a crucial first step in efficiently deploying smart building applications. The vocabulary used to describe building metadata appears small compared to general natural languages, but each term has multiple commonly used abbreviations. Conventional machine learning techniques are inefficient since they need to learn many different forms for the same word, and large amounts of data must be used to train these models. It is also difficult to apply standard techniques such as tokenisation since this commonly results in multiple output tags being associated with a single input token, something traditional sequence labelling models do not allow. Finite State Transducers can model sequence-to-sequence tasks where the input and output sequences are different lengths, and they can be combined with language models to ensure a valid output sequence is generated. We perform a preliminary analysis into the use of transducer-based language models to parse and normalise building point metadata.
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- North America > United States > California (0.04)
- Construction & Engineering (1.00)
- Energy (0.95)
- Information Technology > Smart Houses & Appliances (0.34)
Using Finite-State Machines to Automatically Scan Classical Greek Hexameter
Schumann, Anne-Kathrin, Beierle, Christoph, Blößner, Norbert
Greek literature has, for centuries, served as a paradigm and model for literary writing all over Europe. The oldest surviving texts of Classical Greek literature - texts such as the Iliad, the Odyssey, and the works of Hesiod - are epic poems that all share the same metre: hexameter. They are written in an artificial language that has never been spoken in everyday life and owes its origin and many of its peculiarities to the nature of metrically bound language (Meister (1921)). Comprehensive hexameter annotation is, therefore, crucial for large-scale and data-driven investigations into some of the linguistic features of Ancient Greek epic language. Furthermore, it may provide additional criteria for the evaluation of Homer's repeated verses, the so-called iterata. Within Classical Philology, controversy around the nature of the Homeric repetitions started in 1840, and it remained one of the central research questions in the field for a long period of time (see Strasser (1984), pp.
- North America > United States > Colorado (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Germany > Bavaria > Regensburg (0.04)
- (6 more...)