AITopics | reversibility

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.31)

Neural Information Processing SystemsFeb-13-2026, 01:36:31 GMT

6745cb9889cc213bda803535f2d3902e-Paper-Conference.pdf

artificial intelligence, integrator, machine learning, (15 more...)

Country:

Europe > France (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre: Research Report > New Finding (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.45)

Neural Information Processing SystemsFeb-8-2026, 19:45:29 GMT

6d79e030371e47e6231337805a7a2685-Paper.pdf

dynamical system, kernel, trajectory, (14 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)
North America > United States > Virginia (0.04)
(2 more...)

Industry: Health & Medicine (0.94)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Neural Information Processing SystemsFeb-7-2026, 12:17:31 GMT

ThereIsNoTurningBack: ASelf-SupervisedApproachfor Reversibility-AwareReinforcementLearning

We propose to learn to distinguish reversible from irreversible actions for better informed decision-making in Reinforcement Learning (RL).

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)

arXiv.org Artificial IntelligenceNov-27-2025

Harmonic Token Projection (HTP): A Vocabulary-Free, Training-Free, Deterministic, and Reversible Embedding Methodology

Schmitz, Tcharlies

This paper introduces the Harmonic Token Projection (HTP), a reversible and deterministic framework for generating text embeddings without training, vocabularies, or stochastic parameters. Unlike neural embeddings that rely on statistical co-occurrence or optimization, HTP encodes each token analytically as a harmonic trajectory derived from its Unicode integer representation, establishing a bijective and interpretable mapping between discrete symbols and continuous vector space. The harmonic formulation provides phase-coherent projections that preserve both structure and reversibility, enabling semantic similarity estimation from purely geometric alignment. Experimental evaluation on the Semantic Textual Similarity Benchmark (STS-B) and its multilingual extension shows that HTP achieves a Spearman correlation of \r{ho} = 0.68 in English, maintaining stable performance across ten languages with negligible computational cost and sub-millisecond latency per sentence pair. This demonstrates that meaningful semantic relations can emerge from deterministic geometry, offering a transparent and efficient alternative to data-driven embeddings. Keywords: Harmonic Token Projection, reversible embedding, deterministic encoding, semantic similarity, multilingual representation.

artificial intelligence, machine learning, natural language, (18 more...)

2511.20665

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceOct-31-2025

Modular Linear Tokenization (MLT)

Schmitz, Tcharlies

This paper introduces Modular Linear Tokenization (MLT), a reversible and deterministic technique for encoding high-cardinality categorical identifiers into compact numerical vectors. Unlike traditional hashing or one-hot encodings, MLT preserves bijective mappings by leveraging modular arithmetic over finite fields and invertible linear transformations. The method offers explicit control of dimensionality and computational scalability while maintaining full reversibility, even for millions of identifiers. Experimental results on the MovieLens 20M dataset show that MLT achieves comparable predictive performance to supervised embeddings while requiring significantly fewer parameters and lower training cost. An open-source implementation of MLT is available on PyPI (https://pypi.org/project/light-mlt/) and GitHub (https://github.com/tcharliesschmitz/light-mlt).

artificial intelligence, machine learning, natural language, (18 more...)

2510.25952

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Overmars, Maik, Goseling, Jasper, Boucherie, Richard

Convergence of off-policy TD(0) with linear function approximation for reversible Markov chains

arXiv.org Machine LearningOct-30-2025

We study the convergence of off-policy TD(0) with linear function approximation when used to approximate the expected discounted reward in a Markov chain. It is well known that the combination of off-policy learning and function approximation can lead to divergence of the algorithm. Existing results for this setting modify the algorithm, for instance by reweighing the updates using importance sampling. This establishes convergence at the expense of additional complexity. In contrast, our approach is to analyse the standard algorithm, but to restrict our attention to the class of reversible Markov chains. We demonstrate convergence under this mild reversibility condition on the structure of the chain, which in many applications can be assumed using domain knowledge. In particular, we establish a convergence guarantee under an upper bound on the discount factor in terms of the difference between the on-policy and off-policy process. This improves upon known results in the literature that state that convergence holds for a sufficiently small discount factor by establishing an explicit bound. Convergence is with probability one and achieves projected Bellman error equal to zero. To obtain these results, we adapt the stochastic approximation framework that was used by Tsitsiklis and Van Roy [1997 for the on-policy case, to the off-policy case. We illustrate our results using different types of reversible Markov chains, such as one-dimensional random walks and random walks on a weighted graph.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Machine Learning

2510.25514

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.84)

Sorstkins, Andrejs, Tariq, Omer, Bilal, Muhammad

Learning to Undo: Rollback-Augmented Reinforcement Learning with Reversibility Signals

arXiv.org Artificial IntelligenceOct-17-2025

This paper proposes a reversible learning framework to improve the robustness and efficiency of value based Reinforcement Learning agents, addressing vulnerability to value overestimation and instability in partially irreversible environments. The framework has two complementary core mechanisms: an empirically derived transition reversibility measure called Phi of s and a, and a selective state rollback operation. We introduce an online per state action estimator called Phi that quantifies the likelihood of returning to a prior state within a fixed horizon K. This measure is used to adjust the penalty term during temporal difference updates dynamically, integrating reversibility awareness directly into the value function. The system also includes a selective rollback operator. When an action yields an expected return markedly lower than its instantaneous estimated value and violates a predefined threshold, the agent is penalized and returns to the preceding state rather than progressing. This interrupts sub optimal high risk trajectories and avoids catastrophic steps. By combining reversibility aware evaluation with targeted rollback, the method improves safety, performance, and stability. In the CliffWalking v0 domain, the framework reduced catastrophic falls by over 99.8 percent and yielded a 55 percent increase in mean episode return. In the Taxi v3 domain, it suppressed illegal actions by greater than or equal to 99.9 percent and achieved a 65.7 percent improvement in cumulative reward, while also sharply reducing reward variance in both environments. Ablation studies confirm that the rollback mechanism is the critical component underlying these safety and performance gains, marking a robust step toward safe and reliable sequential decision making.

machine learning, reinforcement learning, rollback, (17 more...)

2510.14503

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (0.83)

Industry:

Transportation > Passenger (0.48)
Transportation > Ground > Road (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Klinger, Eitan, Wadhwa, Vivaan, Park, Jungyeul

Punctuation-aware treebank tree binarization

arXiv.org Artificial IntelligenceOct-14-2025

This article presents a curated resource and evaluation suite for punctuation-aware treebank binarization. Standard binarization pipelines drop punctuation before head selection, which alters constituent shape and harms head-child identification. We release (1) a reproducible pipeline that preserves punctuation as sibling nodes prior to binarization, (2) derived artifacts and metadata (intermediate @X markers, reversibility signatures, alignment indices), and (3) an accompanying evaluation suite covering head-child prediction, round-trip reversibility, and structural compatibility with derivational resources (CCGbank). On the Penn Treebank, punctuation-aware preprocessing improves head prediction accuracy from 73.66\% (Collins rules) and 86.66\% (MLP) to 91.85\% with the same classifier, and achieves competitive alignment against CCGbank derivations. All code, configuration files, and documentation are released to enable replication and extension to other corpora.

artificial intelligence, machine learning, natural language, (17 more...)

2510.10951

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > New York (0.04)
North America > United States > Illinois (0.04)
(5 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.68)