AITopics | ctm

Collaborating Authors

ctm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Testing For Distribution Shifts with Conditional Conformal Test Martingales

Shaer, Shalev, Bar, Yarin, Prinster, Drew, Romano, Yaniv

arXiv.org Machine LearningFeb-17-2026

We propose a sequential test for detecting arbitrary distribution shifts that allows conformal test martingales (CTMs) to work under a fixed, reference-conditional setting. Existing CTM detectors construct test martingales by continually growing a reference set with each incoming sample, using it to assess how atypical the new sample is relative to past observations. While this design yields anytime-valid type-I error control, it suffers from test-time contamination: after a change, post-shift observations enter the reference set and dilute the evidence for distribution shift, increasing detection delay and reducing power. In contrast, our method avoids contamination by design by comparing each new sample to a fixed null reference dataset. Our main technical contribution is a robust martingale construction that remains valid conditional on the null reference data, achieved by explicitly accounting for the estimation error in the reference distribution induced by the finite reference set. This yields anytime-valid type-I error control together with guarantees of asymptotic power one and bounded expected detection delay. Empirically, our method detects shifts faster than standard CTMs, providing a powerful and reliable distribution-shift detector.

artificial intelligence, distribution shift, machine learning, (18 more...)

arXiv.org Machine Learning

2602.13848

Country:

Asia > Middle East > Israel (0.04)
North America > United States > Maryland > Baltimore (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Conformal Blindness: A Note on $A$-Cryptic change-points

Szabadváry, Johan Hallberg

arXiv.org Machine LearningJan-23-2026

Conformal Test Martingales (CTMs) are a standard method within the Conformal Prediction framework for testing the crucial assumption of data exchangeability by monitoring deviations from uniformity in the p-value sequence. Although exchangeability implies uniform p-values, the converse does not hold. This raises the question of whether a significant break in exchangeability can occur, such that the p-values remain uniform, rendering CTMs blind. We answer this affirmatively, demonstrating the phenomenon of \emph{conformal blindness}. Through explicit construction, for the theoretically ideal ``predictive oracle'' conformity measure (given by the true conditional density), we demonstrate the possibility of an \emph{$A$-cryptic change-point} (where $A$ refers to the conformity measure). Using bivariate Gaussian distributions, we identify a line along which a change in the marginal means does not alter the distribution of the conformity scores, thereby producing perfectly uniform p-values. Simulations confirm that even a massive distribution shift can be perfectly cryptic to the CTM, highlighting a fundamental limitation and emphasising the critical role of the alignment of the conformity measure with potential shifts. By contrasting the predictive oracle with recent results on detection-optimal scores, we emphasise that validity monitoring in safety-critical systems requires careful separation of predictive and diagnostic goals.

artificial intelligence, machine learning, modeling & simulation, (16 more...)

arXiv.org Machine Learning

2601.01147

Country: Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Modeling & Simulation (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

Continuous Thought Machines

Darlow, Luke, Regan, Ciaran, Risi, Sebastian, Seely, Jeffrey, Jones, Llion

arXiv.org Artificial IntelligenceOct-6-2025

Biological brains demonstrate complex neural activity, where neural dynamics are critical to how brains process information. Most artificial neural networks ignore the complexity of individual neurons. We challenge that paradigm. By incorporating neuron-level processing and synchronization, we reintroduce neural timing as a foundational element. We present the Continuous Thought Machine (CTM), a model designed to leverage neural dynamics as its core representation. The CTM has two innovations: (1) neuron-level temporal processing, where each neuron uses unique weight parameters to process incoming histories; and (2) neural synchronization as a latent representation. The CTM aims to strike a balance between neuron abstractions and biological realism. It operates at a level of abstraction that effectively captures essential temporal dynamics while remaining computationally tractable. We demonstrate the CTM's performance and versatility across a range of tasks, including solving 2D mazes, ImageNet-1K classification, parity computation, and more. Beyond displaying rich internal representations and offering a natural avenue for interpretation owing to its internal process, the CTM is able to perform tasks that require complex sequential reasoning. The CTM can also leverage adaptive compute, where it can stop earlier for simpler tasks, or keep computing when faced with more challenging instances. The goal of this work is to share the CTM and its associated innovations, rather than pushing for new state-of-the-art results. To that end, we believe the CTM represents a significant step toward developing more biologically plausible and powerful artificial intelligence systems. We provide an accompanying interactive online demonstration at https://pub.sakana.ai/ctm/ and an extended technical report at https://pub.sakana.ai/ctm/paper .

internal tick, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2505.05522

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Middle East > Jordan (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(4 more...)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine (0.94)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(3 more...)

Add feedback

285f89b802bcb2651801455c86d78f2a-Reviews.html

Neural Information Processing SystemsOct-3-2025, 07:53:23 GMT

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The authors describe a two novel inference methods for the correlated topic model (CTM). They build on analytic results for the conditional logistic normal likelihood to arrive at a fast, easily parallelized exact inference. This leads to an approximate sampling method for producing Polya-Gamma variates. Finally, they propose a method for efficiently drawing samples in the presence of sparsity.

algorithm, inference, topic model, (13 more...)

Neural Information Processing Systems

Country: North America > United States > Nevada (0.04)

Genre:

Overview (0.35)
Research Report (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Elucidating the Preconditioning in Consistency Distillation

Zheng, Kaiwen, He, Guande, Chen, Jianfei, Bao, Fan, Zhu, Jun

arXiv.org Artificial IntelligenceFeb-5-2025

Consistency distillation is a prevalent way for accelerating diffusion models adopted in consistency (trajectory) models, in which a student model is trained to traverse backward on the probability flow (PF) ordinary differential equation (ODE) trajectory determined by the teacher model. Preconditioning is a vital technique for stabilizing consistency distillation, by linear combining the input data and the network output with pre-defined coefficients as the consistency function. It imposes the boundary condition of consistency functions without restricting the form and expressiveness of the neural network. However, previous preconditionings are hand-crafted and may be suboptimal choices. In this work, we offer the first theoretical insights into the preconditioning in consistency distillation, by elucidating its design criteria and the connection to the teacher ODE trajectory. Based on these analyses, we further propose a principled way dubbed \textit{Analytic-Precond} to analytically optimize the preconditioning according to the consistency gap (defined as the gap between the teacher denoiser and the optimal student denoiser) on a generalized teacher ODE. We demonstrate that Analytic-Precond can facilitate the learning of trajectory jumpers, enhance the alignment of the student trajectory with the teacher's, and achieve $2\times$ to $3\times$ training acceleration of consistency trajectory models in multi-step generation across various datasets.

artificial intelligence, diffusion model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2502.02922

Country:

Asia > China > Beijing > Beijing (0.04)
North America > Canada > Ontario > Toronto (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report (0.50)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

DITTO-2: Distilled Diffusion Inference-Time T-Optimization for Music Generation

Novack, Zachary, McAuley, Julian, Berg-Kirkpatrick, Taylor, Bryan, Nicholas

arXiv.org Artificial IntelligenceMay-30-2024

Controllable music generation methods are critical for human-centered AI-based music creation, but are currently limited by speed, quality, and control design trade-offs. Diffusion Inference-Time T-optimization (DITTO), in particular, offers state-of-the-art results, but is over 10x slower than real-time, limiting practical use. We propose Distilled Diffusion Inference-Time T -Optimization (or DITTO-2), a new method to speed up inference-time optimization-based control and unlock faster-than-real-time generation for a wide-variety of applications such as music inpainting, outpainting, intensity, melody, and musical structure control. Our method works by (1) distilling a pre-trained diffusion model for fast sampling via an efficient, modified consistency or consistency trajectory distillation process (2) performing inference-time optimization using our distilled model with one-step sampling as an efficient surrogate optimization task and (3) running a final multi-step sampling generation (decoding) using our estimated noise latents for best-quality, fast, controllable generation. Through thorough evaluation, we find our method not only speeds up generation over 10-20x, but simultaneously improves control adherence and generation quality all at once. Furthermore, we apply our approach to a new application of maximizing text adherence (CLAP score) and show we can convert an unconditional diffusion model without text inputs into a model that yields state-of-the-art text control. Sound examples can be found at https://ditto-music.github.io/ditto2/.

ditto-2, latexit sha1, music generation, (13 more...)

arXiv.org Artificial Intelligence

2405.20289

Country: North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (0.40)

Industry:

Media > Music (0.73)
Leisure & Entertainment (0.73)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Beyond Automated Evaluation Metrics: Evaluating Topic Models On Practical Social Science Content Analysis Tasks

Li, Zongxia, Mao, Andrew, Stephens, Daniel, Goel, Pranav, Walpole, Emily, Dima, Alden, Fung, Juan, Boyd-Graber, Jordan

arXiv.org Artificial IntelligenceJan-29-2024

Topic models are a popular tool for understanding text collections, but their evaluation has been a point of contention. Automated evaluation metrics such as coherence are often used, however, their validity has been questioned for neural topic models (NTMs) and can overlook the benefits of a model in real world applications. To this end, we conduct the first evaluation of neural, supervised and classical topic models in an interactive task based setting. We combine topic models with a classifier and test their ability to help humans conduct content analysis and document annotation. From simulated, real user and expert pilot studies, the Contextual Neural Topic Model does the best on cluster evaluation metrics and human evaluations; however, LDA is competitive with two other NTMs under our simulated experiment and user study results, contrary to what coherence scores suggest. We show that current automated metrics do not provide a complete picture of topic modeling capabilities, but the right choice of NTMs can be better than classical models on practical tasks.

classifier, lda, topic model, (17 more...)

arXiv.org Artificial Intelligence

2401.16348

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > New York > New York County > New York City (0.04)
(9 more...)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.67)

Industry:

Law > Statutes (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Education (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Consistency Trajectory Models: Learning Probability Flow ODE Trajectory of Diffusion

Kim, Dongjun, Lai, Chieh-Hsin, Liao, Wei-Hsiang, Murata, Naoki, Takida, Yuhta, Uesaka, Toshimitsu, He, Yutong, Mitsufuji, Yuki, Ermon, Stefano

arXiv.org Machine LearningOct-1-2023

Consistency Models (CM) (Song et al., 2023) accelerate score-based diffusion model sampling at the cost of sample quality but lack a natural way to trade-off quality for speed. To address this limitation, we propose Consistency Trajectory Model (CTM), a generalization encompassing CM and score-based models as special cases. CTM trains a single neural network that can -- in a single forward pass -- output scores (i.e., gradients of log-density) and enables unrestricted traversal between any initial and final time along the Probability Flow Ordinary Differential Equation (ODE) in a diffusion process. CTM enables the efficient combination of adversarial training and denoising score matching loss to enhance performance and achieves new state-of-the-art FIDs for single-step diffusion model sampling on CIFAR-10 (FID 1.73) and ImageNet at 64X64 resolution (FID 2.06). CTM also enables a new family of sampling schemes, both deterministic and stochastic, involving long jumps along the ODE solution trajectories. It consistently improves sample quality as computational budgets increase, avoiding the degradation seen in CM. Furthermore, CTM's access to the score accommodates all diffusion model inference techniques, including exact likelihood computation.

artificial intelligence, ctm, machine learning, (15 more...)

arXiv.org Machine Learning

2310.02279

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States (0.04)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Sports (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

DELTA: Dynamic Embedding Learning with Truncated Conscious Attention for CTR Prediction

Zhu, Chen, Du, Liang, Chen, Hong, Zhao, Shuang, Sun, Zixun, Wang, Xin, Zhu, Wenwu

arXiv.org Artificial IntelligenceSep-5-2023

Click-Through Rate (CTR) prediction is a pivotal task in product and content recommendation, where learning effective feature embeddings is of great significance. However, traditional methods typically learn fixed feature representations without dynamically refining feature representations according to the context information, leading to suboptimal performance. Some recent approaches attempt to address this issue by learning bit-wise weights or augmented embeddings for feature representations, but suffer from uninformative or redundant features in the context. To tackle this problem, inspired by the Global Workspace Theory in conscious processing, which posits that only a specific subset of the product features are pertinent while the rest can be noisy and even detrimental to human-click behaviors, we propose a CTR model that enables Dynamic Embedding Learning with Truncated Conscious Attention for CTR prediction, termed DELTA. DELTA contains two key components: (I) conscious truncation module (CTM), which utilizes curriculum learning to apply adaptive truncation on attention weights to select the most critical feature in the context; (II) explicit embedding optimization (EEO), which applies an auxiliary task during training that directly and independently propagates the gradient from the loss layer to the embedding layer, thereby optimizing the embedding explicitly via linear feature crossing. Extensive experiments on five challenging CTR datasets demonstrate that DELTA achieves new state-of-art performance among current CTR methods.

artificial intelligence, interaction, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2305.04891

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

A Cosine Similarity-based Method for Out-of-Distribution Detection

Ngoc-Hieu, Nguyen, Hung-Quang, Nguyen, Ta, The-Anh, Nguyen-Tang, Thanh, Doan, Khoa D, Thanh-Tung, Hoang

arXiv.org Artificial IntelligenceJun-23-2023

The ability to detect OOD data is a crucial aspect of practical machine learning applications. In this work, we show that cosine similarity between the test feature and the typical ID feature is a good indicator of OOD data. We propose Class Typical Matching (CTM), a post hoc OOD detection algorithm that uses a cosine similarity scoring function. Extensive experiments on multiple benchmarks show that CTM outperforms existing post hoc OOD detection methods.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2306.1492

Country:

South America > Peru > Loreto Department (0.04)
North America > United States > Maryland > Baltimore (0.04)
North America > Mexico > Gulf of Mexico (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback