AITopics | sdt

Collaborating Authors

sdt

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Why Masking Diffusion Works: Condition on the Jump Schedule for Improved Discrete Diffusion

Neural Information Processing SystemsJun-16-2026, 02:07:07 GMT

Discrete diffusion models, like continuous diffusion models, generate high-quality samples by gradually undoing noise applied to datapoints with a Markov process. Gradual generation in theory comes with many conceptual benefits; for example, inductive biases can be incorporated into the noising Markov process, and access to improved sampling algorithms. In practice, however, the consistently best performing discrete diffusion model is, surprisingly, masking diffusion, which does not denoise gradually. Here we explain the superior performance of masking diffusion by noting that it makes use of a fundamental difference between continuous and discrete Markov processes: discrete Markov processes evolve by discontinuous jumps at a fixed rate and, unlike other discrete diffusion models, masking diffusion builds in the known distribution of jump times and only learns where to jump to. We show that we can similarly bake in the known distribution of jump times into any discrete diffusion model. The resulting models -- schedule-conditioned diffusion (SCUD) -- generalize classical discrete diffusion and masking diffusion. By applying SCUD to models with noising processes that incorporate inductive biases on images, text, and protein data, we build models that outperform masking.

artificial intelligence, forward process, machine learning, (19 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.94)

Add feedback

A Comprehensive Survey on Surgical Digital Twin

Khan, Afsah Sharaf, Fan, Falong, Kim, Doohwan DH, Alshareef, Abdurrahman, Chen, Dong, Kim, Justin, Carter, Ernest, Liu, Bo, Rozenblit, Jerzy W., Zeigler, Bernard

arXiv.org Artificial IntelligenceDec-2-2025

Such models are integral to the development of context-aware surgical training systems and process monitoring platforms [11], [19] as well as for encoding adaptive robotic control policies in teleoperated environments [13], [20], [78]. However, their limited capacity to capture continuous biophysical dynamics can constrain their utility in applications where physiological fidelity is essential. Recognizing the limitations inherent in purely continuous or discrete approaches, hybrid modeling strategies have emerged as a state-of-the-art solution for surgical digital twins. These frameworks integrate continuous dynamic models with discrete state machines, enabling the simultaneous tracking of physiological changes and procedural events [8], [7], [19], [37]. For example, hybrid automata have been deployed to synchronize real-time updates of tissue deformation with the sequencing of surgical tool actions [7], [19]. This integration allows digital twins to provide context-sensitive support, adapting to abrupt workflow transitions and physiological perturbations alike--a critical requirement in both routine and emergent surgical scenarios [8], [11], [7]. B. Mutual Information and Information-Theoretic Approaches With the proliferation of multi-modal surgical data, information-theoretic concepts have become indispensable for quantifying uncertainty, relevance, and redundancy across heterogeneous information streams. Mutual information I(X; Y) has been adopted as a rigorous metric for selecting the most informative sensors, imaging modalities, or clinical parameters, thereby enhancing the efficiency and robustness of digital twin-enabled decision support [2], [3], [13], [34], [11], [51], [48], [26], [29]. This is formally captured as Eq.

decision support system, machine learning, real time system, (20 more...)

arXiv.org Artificial Intelligence

2512.00019

Country: North America > United States (0.28)

Genre:

Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Surgery (1.00)
(2 more...)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Modeling & Simulation (1.00)
(10 more...)

Add feedback

Subjective Depth and Timescale Transformers: Learning Where and When to Compute

Wieser, Frederico, Benfeghoul, Martin, Ammar, Haitham Bou, Wang, Jun, Fountas, Zafeirios

arXiv.org Artificial IntelligenceNov-27-2025

The rigid, uniform allocation of computation in standard Transformer (TF) architectures can limit their efficiency and scalability, particularly for large-scale models and long sequences. Addressing this, we introduce Subjective Depth Transformers (SDT) and Subjective Timescale Transformers (STT), two distinct architectures that leverage Bayesian surprise signals to dynamically route computation, learning where and when to compute within decoder-only TFs. SDT augments a decoder-only stack with alternating Decision and Dynamic layers: a Decision layer computes a full block 'posterior' and a lightweight 'prior,' while a Dynamic layer employs fixed-capacity Top-K routing based on Bayesian surprise (Expected and Unexpected Change), maintaining a static compute graph. STT extends this conditional computation to the temporal domain: a transition network predicts residual updates, forming a temporal 'change hypothesis' that informs a router to dynamically execute or bypass TF blocks for each token, managing KV-cache contributions. Both architectures exhibit the predicted shift from novelty to prediction driven gating over training, suggesting alignment with surprise based principles. While operating at reduced capacity, they offer preliminary insights into the compute-accuracy trade-offs of conditional computation. The proposed architectures establish a flexible framework for efficiency, reducing self-attention computation by 75% and KV-cache requirements by 50% within each compute skipping layer, setting a pathway for more efficient models.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2511.21408

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Grounding Language Models with Semantic Digital Twins for Robotic Planning

Naeem, Mehreen, Melnik, Andrew, Beetz, Michael

arXiv.org Artificial IntelligenceJun-23-2025

We introduce a novel framework that integrates Semantic Digital Twins (SDTs) with Large Language Models (LLMs) to enable adaptive and goal-driven robotic task execution in dynamic environments. The system decomposes natural language instructions into structured action triplets, which are grounded in contextual environmental data provided by the SDT. This semantic grounding allows the robot to interpret object affordances and interaction rules, enabling action planning and real-time adaptability. In case of execution failures, the LLM utilizes error feedback and SDT insights to generate recovery strategies and iteratively revise the action plan. We evaluate our approach using tasks from the ALFRED benchmark, demonstrating robust performance across various household scenarios. The proposed framework effectively combines high-level reasoning with semantic environment understanding, achieving reliable task completion in the face of uncertainty and failure.

large language model, natural language, semantic digital twin, (16 more...)

arXiv.org Artificial Intelligence

2506.16493

Country: Europe > Germany > Bremen > Bremen (0.28)

Genre:

Research Report (0.68)
Workflow (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.85)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.63)

Add feedback

Towards a Formal Theory of the Need for Competence via Computational Intrinsic Motivation

Lintunen, Erik M., Ady, Nadia M., Deterding, Sebastian, Guckelsberger, Christian

arXiv.org Artificial IntelligenceFeb-11-2025

Computational models offer powerful tools for formalising psychological theories, making them both testable and applicable in digital contexts. However, they remain little used in the study of motivation within psychology. We focus on the "need for competence", postulated as a key basic human need within Self-Determination Theory (SDT) -- arguably the most influential psychological framework for studying intrinsic motivation (IM). The need for competence is treated as a single construct across SDT texts. Yet, recent research has identified multiple, ambiguously defined facets of competence in SDT. We propose that these inconsistencies may be alleviated by drawing on computational models from the field of artificial intelligence, specifically from the domain of reinforcement learning (RL). By aligning the aforementioned facets of competence -- effectance, skill use, task performance, and capacity growth -- with existing RL formalisms, we provide a foundation for advancing competence-related theory in SDT and motivational psychology more broadly. The formalisms reveal underlying preconditions that SDT fails to make explicit, demonstrating how computational models can improve our understanding of IM. Additionally, our work can support a cycle of theory development by inspiring new computational models formalising aspects of the theory, which can then be tested empirically to refine the theory. While our research lays a promising foundation, empirical studies of these models in both humans and machines are needed, inviting collaboration across disciplines.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2502.07423

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)

Add feedback

Push the Limit of Multi-modal Emotion Recognition by Prompting LLMs with Receptive-Field-Aware Attention Weighting

Zhang, Liyun, Ding, Dian, Lu, Yu, Chen, Yi-Chao, Xue, Guangtao

arXiv.org Artificial IntelligenceNov-26-2024

Understanding the emotions in a dialogue usually requires external knowledge to accurately understand the contents. As the LLMs become more and more powerful, we do not want to settle on the limited ability of the pre-trained language model. However, the LLMs either can only process text modality or are too expensive to process the multimedia information. We aim to utilize both the power of LLMs and the supplementary features from the multimedia modalities. In this paper, we present a framework, Lantern, that can improve the performance of a certain vanilla model by prompting large language models with receptive-field-aware attention weighting. This framework trained a multi-task vanilla model to produce probabilities of emotion classes and dimension scores. These predictions are fed into the LLMs as references to adjust the predicted probabilities of each emotion class with its external knowledge and contextual understanding. We slice the dialogue into different receptive fields, and each sample is included in exactly t receptive fields. Finally, the predictions of LLMs are merged with a receptive-field-aware attention-driven weighting module. In the experiments, vanilla models CORECT and SDT are deployed in Lantern with GPT-4 or Llama-3.1-405B. The experiments in IEMOCAP with 4-way and 6-way settings demonstrated that the Lantern can significantly improve the performance of current vanilla models by up to 1.23% and 1.80%.

llm, prediction, vanilla model, (17 more...)

arXiv.org Artificial Intelligence

2411.17674

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Temporal Logic Specification-Conditioned Decision Transformer for Offline Safe Reinforcement Learning

Guo, Zijian, Zhou, Weichao, Li, Wenchao

arXiv.org Artificial IntelligenceFeb-27-2024

Offline safe reinforcement learning (RL) aims to train a constraint satisfaction policy from a fixed dataset. Current state-of-the-art approaches are based on supervised learning with a conditioned policy. However, these approaches fall short in real-world applications that involve complex tasks with rich temporal and logical structures. In this paper, we propose temporal logic Specification-conditioned Decision Transformer (SDT), a novel framework that harnesses the expressive power of signal temporal logic (STL) to specify complex temporal rules that an agent should follow and the sequential modeling capability of Decision Transformer (DT). Empirical evaluations on the DSRL benchmarks demonstrate the better capacity of SDT in learning safe and high-reward policies compared with existing approaches. In addition, SDT shows good alignment with respect to different desired degrees of satisfaction of the STL specification that it is conditioned on.

decision transformer, suffix, trajectory, (10 more...)

arXiv.org Artificial Intelligence

2402.17217

Country:

Europe > Russia (0.04)
Europe > France (0.04)
Asia > Russia (0.04)
(2 more...)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A Sequential Detection and Tracking of Very Low SNR Objects

Rezaie, Reza

arXiv.org Artificial IntelligenceDec-25-2023

A sequential detection and tracking (SDT) approach is proposed for detection and tracking of very low signal-to-noise (SNR) objects. The proposed approach is compared with two existing particle filter track-before-track (TBD) methods. It is shown that the former outperforms the latter. A conventional detection and tracking (CDT) approach, based on one-data-frame thresholding, is considered as a benchmark for comparison. Simulations demonstrate the performance.

conditionally markov sequence, rmse, snr, (14 more...)

arXiv.org Artificial Intelligence

2312.15823

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.05)
North America > United States > New York > Monroe County > Rochester (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.74)

Add feedback

Exact and Cost-Effective Automated Transformation of Neural Network Controllers to Decision Tree Controllers

Chang, Kevin, Dahlin, Nathan, Jain, Rahul, Nuzzo, Pierluigi

arXiv.org Artificial IntelligenceSep-15-2023

Over the past decade, neural network (NN)-based controllers have demonstrated remarkable efficacy in a variety of decision-making tasks. However, their black-box nature and the risk of unexpected behaviors and surprising results pose a challenge to their deployment in real-world systems with strong guarantees of correctness and safety. We address these limitations by investigating the transformation of NN-based controllers into equivalent soft decision tree (SDT)-based controllers and its impact on verifiability. Differently from previous approaches, we focus on discrete-output NN controllers including rectified linear unit (ReLU) activation functions as well as argmax operations. We then devise an exact but cost-effective transformation algorithm, in that it can automatically prune redundant branches. We evaluate our approach using two benchmarks from the OpenAI Gym environment. Our results indicate that the SDT transformation can benefit formal verification, showing runtime improvements of up to 21x and 2x for MountainCar-v0 and CartPole-v0, respectively.

controller, node, sdt, (14 more...)

arXiv.org Artificial Intelligence

2304.06049

Country:

North America > United States > California (0.14)
North America > United States > Illinois (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Transportation (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Towards Ubiquitous Semantic Metaverse: Challenges, Approaches, and Opportunities

Li, Kai, Lau, Billy Pik Lik, Yuan, Xin, Ni, Wei, Guizani, Mohsen, Yuen, Chau

arXiv.org Artificial IntelligenceAug-5-2023

In recent years, ubiquitous semantic Metaverse has been studied to revolutionize immersive cyber-virtual experiences for augmented reality (AR) and virtual reality (VR) users, which leverages advanced semantic understanding and representation to enable seamless, context-aware interactions within mixed-reality environments. This survey focuses on the intelligence and spatio-temporal characteristics of four fundamental system components in ubiquitous semantic Metaverse, i.e., artificial intelligence (AI), spatio-temporal data representation (STDR), semantic Internet of Things (SIoT), and semantic-enhanced digital twin (SDT). We thoroughly survey the representative techniques of the four fundamental system components that enable intelligent, personalized, and context-aware interactions with typical use cases of the ubiquitous semantic Metaverse, such as remote education, work and collaboration, entertainment and socialization, healthcare, and e-commerce marketing. Furthermore, we outline the opportunities for constructing the future ubiquitous semantic Metaverse, including scalability and interoperability, privacy and security, performance measurement and standardization, as well as ethical considerations and responsible AI. Addressing those challenges is important for creating a robust, secure, and ethically sound system environment that offers engaging immersive experiences for the users and AR/VR applications.

artificial intelligence, machine learning, metaverse, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/JIOT.2023.3302159

2307.06687

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Asia > Singapore (0.04)
South America > Ecuador > Guayas Province > Guayaquil (0.04)
(9 more...)

Genre:

Overview (1.00)
Research Report (0.81)

Industry:

Transportation (1.00)
Information Technology > Security & Privacy (1.00)
Energy (1.00)
(2 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Internet of Things (1.00)
Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (1.00)
(2 more...)

Add feedback