AITopics | sdd

Collaborating Authors

sdd

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Approximate Knowledge Compilation by Online Collapsed Importance Sampling

Tal Friedman, Guy Van den Broeck

Neural Information Processing SystemsNov-20-2025, 16:22:53 GMT

These properties are used implicitly in exact inference, but are difficult to harness for approximate inference.

compilation, darwiche, inference, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Virginia > Arlington County > Arlington (0.04)
North America > Canada > Quebec > Montreal (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.91)

Add feedback

Approximate Knowledge Compilation by Online Collapsed Importance Sampling

Tal Friedman, Guy Van den Broeck

Neural Information Processing SystemsNov-17-2025, 14:32:23 GMT

These properties are used implicitly in exact inference, but are difficult to harness for approximate inference.

artificial intelligence, compilation, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Virginia > Arlington County > Arlington (0.04)
North America > Canada > Quebec > Montreal (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.91)

Add feedback

Tractable Learning for Complex Probability Queries

Jessa Bekker, Jesse Davis, Arthur Choi, Adnan Darwiche, Guy Van den Broeck

Neural Information Processing SystemsOct-2-2025, 13:11:46 GMT

Tractable learning aims to learn probabilistic models where inference is guaranteed to be efficient. However, the particular class of queries that is tractable depends on the model and underlying representation. Usually this class is MPE or conditional probabilities Pr(x |y) for joint assignments x, y . We propose a tractable learner that guarantees efficient inference for a broader class of queries. It simultaneously learns a Markov network and its tractable circuit representation, in order to guarantee and measure tractability. Our approach differs from earlier work by using Sentential Decision Diagrams (SDD) as the tractable language instead of Arithmetic Circuits (AC). SDDs have desirable properties, which more general representations such as ACs lack, that enable basic primitives for Boolean circuit compilation. This allows us to support a broader class of complex probability queries, including counting, threshold, and parity, in polytime.

Add feedback

SDD: Self-Degraded Defense against Malicious Fine-tuning

Chen, Zixuan, Lu, Weikai, Lin, Xin, Zeng, Ziqian

arXiv.org Artificial IntelligenceJul-30-2025

Open-source Large Language Models (LLMs) often employ safety alignment methods to resist harmful instructions. However, recent research shows that maliciously fine-tuning these LLMs on harmful data can easily bypass these safeguards. To counter this, we theoretically uncover why malicious fine-tuning succeeds and identify potential defense strategies. Building on the theoretical analysis, we introduce the Self-Degraded Defense (SDD) framework. SDD encourages LLMs to produce high-quality but irrelevant responses to harmful prompts. When attackers attempt malicious fine-tuning, the general capability of the LLM aligned by SDD will significantly decrease, rendering it incapable of following harmful instructions. Our experimental results confirm SDD's effectiveness against such attacks.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2507.21182

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.46)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Machine-Precision Prediction of Low-Dimensional Chaotic Systems

Schötz, Christof, Boers, Niklas

arXiv.org Artificial IntelligenceJul-15-2025

Low-dimensional chaotic systems such as the Lorenz-63 model are commonly used to benchmark system-agnostic methods for learning dynamics from data. Here we show that learning from noise-free observations in such systems can be achieved up to machine precision: using ordinary least squares regression on high-degree polynomial features with 512-bit arithmetic, our method exceeds the accuracy of standard 64-bit numerical ODE solvers of the true underlying dynamical systems. Depending on the configuration, we obtain valid prediction times of 32 to 105 Lyapunov times for the Lorenz-63 system, dramatically outperforming prior work that reaches 13 Lyapunov times at most. We further validate our results on Thomas' Cyclically Symmetric Attractor, a non-polynomial chaotic system that is considerably more complex than the Lorenz-63 model, and show that similar results extend also to higher dimensions using the spatiotemporally chaotic Lorenz-96 model. Our findings suggest that learning low-dimensional chaotic systems from noise-free data is a solved problem.

artificial intelligence, machine learning, modeling & simulation, (20 more...)

arXiv.org Artificial Intelligence

2507.09652

Country:

Europe > Germany (0.46)
North America > United States (0.27)
Europe > Austria (0.27)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Modeling & Simulation (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
(3 more...)

Add feedback

Scale-Distribution Decoupling: Enabling Stable and Effective Training of Large Language Models

Wang, Ya, Zhuo, Zhijian, Zeng, Yutao, Zhou, Xun, Yang, Jian, Li, Xiaoqing

arXiv.org Artificial IntelligenceFeb-25-2025

Training stability is a persistent challenge in the pre-training of large language models (LLMs), particularly for architectures such as Post-Norm Transformers, which are prone to gradient explosion and dissipation. In this paper, we propose Scale-Distribution Decoupling (SDD), a novel approach that stabilizes training by explicitly decoupling the scale and distribution of the weight matrix in fully-connected layers. SDD applies a normalization mechanism to regulate activations and a learnable scaling vector to maintain well-conditioned gradients, effectively preventing $\textbf{gradient explosion and dissipation}$. This separation improves optimization efficiency, particularly in deep networks, by ensuring stable gradient propagation. Experimental results demonstrate that our method stabilizes training across various LLM architectures and outperforms existing techniques in different normalization configurations. Furthermore, the proposed method is lightweight and compatible with existing frameworks, making it a practical solution for stabilizing LLM training. Code is available at https://github.com/kaihemo/SDD.

enabling stable and effective training, gradient explosion, scale-distribution decoupling, (11 more...)

arXiv.org Artificial Intelligence

2502.15499

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Scalable Discrete Diffusion Samplers: Combinatorial Optimization and Statistical Physics

Sanokowski, Sebastian, Berghammer, Wilhelm, Ennemoser, Martin, Wang, Haoyu Peter, Hochreiter, Sepp, Lehner, Sebastian

arXiv.org Machine LearningFeb-17-2025

Learning to sample from complex unnormalized distributions over discrete domains emerged as a promising research direction with applications in statistical physics, variational inference, and combinatorial optimization. Recent work has demonstrated the potential of diffusion models in this domain. However, existing methods face limitations in memory scaling and thus the number of attainable diffusion steps since they require backpropagation through the entire generative process. To overcome these limitations we introduce two novel training methods for discrete diffusion samplers, one grounded in the policy gradient theorem and the other one leveraging Self-Normalized Neural Importance Sampling (SN-NIS). These methods yield memory-efficient training and achieve state-of-the-art results in unsupervised combinatorial optimization. Numerous scientific applications additionally require the ability of unbiased sampling. We introduce adaptations of SN-NIS and Neural Markov Chain Monte Carlo that enable for the first time the application of discrete diffusion models to this problem. We validate our methods on Ising model benchmarks and find that they outperform popular autoregressive approaches. Our work opens new avenues for applying diffusion models to a wide range of scientific applications in discrete domains that were hitherto restricted to exact likelihood models.

artificial intelligence, diffusion model, machine learning, (19 more...)

arXiv.org Machine Learning

2502.08696

Country:

Europe (1.00)
North America > United States (0.67)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsFeb-12-2025, 00:21:26 GMT

Abstract: the paper introduces LearnSDD an algorithm that learns log-linear models for discrete random variables but adds a penalty term for models that are expensive at query time. Compared to earlier work in this direction the paper studies a new way of describing models (SDDs instead of ACs) and is interested in "complex queries", e.g. The computational complexity of complex queries are not directly addressed in the algorithm, but as it turns out the choice of SDD as model space also has good run-time performance for certain complex queries (Theorem 1). Quality: there are no obvious errors, but some definitions in the proof are missing. Some key elements in the algorithm are not motivated/discussed (see comments below) Clarity: The presentation is good enough, but can be improved.

algorithm, author feedback and meta-review, complex query, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.39)

Add feedback

Sparse Data Generation Using Diffusion Models

Ostheimer, Phil, Nagda, Mayank, Kloft, Marius, Fellenz, Sophie

arXiv.org Artificial IntelligenceFeb-4-2025

SDD extends Despite significant advances in generative modeling, a critical continuous state-space diffusion models by explicitly gap remains in developing models explicitly designed modeling sparsity through the introduction of for sparse data. Directly generating sparse data ensures that Sparsity Bits. Empirical validation on image data models learn realistic structures and distributions, preserving from various domains--including two scientific meaningful relationships that thresholding dense data applications, physics and biology--demonstrates would distort. Sparse data is crucial for applications like that SDD achieves high fidelity in representing data augmentation, where realistic but varied samples improve data sparsity while preserving the quality of the model robustness, and compressed representations, generated data.

artificial intelligence, machine learning, sparsity, (15 more...)

arXiv.org Artificial Intelligence

2502.02448

Country:

Europe > Austria > Vienna (0.14)
Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)
(8 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Exact Soft Analytical Side-Channel Attacks using Tractable Circuits

Wedenig, Thomas, Nagpal, Rishub, Cassiers, Gaëtan, Mangard, Stefan, Peharz, Robert

arXiv.org Artificial IntelligenceJan-23-2025

Detecting weaknesses in cryptographic algorithms is of utmost importance for designing secure information systems. The state-of-the-art soft analytical side-channel attack (SASCA) uses physical leakage information to make probabilistic predictions about intermediate computations and combines these "guesses" with the known algorithmic logic to compute the posterior distribution over the key. This attack is commonly performed via loopy belief propagation, which, however, lacks guarantees in terms of convergence and inference quality. In this paper, we develop a fast and exact inference method for SASCA, denoted as ExSASCA, by leveraging knowledge compilation and tractable probabilistic circuits. When attacking the Advanced Encryption Standard (AES), the most widely used encryption algorithm to date, ExSASCA outperforms SASCA by more than 31% top-1 success rate absolute. By leveraging sparse belief messages, this performance is achieved with little more computational cost than SASCA, and about 3 orders of magnitude less than exact inference via exhaustive enumeration. Even with dense belief messages, ExSASCA still uses 6 times less computations than exhaustive inference.

artificial intelligence, exact soft analytical side-channel attack, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2501.13748

Country:

Europe > Austria > Vienna (0.14)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.14)
Europe > Austria > Styria > Graz (0.04)
(12 more...)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback