AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.51)

Neural Information Processing SystemsFeb-18-2026, 05:21:34 GMT

Provable Editing of Deep Neural Networks using Parametric Linear Relaxation

However, the problem of provably editing a DNN to satisfy a property remains challenging.

artificial intelligence, machine learning, parametric linear relaxation, (15 more...)

Country:

North America > United States > California > Yolo County > Davis (0.14)
North America > Canada > Quebec > Montreal (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(8 more...)

Genre: Research Report > Experimental Study (0.92)

Industry: Information Technology (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.82)

Neural Information Processing SystemsFeb-17-2026, 21:41:52 GMT

A Missing lemmas for the proof of Theorem 3.1

The following proof is from Daniely and V ardi [15], and we give it here for completeness. By Lemma A.1, there exists a DNF formula We construct such an affine layer in Lemma A.2. At least one of the k size-n slices in z contains 0 more than once. We define the outputs of our affine layer as follows. Pr [z represents a hyperedge ] = n (n 1) ... (n k + 1) null 1 n null Pr null z Z null 1 2 log(n) .

artificial intelligence, machine learning, neuron, (16 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.51)

Neural Information Processing SystemsOct-10-2025, 17:00:37 GMT

ce6326ac4794bb04d5eb16f597446baf-Paper-Conference.pdf

definition 3, formula, parametric linear relaxation, (12 more...)

Country:

North America > United States > California > Yolo County > Davis (0.14)
North America > Canada > Quebec > Montreal (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(8 more...)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Software (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceOct-9-2024

Multi-Neuron Unleashes Expressivity of ReLU Networks Under Convex Relaxation

Mao, Yuhao, Zhang, Yani, Vechev, Martin

Neural work certification has established itself as a crucial tool for ensuring the robustness of neural networks. Certification methods typically rely on convex relaxations of the feasible output set to provide sound bounds. However, complete certification requires exact bounds, which strongly limits the expressivity of ReLU networks: even for the simple ``$\max$'' function in $\mathbb{R}^2$, there does not exist a ReLU network that expresses this function and can be exactly bounded by single-neuron relaxation methods. This raises the question whether there exists a convex relaxation that can provide exact bounds for general continuous piecewise linear functions in $\mathbb{R}^n$. In this work, we answer this question affirmatively by showing that (layer-wise) multi-neuron relaxation provides complete certification for general ReLU networks. Based on this novel result, we show that the expressivity of ReLU networks is no longer limited under multi-neuron relaxation. To the best of our knowledge, this is the first positive result on the completeness of convex relaxations, shedding light on the practice of certified robustness.

constraint, relaxation, relu network, (14 more...)

2410.06816

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Europe > France (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

arXiv.org Artificial IntelligenceApr-17-2024

Towards White Box Deep Learning

Satkiewicz, Maciej

The main advantages of deep neural networks (DNNs) are their architectural simplicity and automatic feature learning. The latter is crucial for working with unstructured data as developers don't need to design features by hand. However, giving away the control over features leads to black box models - DNNs tend to learn hardly interpretable "shortcut" correlations [17] that leak from train to test [20], hampering alignment and out-of-distribution performance. In particular, this gives rise to adversarial attacks [35] - semantically negligible perturbations of data that arbitrarily change model's predictions. Adversarial vulnerability is a widespread phenomenon (vision [35], segmentation/detection [39], speech recognition [9], tabular data [10], RL [19], NLP [41]) and largely contributes to the general lack of trust in DNNs, substantially limiting their adoption in high-stakes applications such as healthcare, military, autonomous vehicles or cybersecurity. Conversely, the main advantage of hand-designed features is the fine-grained control over model's performance; however, such systems quickly become infeasibly complex. This paper aims to address those issues by reconciling Deep Learning with feature engineering - with the help of locality engineering. Specifically, semantic features are introduced as a general conceptual machinery for controlled dimensionality reduction inside a neural network layer. Figure 1 presents the core idea behind the notion and the rigorous definition is given in Section 4. Implementing a semantic feature predominantly involves encoding appropriate invariants (i.e.

arxiv, robustness, semantic feature, (14 more...)

2403.09863

Country:

Europe > Poland > Masovia Province > Warsaw (0.04)
Europe > Poland > Lesser Poland Province > Kraków (0.04)

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (0.55)
Government > Military (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Cho, Jaejin, Villalba, Jes'us, Moro-Velazquez, Laureano, Dehak, Najim

Non-Contrastive Self-supervised Learning for Utterance-Level Information Extraction from Speech

arXiv.org Artificial IntelligenceAug-10-2022

In recent studies, self-supervised pre-trained models tend to outperform supervised pre-trained models in transfer learning. In particular, self-supervised learning (SSL) of utterance-level speech representation can be used in speech applications that require discriminative representation of consistent attributes within an utterance: speaker, language, emotion, and age. Existing frame-level self-supervised speech representation, e.g., wav2vec, can be used as utterance-level representation with pooling, but the models are usually large. There are also SSL techniques to learn utterance-level representation. One of the most successful is a contrastive method, which requires negative sampling: selecting alternative samples to contrast with the current sample (anchor). However, this does not ensure that all the negative samples belong to classes different from the anchor class without labels. This paper applies a non-contrastive self-supervised method to learn utterance-level embeddings. We adapted DIstillation with NO labels (DINO) from computer vision to speech. Unlike contrastive methods, DINO does not require negative sampling. We compared DINO to x-vector trained in a supervised manner. When transferred to down-stream tasks (speaker verification, speech emotion recognition (SER), and Alzheimer's disease detection), DINO outperformed x-vector. We studied the influence of several aspects during transfer learning such as dividing the fine-tuning process into steps, chunk lengths, or augmentation. During fine-tuning, tuning the last affine layers first and then the whole network surpassed fine-tuning all at once. Using shorter chunk lengths, although they generate more diverse inputs, did not necessarily improve performance, implying speech segments at least with a specific length are required for better performance per application. Augmentation was helpful in SER.

artificial intelligence, machine learning, natural language, (20 more...)

doi: 10.1109/JSTSP.2022.3197315

2208.05445

Country: North America > United States > Maryland > Baltimore (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.70)
(2 more...)

arXiv.org Artificial IntelligenceFeb-22-2021

Sandwich Batch Normalization

Gong, Xinyu, Chen, Wuyang, Chen, Tianlong, Wang, Zhangyang

We present Sandwich Batch Normalization (SaBN), an embarrassingly easy improvement of Batch Normalization (BN) with only a few lines of code changes. SaBN is motivated by addressing the inherent feature distribution heterogeneity that one can be identified in many tasks, which can arise from data heterogeneity (multiple input domains) or model heterogeneity (dynamic architectures, model conditioning, etc.). Our SaBN factorizes the BN affine layer into one shared sandwich affine layer, cascaded by several parallel independent affine layers. Concrete analysis reveals that, during optimization, SaBN promotes balanced gradient norms while still preserving diverse gradient directions: a property that many application tasks seem to favor. We demonstrate the prevailing effectiveness of SaBN as a drop-in replacement in four tasks: $\textbf{conditional image generation}$, $\textbf{neural architecture search}$ (NAS), $\textbf{adversarial training}$, and $\textbf{arbitrary style transfer}$. Leveraging SaBN immediately achieves better Inception Score and FID on CIFAR-10 and ImageNet conditional image generation with three state-of-the-art GANs; boosts the performance of a state-of-the-art weight-sharing NAS algorithm significantly on NAS-Bench-201; substantially improves the robust and standard accuracies for adversarial defense; and produces superior arbitrary stylized results. We also provide visualizations and analysis to help understand why SaBN works. Codes are available at https://github.com/VITA-Group/Sandwich-Batch-Normalization.

affine layer, arxiv preprint arxiv, normalization, (14 more...)

2102.11382

Country: North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Ryder, Tom, Golightly, Andrew, Matthews, Isaac, Prangle, Dennis

Scalable approximate inference for state space models with normalising flows

arXiv.org Machine LearningOct-2-2019

By exploiting mini-batch stochastic gradient optimisation, variational inference has had great success in scaling up approximate Bayesian inference to big data. To date, however, this strategy has only been applicable to models of independent data. Here we extend mini-batch variational methods to state space models of time series data. To do so we introduce a novel generative model as our variational approximation, a local inverse autoregressive flow. This allows a subsequence to be sampled without sampling the entire distribution. Hence we can perform training iterations using short portions of the time series at low computational cost. We illustrate our method on AR(1), Lotka-Volterra and FitzHugh-Nagumo models, achieving accurate parameter estimation in a short time.

affine layer, inference, iteration, (13 more...)

arXiv.org Machine Learning

1910.00879

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Alsubaihi, Salman, Bibi, Adel, Alfadly, Modar, Ghanem, Bernard

Probabilistically True and Tight Bounds for Robust Deep Neural Network Training

arXiv.org Machine LearningMay-28-2019

Training Deep Neural Networks (DNNs) that are robust to norm bounded adversarial attacks remains an elusive problem. While verification based methods are generally too expensive to robustly train large networks, it was demonstrated in Gowal et al. that bounded input intervals can be inexpensively propagated per layer through large networks. This interval bound propagation (IBP) approach lead to high robustness and was the first to be employed on large networks. However, due to the very loose nature of the IBP bounds, particularly for large networks, the required training procedure is complex and involved. In this paper, we closely examine the bounds of a block of layers composed of an affine layer followed by a ReLU nonlinearity followed by another affine layer. In doing so, we propose probabilistic bounds, true bounds with overwhelming probability, that are provably tighter than IBP bounds in expectation. We then extend this result to deeper networks through blockwise propagation and show that we can achieve orders of magnitudes tighter bounds compared to IBP. With such tight bounds, we demonstrate that a simple standard training procedure can achieve the best robustness-accuracy trade-off across several architectures on both MNIST and CIFAR10.

artificial intelligence, machine learning, test accuracy, (17 more...)

arXiv.org Machine Learning

1905.12418

Genre: Research Report (0.82)

Industry: Information Technology (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)