exactness
Just-In-Time Piecewise-Linear Semantics for ReLU-type Networks
Duan, Hongyi, Liu, Haoyang, Zhang, Jian'an, Liu, Fengrui, Wang, Yiyi
We present a JIT PL semantics for ReLU-type networks that compiles models into a guarded CPWL transducer with shared guards. The system adds hyperplanes only when operands are affine on the current cell, maintains global lower/upper envelopes, and uses a budgeted branch-and-bound. We obtain anytime soundness, exactness on fully refined cells, monotone progress, guard-linear complexity (avoiding global $\binom{k}{2}$), dominance pruning, and decidability under finite refinement. The shared carrier supports region extraction, decision complexes, Jacobians, exact/certified Lipschitz, LP/SOCP robustness, and maximal causal influence. A minimal prototype returns certificates or counterexamples with cost proportional to visited subdomains.
Towards Evaluation for Real-World LLM Unlearning
Miao, Ke, Hu, Yuke, Li, Xiaochen, Bao, Wenjie, Liu, Zhihao, Qin, Zhan, Ren, Kui
This paper analyzes the limitations of existing unlearning evaluation metrics in terms of practicality, exactness, and robustness in real-world LLM unlearning scenarios. To overcome these limitations, we propose a new metric called Distribution Correction-based Unlearning Evaluation (DCUE). It identifies core tokens and corrects distributional biases in their confidence scores using a validation set. The evaluation results are quantified using the Kolmogorov-Smirnov test. Experimental results demonstrate that DCUE overcomes the limitations of existing metrics, which also guides the design of more practical and reliable unlearning algorithms in the future.
Exactness of Approximate MAP Inference in Continuous MRFs
Computing the MAP assignment in graphical models is generally intractable. As a result, for discrete graphical models, the MAP problem is often approximated using linear programming relaxations. Much research has focused on characterizing when these LP relaxations are tight, and while they are relatively well-understood in the discrete case, only a few results are known for their continuous analog. In this work, we use graph covers to provide necessary and sufficient conditions for continuous MAP relaxations to be tight. We use this characterization to give simple proofs that the relaxation is tight for log-concave decomposable and log-supermodular decomposable models.
Rockafellian Relaxation and Stochastic Optimization under Perturbations
Royset, Johannes O., Chen, Louis L., Eckstrand, Eric
In practice, optimization models are often prone to unavoidable inaccuracies due to dubious assumptions and corrupted data. Traditionally, this placed special emphasis on risk-based and robust formulations, and their focus on ``conservative" decisions. We develop, in contrast, an ``optimistic" framework based on Rockafellian relaxations in which optimization is conducted not only over the original decision space but also jointly with a choice of model perturbation. The framework enables us to address challenging problems with ambiguous probability distributions from the areas of two-stage stochastic optimization without relatively complete recourse, probability functions lacking continuity properties, expectation constraints, and outlier analysis. We are also able to circumvent the fundamental difficulty in stochastic optimization that convergence of distributions fails to guarantee convergence of expectations. The framework centers on the novel concepts of exact and limit-exact Rockafellians, with interpretations of ``negative'' regularization emerging in certain settings. We illustrate the role of Phi-divergence, examine rates of convergence under changing distributions, and explore extensions to first-order optimality conditions. The main development is free of assumptions about convexity, smoothness, and even continuity of objective functions. Numerical results in the setting of computer vision and text analytics with label noise illustrate the framework.
Projective Integral Updates for High-Dimensional Variational Inference
Variational inference is an approximation framework for Bayesian inference that seeks to improve quantified uncertainty in predictions by optimizing a simplified distribution over parameters to stand in for the full posterior. Capturing model variations that remain consistent with training data enables more robust predictions by reducing parameter sensitivity. This work introduces a fixed-point optimization for variational inference that is applicable when every feasible log density can be expressed as a linear combination of functions from a given basis. In such cases, the optimizer becomes a fixed-point of projective integral updates. When the basis spans univariate quadratics in each parameter, feasible densities are Gaussian and the projective integral updates yield quasi-Newton variational Bayes (QNVB). Other bases and updates are also possible. As these updates require high-dimensional integration, this work first proposes an efficient quasirandom quadrature sequence for mean-field distributions. Each iterate of the sequence contains two evaluation points that combine to correctly integrate all univariate quadratics and, if the mean-field factors are symmetric, all univariate cubics. More importantly, averaging results over short subsequences achieves periodic exactness on a much larger space of multivariate quadratics. The corresponding variational updates require 4 loss evaluations with standard (not second-order) backpropagation to eliminate error terms from over half of all multivariate quadratic basis functions. This integration technique is motivated by first proposing stochastic blocked mean-field quadratures, which may be useful in other contexts. A PyTorch implementation of QNVB allows for better control over model uncertainty during training than competing methods. Experiments demonstrate superior generalizability for multiple learning problems and architectures.
On the Exactness of Dantzig-Wolfe Relaxation for Rank Constrained Optimization Problems
In the rank-constrained optimization problem (RCOP), it minimizes a linear objective function over a prespecified closed rank-constrained domain set and $m$ generic two-sided linear matrix inequalities. Motivated by the Dantzig-Wolfe (DW) decomposition, a popular approach of solving many nonconvex optimization problems, we investigate the strength of DW relaxation (DWR) of the RCOP, which admits the same formulation as RCOP except replacing the domain set by its closed convex hull. Notably, our goal is to characterize conditions under which the DWR matches RCOP for any m two-sided linear matrix inequalities. From the primal perspective, we develop the first-known simultaneously necessary and sufficient conditions that achieve: (i) extreme point exactness -- all the extreme points of the DWR feasible set belong to that of the RCOP; (ii) convex hull exactness -- the DWR feasible set is identical to the closed convex hull of RCOP feasible set; and (iii) objective exactness -- the optimal values of the DWR and RCOP coincide. The proposed conditions unify, refine, and extend the existing exactness results in the quadratically constrained quadratic program (QCQP) and fair unsupervised learning. These conditions can be very useful to identify new results, including the extreme point exactness for a QCQP problem that admits an inhomogeneous objective function with two homogeneous two-sided quadratic constraints and the convex hull exactness for fair SVD.
Artificial Intelligence in speech Recognition Technology – What you need to know
It's a well-known fact that the science of speech recognition has made some fantastic progress since IBM introduced its first speech recognition machine in 1962. As the innovation has advanced, speech recognition has gotten progressively embedded in our everyday lives with voice-driven applications like Apple's Siri, Amazon's Alexa, Microsoft's Cortana, or the many voice-responsive highlights of Google. From our phones, PCs, watches, and refrigerators, each new voice-interactive gadget that we bring into our lives develops our reliance on AI and ML. Speech recognition in AI is the cycle that empowers a PC to perceive and react to verbally expressed words and afterwards changing over them in a format that the machine gets it. The machine may then change over it into another type of information relying upon the ultimate aim.
Moment in Time: The Biggest Short Video Dataset For Data Scientists
Moment in Time is one of the biggest human-commented video datasets catching visual and discernible short occasions created by people, creatures, articles and nature. It was developed in 2018 by the researchers: Mathew Monfort, Alex Andonian, Bolei Zhou and Kandan Ramakrishnan. The dataset comprises more than 1,000,000 3-second recordings relating to 339 unique action words. Every action word is related to more than 1,000 recordings bringing about a huge adjusted dataset for taking in powerful occasions from recordings. The various day to day activities associated with this dataset includes falling on the floor, the opening of the mouth, eye, swimming, bouncing etc.
Human Participation in Removal of Biased Data from AI – VentureSoft AI
McKinsey Global Institute as of late detailed that organizations receiving every one of the five types of AI -- computer vision, remote assistants, natural language, robotic process automation, and advanced artificial intelligence -- remain to profit lopsidedly versus their rivals. In any case, before organizations can appreciate the advantages of AI, they should guarantee the data they use for their works is usable and unbiased. All things considered; Machine Learning (ML) algorithms are just in the same class as the data on which they are prepared. What's more, one worrying trend manifesting itself is biased calculations. To eliminate algorithmic bias, organizations should initially guarantee the training data they use is as free as conceivable from bias.
Exactness of Approximate MAP Inference in Continuous MRFs
Computing the MAP assignment in graphical models is generally intractable. As a result, for discrete graphical models, the MAP problem is often approximated using linear programming relaxations. Much research has focused on characterizing when these LP relaxations are tight, and while they are relatively well-understood in the discrete case, only a few results are known for their continuous analog. In this work, we use graph covers to provide necessary and sufficient conditions for continuous MAP relaxations to be tight. We use this characterization to give simple proofs that the relaxation is tight for log-concave decomposable and log-supermodular decomposable models.