Marecek, Jakub
Sample Complexity of Bias Detection with Subsampled Point-to-Subspace Distances
Matilla, German Martinez, Marecek, Jakub
Sample complexity of bias estimation is a lower bound on the runtime of any bias detection method. Many regulatory frameworks require the bias to be tested for all subgroups, whose number grows exponentially with the number of protected attributes. Unless one wishes to run a bias detection with a doubly-exponential run-time, one should like to have polynomial complexity of bias detection for a single subgroup. At the same time, the reference data may be based on surveys, and thus come with non-trivial uncertainty. Here, we reformulate bias detection as a point-to-subspace problem on the space of measures and show that, for supremum norm, it can be subsampled efficiently. In particular, our probabilistically approximately correct (PAC) results are corroborated by tests on well-known instances.
Quantum open system identification via global optimization: Optimally accurate Markovian models of open systems from time-series data
Popovych, Zakhar, Jacobs, Kurt, Korpas, Georgios, Marecek, Jakub, Bondar, Denys I.
Accurate models of the dynamics of quantum circuits are essential for optimizing and advancing quantum devices. Since first-principles models of environmental noise and dissipation in real quantum systems are often unavailable, deriving accurate models from measured time-series data is critical. However, identifying open quantum systems poses significant challenges: powerful methods from systems engineering can perform poorly beyond weak damping (as we show) because they fail to incorporate essential constraints required for quantum evolution (e.g., positivity). Common methods that can include these constraints are typically multi-step, fitting linear models to physically grounded master equations, often resulting in non-convex functions in which local optimization algorithms get stuck in local extrema (as we show). In this work, we solve these problems by formulating quantum system identification directly from data as a polynomial optimization problem, enabling the use of recently developed global optimization methods. These methods are essentially guaranteed to reach global optima, allowing us for the first time to efficiently obtain the most accurate Markovian model for a given system. In addition to its practical importance, this allows us to take the error of these Markovian models as an alternative (operational) measure of the non-Markovianity of a system. We test our method with the spin-boson model -- a two-level system coupled to a bath of harmonic oscillators -- for which we obtain the exact evolution using matrix-product-state techniques. We show that polynomial optimization using moment/sum-of-squares approaches significantly outperforms traditional optimization algorithms, and we show that even for strong damping Lindblad-form master equations can provide accurate models of the spin-boson system.
ExDBN: Exact learning of Dynamic Bayesian Networks
Rytir, Pavel, Wodecki, Ales, Korpas, Georgios, Marecek, Jakub
Causal learning from data has received much attention in recent years. One way of capturing causal relationships is by utilizing Bayesian networks. There, one recovers a weighted directed acyclic graph, in which random variables are represented by vertices, and the weights associated with each edge represent the strengths of the causal relationships between them. This concept is extended to capture dynamic effects by introducing a dependency on past data, which may be captured by the structural equation model, which is utilized in the present contribution to formulate a score-based learning approach. A mixed-integer quadratic program is formulated and an algorithmic solution proposed, in which the pre-generation of exponentially many acyclicity constraints is avoided by utilizing the so-called branch-and-cut ("lazy constraint") method. Comparing the novel approach to the state of the art, we show that the proposed approach turns out to produce excellent results when applied to small and medium-sized synthetic instances of up to 25 time-series. Lastly, two interesting applications in bio-science and finance, to which the method is directly applied, further stress the opportunities in developing highly accurate, globally convergent solvers that can handle modest instances.
Fairness in Ranking: Robustness through Randomization without the Protected Attribute
Kliachkin, Andrii, Psaroudaki, Eleni, Marecek, Jakub, Fotakis, Dimitris
There has been great interest in fairness in machine learning, especially in relation to classification problems. In ranking-related problems, such as in online advertising, recommender systems, and HR automation, much work on fairness remains to be done. Two complications arise: first, the protected attribute may not be available in many applications. Second, there are multiple measures of fairness of rankings, and optimization-based methods utilizing a single measure of fairness of rankings may produce rankings that are unfair with respect to other measures. In this work, we propose a randomized method for post-processing rankings, which do not require the availability of the protected attribute. In an extensive numerical study, we show the robustness of our methods with respect to P-Fairness and effectiveness with respect to Normalized Discounted Cumulative Gain (NDCG) from the baseline ranking, improving on previously proposed methods.
Learning quantum Hamiltonians at any temperature in polynomial time with Chebyshev and bit complexity
Wodecki, Ales, Marecek, Jakub
We consider the problem of learning local quantum Hamiltonians given copies of their Gibbs state at a known inverse temperature, following Haah et al. [2108.04842] and Bakshi et al. [arXiv:2310.02243]. Our main technical contribution is a new flat polynomial approximation of the exponential function based on the Chebyshev expansion, which enables the formulation of learning quantum Hamiltonians as a polynomial optimization problem. This, in turn, can benefit from the use of moment/SOS relaxations, whose polynomial bit complexity requires careful analysis [O'Donnell, ITCS 2017]. Finally, we show that learning a $k$-local Hamiltonian, whose dual interaction graph is of bounded degree, runs in polynomial time under mild assumptions.
Piecewise Polynomial Regression of Tame Functions via Integer Programming
Bareilles, Gilles, Aspman, Johannes, Nemecek, Jiri, Marecek, Jakub
We consider approximating so-called tame functions, a class of nonsmooth, nonconvex functions, with piecewise polynomial functions. Tame functions appear in a wide range of applications: functions encountered in the training of deep neural networks with all common activations, value functions of mixed-integer programs, or wave functions of small molecules. We bound the quality of approximation of a tame function by a piecewise polynomial function with a given number of segments on any full-dimensional cube. We also present the first ever mixed-integer programming formulation of piecewise polynomial regression. Together, these can be used to estimate tame functions. We demonstrate promising computational results.
Generating Likely Counterfactuals Using Sum-Product Networks
Nemecek, Jiri, Pevny, Tomas, Marecek, Jakub
Due to user demand and recent regulation (GDPR, AI Act), decisions made by AI systems need to be explained. These decisions are often explainable only post hoc, where counterfactual explanations are popular. The question of what constitutes the best counterfactual explanation must consider multiple aspects, where "distance from the sample" is the most common. We argue that this requirement frequently leads to explanations that are unlikely and, therefore, of limited value. Here, we present a system that provides high-likelihood explanations. We show that the search for the most likely explanations satisfying many common desiderata for counterfactual explanations can be modeled using mixed-integer optimization (MIO). In the process, we propose an MIO formulation of a Sum-Product Network (SPN) and use the SPN to estimate the likelihood of a counterfactual, which can be of independent interest. A numerical comparison against several methods for generating counterfactual explanations is provided.
Taming Binarized Neural Networks and Mixed-Integer Programs
Aspman, Johannes, Korpas, Georgios, Marecek, Jakub
There has been a great deal of recent interest in binarized neural networks, especially because of their explainability. At the same time, automatic differentiation algorithms such as backpropagation fail for binarized neural networks, which limits their applicability. By reformulating the problem of training binarized neural networks as a subadditive dual of a mixed-integer program, we show that binarized neural networks admit a tame representation. This, in turn, makes it possible to use the framework of Bolte et al. for implicit differentiation, which offers the possibility for practical implementation of backpropagation in the context of binarized neural networks. This approach could also be used for a broader class of mixed-integer programs, beyond the training of binarized neural networks, as encountered in symbolic approaches to AI and beyond.
Joint Problems in Learning Multiple Dynamical Systems
Niu, Mengjia, He, Xiaoyu, Rysavy, Petr, Zhou, Quan, Marecek, Jakub
Clustering of time series is a well-studied problem, with applications ranging from quantitative, personalized models of metabolism obtained from metabolite concentrations to state discrimination in quantum information theory. We consider a variant, where given a set of trajectories and a number of parts, we jointly partition the set of trajectories and learn linear dynamical system (LDS) models for each part, so as to minimize the maximum error across all the models. We present globally convergent methods and EM heuristics, accompanied by promising computational results.
Group-blind optimal transport to group parity and its constrained variants
Zhou, Quan, Marecek, Jakub
Fairness holds a pivotal role in the realm of machine learning, particularly when it comes to addressing groups categorised by sensitive attributes, e.g., gender, race. Prevailing algorithms in fair learning predominantly hinge on accessibility or estimations of these sensitive attributes, at least in the training process. We design a single group-blind projection map that aligns the feature distributions of both groups in the source data, achieving (demographic) group parity, without requiring values of the protected attribute for individual samples in the computation of the map, as well as its use. Instead, our approach utilises the feature distributions of the privileged and unprivileged groups in a boarder population and the essential assumption that the source data are unbiased representation of the population.