Directed Networks
A Job I Like or a Job I Can Get: Designing Job Recommender Systems Using Field Experiments
Bied, Guillaume, Caillou, Philippe, Crépon, Bruno, Gaillac, Christophe, Pérennes, Elia, Sebag, Michèle
Recommendation systems (RSs) are increasingly used to guide job seekers on online platforms, yet the algorithms currently deployed are typically optimized for predictive objectives such as clicks, applications, or hires, rather than job seekers' welfare. We develop a job-search model with an application stage in which the value of a vacancy depends on two dimensions: the utility it delivers to the worker and the probability that an application succeeds. The model implies that welfare-optimal RSs rank vacancies by an expected-surplus index combining both, and shows why rankings based solely on utility, hiring probabilities, or observed application behavior are generically suboptimal, an instance of the inversion problem between behavior and welfare. We test these predictions and quantify their practical importance through two randomized field experiments conducted with the French public employment service. The first experiment, comparing existing algorithms and their combinations, provides behavioral evidence that both dimensions shape application decisions. Guided by the model and these results, the second experiment extends the comparison to an RS designed to approximate the welfare-optimal ranking. The experiments generate exogenous variation in the vacancies shown to job seekers, allowing us to estimate the model, validate its behavioral predictions, and construct a welfare metric. Algorithms informed by the model-implied optimal ranking substantially outperform existing approaches and perform close to the welfare-optimal benchmark. Our results show that embedding predictive tools within a simple job-search framework and combining it with experimental evidence yields recommendation rules with substantial welfare gains in practice.
- Europe > France (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Research Report > Strength High (0.87)
- Banking & Finance > Economy (0.46)
- Education > Educational Setting (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.88)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.45)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)
On the Number of Conditional Independence Tests in Constraint-based Causal Discovery
Monés, Marc Franquesa, Zhang, Jiaqi, Uhler, Caroline
Learning causal relations from observational data is a fundamental problem with wide-ranging applications across many fields. Constraint-based methods infer the underlying causal structure by performing conditional independence tests. However, existing algorithms such as the prominent PC algorithm need to perform a large number of independence tests, which in the worst case is exponential in the maximum degree of the causal graph. Despite extensive research, it remains unclear if there exist algorithms with better complexity without additional assumptions. Here, we establish an algorithm that achieves a better complexity of $p^{\mathcal{O}(s)}$ tests, where $p$ is the number of nodes in the graph and $s$ denotes the maximum undirected clique size of the underlying essential graph. Complementing this result, we prove that any constraint-based algorithm must perform at least $2^{Ω(s)}$ conditional independence tests, establishing that our proposed algorithm achieves exponent-optimality up to a logarithmic factor in terms of the number of conditional independence tests needed. Finally, we validate our theoretical findings through simulations, on semi-synthetic gene-expression data, and real-world data, demonstrating the efficiency of our algorithm compared to existing methods in terms of number of conditional independence tests needed.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Virginia > Arlington County > Arlington (0.04)
- South America > Paraguay > Asunción > Asunción (0.04)
- (6 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Hard labels sampled from sparse targets mislead rotation invariant algorithms
Ghosh, Avrajit, Yu, Bin, Warmuth, Manfred, Bartlett, Peter
One of the most common machine learning setups is logistic regression. In many classification models, including neural networks, the final prediction is obtained by applying a logistic link function to a linear score. In binary logistic regression, the feedback can be either soft labels, corresponding to the true conditional probability of the data (as in distillation), or sampled hard labels (taking values $\pm 1$). We point out a fundamental problem that arises even in a particularly favorable setting, where the goal is to learn a noise-free soft target of the form $σ(\mathbf{x}^{\top}\mathbf{w}^{\star})$. In the over-constrained case (i.e. the number of samples $n$ exceeds the input dimension $d$) with examples $(\mathbf{x}_i,σ(\mathbf{x}_i^{\top}\mathbf{w}^{\star}))$, it is sufficient to recover $\mathbf{w}^{\star}$ and hence achieve the Bayes risk. However, we prove that when the examples are labeled by hard labels $y_i$ sampled from the same conditional distribution $σ(\mathbf{x}_i^{\top}\mathbf{w}^{\star})$ and $\mathbf{w}^{\star}$ is $s$-sparse, then rotation-invariant algorithms are provably suboptimal: they incur an excess risk $Ω\!\left(\frac{d-1}{n}\right)$, while there are simple non-rotation invariant algorithms with excess risk $O(\frac{s\log d}{n})$. The simplest rotation invariant algorithm is gradient descent on the logistic loss (with early stopping). A simple non-rotation-invariant algorithm for sparse targets that achieves the above upper bounds uses gradient descent on the weights $u_i,v_i$, where now the linear weight $w_i$ is reparameterized as $u_iv_i$.
- Research Report > New Finding (0.89)
- Research Report > Experimental Study (0.75)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.55)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)
A Generalised Exponentiated Gradient Approach to Enhance Fairness in Binary and Multi-class Classification Tasks
Boubekraoui, Maryam, d'Aloisio, Giordano, Di Marco, Antinisca
The widespread use of AI and ML models in sensitive areas raises significant concerns about fairness. While the research community has introduced various methods for bias mitigation in binary classification tasks, the issue remains under-explored in multi-class classification settings. To address this limitation, in this paper, we first formulate the problem of fair learning in multi-class classification as a multi-objective problem between effectiveness (i.e., prediction correctness) and multiple linear fairness constraints. Next, we propose a Generalised Exponentiated Gradient (GEG) algorithm to solve this task. GEG is an in-processing algorithm that enhances fairness in binary and multi-class classification settings under multiple fairness definitions. We conduct an extensive empirical evaluation of GEG against six baselines across seven multi-class and three binary datasets, using four widely adopted effectiveness metrics and three fairness definitions. GEG overcomes existing baselines, with fairness improvements up to 92% and a decrease in accuracy up to 14%.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Italy > Abruzzo > L'Aquila Province > L'Aquila (0.04)
- South America > Peru (0.04)
- (6 more...)
- Research Report > New Finding (0.92)
- Research Report > Experimental Study (0.67)
- Health & Medicine > Consumer Health (0.67)
- Health & Medicine > Therapeutic Area (0.46)
- Education > Educational Setting (0.46)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.45)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Rule-State Inference (RSI): A Bayesian Framework for Compliance Monitoring in Rule-Governed Domains
Existing machine learning frameworks for compliance monitoring -- Markov Logic Networks, Probabilistic Soft Logic, supervised models -- share a fundamental paradigm: they treat observed data as ground truth and attempt to approximate rules from it. This assumption breaks down in rule-governed domains such as taxation or regulatory compliance, where authoritative rules are known a priori and the true challenge is to infer the latent state of rule activation, compliance, and parametric drift from partial and noisy observations. We propose Rule-State Inference (RSI), a Bayesian framework that inverts this paradigm by encoding regulatory rules as structured priors and casting compliance monitoring as posterior inference over a latent rule-state space S = {(a_i, c_i, delta_i)}, where a_i captures rule activation, c_i models the compliance rate, and delta_i quantifies parametric drift. We prove three theoretical guarantees: (T1) RSI absorbs regulatory changes in O(1) time via a prior ratio correction, independently of dataset size; (T2) the posterior is Bernstein-von Mises consistent, converging to the true rule state as observations accumulate; (T3) mean-field variational inference monotonically maximizes the Evidence Lower BOund (ELBO). We instantiate RSI on the Togolese fiscal system and introduce RSI-Togo-Fiscal-Synthetic v1.0, a benchmark of 2,000 synthetic enterprises grounded in real OTR regulatory rules (2022-2025). Without any labeled training data, RSI achieves F1=0.519 and AUC=0.599, while absorbing regulatory changes in under 1ms versus 683-1082ms for full model retraining -- at least a 600x speedup.
- Africa > Togo > Maritime Region > Lome (0.05)
- Africa > Middle East > Morocco > Rabat-Salé-Kénitra Region > Rabat (0.04)
- Law (0.75)
- Government (0.75)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.54)
Active Inference for Physical AI Agents -- An Engineering Perspective
Physical AI agents, such as robots and other embodied systems operating under tight and fluctuating resource constraints, remain far less capable than biological agents in open-ended real-world environments. This paper argues that Active Inference (AIF), grounded in the Free Energy Principle, offers a principled foundation for closing that gap. We develop this argument from first principles, following a chain from probability theory through Bayesian machine learning and variational inference to active inference and reactive message passing. From the FEP perspective, systems that maintain their structural and functional integrity over time can, under suitable assumptions, be described as minimizing variational free energy (VFE), and AIF operationalizes this by unifying perception, learning, planning, and control within a single computational objective. We show that VFE minimization is naturally realized by reactive message passing on factor graphs, where inference emerges from local, parallel computations. This realization is well matched to the constraints of physical operation, including hard deadlines, asynchronous data, fluctuating power budgets, and changing environments. Because reactive message passing is event-driven, interruptible, and locally adaptable, performance degrades gracefully under reduced resources while model structure can adjust online. We further show that, under suitable coupling and coarse-graining conditions, coupled AIF agents can be described as higher-level AIF agents, yielding a homogeneous architecture based on the same message-passing primitive across scales. Our contribution is not empirical benchmarking, but a clear theoretical and architectural case for the engineering community.
- South America > Brazil (0.04)
- Europe > Netherlands > North Brabant > Eindhoven (0.04)
- North America > United States > New York (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Integrative Learning of Dynamically Evolving Multiplex Graphs and Nodal Attributes Using Neural Network Gaussian Processes with an Application to Dynamic Terrorism Graphs
Rodriguez-Acosta, Jose, Guha, Sharmistha, Patel, Lekha, Shuler, Kurtis
Exploring the dynamic co-evolution of multiplex graphs and nodal attributes is a compelling question in criminal and terrorism networks. This article is motivated by the study of dynamically evolving interactions among prominent terrorist organizations, considering various organizational attributes like size, ideology, leadership, and operational capacity. Statistically principled integration of multiplex graphs with nodal attributes is significantly challenging due to the need to leverage shared information within and across layers, account for uncertainty in predicting unobserved links, and capture temporal evolution of node attributes. These difficulties increase when layers are partially observed, as in terrorism networks where connections are deliberately hidden to obscure key relationships. To address these challenges, we present a principled methodological framework to integrate the multiplex graph layers and nodal attributes. The approach employs time-varying stochastic latent factor models, leveraging shared latent factors to capture graph structure and its co-evolution with node attributes. Latent factors are modeled using Gaussian processes with an infinitely wide deep neural network-based covariance function, termed neural network Gaussian processes (NN-GP). The NN-GP framework on latent factors exploits the predictive power of Bayesian deep neural network architecture while propagating uncertainty for reliability. Simulation studies highlight superior performance of the proposed approach in achieving inferential objectives. The approach, termed as dynamic joint learner, enables predictive inference (with uncertainty) of diverse unobserved dynamic relationships among prominent terrorist organizations and their organization-specific attributes, as well as clustering behavior in terms of friend-and-foe relationships, which could be informative in counter-terrorism research.
- South America > Colombia (0.28)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > Texas (0.04)
- (13 more...)
- Information Technology > Modeling & Simulation (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Europe > Austria > Styria > Graz (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Europe > Denmark > Capital Region > Kongens Lyngby (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Scalable Learning of Multivariate Distributions via Coresets
Ding, Zeyu, Ickstadt, Katja, Klein, Nadja, Munteanu, Alexander, Omlor, Simon
Efficient and scalable non-parametric or semi-parametric regression analysis and density estimation are of crucial importance to the fields of statistics and machine learning. However, available methods are limited in their ability to handle large-scale data. We address this issue by developing a novel coreset construction for multivariate conditional transformation models (MCTMs) to enhance their scalability and training efficiency. To the best of our knowledge, these are the first coresets for semi-parametric distributional models. Our approach yields substantial data reduction via importance sampling. It ensures with high probability that the log-likelihood remains within multiplicative error bounds of $(1\pm\varepsilon)$ and thereby maintains statistical model accuracy. Compared to conventional full-parametric models, where coresets have been incorporated before, our semi-parametric approach exhibits enhanced adaptability, particularly in scenarios where complex distributions and non-linear relationships are present, but not fully understood. To address numerical problems associated with normalizing logarithmic terms, we follow a geometric approximation based on the convex hull of input data. This ensures feasible, stable, and accurate inference in scenarios involving large amounts of data. Numerical experiments demonstrate substantially improved computational efficiency when handling large and complex datasets, thus laying the foundation for a broad range of applications within the statistics and machine learning communities.
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
- North America > United States > North Carolina > Durham County > Durham (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)