Goto

Collaborating Authors

 dnf



Leveraging Predictive Equivalence in Decision Trees

McTavish, Hayden, Boner, Zachery, Donnelly, Jon, Seltzer, Margo, Rudin, Cynthia

arXiv.org Artificial Intelligence

Decision trees are widely used for interpretable machine learning due to their clearly structured reasoning process. However, this structure belies a challenge we refer to as predictive equivalence: a given tree's decision boundary can be represented by many different decision trees. The presence of models with identical decision boundaries but different evaluation processes makes model selection challenging. The models will have different variable importance and behave differently in the presence of missing values, but most optimization procedures will arbitrarily choose one such model to return. We present a boolean logical representation of decision trees that does not exhibit predictive equivalence and is faithful to the underlying decision boundary. We apply our representation to several downstream machine learning tasks. Using our representation, we show that decision trees are surprisingly robust to test-time missingness of feature values; we address predictive equivalence's impact on quantifying variable importance; and we present an algorithm to optimize the cost of reaching predictions.





Learning DNF through Generalized Fourier Representations

Heidari, Mohsen, Khardon, Roni

arXiv.org Artificial Intelligence

The Fourier representation for the uniform distribution over the Boolean cube has found numerous applications in algorithms and complexity analysis. Notably, in learning theory, learnability of Disjunctive Normal Form (DNF) under uniform as well as product distributions has been established through such representations. This paper makes five main contributions. First, it introduces a generalized Fourier expansion that can be used with any distribution $D$ through the representation of the distribution as a Bayesian network (BN). Second, it shows that the main algorithmic tools for learning with the Fourier representation, that use membership queries to approximate functions by recovering their heavy Fourier coefficients, can be used with slight modifications with the generalized expansion. These results hold for any distribution. Third, it analyzes the $L_1$ spectral norm of conjunctions under the new expansion, showing that it is bounded for a class of distributions which can be represented by difference bounded tree BN, where a parent node in the BN representation can change the conditional expectation of a child node by at most $α<0.5$. Lower bounds are presented to show that such constraints are necessary. The fourth contribution uses these results to show the learnability of DNF with membership queries under difference bounded tree BN. The final contribution is to develop an algorithm for learning difference-bounded tree BN distributions, thus extending the DNF learnability result to cases where the distribution is not known in advance.


Discrete Neural Flow Samplers with Locally Equivariant Transformer

Ou, Zijing, Zhang, Ruixiang, Li, Yingzhen

arXiv.org Machine Learning

Sampling from unnormalised discrete distributions is a fundamental problem across various domains. While Markov chain Monte Carlo offers a principled approach, it often suffers from slow mixing and poor convergence. In this paper, we propose Discrete Neural Flow Samplers (DNFS), a trainable and efficient framework for discrete sampling. DNFS learns the rate matrix of a continuous-time Markov chain such that the resulting dynamics satisfy the Kolmogorov equation. As this objective involves the intractable partition function, we then employ control variates to reduce the variance of its Monte Carlo estimation, leading to a coordinate descent learning algorithm. To further facilitate computational efficiency, we propose locally equivaraint Transformer, a novel parameterisation of the rate matrix that significantly improves training efficiency while preserving powerful network expressiveness. Empirically, we demonstrate the efficacy of DNFS in a wide range of applications, including sampling from unnormalised distributions, training discrete energy-based models, and solving combinatorial optimisation problems.


Contextual modulation of language comprehension in a dynamic neural model of lexical meaning

Stern, Michael C., Piñango, Maria M.

arXiv.org Artificial Intelligence

We propose and computationally implement a dynamic neural model of lexical meaning, and experimentally test its behavioral predictions. We demonstrate the architecture and behavior of the model using as a test case the English lexical item 'have', focusing on its polysemous use. In the model, 'have' maps to a semantic space defined by two continuous conceptual dimensions, connectedness and control asymmetry, previously proposed to parameterize the conceptual system for language. The mapping is modeled as coupling between a neural node representing the lexical item and neural fields representing the conceptual dimensions. While lexical knowledge is modeled as a stable coupling pattern, real-time lexical meaning retrieval is modeled as the motion of neural activation patterns between metastable states corresponding to semantic interpretations or readings. Model simulations capture two previously reported empirical observations: (1) contextual modulation of lexical semantic interpretation, and (2) individual variation in the magnitude of this modulation. Simulations also generate a novel prediction that the by-trial relationship between sentence reading time and acceptability should be contextually modulated. An experiment combining self-paced reading and acceptability judgments replicates previous results and confirms the new model prediction. Altogether, results support a novel perspective on lexical polysemy: that the many related meanings of a word are metastable neural activation states that arise from the nonlinear dynamics of neural populations governing interpretation on continuous semantic dimensions.


Local Minima Drive Communications in Cooperative Interaction

Moore, Roger K.

arXiv.org Artificial Intelligence

An important open question in human-robot interaction (HRI) is precisely when an agent should decide to communicate, particularly in a cooperative task. Perceptual Control Theory (PCT) tells us that agents are able to cooperate on a joint task simply by sharing the same 'intention', thereby distributing the effort required to complete the task among the agents. This is even true for agents that do not possess the same abilities, so long as the goal is observable, the combined actions are sufficient to complete the task, and there is no local minimum in the search space. If these conditions hold, then a cooperative task can be accomplished without any communication between the contributing agents. However, for tasks that do contain local minima, the global solution can only be reached if at least one of the agents adapts its intention at the appropriate moments, and this can only be achieved by appropriately timed communication. In other words, it is hypothesised that in cooperative tasks, the function of communication is to coordinate actions in a complex search space that contains local minima. These principles have been verified in a computer-based simulation environment in which two independent one-dimensional agents are obliged to cooperate in order to solve a two-dimensional path-finding task.


Meta Navigation Functions: Adaptive Associations for Coordination of Multi-Agent Systems

Macktoobian, Matin, Duc, Guillaume Ferdinand

arXiv.org Artificial Intelligence

In this paper, we introduce a new class of potential fields, i.e., meta navigation functions (MNFs) to coordinate multi-agent systems. Thanks to the MNF formulation, agents can contribute to each other's coordination via partial and/or total associations, contrary to traditional decentralized navigation functions (DNFs). In particular, agents may stimulate each other via their MNFs. Moreover, MNFs need to be confined which is a weaker condition compared to the Morse condition of DNFs. An MNF is composed of a confined function and an attraction kernel. The critical points of the former can be confined in a safe region around a target critical point. The collision-free trajectory of an agent and its associations to its peers are governed by a confined function before reaching its safe region. Then, the attraction kernel drives the agent to its target in the safe region. MNFs provide faster coordination compared to DNFs. We illustrate how MNFs may exhibit some social behaviors in the course of partial and total associations among agents. Our simulations verify the efficiency of MNFs to coordinate complex swarms of agents.