Goto

Collaborating Authors

 representational capacity


Characterizing the Representational Capacity of Neural Processes

arXiv.org Machine Learning

What functions can Neural Processes represent? We analyze the representational capacity of popular NP architectures: Conditional Neural Processes (CNPs), Attentive Neural Processes (ANPs), Transformer Neural Processes (TNPs), and their latent variants. We prove these architectures form a strict hierarchy. CNP-representable functions are exactly those depending on finitely many expected features of the context distribution. ANPs strictly generalize CNPs via query-dependent reweighting, enabling kernel smoothers. ConvCNPs and ANPs are incomparable; each contains functions outside the other, separated by stationarity versus translation equivariance. TNPs with $L$ self-attention layers capture $L$-hop context interactions. For latent NPs, we show finite-dimensional latents provide coherent sampling but do not circumvent encoder limitations; matching GP posterior distributions requires latent dimension scaling with context size. These results provide a theoretical foundation for architecture selection based on task structure.



Diffusion Actor-Critic with Entropy Regulator

Neural Information Processing Systems

Reinforcement learning (RL) has proven highly effective in addressing complex decision-making and control tasks. However, in most traditional RL algorithms, the policy is typically parameterized as a diagonal Gaussian distribution with learned mean and variance, which constrains their capability to acquire complex policies. In response to this problem, we propose an online RL algorithm termed diffusion actor-critic with entropy regulator (DACER). This algorithm conceptualizes the reverse process of the diffusion model as a novel policy function and leverages the capability of the diffusion model to fit multimodal distributions, thereby enhancing the representational capacity of the policy. Since the distribution of the diffusion policy lacks an analytical expression, its entropy cannot be determined analytically. To mitigate this, we propose a method to estimate the entropy of the diffusion policy utilizing Gaussian mixture model. Building on the estimated entropy, we can learn a parameter $\alpha$ that modulates the degree of exploration and exploitation. Parameter $\alpha$ will be employed to adaptively regulate the variance of the added noise, which is applied to the action output by the diffusion model. Experimental trials on MuJoCo benchmarks and a multimodal task demonstrate that the DACER algorithm achieves state-of-the-art (SOTA) performance in most MuJoCo control tasks while exhibiting a stronger representational capacity of the diffusion policy.







Diffusion Actor-Critic with Entropy Regulator

Neural Information Processing Systems

Reinforcement learning (RL) has proven highly effective in addressing complex decision-making and control tasks. However, in most traditional RL algorithms, the policy is typically parameterized as a diagonal Gaussian distribution with learned mean and variance, which constrains their capability to acquire complex policies. In response to this problem, we propose an online RL algorithm termed diffusion actor-critic with entropy regulator (DACER). This algorithm conceptualizes the reverse process of the diffusion model as a novel policy function and leverages the capability of the diffusion model to fit multimodal distributions, thereby enhancing the representational capacity of the policy. Since the distribution of the diffusion policy lacks an analytical expression, its entropy cannot be determined analytically.


Theoretical Foundations for Semantic Cognition in Artificial Intelligence

arXiv.org Artificial Intelligence

This monograph presents a modular cognitive architecture for artificial intelligence grounded in the formal modeling of belief as structured semantic state. Belief states are defined as dynamic ensembles of linguistic expressions embedded within a navigable manifold, where operators enable assimilation, abstraction, nullification, memory, and introspection. Drawing from philosophy, cognitive science, and neuroscience, we develop a layered framework that enables self-regulating epistemic agents capable of reflective, goal-directed thought. At the core of this framework is the epistemic vacuum: a class of semantically inert cognitive states that serves as the conceptual origin of belief space. From this foundation, the Null Tower arises as a generative structure recursively built through internal representational capacities. The theoretical constructs are designed to be implementable in both symbolic and neural systems, including large language models, hybrid agents, and adaptive memory architectures. This work offers a foundational substrate for constructing agents that reason, remember, and regulate their beliefs in structured, interpretable ways.