involution
Beyond World Models: Rethinking Understanding in AI Models
World models have garnered substantial interest in the AI community. These are internal representations that simulate aspects of the external world, track entities and states, capture causal relationships, and enable prediction of consequences. This contrasts with representations based solely on statistical correlations. A key motivation behind this research direction is that humans possess such mental world models, and finding evidence of similar representations in AI models might indicate that these models "understand" the world in a human-like way. In this paper, we use case studies from the philosophy of science literature to critically examine whether the world model framework adequately characterizes human-level understanding. We focus on specific philosophical analyses where the distinction between world model capabilities and human understanding is most pronounced. While these represent particular views of understanding rather than universal definitions, they help us explore the limits of world models.
- North America > United States > New York > Nassau County > Mineola (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Asia > India > Karnataka > Bengaluru (0.04)
A Framework for Analyzing Abnormal Emergence in Service Ecosystems Through LLM-based Agent Intention Mining
Shen, Yifan, Zhao, Zihan, Xue, Xiao, Guo, Yuwei, Ma, Qun, Zhou, Deyu, Zhang, Ming
With the rise of service computing, cloud computing, and IoT, service ecosystems are becoming increasingly complex. The intricate interactions among intelligent agents make abnormal emergence analysis challenging, as traditional causal methods focus on individual trajectories. Large language models offer new possibilities for Agent-Based Modeling (ABM) through Chain-of-Thought (CoT) reasoning to reveal agent intentions. However, existing approaches remain limited to microscopic and static analysis. This paper introduces a framework: Emergence Analysis based on Multi-Agent Intention (EAMI), which enables dynamic and interpretable emergence analysis. EAMI first employs a dual-perspective thought track mechanism, where an Inspector Agent and an Analysis Agent extract agent intentions under bounded and perfect rationality. Then, k-means clustering identifies phase transition points in group intentions, followed by a Intention Temporal Emergence diagram for dynamic analysis. The experiments validate EAMI in complex online-to-offline (O2O) service system and the Stanford AI Town experiment, with ablation studies confirming its effectiveness, generalizability, and efficiency. This framework provides a novel paradigm for abnormal emergence and causal analysis in service ecosystems. The code is available at https://anonymous.4open.science/r/EAMI-B085.
- Asia > China > Tianjin Province > Tianjin (0.05)
- Asia > China > Shandong Province > Jinan (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > United Kingdom > England > Devon > Exeter (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)
Spatially Optimized Compact Deep Metric Learning Model for Similarity Search
Islam, Md. Farhadul, Reza, Md. Tanzim, Manab, Meem Arafat, Mahin, Mohammad Rakibul Hasan, Zabeen, Sarah, Noor, Jannatun
Spatial optimization is often overlooked in many computer vision tasks. Filters should be able to recognize the features of an object regardless of where it is in the image. Similarity search is a crucial task where spatial features decide an important output. The capacity of convolution to capture visual patterns across various locations is limited. In contrast to convolution, the involution kernel is dynamically created at each pixel based on the pixel value and parameters that have been learned. This study demonstrates that utilizing a single layer of involution feature extractor alongside a compact convolution model significantly enhances the performance of similarity search. Additionally, we improve predictions by using the GELU activation function rather than the ReLU. The negligible amount of weight parameters in involution with a compact model with better performance makes the model very useful in real-world implementations. Our proposed model is below 1 megabyte in size. We have experimented with our proposed methodology and other models on CIFAR-10, FashionMNIST, and MNIST datasets. Our proposed method outperforms across all three datasets.
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)
The AI Hype Train Has Stalled in China
Building his own large language model (LLM) is out of the realm of possibility for startup founders like Zhang Haiwei. He'd need hundreds of millions of dollars, and he'd be competing with China's internet giants, who have a long head start. The likes of Baidu and IFlyTek have been working on LLMs--the foundation of artificial intelligence systems that can mimic human intelligence--for years, long before the current AI boom took off. Instead, Zhang's motion-capture startup, Chingmu, is using OpenAI's models trained with its own data to analyze how people and objects move, to use in animation and sports training. "My view of this year is involution," Zhang says, applying a popular term in China which describes a cycle of manic competition that leads to everyone working harder and harder for fewer rewards.
- Asia > China (0.91)
- North America > United States > California (0.06)
- Europe (0.06)
Evolution: A Unified Formula for Feature Operators from a High-level Perspective
Traditionally, different types of feature operators (e.g., convolution, self-attention and involution) utilize different approaches to extract and aggregate the features. Resemblance can be hardly discovered from their mathematical formulas. However, these three operators all serve the same paramount purpose and bear no difference in essence. Hence we probe into the essence of various feature operators from a high-level perspective, transformed their components equivalently, and explored their mathematical expressions within higher dimensions. We raise one clear and concrete unified formula for different feature operators termed as Evolution. Evolution utilizes the Evolution Function to generate the Evolution Kernel, which extracts and aggregates the features in certain positions of the input feature map. We mathematically deduce the equivalent transformation from the traditional formulas of these feature operators to Evolution and prove the unification. In addition, we discuss the forms of Evolution Functions and the properties of generated Evolution Kernels, intending to give inspirations to the further research and innovations of powerful feature operators.
Quaternion Backpropagation
Pöppelbaum, Johannes, Schwung, Andreas
Quaternion valued neural networks experienced rising popularity and interest from researchers in the last years, whereby the derivatives with respect to quaternions needed for optimization are calculated as the sum of the partial derivatives with respect to the real and imaginary parts. However, we can show that product- and chain-rule does not hold with this approach. We solve this by employing the GHRCalculus and derive quaternion backpropagation based on this. Furthermore, we experimentally prove the functionality of the derived quaternion backpropagation.
- Europe > Germany (0.04)
- South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
- North America > United States > New Jersey > Hudson County > Hoboken (0.04)
- (4 more...)
Nonparametric Involutive Markov Chain Monte Carlo
Mak, Carol, Zaiser, Fabian, Ong, Luke
A challenging problem in probabilistic programming is to develop inference algorithms that work for arbitrary programs in a universal probabilistic programming language (PPL). We present the nonparametric involutive Markov chain Monte Carlo (NP-iMCMC) algorithm as a method for constructing MCMC inference algorithms for nonparametric models expressible in universal PPLs. Building on the unifying involutive MCMC framework, and by providing a general procedure for driving state movement between dimensions, we show that NP-iMCMC can generalise numerous existing iMCMC algorithms to work on nonparametric models. We prove the correctness of the NP-iMCMC sampler. Our empirical study shows that the existing strengths of several iMCMC algorithms carry over to their nonparametric extensions. Applying our method to the recently proposed Nonparametric HMC, an instance of (Multiple Step) NP-iMCMC, we have constructed several nonparametric extensions (all of which new) that exhibit significant performance improvements.
- North America > United States > New York (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Involutive MCMC: a Unifying Framework
Neklyudov, Kirill, Welling, Max, Egorov, Evgenii, Vetrov, Dmitry
Name & Citation Appendix Metropolis-Hastings (Hastings, 1970) B.1 Markov Chain Monte Carlo (MCMC) is a computational Mixture Proposal (Habib & Barber, 2018) B.2 approach to fundamental problems such Multiple-Try Metropolis (Liu et al., 2000) B.3 as inference, integration, optimization, and simulation. Sample-Adaptive MCMC (Zhu, 2019) B.4 The field has developed a broad spectrum Reversible-Jump MCMC (Green, 1995) B.5 of algorithms, varying in the way they are motivated, Hybrid Monte Carlo (Duane et al., 1987) B.6 the way they are applied and how efficiently RMHMC (Girolami & Calderhead, 2011) B.7 they sample. Despite all the differences, many of NeuTra (Hoffman et al., 2019) B.8 them share the same core principle, which we A-NICE-MC (Song et al., 2017) B.9 unify as the Involutive MCMC (iMCMC) framework. L2HMC (Levy et al., 2017) B.10 Building upon this, we describe a wide Persistent HMC (Horowitz, 1991) B.11 range of MCMC algorithms in terms of iMCMC, Gibbs (Geman & Geman, 1984) B.12 and formulate a number of "tricks" which one Look Ahead (Sohl-Dickstein et al., 2014) B.13 can use as design principles for developing new NRJ (Gagnon & Doucet, 2019) B.14 MCMC algorithms. Thus, iMCMC provides a Lifted MH (Turitsyn et al., 2011) B.15 unified view of many known MCMC algorithms, which facilitates the derivation of powerful extensions. Table 1: List of algorithms that we describe by the Involutive We demonstrate the latter with two MCMC framework. See their descriptions and formulations examples where we transform known reversible in terms of iMCMC in corresponding appendices.
- Europe > Austria > Vienna (0.14)
- Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)