power
On the Power of Decision Trees in Auto-Regressive Language Modeling
Originally proposed for handling time series data, Auto-regressive Decision Trees (ARDTs) have not yet been explored for language modeling. This paper delves into both the theoretical and practical applications of ARDTs in this new context. We theoretically demonstrate that ARDTs can compute complex functions, such as simulating automata, Turing machines, and sparse circuits, by leveraging "chain-of-thought" computations. Our analysis provides bounds on the size, depth, and computational efficiency of ARDTs, highlighting their surprising computational power. Empirically, we train ARDTs on simple language generation tasks, showing that they can learn to generate coherent and grammatically correct text on par with a smaller Transformer model.
On the Power of Small-size Graph Neural Networks for Linear Programming
Graph neural networks (GNNs) have recently emerged as powerful tools for addressing complex optimization problems. It has been theoretically demonstrated that GNNs can universally approximate the solution mapping functions of linear programming (LP) problems. However, these theoretical results typically require GNNs to have large parameter sizes. Conversely, empirical experiments have shown that relatively small GNNs can solve LPs effectively, revealing a significant discrepancy between theoretical predictions and practical observations. In this work, we aim to bridge this gap by providing a theoretical foundation for the effectiveness of small-size GNNs.
The Power of Resets in Online Reinforcement Learning
Simulators are a pervasive tool in reinforcement learning, but most existing algorithms cannot efficiently exploit simulator access -- particularly in high-dimensional domains that require general function approximation. We explore the power of simulators through online reinforcement learning with local simulator access (or, local planning), an RL protocol where the agent is allowed to reset to previously observed states and follow their dynamics during training. We use local simulator access to unlock new statistical guarantees that were previously out of reach:- We show that MDPs with low coverability (Xie et al. 2023) -- a general structural condition that subsumes Block MDPs and Low-Rank MDPs -- can be learned in a sample-efficient fashion with only Q -realizability (realizability of the optimal state-value function); existing online RL algorithms require significantly stronger representation conditions.- As a consequence, we show that the notorious Exogenous Block MDP problem (Efroni et al. 2022) is tractable under local simulator access.The results above are achieved through a computationally inefficient algorithm. We complement them with a more computationally efficient algorithm, RVFS (Recursive Value Function Search), which achieves provable sample complexity guarantees under a strengthened statistical assumption known as pushforward coverability.
On the Power of Heuristics in Temporal Graphs
Cornell, Filip, Smirnov, Oleg, Gandler, Gabriela Zarzar, Cao, Lele
Dynamic graph datasets often exhibit strong temporal patterns, such as recency, which prioritizes recent interactions, and popularity, which favors frequently occurring nodes. We demonstrate that simple heuristics leveraging only these patterns can perform on par or outperform state-of-the-art neural network models under standard evaluation protocols. To further explore these dynamics, we introduce metrics that quantify the impact of recency and popularity across datasets. Our experiments on BenchTemp and the Temporal Graph Benchmark show that our approaches achieve state-of-the-art performance across all datasets in the latter and secure top ranks on multiple datasets in the former. These results emphasize the importance of refined evaluation schemes to enable fair comparisons and promote the development of more robust temporal graph models. Additionally, they reveal that current deep learning methods often struggle to capture the key patterns underlying predictions in real-world temporal graphs. For reproducibility, we have made our code publicly available.
ECM: A Unified Electronic Circuit Model for Explaining the Emergence of In-Context Learning and Chain-of-Thought in Large Language Model
Chen, Qiguang, Qin, Libo, Liu, Jinhao, Peng, Dengyun, Wang, Jiaqi, Hu, Mengkang, Chen, Zhi, Che, Wanxiang, Liu, Ting
Recent advancements in large language models (LLMs) have led to significant successes across various applications, where the most noticeable is to a series of emerging capabilities, particularly in the areas of In-Context Learning (ICL) and Chain-of-Thought (CoT). To better understand and control model performance, many studies have begun investigating the underlying causes of these phenomena and their impact on task outcomes. However, existing explanatory frameworks predominantly focus on isolating and explaining ICL and CoT independently, leading to an incomplete understanding of their combined influence on model performance. To address this gap, we propose the Electronic Circuit Model (ECM), which provides a foundation for developing scalable, learnable policies and improving the management of AI-generated content. Specifically, ECM conceptualizes model behavior as an electronic circuit: ICL is represented as semantic magnetic field to providing an additional voltage following Faraday's Law, while CoT is modeled as series resistors to constrain the model output performance following Ohm's Law. Experimental results demonstrate that the ECM effectively predicts and explains LLM performance across a variety of prompting strategies. Furthermore, we apply ECM to advanced reasoning strategy optimization on a series of tasks, such as the International Olympiad in Informatics (IOI) and the International Mathematical Olympiad (IMO), achieving competitive performance that surpasses nearly 80% of top human competitors.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > Singapore (0.04)
- Asia > Indonesia > Bali (0.04)
- (4 more...)
- Research Report > New Finding (1.00)
- Research Report > Promising Solution (0.92)
On the Power of Differentiable Learning versus PAC and SQ Learning
We study the power of learning via mini-batch stochastic gradient descent (SGD) on the loss of a differentiable model or neural network, and ask what learning problems can be learnt using this paradigm. We show that SGD can always simulate learning with statistical queries (SQ), but its ability to go beyond that depends on the precision \rho of the gradients and the minibatch size b . With fine enough precision relative to minibatch size, namely when b \rho is small enough, SGD can go beyond SQ learning and simulate any sample-based learning algorithm and thus its learning power is equivalent to that of PAC learning; this extends prior work that achieved this result for b 1 . Moreover, with polynomially many bits of precision (i.e. when \rho is exponentially small), SGD can simulate PAC learning regardless of the batch size. On the other hand, when b \rho 2 is large enough, the power of SGD is equivalent to that of SQ learning.
The history of overhyped tech, and a chilling new graphic novel from Charles Burns
Richard Powers' Playground is a novel of contrasts: the vast unknown of Earth's oceans, a place of constant discovery and marvelous creatures that seem always to be at play, versus technological advancement and the rise of AI; the unlikely friendship between a young poet and a boy whose life revolves around coding; a remote island with a tiny population still feeling the effects of a history of exploitation, and the tech elites who envision it as the stepping stone to their own utopia. Through the perspectives of four characters who have been brought together on Makatea, an atoll in the South Pacific, Playground explores friendship, play, the wonders of the natural world and humanity in the age of artificial intelligence. Powers' writing is beautiful, and Playground promises to leave you with a lot to think about.
CT-GAT: Cross-Task Generative Adversarial Attack based on Transferability
Lv, Minxuan, Dai, Chengwei, Li, Kun, Zhou, Wei, Hu, Songlin
Neural network models are vulnerable to adversarial examples, and adversarial transferability further increases the risk of adversarial attacks. Current methods based on transferability often rely on substitute models, which can be impractical and costly in real-world scenarios due to the unavailability of training data and the victim model's structural details. In this paper, we propose a novel approach that directly constructs adversarial examples by extracting transferable features across various tasks. Our key insight is that adversarial transferability can extend across different tasks. Specifically, we train a sequence-to-sequence generative model named CT-GAT using adversarial sample data collected from multiple tasks to acquire universal adversarial features and generate adversarial examples for different tasks. We conduct experiments on ten distinct datasets, and the results demonstrate that our method achieves superior attack performance with small cost.
- Europe > Italy > Tuscany > Florence (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > Oregon (0.04)
- (3 more...)
- Information Technology > Security & Privacy (1.00)
- Government (1.00)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Scale your content with the web's best price on Write Bot
Content is one of the most affordable ways to scale your marketing efforts and extend your brand's reach. Considering you can get a lifetime premium subscription to Write Bot for the web's best price, it's an excellent time to invest in your content marketing. While ChatGPT might catch most of the headlines, it's far from the only tool available to help you write more content. Write Bot uses machine learning algorithms and natural language processing techniques to help you write 100x faster without sounding like a bot. All you have to do is choose your use case, fill in the blanks with as much (or as little) detail as you'd like, and Write Bot delivers content in a ready-to-use format.
Who's saying what? Artificial Intelligence in Power - Filings Trends Q3 2022
By downloading this whitepaper, you acknowledge that Global Data UK Limited may share your information with our white paper partners/sponsors who may contact you directly with information on their products and services. Visit our privacy policy for more information about our services, how Global Data UK Limited may use, process and share your personal data, including information on your rights in respect of your personal data and how you can unsubscribe from future marketing communications. Our services are intended for corporate subscribers and you warrant that the email address submitted is your corporate email address.