Goto

Collaborating Authors

 treated


Prior-Fitted Networks Scale to Larger Datasets When Treated as Weak Learners

Wang, Yuxin, Jiang, Botian, Guo, Yiran, Gan, Quan, Wipf, David, Huang, Xuanjing, Qiu, Xipeng

arXiv.org Artificial Intelligence

Prior-Fitted Networks (PFNs) have recently been proposed to efficiently perform tabular classification tasks. Although they achieve good performance on small datasets, they encounter limitations with larger datasets. These limitations include significant memory consumption and increased computational complexity, primarily due to the impracticality of incorporating all training samples as inputs within these networks. To address these challenges, we investigate the fitting assumption for PFNs and input samples. Building on this understanding, we propose \textit{BoostPFN} designed to enhance the performance of these networks, especially for large-scale datasets. We also theoretically validate the convergence of BoostPFN and our empirical results demonstrate that the BoostPFN method can outperform standard PFNs with the same size of training samples in large datasets and achieve a significant acceleration in training times compared to other established baselines in the field, including widely-used Gradient Boosting Decision Trees (GBDTs), deep learning methods and AutoML systems. High performance is maintained for up to 50x of the pre-training size of PFNs, substantially extending the limit of training samples. Through this work, we address the challenges of efficiently handling large datasets via PFN-based models, paving the way for faster and more effective tabular data classification training and prediction process. Code is available at Github.


Strategic Behavior and AI Training Data

Peukert, Christian, Abeillon, Florian, Haese, Jérémie, Kaiser, Franziska, Staub, Alexander

arXiv.org Artificial Intelligence

Human-created works represent critical data inputs to artificial intelligence (AI). Strategic behavior can play a major role for AI training datasets, be it in limiting access to existing works or in deciding which types of new works to create or whether to create new works at all. We examine creators' behavioral change when their works become training data for AI. Specifically, we focus on contributors on Unsplash, a popular stock image platform with about 6 million high-quality photos and illustrations. In the summer of 2020, Unsplash launched an AI research program by releasing a dataset of 25,000 images for commercial use. We study contributors' reactions, comparing contributors whose works were included in this dataset to contributors whose works were not included. Our results suggest that treated contributors left the platform at a higher-than-usual rate and substantially slowed down the rate of new uploads. Professional and more successful photographers react stronger than amateurs and less successful photographers. We also show that affected users changed the variety and novelty of contributions to the platform, with long-run implications for the stock of works potentially available for AI training. Taken together, our findings highlight the trade-off between interests of rightsholders and promoting innovation at the technological frontier. We discuss implications for copyright and AI policy.


LLMs with Chain-of-Thought Are Non-Causal Reasoners

Bao, Guangsheng, Zhang, Hongbo, Yang, Linyi, Wang, Cunxiang, Zhang, Yue

arXiv.org Artificial Intelligence

This paper explores the role of the Chain of Thought (CoT) in Large Language Models (LLMs) reasoning. Despite its potential to improve task performance, our analysis reveals a surprising frequency of correct answers following incorrect CoTs and vice versa. We employ causal analysis to assess the cause-effect relationship between CoTs/instructions and answers in LLMs, uncovering the Structural Causal Model (SCM) that LLMs approximate. By comparing the implied SCM with that of human reasoning, we highlight discrepancies between LLM and human reasoning processes. We further examine the factors influencing the causal structure of the implied SCM, revealing that in-context learning, supervised fine-tuning, and reinforcement learning on human feedback significantly impact the causal relations. We release the code and results at https://github.com/StevenZHB/CoT_Causal_Analysis.


cegpy: Modelling with Chain Event Graphs in Python

Walley, Gareth, Shenvi, Aditi, Strong, Peter, Kobalczyk, Katarzyna

arXiv.org Machine Learning

Chain event graphs (CEGs) are a recent family of probabilistic graphical models that generalise the popular Bayesian networks (BNs) family. Crucially, unlike BNs, a CEG is able to embed, within its graph and its statistical model, asymmetries exhibited by a process. These asymmetries might be in the conditional independence relationships or in the structure of the graph and its underlying event space. Structural asymmetries are common in many domains, and can occur naturally (e.g. a defendant vs prosecutor's version of events) or by design (e.g. a public health intervention). However, there currently exists no software that allows a user to leverage the theoretical developments of the CEG model family in modelling processes with structural asymmetries. This paper introduces cegpy, the first Python package for learning and analysing complex processes using CEGs. The key feature of cegpy is that it is the first CEG package in any programming language that can model processes with symmetric as well as asymmetric structures. cegpy contains an implementation of Bayesian model selection and probability propagation algorithms for CEGs. We illustrate the functionality of cegpy using a structurally asymmetric dataset.


How COVID-19 has Impacted American Attitudes Toward China: A Study on Twitter

Cook, Gavin, Huang, Junming, Xie, Yu

arXiv.org Artificial Intelligence

Past research has studied social determinants of attitudes toward foreign countries. Confounded by potential endogeneity biases due to unobserved factors or reverse causality, the causal impact of these factors on public opinion is usually difficult to establish. Using social media data, we leverage the suddenness of the COVID-19 pandemic to examine whether a major global event has causally changed American views of another country. We collate a database of more than 297 million posts on the social media platform Twitter about China or COVID-19 up to June 2020, and we treat tweeting about COVID-19 as a proxy for individual awareness of COVID-19. Using regression discontinuity and difference-in-difference estimation, we find that awareness of COVID-19 causes a sharp rise in anti-China attitudes. Our work has implications for understanding how self-interest affects policy preference and how Americans view migrant communities.


Will Advanced AIs Ever Be Treated as Our Equals in Society?

#artificialintelligence

For the first time, artificial intelligence (AI) may have eclipsed the science that brought it to "life": we now have AI that can converse in a human-like manner, robots designed to look like us, and deep learning machines built specifically to learn, think, and act the way we do. Experts believe that by 2030, machine intelligence will be on par with humans and that by 2045, the capabilities of AI will actually surpass human intelligence. Given the rate of advancement the field has seen and the resources being dedicated towards its continued development, robotics researchers believe we're closer to "thinking" machines than ever before. "It's getting to a point where we might be able to say this thing has a sense of itself, and maybe there is a threshold moment where suddenly this consciousness emerges," mathematician Marcus du Sautoy from the University of Oxford said. "And if we understand these things are having a level of consciousness, we might well have to introduce rights.




Digital Analogues (Intro): Artificial Intelligence Systems Should Be Treated Like... - FLI - Future of Life Institute

#artificialintelligence

This piece was originally published on Medium in Imaginary Papers, an online publication of Arizona State University's Center for Science and the Imagination. Matt Scherer runs the Law and AI blog. Artificial intelligence (A.I.) systems are becoming increasingly ubiquitous in our economy and society, and are being designed with an ever-increasing ability to operate free of direct human supervision. Algorithmic trading systems account for a huge and still-growing share of stock market transactions, and autonomous vehicles with A.I. "drivers" are already being tested on the roads. Because they operate with less human supervision and control than earlier technologies, the rising prevalence of autonomous A.I. raises the question of how legal systems can ensure that victims receive compensation if (read: when) an A.I. system causes physical or economic harm during the course of its operations.