AITopics | large-scale study

Collaborating Authors

large-scale study

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Are GANs Created Equal? A Large-Scale Study

Neural Information Processing SystemsNov-20-2025, 23:08:48 GMT

Generative adversarial networks (GAN) are a powerful subclass of generative models. Despite a very rich research activity leading to numerous interesting GAN algorithms, it is still very hard to assess which algorithm(s) perform better than others. We conduct a neutral, multi-faceted large-scale empirical study on state-of-the art models and evaluation measures. We find that most models can reach similar scores with enough hyperparameter optimization and random restarts. This suggests that improvements can arise from a higher computational budget and tuning more than fundamental algorithmic changes. To overcome some limitations of the current metrics, we also propose several data sets on which precision and recall can be computed. Our experimental results suggest that future GAN research should be based on more systematic and objective evaluation procedures.

large-scale study, name change, proceedings, (3 more...)

Neural Information Processing Systems

Genre: Research Report (0.61)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.61)

Add feedback

A Multilingual, Large-Scale Study of the Interplay between LLM Safeguards, Personalisation, and Disinformation

Leite, João A., Arora, Arnav, Gargova, Silvia, Luz, João, Sampaio, Gustavo, Roberts, Ian, Scarton, Carolina, Bontcheva, Kalina

arXiv.org Artificial IntelligenceOct-30-2025

While Large Language Models (LLMs) have made agentic AI, chatbots, and other intelligent applications possible, they have also enabled the affordable creation of highly convincing AI-generated disinformation (Bontcheva et al., 2024), which poses a systemic risk to democratic stability and global security (VIGINUM, 2025; Bengio, 2025). Initially, AI-generated texts suffered from linguistic mistakes and thus were more easily detectable by humans. However, modern LLMs, particularly instruction-tuned models, have significantly improved in producing outputs which are indistinguishable from human-written text (Spitale et al., 2023; Heppell et al., 2024). These advances have resulted in their misuse in generating persuasive disinformation narratives, including political manipulation, health disinformation, conspiracy propagation, and Foreign Information Manipulation and Interference (FIMI) (Vykopal et al., 2024; Chen and Shu, 2024a; Barman et al., 2024; Chen and Shu, 2024b; Heppell et al., 2024; VIGINUM, 2025). While there is a growing body of research on the generation and detection of LLM-produced disinformation (Chen and Shu, 2024a; Lucas et al., 2023; Vykopal et al., 2024; Heppell et al., 2024), a critical aspect remains largely unstudied - namely, whether LLMs are capable of generating fluent and convincing personalised disinformation (i.e., disinformation narratives tailored to specific audiences) in multiple languages and at scale. The few prior studies on AIgenerated personalised disinformation are limited to English and address a very narrow set of personas (e.g., students, parents) (Zugecova et al., 2024). Crucially, prior work has not yet examined whether LLMs can adapt disinformation to country-specific linguistic and cultural contexts in multiple languages.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2510.12993

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Texas (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Media > News (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Actions Speak Louder than Prompts: A Large-Scale Study of LLMs for Graph Inference

Finkelshtein, Ben, Cucerzan, Silviu, Jauhar, Sujay Kumar, White, Ryen

arXiv.org Artificial IntelligenceSep-24-2025

Large language models (LLMs) are increasingly used for text-rich graph machine learning tasks such as node classification in high-impact domains like fraud detection and recommendation systems. Yet, despite a surge of interest, the field lacks a principled understanding of the capabilities of LLMs in their interaction with graph data. In this work, we conduct a large-scale, controlled evaluation across several key axes of variability to systematically assess the strengths and weaknesses of LLM-based graph reasoning methods in text-based applications. The axes include the LLM-graph interaction mode, comparing prompting, tool-use, and code generation; dataset domains, spanning citation, web-link, e-commerce, and social networks; structural regimes contrasting homophilic and heterophilic graphs; feature characteristics involving both short- and long-text node attributes; and model configurations with varying LLM sizes and reasoning capabilities. We further analyze dependencies by methodically truncating features, deleting edges, and removing labels to quantify reliance on input types. Our findings provide practical and actionable guidance. (1) LLMs as code generators achieve the strongest overall performance on graph data, with especially large gains on long-text or high-degree graphs where prompting quickly exceeds the token budget. (2) All interaction strategies remain effective on heterophilic graphs, challenging the assumption that LLM-based methods collapse under low homophily. (3) Code generation is able to flexibly adapt its reliance between structure, features, or labels to leverage the most informative input type. Together, these findings provide a comprehensive view of the strengths and limitations of current LLM-graph interaction modes and highlight key design principles for future approaches.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2509.18487

Country: North America > United States (0.29)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Services (0.55)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Reviews: Are GANs Created Equal? A Large-Scale Study

Neural Information Processing SystemsOct-8-2024, 07:55:55 GMT

This paper introduces a large set of experiments to compare recently proposed GANs. It discusses two previously proposed measures -- inception score (IS) and Frechet Inception Distance (FID); and it proposes a new measure in the context of GAN assessment, based on precision, recall and F1. Precision (P) is measured as the fraction of generated samples with distance below a pre-defined threshold \delta; while recall (R) is measured as the fraction of inversely generated samples (from test set) with squared Euclidean distance below \delta (F1 is the usual mean between P and R). The paper argues that IS only measures precision and FIS measures both, so IS is essentially dropped as a measurement for GANs. Then the paper argues that it is important to show the mean and variance of FID and P-R-F1 measurements instead of the best values, computed over a set of random initialisations and hyper-parameter search points.

conclusion, gan, large-scale study, (6 more...)

Neural Information Processing Systems

Genre: Research Report (0.60)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.94)

Add feedback

Large-Scale Study of Temporal Shift in Health Insurance Claims

Ji, Christina X, Alaa, Ahmed M, Sontag, David

arXiv.org Artificial IntelligenceJun-18-2023

Most machine learning models for predicting clinical outcomes are developed using historical data. Yet, even if these models are deployed in the near future, dataset shift over time may result in less than ideal performance. To capture this phenomenon, we consider a task--that is, an outcome to be predicted at a particular time point--to be non-stationary if a historical model is no longer optimal for predicting that outcome. We build an algorithm to test for temporal shift either at the population level or within a discovered sub-population. Then, we construct a meta-algorithm to perform a retrospective scan for temporal shift on a large collection of tasks. Our algorithms enable us to perform the first comprehensive evaluation of temporal shift in healthcare to our knowledge. We create 1,010 tasks by evaluating 242 healthcare outcomes for temporal shift from 2015 to 2020 on a health insurance claims dataset. 9.7% of the tasks show temporal shifts at the population level, and 93.0% have some sub-population affected by shifts. We dive into case studies to understand the clinical implications. Our analysis highlights the widespread prevalence of temporal shifts in healthcare.

algorithm, large-scale study, temporal shift, (14 more...)

arXiv.org Artificial Intelligence

2305.05087

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Pennsylvania (0.04)
North America > United States > New Jersey (0.04)
(3 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(6 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Are GANs Created Equal? A Large-Scale Study

Lucic, Mario, Kurach, Karol, Michalski, Marcin, Gelly, Sylvain, Bousquet, Olivier

Neural Information Processing SystemsFeb-14-2020, 06:26:16 GMT

algorithm, gan, large-scale study

Neural Information Processing Systems

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.65)

Add feedback

A Gentle Introduction to Generative Adversarial Network Loss Functions

#artificialintelligenceSep-2-2019, 06:42:07 GMT

Taken from: Are GANs Created Equal?

artificial intelligence, loss function, machine learning, (16 more...)

#artificialintelligence

Genre: Instructional Material (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.71)

Add feedback

Large-Scale Study of Curiosity-Driven Learning

#artificialintelligenceJan-8-2019, 02:53:34 GMT

Reinforcement learning algorithms rely on carefully engineering environment rewards that are extrinsic to the agent. However, annotating each environment with hand-designed, dense rewards is not scalable, motivating the need for developing reward functions that are intrinsic to the agent. Curiosity is a type of intrinsic reward function which uses prediction error as reward signal. In this paper: (a) We perform the first large-scale study of purely curiosity-driven learning, i.e. without any extrinsic rewards, across 54 standard benchmark environments, including the Atari game suite. Our results show surprisingly good performance, and a high degree of alignment between the intrinsic curiosity objective and the hand-designed extrinsic rewards of many game environments.

curiosity-driven learning, extrinsic reward, large-scale study, (3 more...)

#artificialintelligence

Genre: Research Report > New Finding (0.65)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.65)

Add feedback

Who Does What on the Web: A Large-Scale Study of Browsing Behavior

Goel, Sharad (Yahoo! Research) | Hofman, Jake M. (Yahoo! Research) | Sirer, M. Irmak (Northwestern University)

AAAI ConferencesFeb-22-2012

As the Web has become integrated into daily life, understanding how individuals spend their time online impacts domains ranging from public policy to marketing. It is difficult, however, to measure even simple aspects of browsing behavior via conventional methods---including surveys and site-level analytics---due to limitations of scale and scope. In part addressing these limitations, large-scale Web panel data are a relatively novel means for investigating patterns of Internet usage. In one of the largest studies of browsing behavior to date, we pair Web histories for 250,000 anonymized individuals with user-level demographics---including age, sex, race, education, and income---to investigate three topics. First, we examine how behavior changes as individuals spend more time online, showing that the heaviest users devote nearly twice as much of their time to social media relative to typical individuals. Second, we revisit the digital divide, finding that the frequency with which individuals turn to the Web for research, news, and healthcare is strongly related to educational background, but not as closely tied to gender and ethnicity. Finally, we demonstrate that browsing histories are a strong signal for inferring user attributes, including ethnicity and household income, a result that may be leveraged to improve ad targeting.

artificial intelligence, category, machine learning, (19 more...)

AAAI Conferences

Sixth International AAAI Conference on Weblogs and Social Media

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois > Cook County > Evanston (0.04)

Industry:

Health & Medicine (0.88)
Education > Educational Setting > Higher Education (0.49)
Information Technology > Services (0.47)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback