Goto

Collaborating Authors

 Generative AI


Organizational Adaptation to Generative AI in Cybersecurity: A Systematic Review

arXiv.org Artificial Intelligence

Cybersecurity organizations are adapting to GenAI integration through modified frameworks and hybrid operational processes, with success influenced by existing security maturity, regulatory requirements, and investments in human capital and infrastructure. This qualitative research employs systematic document analysis and comparative case study methodology to examine how cybersecurity organizations adapt their threat modeling frameworks and operational processes to address generative artificial intelligence integration. Through examination of 25 studies from 2022 to 2025, the research documents substantial transformation in organizational approaches to threat modeling, moving from traditional signature-based systems toward frameworks incorporating artificial intelligence capabilities. The research identifies three primary adaptation patterns: Large Language Model integration for security applications, GenAI frameworks for risk detection and response automation, and AI/ML integration for threat hunting. Organizations with mature security infrastructures, particularly in finance and critical infrastructure sectors, demonstrate higher readiness through structured governance approaches, dedicated AI teams, and robust incident response processes. Organizations achieve successful GenAI integration when they maintain appropriate human oversight of automated systems, address data quality concerns and explainability requirements, and establish governance frameworks tailored to their specific sectors. Organizations encounter ongoing difficulties with privacy protection, bias reduction, personnel training, and defending against adversarial attacks. This work advances understanding of how organizations adopt innovative technologies in high-stakes environments and offers actionable insights for cybersecurity professionals implementing GenAI systems.


The Download: how AI can improve a city, and inside OpenAI's empire

MIT Technology Review

Bright LEDs could spell the end of dark skies Scientists have known for years that light pollution is growing and can harm both humans and wildlife. In people, increased exposure to light at night disrupts sleep cycles and has been linked to cancer and cardiovascular disease, while wildlife suffers from interruption to their reproductive patterns, and increased danger. Astronomers, policymakers, and lighting professionals are all working to find ways to reduce light pollution. Many of them advocate installing light-emitting diodes, or LEDs, in outdoor fixtures such as city streetlights, mainly for their ability to direct light to a targeted area. But the high initial investment and durability of modern LEDs mean cities need to get the transition right the first time or potentially face decades of consequences.


The Biased Samaritan: LLM biases in Perceived Kindness

arXiv.org Artificial Intelligence

While Large Language Models (LLMs) have become ubiquitous in many fields, understanding and mitigating LLM biases is an ongoing issue. This paper provides a novel method for evaluating the demographic biases of various generative AI models. By prompting models to assess a moral patient's willingness to intervene constructively, we aim to quantitatively evaluate different LLMs' biases towards various genders, races, and ages. Our work differs from existing work by aiming to determine the baseline demographic identities for various commercial models and the relationship between the baseline and other demographics. We strive to understand if these biases are positive, neutral, or negative, and the strength of these biases. This paper can contribute to the objective assessment of bias in Large Language Models and give the user or developer the power to account for these biases in LLM output or in training future LLMs. Our analysis suggested two key findings: that models view the baseline demographic as a white middle-aged or young adult male; however, a general trend across models suggested that non-baseline demographics are more willing to help than the baseline. These methodologies allowed us to distinguish these two biases that are often tangled together.


Collaborative Prediction: To Join or To Disjoin Datasets

arXiv.org Machine Learning

With the recent rise of generative Artificial Intelligence (AI), the need of selecting high-quality dataset to improve machine learning models has garnered increasing attention. However, some part of this topic remains underexplored, even for simple prediction models. In this work, we study the problem of developing practical algorithms that select appropriate dataset to minimize population loss of our prediction model with high probability. Broadly speaking, we investigate when datasets from different sources can be effectively merged to enhance the predictive model's performance, and propose a practical algorithm with theoretical guarantees. By leveraging an oracle inequality and data-driven estimators, the algorithm reduces population loss with high probability. Numerical experiments demonstrate its effectiveness in both standard linear regression and broader machine learning applications. Code is available at https://github.com/kkrokii/collaborative_prediction.


Enter: Graduated Realism: A Pedagogical Framework for AI-Powered Avatars in Virtual Reality Teacher Training

arXiv.org Artificial Intelligence

Virtual Reality simulators offer a powerful tool for teacher training, yet the integration of AI-powered student avatars presents a critical challenge: determining the optimal level of avatar realism for effective pedagogy. This literature review examines the evolution of avatar realism in VR teacher training, synthesizes its theoretical implications, and proposes a new pedagogical framework to guide future design. Through a systematic review, this paper traces the progression from human-controlled avatars to generative AI prototypes. Applying learning theories like Cognitive Load Theory, we argue that hyper-realism is not always optimal, as high-fidelity avatars can impose excessive extraneous cognitive load on novices, a stance supported by recent empirical findings. A significant gap exists between the technological drive for photorealism and the pedagogical need for scaffolded learning. To address this gap, we propose Graduated Realism, a framework advocating for starting trainees with lower-fidelity avatars and progressively increasing behavioral complexity as skills develop. To make this computationally feasible, we outline a novel single-call architecture, Crazy Slots, which uses a probabilistic engine and a Retrieval-Augmented Generation database to generate authentic, real-time responses without the latency and cost of multi-step reasoning models. This review provides evidence-based principles for designing the next generation of AI simulators, arguing that a pedagogically grounded approach to realism is essential for creating scalable and effective teacher education tools.


The Newspaper That Hired ChatGPT

The Atlantic - Technology

For more than 20 years, print media has been a bit of a punching bag for digital-technology companies. Craigslist killed the paid classifieds, free websites led people to think newspapers and magazines were committing robbery when they charged for subscriptions, and the smartphone and social media turned reading full-length articles into a chore. Now generative AI is in the mix--and many publishers, desperate to avoid being left behind once more, are rushing to harness the technology themselves. Several major publications, including The Atlantic, have entered into corporate partnerships with OpenAI and other AI firms. Any number of experiments have ensued--publishers have used the software to help translate work into different languages, draft headlines, and write summaries or even articles.


A Survey of Automatic Evaluation Methods on Text, Visual and Speech Generations

arXiv.org Artificial Intelligence

Recent advances in deep learning have significantly enhanced generative AI capabilities across text, images, and audio. However, automatically evaluating the quality of these generated outputs presents ongoing challenges. Although numerous automatic evaluation methods exist, current research lacks a systematic framework that comprehensively organizes these methods across text, visual, and audio modalities. To address this issue, we present a comprehensive review and a unified taxonomy of automatic evaluation methods for generated content across all three modalities; We identify five fundamental paradigms that characterize existing evaluation approaches across these domains. Our analysis begins by examining evaluation methods for text generation, where techniques are most mature. We then extend this framework to image and audio generation, demonstrating its broad applicability. Finally, we discuss promising directions for future research in cross-modal evaluation methodologies.


"Check My Work?": Measuring Sycophancy in a Simulated Educational Context

arXiv.org Artificial Intelligence

This study examines how user-provided suggestions affect Large Language Models (LLMs) in a simulated educational context, where sycophancy poses significant risks. Testing five different LLMs from the OpenAI GPT-4o and GPT-4.1 model classes across five experimental conditions, we show that response quality varies dramatically based on query framing. In cases where the student mentions an incorrect answer, the LLM correctness can degrade by as much as 15 percentage points, while mentioning the correct answer boosts accuracy by the same margin. Our results also show that this bias is stronger in smaller models, with an effect of up to 30% for the GPT-4.1-nano model, versus 8% for the GPT-4o model. Our analysis of how often LLMs "flip" their answer, and an investigation into token level probabilities, confirm that the models are generally changing their answers to answer choices mentioned by students in line with the sycophancy hypothesis. This sycophantic behavior has important implications for educational equity, as LLMs may accelerate learning for knowledgeable students while the same tools may reinforce misunderstanding for less knowledgeable students. Our results highlight the need to better understand the mechanism, and ways to mitigate, such bias in the educational context.


The Role of Generative AI in Facilitating Social Interactions: A Scoping Review

arXiv.org Artificial Intelligence

Reduced social connectedness increasingly poses a threat to mental health, life expectancy, and general well-being. Generative AI (GAI) technologies, such as large language models (LLMs) and image generation tools, are increasingly integrated into applications aimed at enhancing human social experiences. Despite their growing presence, little is known about how these technologies influence social interactions. This scoping review investigates how GAI-based applications are currently designed to facilitate social interaction, what forms of social engagement they target, and which design and evaluation methodologies designers use to create and evaluate them. Through an analysis of 30 studies published since 2020, we identify key trends in application domains including storytelling, socio-emotional skills training, reminiscence, collaborative learning, music making, and general conversation. We highlight the role of participatory and co-design approaches in fostering both effective technology use and social engagement, while also examining socio-ethical concerns such as cultural bias and accessibility. This review underscores the potential of GAI to support dynamic and personalized interactions, but calls for greater attention to equitable design practices and inclusive evaluation strategies.


A Crack in the Bark: Leveraging Public Knowledge to Remove Tree-Ring Watermarks

arXiv.org Artificial Intelligence

We present a novel attack specifically designed against Tree-Ring, a watermarking technique for diffusion models known for its high imperceptibility and robustness against removal attacks. Unlike previous removal attacks, which rely on strong assumptions about attacker capabilities, our attack only requires access to the variational autoencoder that was used to train the target diffusion model, a component that is often publicly available. By leveraging this variational autoencoder, the attacker can approximate the model's intermediate latent space, enabling more effective surrogate-based attacks. Our evaluation shows that this approach leads to a dramatic reduction in the AUC of Tree-Ring detector's ROC and PR curves, decreasing from 0.993 to 0.153 and from 0.994 to 0.385, respectively, while maintaining high image quality. Notably, our attacks outperform existing methods that assume full access to the diffusion model. These findings highlight the risk of reusing public autoencoders to train diffusion models -- a threat not considered by current industry practices. Furthermore, the results suggest that the Tree-Ring detector's precision, a metric that has been overlooked by previous evaluations, falls short of the requirements for real-world deployment.