Goto

Collaborating Authors

 Generative AI


The Morning After: OpenAI's week of security issues

Engadget

Perhaps unsurprisingly, July 4th was a quiet day for news, but we've still got editorials on e-ink writing, the most-delayed video game ever and more bad news from the makers of ChatGPT. Earlier this week, engineer and Swift developer Pedro Josรฉ Pereira Vieito dug into OpenAI's Mac ChatGPT app and found that it was storing user conversations locally in plain text, rather than encrypting them. Because that app is only available from OpenAI's website, and since it's not available on the App Store, it doesn't have to follow Apple's sandboxing requirements. OpenAI released an update that added encryption to locally stored chats. Then, more bad news stemmed from issues in 2023. Last spring, a hacker obtained information about OpenAI after illicitly accessing the company's internal messaging systems.


Japanese companies lag in AI adoption, white paper says

The Japan Times

Companies in Japan lag behind those in the United States and Europe in the use of generative artificial intelligence, a government white paper showed Friday. The white paper said that 46.8% of companies in Japan use generative AI in their operations, compared with 84.7% in the U.S. and 72.7% in Germany. Japanese companies' use of generative AI has been limited to taking meeting minutes and creating emails and documents. In contrast, firms in the U.S. and Europe use the technology in a wider range of operations, including for customer services, the white paper said. Including companies that use generative AI on a trial basis, the proportion of those using the technology stands at about 70% in Japan, lower than over 90% in both the U.S. and Germany.


Improving ensemble extreme precipitation forecasts using generative artificial intelligence

arXiv.org Artificial Intelligence

An ensemble post-processing method is developed to improve the probabilistic forecasts of extreme precipitation events across the conterminous United States (CONUS). The method combines a 3-D Vision Transformer (ViT) for bias correction with a Latent Diffusion Model (LDM), a generative Artificial Intelligence (AI) method, to post-process 6-hourly precipitation ensemble forecasts and produce an enlarged generative ensemble that contains spatiotemporally consistent precipitation trajectories. These trajectories are expected to improve the characterization of extreme precipitation events and offer skillful multi-day accumulated and 6-hourly precipitation guidance. The method is tested using the Global Ensemble Forecast System (GEFS) precipitation forecasts out to day 6 and is verified against the Climate-Calibrated Precipitation Analysis (CCPA) data. Verification results indicate that the method generated skillful ensemble members with improved Continuous Ranked Probabilistic Skill Scores (CRPSSs) and Brier Skill Scores (BSSs) over the raw operational GEFS and a multivariate statistical post-processing baseline. It showed skillful and reliable probabilities for events at extreme precipitation thresholds. Explainability studies were further conducted, which revealed the decision-making process of the method and confirmed its effectiveness on ensemble member generation. This work introduces a novel, generative-AI-based approach to address the limitation of small numerical ensembles and the need for larger ensembles to identify extreme precipitation events.


Rethinking Image Compression on the Web with Generative AI

arXiv.org Artificial Intelligence

The rapid growth of the Internet, driven by social media, web browsing, and video streaming, has made images central to the Web experience, resulting in significant data transfer and increased webpage sizes. Traditional image compression methods, while reducing bandwidth, often degrade image quality. This paper explores a novel approach using generative AI to reconstruct images at the edge or client-side. We develop a framework that leverages text prompts and provides additional conditioning inputs like Canny edges and color palettes to a text-to-image model, achieving up to 99.8% bandwidth savings in the best cases and 92.6% on average, while maintaining high perceptual similarity. Empirical analysis and a user study show that our method preserves image meaning and structure more effectively than traditional compression methods, offering a promising solution for reducing bandwidth usage and improving Internet affordability with minimal degradation in image quality.


OpenAI hit by two big security issues this week

Engadget

OpenAI seems to make headlines every day and this time it's for a double dose of security concerns. The first issue centers on the Mac app for ChatGPT, while the second hints at broader concerns about how the company is handling its cybersecurity. Earlier this week, engineer and Swift developer Pedro Josรฉ Pereira Vieito dug into the Mac ChatGPT app and found that it was storing user conversations locally in plain text rather than encrypting them. The app is only available from OpenAI's website, and since it's not available on the App Store, it doesn't have to follow Apple's sandboxing requirements. Vieito's work was then covered by The Verge, and after the exploit attracted attention, OpenAI released an update that added encryption to locally stored chats.


Functional Faithfulness in the Wild: Circuit Discovery with Differentiable Computation Graph Pruning

arXiv.org Artificial Intelligence

In this paper, we introduce a comprehensive reformulation of the task known as Circuit Discovery, along with DiscoGP, a novel and effective algorithm based on differentiable masking for discovering circuits. Circuit discovery is the task of interpreting the computational mechanisms of language models (LMs) by dissecting their functions and capabilities into sparse subnetworks (circuits). We identified two major limitations in existing circuit discovery efforts: (1) a dichotomy between weight-based and connection-edge-based approaches forces researchers to choose between pruning connections or weights, thereby limiting the scope of mechanistic interpretation of LMs; (2) algorithms based on activation patching tend to identify circuits that are neither functionally faithful nor complete. The performance of these identified circuits is substantially reduced, often resulting in near-random performance in isolation. Furthermore, the complement of the circuit -- i.e., the original LM with the identified circuit removed -- still retains adequate performance, indicating that essential components of a complete circuits are missed by existing methods. DiscoGP successfully addresses the two aforementioned issues and demonstrates state-of-the-art faithfulness, completeness, and sparsity. The effectiveness of the algorithm and its novel structure open up new avenues of gathering new insights into the internal workings of generative AI.


Over the Edge of Chaos? Excess Complexity as a Roadblock to Artificial General Intelligence

arXiv.org Artificial Intelligence

In this study, we explored the progression trajectories of artificial intelligence (AI) systems through the lens of complexity theory. We challenged the conventional linear and exponential projections of AI advancement toward Artificial General Intelligence (AGI) underpinned by transformer-based architectures, and posited the existence of critical points, akin to phase transitions in complex systems, where AI performance might plateau or regress into instability upon exceeding a critical complexity threshold. We employed agent-based modelling (ABM) to simulate hypothetical scenarios of AI systems' evolution under specific assumptions, using benchmark performance as a proxy for capability and complexity. Our simulations demonstrated how increasing the complexity of the AI system could exceed an upper criticality threshold, leading to unpredictable performance behaviours. Additionally, we developed a practical methodology for detecting these critical thresholds using simulation data and stochastic gradient descent to fine-tune detection thresholds. This research offers a novel perspective on AI advancement that has a particular relevance to Large Language Models (LLMs), emphasising the need for a tempered approach to extrapolating AI's growth potential and underscoring the importance of developing more robust and comprehensive AI performance benchmarks.


Advances in Diffusion Models for Image Data Augmentation: A Review of Methods, Models, Evaluation Metrics and Future Research Directions

arXiv.org Artificial Intelligence

Image data augmentation constitutes a critical methodology in modern computer vision tasks, since it can facilitate towards enhancing the diversity and quality of training datasets; thereby, improving the performance and robustness of machine learning models in downstream tasks. In parallel, augmentation approaches can also be used for editing/modifying a given image in a context- and semantics-aware way. Diffusion Models (DMs), which comprise one of the most recent and highly promising classes of methods in the field of generative Artificial Intelligence (AI), have emerged as a powerful tool for image data augmentation, capable of generating realistic and diverse images by learning the underlying data distribution. The current study realizes a systematic, comprehensive and in-depth review of DM-based approaches for image augmentation, covering a wide range of strategies, tasks and applications. In particular, a comprehensive analysis of the fundamental principles, model architectures and training strategies of DMs is initially performed. Subsequently, a taxonomy of the relevant image augmentation methods is introduced, focusing on techniques regarding semantic manipulation, personalization and adaptation, and application-specific augmentation tasks. Then, performance assessment methodologies and respective evaluation metrics are analyzed. Finally, current challenges and future research directions in the field are discussed.


A Review of Modern Recommender Systems Using Generative Models (Gen-RecSys)

arXiv.org Artificial Intelligence

Traditional recommender systems (RS) typically use user-item rating histories as their main data source. However, deep generative models now have the capability to model and sample from complex data distributions, including user-item interactions, text, images, and videos, enabling novel recommendation tasks. This comprehensive, multidisciplinary survey connects key advancements in RS using Generative Models (Gen-RecSys), covering: interaction-driven generative models; the use of large language models (LLM) and textual data for natural language recommendation; and the integration of multimodal models for generating and processing images/videos in RS. Our work highlights necessary paradigms for evaluating the impact and harm of Gen-RecSys and identifies open challenges. This survey accompanies a tutorial presented at ACM KDD'24, with supporting materials provided at: https://encr.pw/vDhLq.


Can Base ChatGPT be Used for Forecasting without Additional Optimization?

arXiv.org Artificial Intelligence

This study investigates whether OpenAI's ChatGPT-3.5 and ChatGPT-4 can forecast future events. To evaluate the accuracy of the predictions, we take advantage of the fact that the training data at the time of our experiments (mid 2023) stopped at September 2021, and ask about events that happened in 2022. We employed two prompting strategies: direct prediction and what we call future narratives which ask ChatGPT to tell fictional stories set in the future with characters retelling events that happened in the past, but after ChatGPT's training data had been collected. We prompted ChatGPT to engage in storytelling, particularly within economic contexts. After analyzing 100 trials, we find that future narrative prompts significantly enhanced ChatGPT-4's forecasting accuracy. This was especially evident in its predictions of major Academy Award winners as well as economic trends, the latter inferred from scenarios where the model impersonated public figures like the Federal Reserve Chair, Jerome Powell. As a falsification exercise, we repeated our experiments in May 2024 at which time the models included more recent training data. ChatGPT-4's accuracy significantly improved when the training window included the events being prompted for, achieving 100% accuracy in many instances. The poorer accuracy for events outside of the training window suggests that in the 2023 prediction experiments, ChatGPT-4 was forming predictions based solely on its training data. Narrative prompting also consistently outperformed direct prompting. These findings indicate that narrative prompts leverage the models' capacity for hallucinatory narrative construction, facilitating more effective data synthesis and extrapolation than straightforward predictions. Our research reveals new aspects of LLMs' predictive capabilities and suggests potential future applications in analytical contexts.