Generative AI
PixLens: A Novel Framework for Disentangled Evaluation in Diffusion-Based Image Editing with Object Detection + SAM
Stefanache, Stefan, Pรฉrez, Lluรญs Pastor, Watanabe, Julen Costa, Tejedor, Ernesto Sanchez, Hofmann, Thomas, Simsar, Enis
Evaluating diffusion-based image-editing models is a crucial task in the field of Generative AI. Specifically, it is imperative to assess their capacity to execute diverse editing tasks while preserving the image content and realism. While recent developments in generative models have opened up previously unheard-of possibilities for image editing, conducting a thorough evaluation of these models remains a challenging and open task. The absence of a standardized evaluation benchmark, primarily due to the inherent need for a post-edit reference image for evaluation, further complicates this issue. Currently, evaluations often rely on established models such as CLIP or require human intervention for a comprehensive understanding of the performance of these image editing models. Our benchmark, PixLens, provides a comprehensive evaluation of both edit quality and latent representation disentanglement, contributing to the advancement and refinement of existing methodologies in the field.
Reviews: Flexible and accurate inference and learning for deep generative models
This paper presents an alternative to variational autoencoders and other generative models with latent variables that rely on the wake-sleep algorithm for training. The main problem with the wake-sleep algorithm is its bias: the recognition model has different conditional independencies than the generative model and it's trained to optimize a different objective. DDC-HM solves this by instead working with sufficient statistics, and using those to implicitly define the maximum entropy distribution consistent with those statistics. The measurements chosen are random functions. The methods are evaluated on synthetic data and two small vision datasets (image patches and MNIST), comparing against two baselines using the MMD metric. I don't know the related work well enough to evaluate the novelty with confidence.
The OpenAI Talent Exodus Gives Rivals an Opening
When investors poured 6.6 billion into OpenAI last week, they seemed largely unbothered by the latest drama, which recently saw the company's chief technology officer, Mira Murati, along with chief research officer Bob McCrew and Barret Zoph, a vice president of research, abruptly quit. And yet those three departures were just the latest in an ongoing exodus of key technical talent. Over the past few years, OpenAI has lost several researchers who played crucial roles in developing the algorithms, techniques, and infrastructure that helped make it the world leader in AI as well as a household name. Several other ex-OpenAI employees who spoke to WIRED said that an ongoing shift to a more commercial focus continues to be a source of friction. "People who like to do research are being forced to do product," says one former employee who works at a rival AI company but has friends at OpenAI. This person says some of their contacts at the firm have reached out in recent weeks to inquire about jobs.
Reviews: Deep Generative Models for Distribution-Preserving Lossy Compression
The paper proposes a novel problem formulation for lossy compression, namely distribution-preserving lossy compression (DPLC). For a rate constrained lossy compression scheme, for large enough rate of the compression scheme, it is possible to (almost) exactly reconstruct the original signal from its compressed version. However, as the rate gets smaller, the reconstructed signal necessarily has very high distortion. The DPLC formulation aims to alleviate this issue by enforcing an additional constraint during the design of the encoder and the decoder of the compression scheme. The constraint requires that irrespective of the rate of the compression the distribution of the reconstructed signal is the same as that of the original signal.
The Race to Block OpenAI's Scraping Bots Is Slowing Down
It's too soon to say how the spate of deals between AI companies and publishers will shake out. OpenAI has already scored one clear win, though: Its web crawlers aren't getting blocked by top news outlets at the rate they once were. The generative AI boom sparked a gold rush for data--and a subsequent data-protection rush (for most news websites, anyway) in which publishers sought to block AI crawlers and prevent their work from becoming training data without consent. When Apple debuted a new AI agent this summer, for example, a slew of top news outlets swiftly opted out of Apple's web scraping using the Robots Exclusion Protocol, or robots.txt, the file that allows webmasters to control bots. There are so many new AI bots on the scene that it can feel like playing whack-a-mole to keep up.
Reviews: Bias and Generalization in Deep Generative Models: An Empirical Study
After reading author responses and discussing with other reviewers, I have decided to raise my score. I think the authors did a good job in their response to the points I raised. However, I still think that their should be more emphasis in the paper on the significance of the observations made in the paper which was not clear to me at first. The study relies on probative experiments using synthetic image datasets (e.g. CLEVR, colored dots, pie shapes with various color proportions) in which observations can be explained by few, independent factors or features (e.g.
ChatGPT is changing the way we write. Here's how โ and why it's a problem
Have you noticed certain words and phrases popping up everywhere lately? Phrases such as "delve into" and "navigate the landscape" seem to feature in everything from social media posts to news articles and academic publications. They may sound fancy, but their overuse can make a text feel monotonous and repetitive. This trend may be linked to the increasing use of generative artificial intelligence (AI) tools such as ChatGPT and other large language models (LLMs). These tools are designed to make writing easier by offering suggestions based on patterns in the text they were trained on.
Reviews: Semi-crowdsourced Clustering with Deep Generative Models
A complex DGM is proposed that jointly models observations with crowdsourced annotations of whether or not two observations belong to the same cluster. This allows crowdsourcing non-expert annotations to help with clustering complex data. Importantly, the model is developed for the semi-supervised case, i.e., annotations are only observed for a small proportion of observation pairs. The authors propose a hierarchical VAE structure to model the observations, with a discrete latent-variable z \sim p(z \pi), a continuous latent variable x \sim p(x z), and observed data o \sim p(o x). This is paired with a two-coin David-Skene model which is conditioned on the mixture variable z for annotations: L \sim p(L z_i, z_j, \alpha, \beta), where \alpha and \beta are annotator-specific latent variables that model the "expertise" of the m_th annotator (precision and recall parameters, respectively). To the best of my understanding, through the dependence of the two-coin model on the latent mixture association, though it is not explicitly stated in the paper, z represents cluster association in the model.
SoK: Towards Security and Safety of Edge AI
Wingarz, Tatjana, Lauscher, Anne, Edinger, Janick, Kaaser, Dominik, Schulte, Stefan, Fischer, Mathias
Advanced AI applications have become increasingly available to a broad audience, e.g., as centrally managed large language models (LLMs). Such centralization is both a risk and a performance bottleneck - Edge AI promises to be a solution to these problems. However, its decentralized approach raises additional challenges regarding security and safety. In this paper, we argue that both of these aspects are critical for Edge AI, and even more so, their integration. Concretely, we survey security and safety threats, summarize existing countermeasures, and collect open challenges as a call for more research in this area.
AI-Enhanced Ethical Hacking: A Linux-Focused Experiment
Al-Sinani, Haitham S., Mitchell, Chris J.
This technical report investigates the integration of generative AI (GenAI), specifically ChatGPT, into the practice of ethical hacking through a comprehensive experimental study and conceptual analysis. Conducted in a controlled virtual environment, the study evaluates GenAI's effectiveness across the key stages of penetration testing on Linux-based target machines operating within a virtual local area network (LAN), including reconnaissance, scanning and enumeration, gaining access, maintaining access, and covering tracks. The findings confirm that GenAI can significantly enhance and streamline the ethical hacking process while underscoring the importance of balanced human-AI collaboration rather than the complete replacement of human input. The report also critically examines potential risks such as misuse, data biases, hallucination, and over-reliance on AI. This research contributes to the ongoing discussion on the ethical use of AI in cybersecurity and highlights the need for continued innovation to strengthen security defences.