Generative AI
Diffusion-based Visual Counterfactual Explanations -- Towards Systematic Quantitative Evaluation
Vaeth, Philipp, Fruehwald, Alexander M., Paassen, Benjamin, Gregorova, Magda
Latest methods for visual counterfactual explanations (VCE) harness the power of deep generative models to synthesize new examples of high-dimensional images of impressive quality. However, it is currently difficult to compare the performance of these VCE methods as the evaluation procedures largely vary and often boil down to visual inspection of individual examples and small scale user studies. In this work, we propose a framework for systematic, quantitative evaluation of the VCE methods and a minimal set of metrics to be used. We use this framework to explore the effects of certain crucial design choices in the latest diffusion-based generative models for VCEs of natural image classification (ImageNet). We conduct a battery of ablation-like experiments, generating thousands of VCEs for a suite of classifiers of various complexity, accuracy and robustness. Our findings suggest multiple directions for future advancements and improvements of VCE methods. By sharing our methodology and our approach to tackle the computational challenges of such a study on a limited hardware setup (including the complete code base), we offer a valuable guidance for researchers in the field fostering consistency and transparency in the assessment of counterfactual explanations.
Generative AI Is Making Companies Even More Thirsty for Your Data
Zoom, the company that normalized attending business meetings in your pajama pants, was forced to unmute itself this week to reassure users that it would not use personal data to train artificial intelligence without their consent. A keen-eyed Hacker News user last week noticed that an update to Zoom's terms and conditions in March appeared to essentially give the company free rein to slurp up voice, video, and other data, and shovel it into machine learning systems. The new terms stated that customers "consent to Zoom's access, use, collection, creation, modification, distribution, processing, sharing, maintenance, and storage of Service Generated Data" for purposes including "machine learning or artificial intelligence (including for training and tuning of algorithms and models)." The discovery prompted critical news articles and angry posts across social media. On Monday, Zoom's chief product officer, Smita Hasham, wrote a blog post stating, "We will not use audio, video, or chat customer content to train our artificial intelligence models without your consent." The company also updated its terms to say the same.
Authors fear they have little defence against AI impersonators
Authors seem to be facing a new threat from artificial intelligence, with one finding books she didn't write being sold by Amazon under her name. There are fears that ready access to generative AI tools could make it easy for people to impersonate writers without their permission. The issue was raised by author Jane Friedman.
ChatGPT iOS app: How to use Custom Instructions
PactumAI co-founder and CEO Martin Rand explains how workers can use artificial intelligence to enhance their careers and positions. Artificial intelligence leader OpenAI has once again updated its ChatGPT chatbot smartphone app, making improvements and minor bug fixes. Recent changes made at the end of last month expanded access to Custom Instructions to iOS devices. "Custom instructions now give you more control over ChatGPT's responses. Set your preferences once, and they'll steer future conversations. This feature is now available for Plus users and expanding to all users in the coming weeks," the update on July 28 noted.
Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment
Liu, Yang, Yao, Yuanshun, Ton, Jean-Francois, Zhang, Xiaoying, Guo, Ruocheng, Cheng, Hao, Klochkov, Yegor, Taufiq, Muhammad Faaiz, Li, Hang
Ensuring alignment, which refers to making models behave in accordance with human intentions [1,2], has become a critical task before deploying large language models (LLMs) in real-world applications. For instance, OpenAI devoted six months to iteratively aligning GPT-4 before its release [3]. However, a major challenge faced by practitioners is the lack of clear guidance on evaluating whether LLM outputs align with social norms, values, and regulations. This obstacle hinders systematic iteration and deployment of LLMs. To address this issue, this paper presents a comprehensive survey of key dimensions that are crucial to consider when assessing LLM trustworthiness. The survey covers seven major categories of LLM trustworthiness: reliability, safety, fairness, resistance to misuse, explainability and reasoning, adherence to social norms, and robustness. Each major category is further divided into several sub-categories, resulting in a total of 29 sub-categories. Additionally, a subset of 8 sub-categories is selected for further investigation, where corresponding measurement studies are designed and conducted on several widely-used LLMs. The measurement results indicate that, in general, more aligned models tend to perform better in terms of overall trustworthiness. However, the effectiveness of alignment varies across the different trustworthiness categories considered. This highlights the importance of conducting more fine-grained analyses, testing, and making continuous improvements on LLM alignment. By shedding light on these key dimensions of LLM trustworthiness, this paper aims to provide valuable insights and guidance to practitioners in the field. Understanding and addressing these concerns will be crucial in achieving reliable and ethically sound deployment of LLMs in various applications.
GPT-4 Can't Reason
GPT-4 was released in March 2023 to wide acclaim, marking a very substantial improvement across the board over GPT-3.5 (OpenAI's previously best model, which had powered the initial release of ChatGPT). However, despite the genuinely impressive improvement, there are good reasons to be highly skeptical of GPT-4's ability to reason. This position paper discusses the nature of reasoning; criticizes the current formulation of reasoning problems in the NLP community, as well as the way in which LLM reasoning performance is currently evaluated; introduces a small collection of 21 diverse reasoning problems; and performs a detailed qualitative evaluation of GPT-4's performance on those problems. Based on this analysis, the paper concludes that, despite its occasional flashes of analytical brilliance, GPT-4 at present is utterly incapable of reasoning.
How to Make AI Work for You, at Work
Brynjolfsson, along with researchers Danielle Li, and Lindsey Raymond, authored a study in which generative AI was used by over 5,000 customer support agents at a call center, and found that AI tools boosted workers productivity, reduced attrition, and were especially helpful for early-career workers. Through machine learning, the generative AI systems were able to use pattern recognition to identify successes and failures in customer service approaches. "It listened in on a whole bunch of transcripts and calls, and could see the patterns that turned out well the ones that didn't turn out well," says Brynjolfsson. "It captured that tacit knowledge and passed it on to the less experienced workers." Brynjolfsson said the AI system was able to recommend specific features to solve a customer's problems, or a tone of voice or phrasing that might work better. "Maybe no human had ever written down those rules before but the AI system, by looking at literally millions of transcripts, was able to pick up on these patterns." AI tools are likely going to impact tasks that are "routine, predictable, or standardized," according to Tomas Chamorro-Premuzic, a professor of business psychology and author of I, Human: AI, Automation, and the Quest to Reclaim What Makes Us Unique. Though it might be tempting to brush off the sudden rise of AI tools as just a fad, Chamorro-Premuzic says it's important to become as familiar as possible with the tools, as they are likely to become ubiquitous. "These are tools that everybody will use, and if you're the only person not even trying it out or not using it, you might actually suffer," he says, comparing such resistance to deciding not to use Google's search engine.
This AI Company Releases Deepfakes Into the Wild. Can It Control Them?
Erica is on YouTube, detailing how much it costs to hire a divorce attorney in the state of Massachusetts. Dr. Dass is selling private medical insurance in the UK. But Jason has been on Facebook spreading disinformation about France's relationship with its former colony, Mali. And Gary has been caught impersonating a CEO as part of an elaborate crypto scam. They're deepfakes, let loose into the wild by Victor Riparbelli, CEO of Synthesia.
AI hysteria is a distraction: algorithms already sow disinformation in Africa
More than 70 countries are due to hold regional or national elections by the end of 2024. It will be a period of huge political significance across the globe, with more than 2 billion people (mostly from the global south) directly affected by the outcome of these elections. The stakes for the integrity of democracy have never been higher. As concerns mount about the influential role of information pollution, disseminated through the vast platforms of US and Chinese corporations, in shaping these elections, a new shadow looms: how artificial intelligence – more specifically, generative AI such as OpenAI's ChatGPT – has increasingly moved into the mainstream of technology. The recent wave of hype around AI has seen a fair share of doom-mongering.
"Generate" the Future of Work through AI: Empirical Evidence from Online Labor Markets
Liu, Jin, Xu, Xingchen, Li, Yongjun, Tan, Yong
With the advent of general-purpose Generative AI, the interest in discerning its impact on the labor market escalates. In an attempt to bridge the extant empirical void, we interpret the launch of ChatGPT as an exogenous shock, and implement a Difference-in-Differences (DID) approach to quantify its influence on text-related jobs and freelancers within an online labor marketplace. Our results reveal a significant decrease in transaction volume for gigs and freelancers directly exposed to ChatGPT. Additionally, this decline is particularly marked in units of relatively higher past transaction volume or lower quality standards. Yet, the negative effect is not universally experienced among service providers. Subsequent analyses illustrate that freelancers proficiently adapting to novel advancements and offering services that augment AI technologies can yield substantial benefits amidst this transformative period. Consequently, even though the advent of ChatGPT could conceivably substitute existing occupations, it also unfolds immense opportunities and carries the potential to reconfigure the future of work. This research contributes to the limited empirical repository exploring the profound influence of LLM-based generative AI on the labor market, furnishing invaluable insights for workers, job intermediaries, and regulatory bodies navigating this evolving landscape.