Generative AI
AI bubble or 'revolution'? OpenAI's big payday fuels debate
Fear of missing out has rocketed the value of artificial intelligence companies, despite few signs as to when the technology will turn a profit, raising talk of AI overenthusiasm. The mystery deepens when it comes to predicting which generative AI firms will prevail, according to analysts. ChatGPT-maker OpenAI secured 6.6 billion in a funding round that propelled its valuation to an eye-popping 157 billion, sparking new worries there is an AI bubble poised to burst.
Chain-of-Jailbreak Attack for Image Generation Models via Editing Step by Step
Wang, Wenxuan, Gao, Kuiyi, Jia, Zihan, Yuan, Youliang, Huang, Jen-tse, Liu, Qiuzhi, Wang, Shuai, Jiao, Wenxiang, Tu, Zhaopeng
WARNING: This paper contains unsafe model generation. Text-based image generation models, such as Stable Diffusion and DALL-E 3, hold significant potential in content creation and publishing workflows, making them the focus in recent years. Despite their remarkable capability to generate diverse and vivid images, considerable efforts are being made to prevent the generation of harmful content, such as abusive, violent, or pornographic material. To assess the safety of existing models, we introduce a novel jailbreaking method called Chainof-Jailbreak (CoJ) attack, which compromises image generation models through a step-by-step editing process. Specifically, for malicious queries that cannot bypass the safeguards with a single prompt, we intentionally decompose the query into multiple sub-queries. The image generation models are then prompted to generate and iteratively edit images based on these sub-queries. To evaluate the effectiveness of our CoJ attack method, we constructed a comprehensive dataset, CoJ-Bench, encompassing nine safety scenarios, three types of editing operations, and three editing elements. Experiments on four widely-used image generation services provided by GPT-4V, GPT-4o, Gemini 1.5 and Gemini 1.5 Pro, demonstrate that our CoJ attack method can successfully bypass the safeguards of models for over 60% cases, which significantly outperforms other jailbreaking methods (i.e., 14%). Further, to enhance these models' safety against our CoJ attack method, we also propose an effective prompting-based method, Think Twice Prompting, that can successfully defend over 95% of CoJ attack. Image generation models, which generate images from a given text, have recently drawn lots of interest from academia and the industry.
Generative Artificial Intelligence for Navigating Synthesizable Chemical Space
Gao, Wenhao, Luo, Shitong, Coley, Connor W.
We introduce SynFormer, a generative modeling framework designed to efficiently explore and navigate synthesizable chemical space. Unlike traditional molecular generation approaches, we generate synthetic pathways for molecules to ensure that designs are synthetically tractable. By incorporating a scalable transformer architecture and a diffusion module for building block selection, SynFormer surpasses existing models in synthesizable molecular design. We demonstrate SynFormer's effectiveness in two key applications: (1) local chemical space exploration, where the model generates synthesizable analogs of a reference molecule, and (2) global chemical space exploration, where the model aims to identify optimal molecules according to a black-box property prediction oracle. Additionally, we demonstrate the scalability of our approach via the improvement in performance as more computational resources become available. With our code and trained models openly available, we hope that SynFormer will find use across applications in drug discovery and materials science.
Images Speak Volumes: User-Centric Assessment of Image Generation for Accessible Communication
Anschรผtz, Miriam, Sylaj, Tringa, Groh, Georg
Explanatory images play a pivotal role in accessible and easy-to-read (E2R) texts. However, the images available in online databases are not tailored toward the respective texts, and the creation of customized images is expensive. In this large-scale study, we investigated whether text-to-image generation models can close this gap by providing customizable images quickly and easily. We benchmarked seven, four open- and three closed-source, image generation models and provide an extensive evaluation of the resulting images. In addition, we performed a user study with people from the E2R target group to examine whether the images met their requirements. We find that some of the models show remarkable performance, but none of the models are ready to be used at a larger scale without human supervision. Our research is an important step toward facilitating the creation of accessible information for E2R creators and tailoring accessible images to the target group's needs.
OpenAI rolls out Canvas, its newest ChatGPT interface
OpenAI is beta testing a new workspace interface for ChatGPT called Canvas. The AI giant unveiled its new ChatGPT workspace on its official blog and it's currently available for ChatGPT Plus and Team users. Enterprise and Edu users will be able to access Canvas sometime next week. Canvas is a virtual interface space for writing and coding projects that allow users to consult with ChatGPT on certain portions of a project. A separate window opens besides the main chat space and users can put writing or code on this new "canvas" and highlight sections to have the model focus on and edit "like a copy editor or code reviewer," according to the blog.
Google will expand Gemini Live to over 40 languages in the coming weeks
Gemini Live, Google's AI chatbot you can talk to like a person, is about to support more languages. The company is rolling out support for the generative AI virtual assistant in over 40 languages in the coming weeks. Gemini Live is Google's take on "free-flowing, natural conversations" in this new generative AI era. You can use it for things like brainstorming for events, diving down learning rabbit holes or practicing for job interview questions (and receiving real-time feedback). Although Google describes it as like talking with a friend, I'm unsure how many would do all of that.
OpenAI's ChatGPT Breaks Out of Its Box--and Onto a Canvas
Just one day after OpenAI announced a 6.6 billion funding round, the company is launching its first major interface evolution for ChatGPT. In what could be recognition from OpenAI that its transformational chatbot is ready for user experiences beyond a question and answer format, the new beta feature is an editable canvas that opens in a window alongside ChatGPT's standard chatbox. "The core thing we're trying to solve is a better way to collaborate with ChatGPT on writing and coding," says Daniel Levine, a product lead at OpenAI for the canvas feature. Canvas is rolling out in beta to ChatGPT Plus and Team subscribers today, and Enterprise and Edu customers will likely get the feature next week. The feature is fully functional on desktops--mobile users can only view the canvas projects for now.
OpenAI now has a 4 billion credit line on top of 6.6 billion in funding
Keeping ChatGPT running is expensive as heck, so OpenAI needs access to plenty of cash to make sure the lights stay on. A day after the company said it had secured 6.6 billion in funding -- the biggest ever funding round for a startup -- it confirmed that it has a new 4 billion revolving line of credit. OpenAI has yet to tap the credit line, which it obtained from JPMorgan Chase, Citi, Goldman Sachs, Morgan Stanley, Santander, Wells Fargo, SMBC, UBS and HSBC. Some of those banks are also among OpenAI's customers. All told, OpenAI now has a war chest of over 10 billion in liquid funds.
SoftBank's Son envisions AI running households in the next few years
SoftBank Group founder Masayoshi Son sketched out one of the most aggressive timelines for the adoption of artificial intelligence yet, envisioning a near future where the technology would run entire households. AI will soon be able to monitor the health of family members, call the doctor when needed, do grocery shopping, make reservations, judge optimal investments and tutor young children, Son said in a speech at an annual forum for enterprise clients on Thursday. He moved up his expectation for when artificial general intelligence -- the long-term goal for developers from OpenAI to Meta Platforms and Alphabet's Google -- would arrive to within the next two to three years.