Generative AI
DiffusionLight: Light Probes for Free by Painting a Chrome Ball
Phongthawee, Pakkapon, Chinchuthakun, Worameth, Sinsunthithet, Nontaphat, Raj, Amit, Jampani, Varun, Khungurn, Pramook, Suwajanakorn, Supasorn
We present a simple yet effective technique to estimate lighting in a single input image. Current techniques rely heavily on HDR panorama datasets to train neural networks to regress an input with limited field-of-view to a full environment map. However, these approaches often struggle with real-world, uncontrolled settings due to the limited diversity and size of their datasets. To address this problem, we leverage diffusion models trained on billions of standard images to render a chrome ball into the input image. Despite its simplicity, this task remains challenging: the diffusion models often insert incorrect or inconsistent objects and cannot readily generate images in HDR format. Our research uncovers a surprising relationship between the appearance of chrome balls and the initial diffusion noise map, which we utilize to consistently generate high-quality chrome balls. We further fine-tune an LDR difusion model (Stable Diffusion XL) with LoRA, enabling it to perform exposure bracketing for HDR light estimation. Our method produces convincing light estimates across diverse settings and demonstrates superior generalization to in-the-wild scenarios.
An attempt to generate new bridge types from latent space of variational autoencoder
Try to generate new bridge types using generative artificial intelligence technology. The grayscale images of the bridge facade with the change of component width was rendered by 3dsMax animation software, and then the OpenCV module performed an appropriate amount of geometric transformation (rotation, horizontal scale, vertical scale) to obtain the image dataset of three-span beam bridge, arch bridge, cable-stayed bridge and suspension bridge. Based on Python programming language, TensorFlow and Keras deep learning platform framework, variational autoencoder was constructed and trained, and low-dimensional bridge-type latent space that is convenient for vector operations was obtained. Variational autoencoder can combine two bridge types on the basis of the original of human into one that is a new bridge type. Generative artificial intelligence technology can assist bridge designers in bridge-type innovation, and can be used as copilot.
Global chip market forecast to grow to record $588 billion in 2024
The global semiconductor market is expected to grow 13.1% in 2024 to a record $588.36 billion, following a slump this year, thanks to growing demand for chips used for artificial intelligence, according to a forecast by an industry organization. The World Semiconductor Trade Statistics, an organization formed by major chip manufacturers, revised its growth forecast higher for the next year from the previous growth estimate made in June of 11.8%. If realized, the market size in terms of billings will exceed the previous record of $574.08 billion in 2022. In 2023, the market is expected to decrease 9.4% to $520.13 billion due to weaker demand for memory chips. The optimistic outlook comes as the industry has started to see signs of recovery in demand driven by widespread use of generative AI following the launch of ChatGPT, an AI chatbot developed by U.S.-based OpenAI, and improving sales of PCs and smartphones.
After a slow start, generative AI gathers speed in Japan
When Japanese business mogul Masayoshi Son, who founded SoftBank Group, took the stage at his firm's event in October, he spoke passionately about the boom in generative artificial intelligence, and asked a question: "Please raise your hand if you use ChatGPT, GPT-4 almost every day for work?" Seeing that the people who did was less than 10% of the audience, Son castigated the remainder: "This is bad! If you didn't raise your hand, you should be repentant and rethink your life.
Generation Z's Ability to Discriminate Between AI-generated and Human-Authored Text on Discord
Ramu, Dhruv, Jain, Rishab, Jain, Aditya
The growing popularity of generative artificial intelligence (AI) chatbots such as ChatGPT is having transformative effects on social media. As the prevalence of AI-generated content grows, concerns have been raised regarding privacy and misinformation online. Among social media platforms, Discord enables AI integrations -- making their primarily "Generation Z" userbase particularly exposed to AI-generated content. We surveyed Generation Z aged individuals (n = 335) to evaluate their proficiency in discriminating between AI-generated and human-authored text on Discord. The investigation employed one-shot prompting of ChatGPT, disguised as a text message received on the Discord.com platform. We explore the influence of demographic factors on ability, as well as participants' familiarity with Discord and artificial intelligence technologies. We find that Generation Z individuals are unable to discern between AI and human-authored text (p = 0.011), and that those with lower self-reported familiarity with Discord demonstrated an improved ability in identifying human-authored compared to those with self-reported experience with AI (p << 0.0001). Our results suggest that there is a nuanced relationship between AI technology and popular modes of communication for Generation Z, contributing valuable insights into human-computer interactions, digital communication, and artificial intelligence literacy.
HSC-GPT: A Large Language Model for Human Settlements Construction
Ran, Chen, Xueqi, Yao, Xuhui, Jiang, Zhengqi, Han, Jingze, Guo, Xianyue, Zhang, Chunyu, Lin, Chumin, Liu, Jing, Zhao, Zeke, Lian, Jingjing, Zhang, Keke, Li
The field of human settlement construction encompasses a range of spatial designs and management tasks, including urban planning and landscape architecture design. These tasks involve a plethora of instructions and descriptions presented in natural language, which are essential for understanding design requirements and producing effective design solutions. Recent research has sought to integrate natural language processing (NLP) and generative artificial intelligence (AI) into human settlement construction tasks. Due to the efficient processing and analysis capabilities of AI with data, significant successes have been achieved in design within this domain. However, this task still faces several fundamental challenges. The semantic information involved includes complex spatial details, diverse data source formats, high sensitivity to regional culture, and demanding requirements for innovation and rigor in work scenarios. These factors lead to limitations when applying general generative AI in this field, further exacerbated by a lack of high-quality data for model training. To address these challenges, this paper first proposes HSC-GPT, a large-scale language model framework specifically designed for tasks in human settlement construction, considering the unique characteristics of this domain.
Viz: A QLoRA-based Copyright Marketplace for Legally Compliant Generative AI
This paper aims to introduce and analyze the Viz system in a comprehensive way, a novel system architecture that integrates Quantized Low-Rank Adapters (QLoRA) to fine-tune large language models (LLM) within a legally compliant and resource efficient marketplace. Viz represents a significant contribution to the field of artificial intelligence, particularly in addressing the challenges of computational efficiency, legal compliance, and economic sustainability in the utilization and monetization of LLMs. The paper delineates the scholarly discourse and developments that have informed the creation of Viz, focusing primarily on the advancements in LLM models, copyright issues in AI training (NYT case, 2023), and the evolution of model fine-tuning techniques, particularly low-rank adapters and quantized low-rank adapters, to create a sustainable and economically compliant framework for LLM utilization. The economic model it proposes benefits content creators, AI developers, and end-users, delineating a harmonious integration of technology, economy, and law, offering a comprehensive solution to the complex challenges of today's AI landscape.
2023 was the year of generative AI. What can we expect in 2024?
In 2023, artificial intelligence (AI) truly entered our daily lives. The latest data shows four in five teenagers in the United Kingdom are using generative AI tools. About two-thirds of Australian employees report using generative AI for work. At first, many people used these tools because they were curious about generative AI or wanted to be entertained. Now, people ask generative AI for help with studies, for advice, or use it to find or synthesise information. Other uses include getting help coding and making images, videos, or audio.
Microsoft's Copilot AI chatbot app arrives on iOS
A few days ago, Microsoft released a standalone Android app for Microsoft Copilot, giving you a quick way to access the AI assistant. Turns out the iOS and iPad versions weren't far behind, because they're now available from Apple's App Store. Just like in Copilot on desktop and other AI chatbots, such as ChatGPT, you can type in your question and wait for responses generated by artificial intelligence. In Copilot's case, you'll get responses spun by OpenAI's GPT-4, the company's latest large language model. The free version of ChatGPT, in comparison, is powered by the older ChatGPT-3.5, and you'll need to pay for ChatGPT Plus to get access to the newer model.
Learning from a Generative AI Predecessor -- The Many Motivations for Interacting with Conversational Agents
Brinkman, Donald, Grudin, Jonathan
For generative AI to succeed, how engaging a conversationalist must it be? For almost sixty years, some conversational agents have responded to any question or comment to keep a conversation going. In recent years, several utilized machine learning or sophisticated language processing, such as Tay, Xiaoice, Zo, Hugging Face, Kuki, and Replika. Unlike generative AI, they focused on engagement, not expertise. Millions of people were motivated to engage with them. What were the attractions? Will generative AI do better if it is equally engaging, or should it be less engaging? Prior to the emergence of generative AI, we conducted a large-scale quantitative and qualitative analysis to learn what motivated millions of people to engage with one such 'virtual companion,' Microsoft's Zo. We examined the complete chat logs of 2000 anonymized people. We identified over a dozen motivations that people had for interacting with this software. Designers learned different ways to increase engagement. Generative conversational AI does not yet have a clear revenue model to address its high cost. It might benefit from being more engaging, even as it supports productivity and creativity. Our study and analysis point to opportunities and challenges.