AITopics | Generative AI

Collaborating Authors

Generative AI

News Overviews Instructional Materials AI-Alerts Classics

The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)

Yang, Zhengyuan, Li, Linjie, Lin, Kevin, Wang, Jianfeng, Lin, Chung-Ching, Liu, Zicheng, Wang, Lijuan

arXiv.org Artificial IntelligenceOct-11-2023

Large multimodal models (LMMs) extend large language models (LLMs) with multi-sensory skills, such as visual understanding, to achieve stronger generic intelligence. In this paper, we analyze the latest model, GPT-4V(ision), to deepen the understanding of LMMs. The analysis focuses on the intriguing tasks that GPT-4V can perform, containing test samples to probe the quality and genericity of GPT-4V's capabilities, its supported inputs and working modes, and the effective ways to prompt the model. In our approach to exploring GPT-4V, we curate and organize a collection of carefully designed qualitative samples spanning a variety of domains and tasks. Observations from these samples demonstrate that GPT-4V's unprecedented ability in processing arbitrarily interleaved multimodal inputs and the genericity of its capabilities together make GPT-4V a powerful multimodal generalist system. Furthermore, GPT-4V's unique capability of understanding visual markers drawn on input images can give rise to new human-computer interaction methods such as visual referring prompting. We conclude the report with in-depth discussions on the emerging application scenarios and the future research directions for GPT-4V-based systems. We hope that this preliminary exploration will inspire future research on the next-generation multimodal task formulation, new ways to exploit and enhance LMMs to solve real-world problems, and gaining better understanding of multimodal foundation models. Finally, we acknowledge that the model under our study is solely the product of OpenAI's innovative work, and they should be fully credited for its development. Please see the GPT-4V contributions paper for the authorship and credit attribution: https://cdn.openai.com/contributions/gpt-4v.pdf

gpt-4v, ision, preliminary exploration, (1 more...)

arXiv.org Artificial Intelligence

2309.17421

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.44)

Add feedback

MatChat: A Large Language Model and Application Service Platform for Materials Science

Chen, Ziyi, Xie, Fankai, Wan, Meng, Yuan, Yang, Liu, Miao, Wang, Zongguo, Meng, Sheng, Wang, Yangang

arXiv.org Artificial IntelligenceOct-11-2023

The prediction of chemical synthesis pathways plays a pivotal role in materials science research. Challenges, such as the complexity of synthesis pathways and the lack of comprehensive datasets, currently hinder our ability to predict these chemical processes accurately. However, recent advancements in generative artificial intelligence (GAI), including automated text generation and question-answering systems, coupled with fine-tuning techniques, have facilitated the deployment of large-scale AI models tailored to specific domains. In this study, we harness the power of the LLaMA2-7B model and enhance it through a learning process that incorporates 13,878 pieces of structured material knowledge data. This specialized AI model, named MatChat, focuses on predicting inorganic material synthesis pathways. MatChat exhibits remarkable proficiency in generating and reasoning with knowledge in materials science. Although MatChat requires further refinement to meet the diverse material design needs, this research undeniably highlights its impressive reasoning capabilities and innovative potential in the field of materials science. MatChat is now accessible online and open for use, with both the model and its application framework available as open source. This study establishes a robust foundation for collaborative innovation in the integration of generative AI in materials science.

language model, materials science, model and application service platform, (1 more...)

arXiv.org Artificial Intelligence

doi: 10.1088/1674-1056/ad04cb

2310.07197

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.53)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.53)

Add feedback

State of the Art on Diffusion Models for Visual Computing

Po, Ryan, Yifan, Wang, Golyanik, Vladislav, Aberman, Kfir, Barron, Jonathan T., Bermano, Amit H., Chan, Eric Ryan, Dekel, Tali, Holynski, Aleksander, Kanazawa, Angjoo, Liu, C. Karen, Liu, Lingjie, Mildenhall, Ben, Nießner, Matthias, Ommer, Björn, Theobalt, Christian, Wonka, Peter, Wetzstein, Gordon

arXiv.org Artificial IntelligenceOct-11-2023

The field of visual computing is rapidly advancing due to the emergence of generative artificial intelligence (AI), which unlocks unprecedented capabilities for the generation, editing, and reconstruction of images, videos, and 3D scenes. In these domains, diffusion models are the generative AI architecture of choice. Within the last year alone, the literature on diffusion-based tools and applications has seen exponential growth and relevant papers are published across the computer graphics, computer vision, and AI communities with new works appearing daily on arXiv. This rapid growth of the field makes it difficult to keep up with all recent developments. The goal of this state-of-the-art report (STAR) is to introduce the basic mathematical concepts of diffusion models, implementation details and design choices of the popular Stable Diffusion model, as well as overview important aspects of these generative AI tools, including personalization, conditioning, inversion, among others. Moreover, we give a comprehensive overview of the rapidly growing literature on diffusion-based generation and editing, categorized by the type of generated medium, including 2D images, videos, 3D objects, locomotion, and 4D scenes. Finally, we discuss available datasets, metrics, open challenges, and social implications. This STAR provides an intuitive starting point to explore this exciting topic for researchers, artists, and practitioners alike.

diffusion model, visual computing

arXiv.org Artificial Intelligence

2310.07204

Genre:

Overview (0.53)
Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.73)

Add feedback

Adobe brings more generative AI features to Express

EngadgetOct-10-2023, 16:00:18 GMT

Few tech companies have embraced generative AI as wholeheartedly as Adobe. At Adobe Max, its annual creativity conference, it unveiled a new version of the Firefly GAI model. Not only that, the company announced more GAI features for Adobe Express, just weeks after making Firefly more broadly available in the app. Adobe Express now includes features such as Generative Fill. This enables users to add, remove or replace items, people and other aspects of images using text prompts.

adobe bring, generative ai feature, template

Engadget

Industry: Information Technology (0.58)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.63)

Add feedback

Generative AI deployment: Strategies for smooth scaling

MIT Technology ReviewOct-10-2023, 14:12:19 GMT

One-quarter of respondents expect generative AI's primary effect to be a reduction in their workforce. The figure was higher in industrial sectors like energy and utilities (43%), manufacturing (34%), and transport and logistics (31%). It was lowest in IT and telecommunications (7%). Overall, this is a modest figure compared to the more dystopian job replacement scenarios in circulation. Demand for skills is increasing in technical fields that focus on operationalizing AI models and in organizational and management positions tackling thorny topics including ethics and risk. AI is democratizing technical skills across the workforce in ways that could lead to new job opportunities and increased employee satisfaction.

artificial intelligence, machine learning, natural language, (4 more...)

MIT Technology Review

Industry: Telecommunications (0.32)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.75)

Add feedback

VerifAI: Verified Generative AI

Tang, Nan, Yang, Chenyu, Fan, Ju, Cao, Lei, Luo, Yuyu, Halevy, Alon

arXiv.org Artificial IntelligenceOct-10-2023

Generative AI has made significant strides, yet concerns about the accuracy and reliability of its outputs continue to grow. Such inaccuracies can have serious consequences such as inaccurate decision-making, the spread of false information, privacy violations, legal liabilities, and more. Although efforts to address these risks are underway, including explainable AI and responsible AI practices such as transparency, privacy protection, bias mitigation, and social and environmental responsibility, misinformation caused by generative AI will remain a significant challenge. We propose that verifying the outputs of generative AI from a data management perspective is an emerging issue for generative AI. This involves analyzing the underlying data from multi-modal data lakes, including text files, tables, and knowledge graphs, and assessing its quality and consistency. By doing so, we can establish a stronger foundation for evaluating the outputs of generative AI models. Such an approach can ensure the correctness of generative AI, promote transparency, and enable decision-making with greater confidence. Our vision is to promote the development of verifiable generative AI and contribute to a more trustworthy and responsible use of AI.

verifai, verified generative ai

arXiv.org Artificial Intelligence

2307.02796

Genre: Research Report (0.40)

Industry:

Law (0.53)
Information Technology > Security & Privacy (0.53)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Add feedback

Tertiary Lymphoid Structures Generation through Graph-based Diffusion

Madeira, Manuel, Thanou, Dorina, Frossard, Pascal

arXiv.org Artificial IntelligenceOct-10-2023

Graph-based representation approaches have been proven to be successful in the analysis of biomedical data, due to their capability of capturing intricate dependencies between biological entities, such as the spatial organization of different cell types in a tumor tissue. However, to further enhance our understanding of the underlying governing biological mechanisms, it is important to accurately capture the actual distributions of such complex data. Graph-based deep generative models are specifically tailored to accomplish that. In this work, we leverage state-of-the-art graph-based diffusion models to generate biologically meaningful cell-graphs. In particular, we show that the adopted graph diffusion model is able to accurately learn the distribution of cells in terms of their tertiary lymphoid structures (TLS) content, a well-established biomarker for evaluating the cancer progression in oncology research. Additionally, we further illustrate the utility of the learned generative models for data augmentation in a TLS classification task. To the best of our knowledge, this is the first work that leverages the power of graph diffusion models in generating meaningful biological cell structures.

digress, generative model, graph, (14 more...)

arXiv.org Artificial Intelligence

2310.06661

Country:

Europe > Switzerland > Vaud > Lausanne (0.04)
North America > Saint Martin (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

The Morning After: ChatGPT creator OpenAI might start making its own AI chips

EngadgetOct-9-2023, 11:15:21 GMT

According to Reuters, OpenAI is exploring making its own artificial intelligence chips, even looking into an acquisition. OpenAI CEO Sam Altman previously blamed GPU shortages for users' concerns regarding the company API's speed and reliability, leading to these moves. OpenAI using its own chips could reduce its costs too. Based on analysis by Bernstein Research, each ChatGPT query costs the company around four cents. At the moment, NVIDIA controls the market for chips that power AI applications. The Microsoft supercomputer OpenAI used to develop its technology, for instance, uses 10,000 NVIDIA GPUs.

chatgpt creator openai, openai, own ai chip, (3 more...)

Engadget

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

G7 to draw up AI code of conduct this autumn: Kishida

The Japan TimesOct-9-2023, 04:45:00 GMT

Prime Minister Fumio Kishida unveiled a plan on Monday to hold a video conference with Group of Seven leaders this autumn to formulate international guidelines and a code of conduct for developers of artificial intelligence (AI) tools. Kishida showed the plan in a speech at a special session of the U.N.-sponsored Internet Governance Forum in Kyoto. The theme of the guidelines and code of conduct is part of the Hiroshima AI Process, an initiative for international best practices regarding generative AI, according to the Japanese leader. Kishida also said that the Japanese government's new economic package, planned to be drawn up late this month, will include aid for the development of computational resources, used for processing huge volumes of data needed for AI development and use, and of basic computational models, as well as stepping up the introduction of AI in small businesses and the medical field. The Hiroshima AI Process, which was agreed on at the G7 summit held in Hiroshima in May, also calls for creating international guidelines by the end of the year that will also cover generative AI users.

autumn, international guideline, kishida, (4 more...)

The Japan Times

Country:

Asia > Japan > Honshū > Chūgoku > Hiroshima Prefecture > Hiroshima (0.74)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.27)
North America > United States (0.07)
(5 more...)

Industry: Government > Regional Government > Asia Government > Japan Government (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.54)

Add feedback

Subsurface Characterization using Ensemble-based Approaches with Deep Generative Models

Bao, Jichao, Yoon, Hongkyu, Lee, Jonghyun

arXiv.org Machine LearningOct-9-2023

Estimating spatially distributed properties such as hydraulic conductivity (K) from available sparse measurements is a great challenge in subsurface characterization. However, the use of inverse modeling is limited for ill-posed, high-dimensional applications due to computational costs and poor prediction accuracy with sparse datasets. In this paper, we combine Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP), a deep generative model that can accurately capture complex subsurface structure, and Ensemble Smoother with Multiple Data Assimilation (ES-MDA), an ensemble-based inversion method, for accurate and accelerated subsurface characterization. WGAN-GP is trained to generate high-dimensional K fields from a low-dimensional latent space and ES-MDA then updates the latent variables by assimilating available measurements. Several subsurface examples are used to evaluate the accuracy and efficiency of the proposed method and the main features of the unknown K fields are characterized accurately with reliable uncertainty quantification. Furthermore, the estimation performance is compared with a widely-used variational, i.e., optimization-based, inversion approach, and the proposed approach outperforms the variational inversion method, especially for the channelized and fractured field examples. We explain such superior performance by visualizing the objective function in the latent space: because of nonlinear and aggressive dimension reduction via generative modeling, the objective function surface becomes extremely complex while the ensemble approximation can smooth out the multi-modal surface during the minimization. This suggests that the ensemble-based approach works well over the variational approach when combined with deep generative models at the cost of forward model runs unless convergence-ensuring modifications are implemented in the variational inversion.

artificial intelligence, machine learning, manuscript, (17 more...)

arXiv.org Machine Learning

2310.00839

Country:

North America > United States > Hawaii (0.14)
North America > United States > Mississippi (0.14)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Energy > Oil & Gas > Upstream (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.81)

Add feedback