AITopics | Generative AI

Collaborating Authors

Generative AI

News Overviews Instructional Materials AI-Alerts Classics

Evidential Softmax for Sparse Multimodal Distributions in Deep Generative Models

Chen, Phil, Itkina, Masha, Senanayake, Ransalu, Kochenderfer, Mykel J.

arXiv.org Machine LearningOct-27-2021

Many applications of generative models rely on the marginalization of their high-dimensional output probability distributions. Normalization functions that yield sparse probability distributions can make exact marginalization more computationally tractable. However, sparse normalization functions usually require alternative loss functions for training since the log-likelihood is undefined for sparse probability distributions. Furthermore, many sparse normalization functions often collapse the multimodality of distributions. In this work, we present $\textit{ev-softmax}$, a sparse normalization function that preserves the multimodality of probability distributions. We derive its properties, including its gradient in closed-form, and introduce a continuous family of approximations to $\textit{ev-softmax}$ that have full support and can be trained with probabilistic loss functions such as negative log-likelihood and Kullback-Leibler divergence. We evaluate our method on a variety of generative models, including variational autoencoders and auto-regressive architectures. Our method outperforms existing dense and sparse normalization techniques in distributional accuracy. We demonstrate that $\textit{ev-softmax}$ successfully reduces the dimensionality of probability distributions while maintaining multimodality.

normalization function, oftmax, softmax, (14 more...)

arXiv.org Machine Learning

2110.14182

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.40)

Add feedback

#NotJustACadburyAd with Rephrase.ai - Create personalized AI videos for local stores across India

#artificialintelligenceOct-25-2021, 14:05:52 GMT

All local retailers in India can create a video of Shahrukh Khan (one of the largest celebrities in the world) endorsing their local stores! An initiative by Cadbury, conceptualized by Ogilvy, powered by Rephrase.ai's

local store, notjustacadburyad, rephrase, (2 more...)

#artificialintelligence

Country: Asia > India (0.85)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.63)

Add feedback

GPT-3 Scared You? Meet Wu Dao 2.0: A Monster of 1.75 Trillion Parameters

#artificialintelligenceOct-24-2021, 15:35:07 GMT

Jack Clark, OpenAI's policy director, calls this trend of copying GPT-3, "model diffusion." Yet, among all the copies, Wu Dao 2.0 holds the record of being the largest of all with a striking 1.75 trillion parameters (10x GPT-3). Coco Feng reported for South China Morning Post that Wu Dao 2.0 was trained on 4.9TB of high-quality text and image data, which makes GPT-3's training dataset (570GB) pale in comparison. Yet, it's worth noting OpenAI researchers curated 45TB of data to extract clean those 570GB. It can learn from text and images and tackle tasks that include both types of data (something GPT-3 can't do).

meet wu dao 2, trillion parameter, wu dao 2, (6 more...)

#artificialintelligence

Country: Asia > China (0.27)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.50)

Add feedback

Gartner identifies the top strategic technology trends for 2022

#artificialintelligenceOct-24-2021, 12:15:36 GMT

Generative AI, distributed enterprise and cloud-native platforms are amongst the top strategic technology trends for 2022, Gartner has predicted. David Groombridge, research vice president at Gartner, says with CEOs and boards striving to find growth through direct digital connections with customers, the priorities of a CIO must reflect the same business imperatives, which run through each of Gartner's top strategic tech trends for 2022. "CIOs must find the IT force multipliers to enable growth and innovation, and create scalable, resilient technical foundations whose scalability will free cash for digital investments," Groombridge says. "These imperatives form the three themes of this year's trends: engineering trust, sculpting change and accelerating growth." Gartner says one of the most visible and powerful AI techniques coming to market is generative AI – machine learning methods that learn about content or objects from their data, and use it to generate brand-new, completely original, realistic artefacts.

gartner, groombridge, top strategic technology trend, (13 more...)

#artificialintelligence

Industry: Information Technology > Security & Privacy (0.71)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.57)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.57)

Add feedback

Enterprise AI startup SambaNova releases a large language model tool

#artificialintelligenceOct-23-2021, 18:55:09 GMT

SambaNova Systems, a Palo Alto–based AI startup, announced a new language service model with a familiar description: GPT, which stands for Generative Pre-trained Transformer, and has no links to OpenAI's GPT series of language models. SambaNova markets their GPT as an everyman's alternative to OpenAI's GPT-3, writing in the press release that it will allow companies "to be up and running with a customized language model in as fast as one month as opposed to nine months or a year." What it's all about: You may have heard of SaaS (Software as a Service) or IaaS (Infrastructure as a Service), but SambaNova Systems offers DaaS: Dataflow-as-a-Service. The aim is to sell a suite of AI tools, including natural language processing ones, for startups to adopt quickly and seamlessly. GPT is the latest tool in its box: a model that can both produce and process natural language. It's built for enterprise use cases, according to the company.

enterprise ai startup sambanova release, language model tool, use case, (2 more...)

#artificialintelligence

Country: North America > United States > California > Santa Clara County > Palo Alto (0.28)

Industry: Information Technology > Software (0.61)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.52)

Add feedback

Exclusive: OpenAI summarizes KDnuggets - KDnuggets

#artificialintelligenceOct-23-2021, 15:24:24 GMT

OpenAI has recently published an important work, focused on the alignment problem, the problem of ensuring that general-purpose AI and machine learning systems align with human intentions. The "Paperclip Maximizer" is a famous example of alignment gone wrong. To test scalable alignment methods, OpenAI trained a model to summarize entire books, as described in their blog on KDnuggets: Scaling human oversight of AI systems for difficult tasks – OpenAI approach. OpenAI model works by first summarizing small sections of a book, then summarizing those summaries into a higher-level summary, and so on. The results were pretty amazing, so we have asked OpenAI to summarize two top KDnuggets blogs from last year, and here are the summaries.

learning, machine learning, scientist, (12 more...)

#artificialintelligence

Genre: Instructional Material (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

What is GPT-3 and Why Does it Matter?

#artificialintelligenceOct-20-2021, 02:10:19 GMT

The recent hype surrounding Generative Pre-trained Transformer 3 (GPT-3), the new artificial intelligence (AI) based natural language processing (NLP) model, is worth observing, particularly from the enterprise front. Both keen observation and casual look-see applied to this latest language model that generates human-like written content are worth your time and effort. It can also show you that the hype is real. However, like every technological innovation, GPT-3 has its shortcomings, yet it is a great leap for AI. In May 2020, OpenAI, an AI research lab founded by Elon Musk, launched the latest version of an AI-based Natural Language Processing system named GPT-3 that can mimic human language.

gpt-3, language model, openai, (10 more...)

#artificialintelligence

Country: North America > Canada > Ontario > Toronto (0.15)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.31)

Add feedback

Deep Generative Models in Engineering Design: A Review

Regenwetter, Lyle, Nobari, Amin Heyrani, Ahmed, Faez

arXiv.org Machine LearningOct-20-2021

Automated design synthesis has the potential to revolutionize the modern human design process and improve access to highly optimized and customized products across countless industries. Successfully adapting generative Machine Learning to design engineering may be the key to such automated design synthesis and is a research subject of great importance. We present a review and analysis of Deep Generative Learning models in engineering design. Deep Generative Models (DGMs) typically leverage deep networks to learn from an input dataset and learn to synthesize new designs. Recently, DGMs such as Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), feedforward Neural Networks (NNs) and certain Deep Reinforcement Learning (DRL) frameworks have shown promising results in design applications like structural optimization, materials design, and shape synthesis. The prevalence of DGMs in Engineering Design has skyrocketed since 2016. Anticipating continued growth, we conduct a review of recent advances with the hope of benefitting researchers interested in DGMs for design. We structure our review as an exposition of the algorithms, datasets, representation methods, and applications commonly used in the current literature. In particular, we discuss key works that have introduced new techniques and methods in DGMs, successfully applied DGMs to a design-related domain, or directly supported development of DGMs through datasets or auxiliary methods. We further identify key challenges and limitations currently seen in DGMs across design fields, such as design creativity, handling complex constraints and objectives, and modeling both form and functional performance simultaneously. In our discussion we identify possible solution pathways as key areas on which to target future work.

dataset, engineering conference, optimization, (17 more...)

arXiv.org Machine Learning

2110.10863

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Illinois (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Austria (0.04)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Information Technology (0.46)
Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.61)

Add feedback

Microsoft and Nvidia build largest ever AI to mimic human language

New ScientistOct-18-2021, 10:10:00 GMT

Microsoft and chip manufacturer Nvidia have created a vast artificial intelligence that can mimic human language more convincingly than ever before. But the cost and time involved in creating the neural network has called into question whether such AIs can continue to scale up. The new neural network, known as the Megatron-Turing Natural Language Generation (MT-NLG) has 530 billion parameters, more than tripling the scale of OpenAI's groundbreaking GPT-3 neural network that was considered the state of the art up until now.

microsoft and nvidia, mimic human language, neural network

New Scientist

Industry: Information Technology > Hardware (0.81)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Scaling Laws for the Few-Shot Adaptation of Pre-trained Image Classifiers

Prato, Gabriele, Guiroy, Simon, Caballero, Ethan, Rish, Irina, Chandar, Sarath

arXiv.org Artificial IntelligenceOct-18-2021

Empirical science of neural scaling laws is a rapidly growing area of significant importance to the future of machine learning, particularly in the light of recent breakthroughs achieved by large-scale pre-trained models such as GPT-3, CLIP and DALL-e. Accurately predicting the neural network performance with increasing resources such as data, compute and model size provides a more comprehensive evaluation of different approaches across multiple scales, as opposed to traditional point-wise comparisons of fixed-size models on fixed-size benchmarks, and, most importantly, allows for focus on the best-scaling, and thus most promising in the future, approaches. In this work, we consider a challenging problem of few-shot learning in image classification, especially when the target data distribution in the few-shot phase is different from the source, training, data distribution, in a sense that it includes new image classes not encountered during training. Our current main goal is to investigate how the amount of pre-training data affects the few-shot generalization performance of standard image classifiers. Our key observations are that (1) such performance improvements are well-approximated by power laws (linear log-log plots) as the training set size increases, (2) this applies to both cases of target data coming from either the same or from a different domain (i.e., new classes) as the training data, and (3) few-shot performance on new classes converges at a faster rate than the standard classification performance on previously seen classes. Our findings shed new light on the relationship between scale and generalization. Over the past decade, deep learning has made tremendous progress in multiple fields, especially in vision (Alam et al., 2020) and natural language processing (Torfi et al., 2020). However, several important issues remain unsolved, including the ability to generalize well to novel, out-of-distribution data (Arjovsky, 2021). A particularly challenging situation involves simultaneous changes at test time in both the input and the task, class distributions, p(x) and p(y x). For example, a self-driving car seeing an elephant for the first time should be able to recognize it as a "new object", while seeing another elephant afterwards, it should be able to recognize it as the same "new object". Obviously, any deployment of deep networks in the real world will likely require them to deal with new situations not encountered during training.

arxiv e-print, dataset, training data, (10 more...)

arXiv.org Artificial Intelligence

2110.0699

Country: North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Transportation > Ground (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback