Generative AI
Researchers quantify bias in Reddit content sometimes used to train AI
In a paper published on the preprint server Arxiv.org, This alone isn't surprising, but the problem is that data from these communities are often used to train large language models like OpenAI's GPT-3. That in turn is important because, as OpenAI itself notes, this sort of bias leads to placing words like "naughty" or "sucked" near female pronouns and "Islam" near words like "terrorism." The scientists' approach uses representations of words called embeddings to discover and categorize language biases, which could enable data scientists to trace the severity of bias in different communities and take steps to counteract this bias. To spotlight examples of potentially offensive content on Reddit subcommunities, given a language model and two sets of words representing concepts to compare and discover biases from, the method identifies the most biased words toward the concepts in a given community.
Here are a few ways GPT-3 can go wrong โ TechCrunch
OpenAI's latest language generation model, GPT-3, has made quite the splash within AI circles, astounding reporters to the point where even Sam Altman, OpenAI's leader, mentioned on Twitter that it may be overhyped. Still, there is no doubt that GPT-3 is powerful. Those with early-stage access to OpenAI's GPT-3 API have shown how to translate natural language into code for websites, solve complex medical question-and-answer problems, create basic tabular financial reports, and even write code to train machine learning models -- all with just a few well-crafted examples as input (i.e., via "few-shot learning"). Soon, anyone will be able to purchase GPT-3's generative power to make use of the language model, opening doors to build tools that will quietly (but significantly) shape our world. Enterprises aiming to take advantage of GPT-3, and the increasingly powerful iterations that will surely follow, must take great care to ensure that they install extensive guardrails when using the model, because of the many ways that it can expose a company to legal and reputational risk.
Deep Q-Network Based Multi-agent Reinforcement Learning with Binary Action Agents
Hafiz, Abdul Mueed, Bhat, Ghulam Mohiuddin
Deep Q-Network (DQN) based multi-agent systems (MAS) for reinforcement learning (RL) use various schemes where in the agents have to learn and communicate. The learning is however specific to each agent and communication may be satisfactorily designed for the agents. As more complex Deep Q-Networks come to the fore, the overall complexity of the multi-agent system increases leading to issues like difficulty in training, need for higher resources and more training time, difficulty in fine-tuning, etc. To address these issues we propose a simple but efficient DQN based MAS for RL which uses shared state and rewards, but agent-specific actions, for updation of the experience replay pool of the DQNs, where each agent is a DQN. The benefits of the approach are overall simplicity, faster convergence and better performance as compared to conventional DQN based approaches. It should be noted that the method can be extended to any DQN. As such we use simple DQN and DDQN (Double Q-learning) respectively on three separate tasks i.e. Cartpole-v1 (OpenAI Gym environment), LunarLander-v2 (OpenAI Gym environment) and Maze Traversal (customized environment). The proposed approach outperforms the baseline on these tasks by decent margins respectively.
Has OpenAI Surpassed DeepMind?
OpenAI's GPT-3 is the talk of the town, and the media is giving it all the attention. Many analysts are even comparing it to AGI because of its practical applicability. Initially disclosed in a research paper in May, GPT-3 is the next version of GPT-2 and is 100x larger than it. It is far more competent than its forerunner due to the number of parameters it is trained on, which is 175 billion for GPT-3 versus 1.5 billion for GPT-2. After the successful launch of GPT-3, other AI companies seem to have been overshadowed.
This AI Could Bring Us Computers That Can Write Their Own Software
When OpenAI first published a paper on their new language generation AI, GPT-3, the hype was slow to build. The paper indicated GPT-3, the biggest natural language AI model yet, was advanced, but it only had a few written examples of its output. Then OpenAI gave select access to a beta version of GPT-3 to see what developers would do with it, and minds were blown. Developers playing with GPT-3 have taken to Twitter with examples of its capabilities: short stories, press releases, articles about itself, a search engine. Perhaps most surprising was the discovery GPT-3 can write simple computer code. When web developer, Sharif Shameem, modified it to spit out HTML instead of natural language, the program generated code for webpage layouts from prompts like "a button that looks like a watermelon."
Philosophers On GPT-3 (updated with replies by GPT-3) - Daily Nous
Nine philosophers exploreย the various issues and questions raised by the newly released language model, GPT-3, in this edition ofย Philosophers On, guest edited by Annette Zimmermann. Introduction Annette Zimmermann, guest editor GPT-3, a powerful, 175 billion parameter language model developed recently by OpenAI, has been galvanizing public debate and controversy. As the MIT Technology Review puts it: โOpenAIโs new language generator GPT-3 is shockingly goodโand completely mindlessโ. Parts of the technology community hope (and fear) that GPT-3 could brings us one step closer to the hypothetical future possibility of human-like, highly sophisticated artificial general intelligence (AGI). Meanwhile, others (including OpenAIโs own CEO) have critiqued claims about GPT-3โs ostensible proximity to AGI, arguing that they are vastly overstated. Why the hype? As is turns out, GPT-3 is unlike other natural language processing (NLP) systems, the latter of which often struggle with what comes comparatively easily to humans: performing entirely new language tasks based on a few simple instructions and examples. Instead, NLP systems usually have to be pre-trained on a large corpus of text, and then fine-tuned in order to successfully perform a specific task. GPT-3, by contrast, does not require fine tuning of this kind: it seems to be able to perform a whole range of tasks reasonably well, from producing fiction, poetry, and press releases to functioning code, and from music, jokes, and technical manuals, to โnews articles which human evaluators have difficulty distinguishing from articles written by humansโ. The Philosophers On series contains group posts on issues of current interest, with the aim being to show what the careful thinking characteristic of philosophers (and occasionally scholars in related fields) can bring to popular ongoing conversations. Contributors present not fully worked out position papers but rather brief thoughts that can serve as prompts for further reflection and discussion. The contributors to this installment of โPhilosophers Onโ are Amanda Askell (Research Scientist, OpenAI), David Chalmers (Professor of Philosophy, New York University), Justin Khoo (Associate Professor of Philosophy, Massachusetts Institute of Technology), Carlos Montemayor (Professor of Philosophy, San Francisco State University), C. Thi Nguyen (Associate Professor of Philosophy, University of Utah), Regina Rini (Canada Research Chair in Philosophy of Moral and Social Cognition, York University), Henry Shevlin (Research Associate, Leverhulme Centre for..
OpenAI's latest breakthrough is astonishingly powerful, but still fighting its flaws
The most exciting new arrival in the world of AI looks, on the surface, disarmingly simple. It's not some subtle game-playing program that can outthink humanity's finest or a mechanically advanced robot that backflips like an Olympian. You start typing and it predicts what comes next. But while this sounds simple, it's an invention that could end up defining the decade to come. The program itself is called GPT-3 and it's the work of San Francisco-based AI lab OpenAI, an outfit that was founded with the ambitious (some say delusional) goal of steering the development of artificial general intelligence or AGI: computer programs that possess all the depth, variety, and flexibility of the human mind. For some observers, GPT-3 -- while very definitely not AGI -- could well be the first step toward creating this sort of intelligence.
OpenAI's new GPT-3 language explained in under 3 minutes
So, you've seen some amazing GPT-3 demos on Twitter (if not, where have you been?). This mega machine learning model, created by OpenAI, can write it's own op-eds, poems, articles, and even working code: With GPT-3, I built a layout generator where you just describe any layout you want, and it generates the JSX code for you. GPT3()โฆ the spreadsheet function to rule them all. Impressed with how well it pattern matches from a few examples. The same function looked up state populations, peoples' twitter usernames and employers, and did some math.
The (Un)ethical Story of GPT-3: OpenAI's Million Dollar Model
Back on October 12, 2019, the world witnessed a previously unimaginable accomplishment- the first sub-two-hour marathon was run in an incredible time of 1:59:40 by Kenyan native Eliud Kipchoge. He would later say in regards to the amazing achievement that he "expected more people all over the world to run under 2 hours after today" [1]. While Kipchoge set new records in long distance running, across the world a team of natural language processing (NLP) experts at OpenAI, the Elon Musk-backed AI firm, published a new transformer-based language model with 1.5 billion parameters that achieved previously unthinkable performance in nearly every language task it faced [2]. The main takeaway from the paper by many experts was that bigger is better-the intelligence of transformer models can dramatically increase with the scale of parameters. In March of 2020, this theory gained support with OpenAI's release of version three of the model or GPT-3 which encapsulates a staggering 175 billion parameters and achieved even more remarkable performance than version 2, despite sharing, quite literally, the same architecture [3].
Deep Generative Models that Solve PDEs: Distributed Computing for Training Large Data-Free Models
Recent progress in scientific machine learning (SciML) has opened up the possibility of training novel neural network architectures that solve complex partial differential equations (PDEs). Several (nearly data free) approaches have been recently reported that successfully solve PDEs, with examples including deep feed forward networks, generative networks, and deep encoder-decoder networks. However, practical adoption of these approaches is limited by the difficulty in training these models, especially to make predictions at large output resolutions ( 1024 1024). Here we report on a software framework for data parallel distributed deep learning that resolves the twin challenges of training these large SciML models - training in reasonable time as well as distributing the storage requirements. Our framework provides several out of the box functionality including (a) loss integrity independent of number of processes, (b) synchronized batch normalization, and (c) distributed higher-order optimization methods.