Ars is excited to be hosting this online debut of Sunspring, a short science fiction film that's not entirely what it seems. You know it's the future because H (played with neurotic gravity by Silicon Valley's Thomas Middleditch) is wearing a shiny gold jacket, H2 (Elisabeth Gray) is playing with computers, and C (Humphrey Ker) announces that he has to "go to the skull" before sticking his face into a bunch of green lights. It sounds like your typical sci-fi B-movie, complete with an incoherent plot. Except Sunspring isn't the product of Hollywood hacks--it was written entirely by an AI. To be specific, it was authored by a recurrent neural network called long short-term memory, or LSTM for short.
Larger language models have higher accuracy on average, but are they better on every single instance (datapoint)? Some work suggests larger models have higher out-of-distribution robustness, while other work suggests they have lower accuracy on rare subgroups. To understand these differences, we investigate these models at the level of individual instances. However, one major challenge is that individual predictions are highly sensitive to noise in the randomness in training. We develop statistically rigorous methods to address this, and after accounting for pretraining and finetuning noise, we find that our BERT-Large is worse than BERT-Mini on at least 1-4% of instances across MNLI, SST-2, and QQP, compared to the overall accuracy improvement of 2-10%. We also find that finetuning noise increases with model size and that instance-level accuracy has momentum: improvement from BERT-Mini to BERT-Medium correlates with improvement from BERT-Medium to BERT-Large. Our findings suggest that instance-level predictions provide a rich source of information; we therefore, recommend that researchers supplement model weights with model predictions.
Gunrock 2.0 is built on top of Gunrock with an emphasis on user adaptation. Gunrock 2.0 combines various neural natural language understanding modules, including named entity detection, linking, and dialog act prediction, to improve user understanding. Its dialog management is a hierarchical model that handles various topics, such as movies, music, and sports. The system-level dialog manager can handle question detection, acknowledgment, error handling, and additional functions, making downstream modules much easier to design and implement. The dialog manager also adapts its topic selection to accommodate different users' profile information, such as inferred gender and personality. The generation model is a mix of templates and neural generation models. Gunrock 2.0 is able to achieve an average rating of 3.73 at its latest build from May 29th to June 4th.
Every week, we talk about important data and analytics topics with data science leaders from around the world on Facebook Live. You can subscribe to the DataTalk podcast on iTunes, Google Play, Stitcher, SoundCloud and Spotify. This data science video and podcast series is part of Experian's effort to help people understand how data-powered decisions can help organizations develop innovative solutions and drive more business. To keep up with upcoming events, join our Data Science Community on Facebook or check out the archive of recent data science videos. To suggest future data science topics or guests, please contact Mike Delgado. In this week's #DataTalk, we talked with Dr. Anima Anandkumar, Principal Scientist at Amazon AI and Bren Professor at Caltech, about what data scientists need to know about deep learning and how to scale deep learning frameworks. Today we're excited to talk about deep learning with Dr. Anima Anandkumar. Anima serves as the principal scientist at Amazon Web Services. Anima earned her BTech in electrical engineering from the Indian Institute of Technology. She also earned her Ph.D. in electrical engineering from Cornell University, and then after that she served as a postdoctoral researcher at MIT. She's the recipient of dozens of awards.
This particular article needs more visibility on Medium, especially to Machiner Learning practitioners. Wael transcribes an interview with Michael Kanaan, an individual that held an AI leadership role in the US Airforce, and is currently working with MIT's primer AI Lab. Early on in the interview, Michael quickly discards the stereotypical portrayal of AI within Hollywood movies and provides the reader with a more accurate description of AI. Michael accurately points out that what we call AI are simply machine learning algorithms that can derive patterns from data, which in turn creates a predictive model of subject of interest, such as a person's behaviour, stock prices etc. The interview goes on to include discussions around the type of individuals that are suitable for roles in AI, a conversation in which Michael debunks the myth that AI-based positions are reserved for individuals with a STEM(Science, technology, engineering, and mathematics) background.
Building an open-domain socialbot that talks to real people is challenging - such a system must meet multiple user expectations such as broad world knowledge, conversational style, and emotional connection. Our socialbot engages users on their terms - prioritizing their interests, feelings and autonomy. As a result, our socialbot provides a responsive, personalized user experience, capable of talking knowledgeably about a wide variety of topics, as well as chatting empathetically about ordinary life. Neural generation plays a key role in achieving these goals, providing the backbone for our conversational and emotional tone. At the end of the competition, Chirpy Cardinal progressed to the finals with an average rating of 3.6/5.0,
What if I told a story here, how would that story start?" Thus, the summarization prompt: "My second grader asked me what this passage means: …" When a given prompt isn't working and GPT-3 keeps pivoting into other modes of completion, that may mean that one hasn't constrained it enough by imitating a correct output, and one needs to go further; writing the first few words or sentence of the target output may be necessary.
To build Sounding Board, we develop a system architecture that is capable of accommodating dialog strategies that we designed for socialbot conversations. The architecture consists of a multi-dimensional language understanding module for analyzing user utterances, a hierarchical dialog management framework for dialog context tracking and complex dialog control, and a language generation process that realizes the response plan and makes adjustments for speech synthesis. Additionally, we construct a new knowledge base to power the socialbot by collecting social chat content from a variety of sources. An important contribution of the system is the synergy between the knowledge base and the dialog management, i.e., the use of a graph structure to organize the knowledge base that makes dialog control very efficient in bringing related content to the discussion. Using the data collected from Sounding Board during the competition, we carry out in-depth analyses of socialbot conversations and user ratings which provide valuable insights in evaluation methods for socialbots. We additionally investigate a new approach for system evaluation and diagnosis that allows scoring individual dialog segments in the conversation. Finally, observing that socialbots suffer from the issue of shallow conversations about topics associated with unstructured data, we study the problem of enabling extended socialbot conversations grounded on a document. To bring together machine reading and dialog control techniques, a graph-based document representation is proposed, together with methods for automatically constructing the graph. Using the graph-based representation, dialog control can be carried out by retrieving nodes or moving along edges in the graph. To illustrate the usage, a mixed-initiative dialog strategy is designed for socialbot conversations on news articles.
We present Meena, a multi-turn open-domain chatbot trained end-to-end on data mined and filtered from public domain social media conversations. This 2.6B parameter neural network is simply trained to minimize perplexity of the next token. We also propose a human evaluation metric called Sensibleness and Specificity Average (SSA), which captures key elements of a human-like multi-turn conversation. Our experiments show strong correlation between perplexity and SSA. The fact that the best perplexity end-to-end trained Meena scores high on SSA (72% on multi-turn evaluation) suggests that a human-level SSA of 86% is potentially within reach if we can better optimize perplexity. Additionally, the full version of Meena (with a filtering mechanism and tuned decoding) scores 79% SSA, 23% higher in absolute SSA than the existing chatbots we evaluated.
The following is an excerpt of You Look Like a Thing and I Love You: How Artificial Intelligence Works and Why It's Making the World a Weirder Place by Janelle Shane. Listen to a radio interview with Janelle Shane about the mistakes artificial intelligence can make. Will the music, movies, and novels of the future be written by AI? Maybe at least partially. AI-generated art can be striking, weird, and unsettling: infinitely morphing tulips; glitchy humans with half-melted faces; skies full of hallucinated dogs. AT. rex may turn into flowers or fruit; the Mona Lisa may take on a goofy grin; a piano riff may turn into an electric guitar solo.