frisbee
Evaluating the Evaluation of Diversity in Commonsense Generation
Zhang, Tianhui, Peng, Bei, Bollegala, Danushka
In commonsense generation, given a set of input concepts, a model must generate a response that is not only commonsense bearing, but also capturing multiple diverse viewpoints. Numerous evaluation metrics based on form- and content-level overlap have been proposed in prior work for evaluating the diversity of a commonsense generation model. However, it remains unclear as to which metrics are best suited for evaluating the diversity in commonsense generation. To address this gap, we conduct a systematic meta-evaluation of diversity metrics for commonsense generation. We find that form-based diversity metrics tend to consistently overestimate the diversity in sentence sets, where even randomly generated sentences are assigned overly high diversity scores. We then use an Large Language Model (LLM) to create a novel dataset annotated for the diversity of sentences generated for a commonsense generation task, and use it to conduct a meta-evaluation of the existing diversity evaluation metrics. Our experimental results show that content-based diversity evaluation metrics consistently outperform the form-based counterparts, showing high correlations with the LLM-based ratings. We recommend that future work on commonsense generation should use content-based metrics for evaluating the diversity of their outputs.
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- (2 more...)
VER: Unifying Verbalizing Entities and Relations
Huang, Jie, Chang, Kevin Chen-Chuan
Entities and relationships between entities are vital in the real world. Essentially, we understand the world by understanding entities and relations. For instance, to understand a field, e.g., computer science, we need to understand the relevant concepts, e.g., machine learning, and the relationships between concepts, e.g., machine learning and artificial intelligence. To understand a person, we should first know who he/she is and how he/she is related to others. To understand entities and relations, humans may refer to natural language descriptions. For instance, when learning a new scientific term, people usually start by reading its definition in dictionaries or encyclopedias. To know the relationship between two entities, humans tend to create a sentence to connect them. In this paper, we propose VER: a unified model for Verbalizing Entities and Relations. Specifically, we attempt to build a system that takes any entity or entity set as input and generates a sentence to represent entities and relations. Extensive experiments demonstrate that our model can generate high-quality sentences describing entities and entity relationships and facilitate various tasks on entities and relations, including definition modeling, relation modeling, and generative commonsense reasoning.
- North America > United States > Illinois (0.05)
- Europe > Sweden > Östergötland County > Linköping (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Ethical AI isn't just how you build it, it's how you use it
Lapses such as racially biased facial recognition or apparently sexist credit card approval algorithms have thankfully left companies asking how to build AI ethically. Many companies have released "ethical AI" guidelines, such as Microsoft's Responsible AI principles, which requires that AI systems be fair, inclusive, reliable and safe, transparent, respect privacy and security, and be accountable. These are laudable, and will help prevent the harms listed above. Harm can result from what a system is used for, not from unfairness, black-boxyness, or other implementation details. Consider an autonomous Uber: if they are able to recognize people using wheelchairs less accurately than people walking, this can be fixed by using training data reflective of the many ways people traverse a city to build a more fair system.
How DALL-E 2 could solve major computer vision challenges
We are excited to bring Transform 2022 back in-person July 19 and virtually July 20 - 28. Join AI and data leaders for insightful talks and exciting networking opportunities. OpenAI has recently released DALL-E 2, a more advanced version of DALL-E, an ingenious multimodal AI capable of generating images purely based on text descriptions. DALL-E 2 does that by employing advanced deep learning techniques that improve the quality and resolution of the generated images and provides further capabilities such as editing an existing image, or creating new versions of it. Many AI enthusiasts and researchers tweeted about how amazing DALL-E 2 is at generating art and images out of a thin word, yet in this article I'd like to explore a different application for this powerful text-to-image model -- generating datasets to solve computer vision's biggest challenges. Caption: A DALL-E 2 generated image.
EETimes - Will Machines Ever Fully Understand What They Are Seeing?
Embedded vision technologies are giving machines the power of sight, but today's systems still fall short of understanding all the nuances of an image. An approach used for natural language processing could address that. Attention-based neural networks, particularly transformer networks, have revolutionized natural language processing (NLP), giving machines a better understanding of language than ever before. This technique, which is designed to mimic cognitive processes by giving an artificial neural network an idea of history or context, has produced much more sophisticated AI agents than older approaches that also employ memory, such as long short-term memory (LSTM) and recurrent neural networks (RNNs). NLP now has a deeper level of understanding of the questions or prompts it is fed and can create long pieces of text in response that are often indistinguishable from what a human might write.
Researchers Disappointed By Efforts to Teach AI Common Sense
"Current machine text-generation models can write an article that may be convincing to many humans, but they're basically mimicking what they have seen in the training phase," said [PhD student Yuchen] Lin. "Our goal in this paper is to study the problem of whether current state-of-the-art text-generation models can write sentences to describe natural scenarios in our everyday lives." Essentially, fake news bots can sound like the New York Times or marketing copy by generating mimics, after taking in thousands of natural examples. Specifically, Ren and Lin tested the models' ability to reason and showed there is a large gap between current text generation models and human performance. Given a set of common nouns and verbs, state-of-the-art NLP computer models were tasked with creating believable sentences describing an everyday scenario.
The Fear of Artificial Intelligence in Job Loss
With all the hype over Artificial Intelligence, there is additionally a lot of disturbing buzz about the negative results of AI. More than one-quarter (27%) of all employees state they are stressed that the work they have now will be disposed of within the next five years because of new innovation, robots or artificial intelligence, as indicated by the quarterly CNBC/SurveyMonkey Workplace Happiness review. In certain industries where technology already has played a profoundly disruptive role, employees fear of automation likewise run higher than the normal: Workers in automotives, business logistics and support, marketing and advertising, and retail are proportionately more stressed over new technology replacing their jobs than those in different industries. The dread stems from the fact that the business is already witnessing it. Self-driving trucks already are compromising the jobs of truck drivers, and it is causing a huge frenzy in this job line.
- Transportation > Ground > Road (0.56)
- Transportation > Freight & Logistics Services (0.52)
- Banking & Finance > Economy (0.43)
Global Big Data Conference
With all the hype over Artificial Intelligence, there is additionally a lot of disturbing buzz about the negative results of AI. More than one-quarter (27%) of all employees state they are stressed that the work they have now will be disposed of within the next five years because of new innovation, robots or artificial intelligence, as indicated by the quarterly CNBC/SurveyMonkey Workplace Happiness review. In certain industries where technology already has played a profoundly disruptive role, employees fear of automation likewise run higher than the normal: Workers in automotives, business logistics and support, marketing and advertising, and retail are proportionately more stressed over new technology replacing their jobs than those in different industries. The dread stems from the fact that the business is already witnessing it. Self-driving trucks already are compromising the jobs of truck drivers, and it is causing a huge frenzy in this job line.
- Transportation > Ground > Road (0.58)
- Transportation > Freight & Logistics Services (0.55)
- Banking & Finance > Economy (0.42)
New test reveals AI still lacks common sense
Natural language processing (NLP) has taken great strides recently--but how much does AI understand of what it reads? Less than we thought, according to researchers at USC's Department of Computer Science. In a recent paper Assistant Professor Xiang Ren and Ph.D. student Yuchen Lin found that despite advances, AI still doesn't have the common sense needed to generate plausible sentences. "Current machine text-generation models can write an article that may be convincing to many humans, but they're basically mimicking what they have seen in the training phase," said Lin. "Our goal in this paper is to study the problem of whether current state-of-the-art text-generation models can write sentences to describe natural scenarios in our everyday lives." Specifically, Ren and Lin tested the models' ability to reason and showed there is a large gap between current text generation models and human performance.