AITopics

Compared to other figures of speech like similes (Chakrabarty et al., 2020) and metaphors Euphemisms are figures of speech which aim to (Chakrabarty et al., 2021), work on euphemisms soften the blow of certain words which may be has been limited. Recently, Gavidia et al. (2022); too direct or too harsh (Magu and Luo, 2018; Felt Lee et al. (2022) released a new dataset of diverse and Riloff, 2020). In the EMNLP 2022 FigLang euphemisms and conducted analysis on automatically Workshop Euphemism Shared Task, participating identifying potentially euphemistic terms. In teams are given a set of sentences with potentially the past, Felt and Riloff (2020) used sentiment analysis euphemistic terms (PETs) enclosed in brackets, and techniques to recognize euphemistic and dysphemistic the task is to classify whether or not the PET in a phrases. Other studies also focused on given sentence is used euphemistically.

euphemism, large language model, natural language, (17 more...)

2210.12926

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Diversity-boosted Generalization-Specialization Balancing for Zero-shot Learning

Li, Yun, Liu, Zhe, Chang, Xiaojun, McAuley, Julian, Yao, Lina

Zero-Shot Learning (ZSL) aims to transfer classification capability from seen to unseen classes. Recent methods have proved that generalization and specialization are two essential abilities to achieve good performance in ZSL. However, focusing on only one of the abilities may result in models that are either too general with degraded classification ability or too specialized to generalize to unseen classes. In this paper, we propose an end-to-end network, termed as BGSNet, which equips and balances generalization and specialization abilities at the instance and dataset level. Specifically, BGSNet consists of two branches: the Generalization Network (GNet), which applies episodic meta-learning to learn generalized knowledge, and the Balanced Specialization Network (BSNet), which adopts multiple attentive extractors to extract discriminative features and achieve instance-level balance. A novel self-adjusted diversity loss is designed to optimize BSNet with redundancy reduced and diversity boosted. We further propose a differentiable dataset-level balance and update the weights in a linear annealing schedule to simulate network pruning and thus obtain the optimal structure for BSNet with dataset-level balance achieved. Experiments on four benchmark datasets demonstrate our model's effectiveness. Sufficient component ablations prove the necessity of integrating and balancing generalization and specialization abilities.

large language model, machine learning, natural language, (16 more...)

2201.01961

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.66)

Overcoming Catastrophic Forgetting in Zero-Shot Cross-Lingual Generation

Vu, Tu, Barua, Aditya, Lester, Brian, Cer, Daniel, Iyyer, Mohit, Constant, Noah

In this paper, we explore the challenging problem of performing a generative task in a target language when labeled data is only available in English, using summarization as a case study. We assume a strict setting with no access to parallel data or machine translation and find that common transfer learning approaches struggle in this setting, as a generative multilingual model fine-tuned purely on English catastrophically forgets how to generate non-English. Given the recent rise of parameter-efficient adaptation techniques, we conduct the first investigation into how one such method, prompt tuning (Lester et al., 2021), can overcome catastrophic forgetting to enable zero-shot cross-lingual generation. Our experiments show that parameter-efficient prompt tuning provides gains over standard fine-tuning when transferring between less-related languages, e.g., from English to Thai. However, a significant gap still remains between these methods and fully-supervised baselines. To improve cross-lingual transfer further, we explore several approaches, including: (1) mixing in unlabeled multilingual data, and (2) explicitly factoring prompts into recombinable language and task components. Our approaches can provide further quality gains, suggesting that robust zero-shot cross-lingual generation is within reach.

large language model, machine learning, natural language, (18 more...)

2205.12647

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Asia > Pakistan (0.04)
(5 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

EncT5: A Framework for Fine-tuning T5 as Non-autoregressive Models

Liu, Frederick, Huang, Terry, Lyu, Shihang, Shakeri, Siamak, Yu, Hongkun, Li, Jing

Pre-trained encoder-decoder transformer architectures have become increasingly popular recently with the advent of T5 models. T5 has also become more favorable over other architectures like BERT due to the amount of data that it is pre-trained on, increased scale of model parameter sizes and easy applicability to a diverse set of tasks due to the generative nature of the model. While being able to generalize to a wide variety of tasks, it is not clear that encoder-decoder architectures are the most efficient for fine-tuning tasks that don't require auto-regressive decoding. In this work, we study fine-tuning pre-trained encoder-decoder models for tasks such as classification, multi-label classification, and structured prediction. We propose \textbf{EncT5}, a framework for these problems, and illustrate instantiations for these tasks. Our experiment results show that EncT5 has advantages over T5 such as efficiency and usability out performs BERT when evaluated on publicly available pre-trained checkpoints.

large language model, machine learning, natural language, (19 more...)

2110.08426

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Italy > Tuscany > Florence (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

FaithDial: A Faithful Benchmark for Information-Seeking Dialogue

Dziri, Nouha, Kamalloo, Ehsan, Milton, Sivan, Zaiane, Osmar, Yu, Mo, Ponti, Edoardo M., Reddy, Siva

The goal of information-seeking dialogue is to respond to seeker queries with natural language utterances that are grounded on knowledge sources. However, dialogue systems often produce unsupported utterances, a phenomenon known as hallucination. To mitigate this behavior, we adopt a data-centric solution and create FaithDial, a new benchmark for hallucination-free dialogues, by editing hallucinated responses in the Wizard of Wikipedia (WoW) benchmark. We observe that FaithDial is more faithful than WoW while also maintaining engaging conversations. We show that FaithDial can serve as training signal for: i) a hallucination critic, which discriminates whether an utterance is faithful or not, and boosts the performance by 12.8 F1 score on the BEGIN benchmark compared to existing datasets for dialogue coherence; ii) high-quality dialogue generation. We benchmark a series of state-of-the-art models and propose an auxiliary contrastive objective that achieves the highest level of faithfulness and abstractiveness based on several automated metrics. Further, we find that the benefits of FaithDial generalize to zero-shot transfer on other datasets, such as CMU-Dog and TopicalChat. Finally, human evaluation reveals that responses generated by models trained on FaithDial are perceived as more interpretable, cooperative, and engaging.

computational linguistic, large language model, natural language, (19 more...)

2204.10757

Country:

North America > Canada > Alberta (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)
(13 more...)

Genre: Research Report (1.00)

Industry:

Media > Film (1.00)
Leisure & Entertainment > Sports > Football (1.00)
Health & Medicine (0.93)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)
(2 more...)

#artificialintelligenceOct-22-2022, 22:45:16 GMT

AI technology is not dark magic, it's just misunderstood

Most forms of technology applications are well understood. Every computer programme can be deconstructed into the basic building blocks of code, and if it goes wrong, you can debug the software – often by simply stepping through the code line by line in order to find out where the problem lies. Artificial Intelligence, or AI, is different. With the latest AI large language models we can't predict exactly what it will output, but it will do a good job at writing an article or creating poetry. What makes them human-like is the lack of predictable outcomes – humans simply aren't predictable!

application, dark magic, minimal experience, (15 more...)

Country: Europe > United Kingdom (0.05)

Industry: Information Technology > Security & Privacy (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)

#artificialintelligenceOct-22-2022, 21:00:39 GMT

Venture FOMO Returns as Investors Chase Artificial Intelligence Deals

Venture capitalists are shaking themselves out of a bear market slumber to chase deals in a pocket of artificial intelligence that's spilled into the mainstream this year: AI that generates art, videos and writing. Jasper AI, which last year started selling an AI-assisted writing tool, raised funding from Insight Partners at a $1.5 billion pre-investment valuation around June this year, according to two sources familiar with the talks. Startup valuations have fallen since then. But earlier this month hedge fund Coatue Management paid a higher price for new shares in the Austin, Tex.–based Jasper, which has rapidly increased its revenues using software developed by startup OpenAI. Meanwhile, Descript, a startup that uses AI for video and audio editing and was founded by Groupon co-founder Andrew Mason, has been in talks with OpenAI CEO Sam Altman and other investors to raise a new round, according to people familiar with the discussions.

investor chase artificial intelligence deal, valuation, venture fomo return

Country: North America > United States > Texas > Travis County > Austin (0.30)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.57)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.57)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.57)

#artificialintelligenceOct-22-2022, 19:15:11 GMT

Will artificial intelligence ever rival true human thinking?

The narrowness of AI will someday be replaced by artificial general intelligence. But will it have the capability to rival human intelligence and creativity? Some of the world's most advanced artificial intelligence (AI) systems, at least the ones the public hear about, are famous for beating human players at chess or poker. Other algorithms are known for their ability to learn how to recognize cats or their inability to recognize people with darker skin. But are current AI systems anything more than toys?

algorithm, intelligence, rival true human thinking, (14 more...)

Country: Asia > China (0.05)

Genre: Personal (0.36)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (0.59)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.34)

#artificialintelligenceOct-22-2022, 14:05:06 GMT

Digital transformation with Google Cloud

Alphabet's Google Cloud empowers organisations to digitally transform themselves into smarter businesses. Its diverse solutions include cloud computing, data analytics, and the latest artificial intelligence (AI) and machine learning tools. Last week, many of the platform's latest advances were shared at Next '22, Google Cloud's annual developer and tech conference about digital transformation in the cloud. We've partnered with Google Cloud over the last few years to apply our AI research for making a positive impact on core solutions used by their customers. Here, we introduce a few of these projects, including optimising document understanding, enhancing the value of wind energy, and offering easier use of AlphaFold.

ai research, digital transformation, google cloud, (10 more...)

Country: Europe > Germany (0.05)

Industry:

Information Technology > Services (1.00)
Energy > Renewable > Wind (0.75)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Kuznia, Kirby, Mishra, Swaroop, Parmar, Mihir, Baral, Chitta

Less is More: Summary of Long Instructions is Better for Program Synthesis

arXiv.org Artificial IntelligenceOct-22-2022

Despite the success of large pre-trained language models (LMs) such as Codex, they show below-par performance on the larger and more complicated programming related questions. We show that LMs benefit from the summarized version of complicated questions. Our findings show that superfluous information often present in problem description such as human characters, background stories, and names (which are included to help humans in understanding a task) does not help models in understanding a task. To this extent, we create a meta-dataset from the frequently used APPS dataset and the newly created CodeContests dataset for the program synthesis task. Our meta-dataset consists of human and synthesized summaries of the long and complicated programming questions. Experimental results on Codex show that our proposed approach outperforms baseline by 8.13% on the APPS dataset and 11.88% on the CodeContests dataset on average in terms of strict accuracy. Our analysis shows that summaries significantly improve performance for introductory (9.86%) and interview (11.48%) programming questions. However, it shows improvement by a small margin (~ 2%) for competitive programming questions, implying scope for future research in this direction.

large language model, machine learning, natural language, (20 more...)

2203.08597

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Arizona (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(2 more...)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)