AITopics | Ju, Da

Collaborating Authors

Ju, Da

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Improving Open Language Models by Learning from Organic Interactions

Xu, Jing, Ju, Da, Lane, Joshua, Komeili, Mojtaba, Smith, Eric Michael, Ung, Megan, Behrooz, Morteza, Ngan, William, Moritz, Rashel, Sukhbaatar, Sainbayar, Boureau, Y-Lan, Weston, Jason, Shuster, Kurt

arXiv.org Artificial IntelligenceJun-7-2023

We present BlenderBot 3x, an update on the conversational model BlenderBot 3, which is now trained using organic conversation and feedback data from participating users of the system in order to improve both its skills and safety. We are publicly releasing the participating de-identified interaction data for use by the research community, in order to spur further progress. Training models with organic data is challenging because interactions with people "in the wild" include both high quality conversations and feedback, as well as adversarial and toxic behavior. We study techniques that enable learning from helpful teachers while avoiding learning from people who are trying to trick the model into unhelpful or toxic responses. BlenderBot 3x is both preferred in conversation to BlenderBot 3, and is shown to produce safer responses in challenging situations. While our current models are still far from perfect, we believe further improvement can be achieved by continued use of the techniques explored in this work.

machine learning, natural language, reward model, (17 more...)

arXiv.org Artificial Intelligence

2306.04707

Country:

Europe > Italy (0.14)
Asia > Middle East > UAE (0.14)
Asia > China (0.14)

Genre: Research Report (0.84)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

Not All Memories are Created Equal: Learning to Forget by Expiring

Sukhbaatar, Sainbayar, Ju, Da, Poff, Spencer, Roller, Stephen, Szlam, Arthur, Weston, Jason, Fan, Angela

arXiv.org Artificial IntelligenceJun-13-2021

Attention mechanisms have shown promising results in sequence modeling tasks that require long-term memory. Recent work investigated mechanisms to reduce the computational cost of preserving and storing memories. However, not all content in the past is equally important to remember. We propose Expire-Span, a method that learns to retain the most important information and expire the irrelevant information. This forgetting of memories enables Transformers to scale to attend over tens of thousands of previous timesteps efficiently, as not all states from previous timesteps are preserved. We demonstrate that Expire-Span can help models identify and retain critical information and show it can achieve strong performance on reinforcement learning tasks specifically designed to challenge this functionality. Next, we show that Expire-Span can scale to memories that are tens of thousands in size, setting a new state of the art on incredibly long context tasks such as character-level language modeling and a frame-by-frame moving objects task. Finally, we analyze the efficiency of Expire-Span compared to existing approaches and demonstrate that it trains faster and uses less memory.

artificial intelligence, neural network, xpire, (17 more...)

arXiv.org Artificial Intelligence

2105.06548

Country: Africa (0.14)

Genre: Research Report (0.50)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation

Goyal, Naman, Gao, Cynthia, Chaudhary, Vishrav, Chen, Peng-Jen, Wenzek, Guillaume, Ju, Da, Krishnan, Sanjana, Ranzato, Marc'Aurelio, Guzman, Francisco, Fan, Angela

arXiv.org Artificial IntelligenceJun-6-2021

One of the biggest challenges hindering progress in low-resource and multilingual machine translation is the lack of good evaluation benchmarks. Current evaluation benchmarks either lack good coverage of low-resource languages, consider only restricted domains, or are low quality because they are constructed using semi-automatic procedures. In this work, we introduce the FLORES-101 evaluation benchmark, consisting of 3001 sentences extracted from English Wikipedia and covering a variety of different topics and domains. These sentences have been translated in 101 languages by professional translators through a carefully controlled process. The resulting dataset enables better assessment of model quality on the long tail of low-resource languages, including the evaluation of many-to-many multilingual translation systems, as all translations are multilingually aligned. By publicly releasing such a high-quality and high-coverage dataset, we hope to foster progress in the machine translation community and beyond.

health & medicine, machine translation, translation, (15 more...)

arXiv.org Artificial Intelligence

2106.03193

Country:

Asia (0.92)
Europe (0.67)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.13)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Recipes for Safety in Open-domain Chatbots

Xu, Jing, Ju, Da, Li, Margaret, Boureau, Y-Lan, Weston, Jason, Dinan, Emily

arXiv.org Artificial IntelligenceOct-22-2020

Models trained on large unlabeled corpora of human interactions will learn patterns and mimic behaviors therein, which include offensive or otherwise toxic behavior and unwanted biases. We investigate a variety of methods to mitigate these issues in the context of open-domain generative dialogue models. We introduce a new human-and-model-in-the-loop framework for both training safer models and for evaluating them, as well as a novel method to distill safety considerations inside generative models without the use of an external classifier at deployment time. We conduct experiments comparing these methods and find our new techniques are (i) safer than existing models as measured by automatic and human evaluations while (ii) maintaining usability metrics such as engagingness relative to the state of the art. We then discuss the limitations of this work by analyzing failure cases of our models.

chatbot, classifier, social media, (16 more...)

arXiv.org Artificial Intelligence

2010.07079

Country:

Europe (0.46)
North America > United States (0.28)

Genre:

Research Report > New Finding (0.68)
Research Report > Promising Solution (0.48)
Research Report > Experimental Study (0.46)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Multi-Modal Open-Domain Dialogue

Shuster, Kurt, Smith, Eric Michael, Ju, Da, Weston, Jason

arXiv.org Artificial IntelligenceOct-2-2020

Recent work in open-domain conversational agents has demonstrated that significant improvements in model engagingness and humanness metrics can be achieved via massive scaling in both pre-training data and model size (Adiwardana et al., 2020; Roller et al., 2020). However, if we want to build agents with human-like abilities, we must expand beyond handling just text. A particularly important topic is the ability to see images and communicate about what is perceived. With the goal of engaging humans in multi-modal dialogue, we investigate combining components from state-of-the-art open-domain dialogue agents with those from state-of-the-art vision models. We study incorporating different image fusion schemes and domain-adaptive pre-training and fine-tuning strategies, and show that our best resulting model outperforms strong existing models in multi-modal dialogue while simultaneously performing as well as its predecessor (text-only) BlenderBot (Roller et al., 2020) in text-based conversation. We additionally investigate and incorporate safety components in our final model, and show that such efforts do not diminish model performance with respect to engagingness metrics.

dataset, image understanding, information fusion, (22 more...)

arXiv.org Artificial Intelligence

2010.01082

Country:

Europe (0.93)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.46)
(2 more...)

Add feedback

Open-Domain Conversational Agents: Current Progress, Open Problems, and Future Directions

Roller, Stephen, Boureau, Y-Lan, Weston, Jason, Bordes, Antoine, Dinan, Emily, Fan, Angela, Gunning, David, Ju, Da, Li, Margaret, Poff, Spencer, Ringshia, Pratik, Shuster, Kurt, Smith, Eric Michael, Szlam, Arthur, Urbanek, Jack, Williamson, Mary

arXiv.org Artificial IntelligenceJul-13-2020

Further, we discuss only open academic research with entertaining wit and knowledge while making others feel reproducible published results, hence we will not address heard. The breadth of possible conversation topics and lack much of the considerable work that has been put into building of a well-defined objective make it challenging to define a commercial systems, where methods, data and results roadmap towards training a good conversational agent, or are not in the public domain. Finally, given that we focus on chatbot. Despite recent progress across the board (Adiwardana open-domain conversation, we do not focus on specific goaloriented et al., 2020; Roller et al., 2020), conversational agents techniques; we also do not cover spoken dialogue in are still incapable of carrying an open-domain conversation this work, focusing on text and image input/output only. For that remains interesting, consistent, accurate, and reliably more general recent surveys, see Gao et al. (2019); Jurafsky well-behaved (e.g., not offensive) while navigating a variety and Martin (2019); Huang, Zhu, and Gao (2020). of topics. Traditional task-oriented dialogue systems rely on slotfilling and structured modules (e.g., Young et al. (2013); Gao et al. (2019); Jurafsky and Martin (2019)).

computational linguistics, deep learning, neural network, (23 more...)

arXiv.org Artificial Intelligence

2006.12442

Country:

Europe (1.00)
Asia (0.93)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.45)

Industry:

Health & Medicine (1.00)
Education (0.93)
Media (0.67)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

High-Level Strategy Selection under Partial Observability in StarCraft: Brood War

Gehring, Jonas, Ju, Da, Mella, Vegard, Gant, Daniel, Usunier, Nicolas, Synnaeve, Gabriel

arXiv.org Machine LearningNov-20-2018

We consider the problem of high-level strategy selection in the adversarial setting of real-time strategy games from a reinforcement learning perspective, where taking an action corresponds to switching to the respective strategy. Here, a good strategy successfully counters the opponent's current and possible future strategies which can only be estimated using partial observations. We investigate whether we can utilize the full game state information during training time (in the form of an auxiliary prediction task) to increase performance. Experiments carried out within a StarCraft: Brood War bot against strong community bots show substantial win rate improvements over a fixed-strategy baseline and encouraging results when learning with the auxiliary task.

build order, computer game, deep learning, (19 more...)

arXiv.org Machine Learning

1811.08568

Country: North America > Canada (0.14)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.31)

Add feedback