AITopics | Jwalapuram, Prathyusha

Collaborating Authors

Jwalapuram, Prathyusha

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

RakutenAI-7B: Extending Large Language Models for Japanese

Rakuten Group, null, Levine, Aaron, Huang, Connie, Wang, Chenguang, Batista, Eduardo, Szymanska, Ewa, Ding, Hongyi, Chou, Hou Wei, Pessiot, Jean-François, Effendi, Johanes, Chiu, Justin, Ohlhus, Kai Torben, Chopra, Karan, Shinzato, Keiji, Murakami, Koji, Xiong, Lee, Chen, Lei, Kubota, Maki, Tkachenko, Maksim, Lee, Miroku, Takahashi, Naoki, Jwalapuram, Prathyusha, Tatsushima, Ryutaro, Jain, Saurabh, Yadav, Sunil Kumar, Cai, Ting, Chen, Wei-Te, Xia, Yandi, Nakayama, Yuki, Higashiyama, Yutaka

arXiv.org Artificial IntelligenceMar-21-2024

We introduce RakutenAI-7B, a suite of Japanese-oriented large language models that achieve the best performance on the Japanese LM Harness benchmarks among the open 7B models. Along with the foundation model, we release instruction- and chat-tuned models, RakutenAI-7B-instruct and RakutenAI-7B-chat respectively, under the Apache 2.0 license.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2403.15484

Country:

Europe (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.50)

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Dynamic Scheduled Sampling with Imitation Loss for Neural Text Generation

Lin, Xiang, Jwalapuram, Prathyusha, Joty, Shafiq

arXiv.org Artificial IntelligenceJan-31-2023

State-of-the-art neural text generation models are typically trained to maximize the likelihood of each token in the ground-truth sequence conditioned on the previous target tokens. However, during inference, the model needs to make a prediction conditioned on the tokens generated by itself. This train-test discrepancy is referred to as exposure bias. Scheduled sampling is a curriculum learning strategy that gradually exposes the model to its own predictions during training to mitigate this bias. Most of the proposed approaches design a scheduler based on training steps, which generally requires careful tuning depending on the training setup. In this work, we introduce Dynamic Scheduled Sampling with Imitation Loss (DySI), which maintains the schedule based solely on the training time accuracy, while enhancing the curriculum learning by introducing an imitation loss, which attempts to make the behavior of the decoder indistinguishable from the behavior of a teacher-forced decoder. DySI is universally applicable across training setups with minimal tuning. Extensive experiments and analysis show that DySI not only achieves notable improvements on standard machine translation benchmarks, but also significantly improves the robustness of other text generation models.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2301.13753

Country:

Europe (1.00)
North America > United States > Pennsylvania (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Evaluating Pronominal Anaphora in Machine Translation: An Evaluation Measure and a Test Suite

Jwalapuram, Prathyusha, Joty, Shafiq, Temnikova, Irina, Nakov, Preslav

arXiv.org Artificial IntelligenceAug-31-2019

The ongoing neural revolution in machine translation has made it easier to model larger contexts beyond the sentence-level, which can potentially help resolve some discourse-level ambiguities such as pronominal anaphora, thus enabling better translations. Unfortunately, even when the resulting improvements are seen as substantial by humans, they remain virtually unnoticed by traditional automatic evaluation measures like BLEU, as only a few words end up being affected. Thus, specialized evaluation measures are needed. With this aim in mind, we contribute an extensive, targeted dataset that can be used as a test suite for pronoun translation, covering multiple source languages and different pronoun errors drawn from real system translations, for English. We further propose an evaluation measure to differentiate good and bad pronoun translations. We also conduct a user study to report correlations with human judgments.

deep learning, neural network, translation, (20 more...)

arXiv.org Artificial Intelligence

1909.00131

Country:

Asia > Middle East > Qatar (0.14)
North America > United States > Michigan (0.14)
North America > United States > Massachusetts (0.14)
(3 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

A Unified Linear-Time Framework for Sentence-Level Discourse Parsing

Lin, Xiang, Joty, Shafiq, Jwalapuram, Prathyusha, Bari, M Saiful

arXiv.org Artificial IntelligenceJun-12-2019

We propose an efficient neural framework for sentence-level discourse analysis in accordance with Rhetorical Structure Theory (RST). Our framework comprises a discourse segmenter to identify the elementary discourse units (EDU) in a text, and a discourse parser that constructs a discourse tree in a top-down fashion. Both the segmenter and the parser are based on Pointer Networks and operate in linear time. Our segmenter yields an $F_1$ score of 95.4, and our parser achieves an $F_1$ score of 81.7 on the aggregated labeled (relation) metric, surpassing previous approaches by a good margin and approaching human agreement on both tasks (98.3 and 83.0 $F_1$).

deep learning, neural network, parser, (21 more...)

arXiv.org Artificial Intelligence

1905.05682

Country:

North America > United States > Maryland (0.14)
Asia > Middle East > Qatar (0.14)
North America > Canada > British Columbia (0.14)

Genre:

Research Report > Experimental Study (0.46)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback