Goto

Collaborating Authors

 krishna


Still Not There: Can LLMs Outperform Smaller Task-Specific Seq2Seq Models on the Poetry-to-Prose Conversion Task?

Das, Kunal Kingkar, Jagadeeshan, Manoj Balaji, Sahith, Nallani Chakravartula, Sandhan, Jivnesh, Goyal, Pawan

arXiv.org Artificial Intelligence

Large Language Models (LLMs) are increasingly treated as universal, general-purpose solutions across NLP tasks, particularly in English. But does this assumption hold for low-resource, morphologically rich languages such as Sanskrit? We address this question by comparing instruction-tuned and in-context-prompted LLMs with smaller task-specific encoder-decoder models on the Sanskrit poetry-to-prose conversion task. This task is intrinsically challenging: Sanskrit verse exhibits free word order combined with rigid metrical constraints, and its conversion to canonical prose (anvaya) requires multi-step reasoning involving compound segmentation, dependency resolution, and syntactic linearisation. This makes it an ideal testbed to evaluate whether LLMs can surpass specialised models. For LLMs, we apply instruction fine-tuning on general-purpose models and design in-context learning templates grounded in Paninian grammar and classical commentary heuristics. For task-specific modelling, we fully fine-tune a ByT5-Sanskrit Seq2Seq model. Our experiments show that domain-specific fine-tuning of ByT5-Sanskrit significantly outperforms all instruction-driven LLM approaches. Human evaluation strongly corroborates this result, with scores exhibiting high correlation with Kendall's Tau scores. Additionally, our prompting strategies provide an alternative to fine-tuning when domain-specific verse corpora are unavailable, and the task-specific Seq2Seq model demonstrates robust generalisation on out-of-domain evaluations.


Can A.I. Writing Be More Than a Gimmick?

The New Yorker

The new essay collection "Searches: Selfhood in the Digital Age," by Vauhini Vara, opens with a transcript. "If I paste some writing here, can we talk about it?" Her interlocutor, the large language model ChatGPT, responds, "Of course!" The chatbot asks what specific themes it should focus on. "Nothing in particular," Vara replies.


AIRCHITECT v2: Learning the Hardware Accelerator Design Space through Unified Representations

Seo, Jamin, Ramachandran, Akshat, Chuang, Yu-Chuan, Itagi, Anirudh, Krishna, Tushar

arXiv.org Artificial Intelligence

Design space exploration (DSE) plays a crucial role in enabling custom hardware architectures, particularly for emerging applications like AI, where optimized and specialized designs are essential. With the growing complexity of deep neural networks (DNNs) and the introduction of advanced foundational models (FMs), the design space for DNN accelerators is expanding at an exponential rate. Additionally, this space is highly non-uniform and non-convex, making it increasingly difficult to navigate and optimize. Traditional DSE techniques rely on search-based methods, which involve iterative sampling of the design space to find the optimal solution. However, this process is both time-consuming and often fails to converge to the global optima for such design spaces. Recently, AIrchitect v1, the first attempt to address the limitations of search-based techniques, transformed DSE into a constant-time classification problem using recommendation networks. In this work, we propose AIrchitect v2, a more accurate and generalizable learning-based DSE technique applicable to large-scale design spaces that overcomes the shortcomings of earlier approaches. Specifically, we devise an encoder-decoder transformer model that (a) encodes the complex design space into a uniform intermediate representation using contrastive learning and (b) leverages a novel unified representation blending the advantages of classification and regression to effectively explore the large DSE space without sacrificing accuracy. Experimental results evaluated on 10^5 real DNN workloads demonstrate that, on average, AIrchitect v2 outperforms existing techniques by 15% in identifying optimal design points. Furthermore, to demonstrate the generalizability of our method, we evaluate performance on unseen model workloads (LLMs) and attain a 1.7x improvement in inference latency on the identified hardware architecture.


Low-Resource Authorship Style Transfer: Can Non-Famous Authors Be Imitated?

Patel, Ajay, Andrews, Nicholas, Callison-Burch, Chris

arXiv.org Artificial Intelligence

Authorship style transfer involves altering text to match the style of a target author whilst preserving the original meaning. Existing unsupervised approaches like STRAP have largely focused on style transfer to target authors with many examples of their writing style in books, speeches, or other published works. This high-resource training data requirement (often greater than 100,000 words) makes these approaches primarily useful for style transfer to published authors, politicians, or other well-known figures and authorship styles, while style transfer to non-famous authors has not been well-studied. We introduce the \textit{low-resource authorship style transfer} task, a more challenging class of authorship style transfer where only a limited amount of text in the target author's style may exist. In our experiments, we specifically choose source and target authors from Reddit and style transfer their Reddit posts, limiting ourselves to just 16 posts (on average ~500 words) of the target author's style. Style transfer accuracy is typically measured by how often a classifier or human judge will classify an output as written by the target author. Recent authorship representations models excel at authorship identification even with just a few writing samples, making automatic evaluation of this task possible for the first time through evaluation metrics we propose. Our results establish an in-context learning technique we develop as the strongest baseline, though we find current approaches do not yet achieve mastery of this challenging task. We release our data and implementations to encourage further investigation.


MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning

Xu, Zhiyang, Shen, Ying, Huang, Lifu

arXiv.org Artificial Intelligence

Instruction tuning, a new learning paradigm that fine-tunes pre-trained language models on tasks specified through instructions, has shown promising zero-shot performance on various natural language processing tasks. However, it has yet to be explored for vision and multimodal tasks. In this work, we introduce MUL-TIINSTRUCT, the first multimodal instruction tuning benchmark dataset that consists of 62 diverse multimodal tasks in a unified seq-to-seq format covering 10 broad categories. The tasks are derived from 21 existing open-source datasets and each task is equipped with 5 expert-written instructions. We take OFA as the base pre-trained model for multimodal instruction tuning, and to further improve its zero-shot performance, we explore multiple transfer learning strategies to leverage the large-scale NATURAL INSTRUCTIONS dataset. Experimental results demonstrate strong zero-shot performance on various unseen multimodal tasks and the benefit of transfer learning from a text-only instruction dataset. We also design a new evaluation metric - Sensitivity, to evaluate how sensitive the model is to the variety of instructions. Our results indicate that fine-tuning the model on a diverse set of tasks and instructions leads to a reduced sensitivity to variations in instructions for each task.


IBM to freeze hiring as CEO expects AI to replace 7,800 jobs

Al Jazeera

IBM will freeze hiring as it expects about 7,800 jobs to be replaced by Artificial Intelligence (AI) in the coming years, the tech giant's CEO has said. In an interview with Bloomberg News, IBM CEO Arvind Krishna said he could "easily see" nearly one-third of the company's non-customer-facing roles being replaced in the next five years. "These non-customer-facing roles amount to roughly 26,000 workers," Krishna said in the interview published on Tuesday. "I could easily see 30 percent of that getting replaced by AI and automation over a five-year period." Back-office employees are only a small portion of IBM's 260,000 or so workers and the company, based in Armonk, New York, has continued to fill roles even after letting go of about 5,000 workers in other areas, according to Bloomberg.


'Godfather of AI' Geoffrey Hinton quits Google and warns over dangers of machine learning

The Guardian

The man often touted as the godfather of AI has quit Google, citing concerns over the flood of fake information, videos and photos online and the possibility for AI to upend the job market. Dr Geoffrey Hinton, who with two of his students at the University of Toronto built a neural net in 2012, quit Google this week, the New York Times reported. Hinton, 75, said he quit to speak freely about the dangers of AI, and in part regrets his contribution to the field. He was brought on by Google a decade ago to help develop the company's AI technology. Hinton's research led the way for current systems like ChatGPT.


An evaluation of Google Translate for Sanskrit to English translation via sentiment and semantic analysis

Shukla, Akshat, Bansal, Chaarvi, Badhe, Sushrut, Ranjan, Mukul, Chandra, Rohitash

arXiv.org Artificial Intelligence

Google Translate has been prominent for language translation; however, limited work has been done in evaluating the quality of translation when compared to human experts. Sanskrit one of the oldest written languages in the world. In 2022, the Sanskrit language was added to the Google Translate engine. Sanskrit is known as the mother of languages such as Hindi and an ancient source of the Indo-European group of languages. Sanskrit is the original language for sacred Hindu texts such as the Bhagavad Gita. In this study, we present a framework that evaluates the Google Translate for Sanskrit using the Bhagavad Gita. We first publish a translation of the Bhagavad Gita in Sanskrit using Google Translate. Our framework then compares Google Translate version of Bhagavad Gita with expert translations using sentiment and semantic analysis via BERT-based language models. Our results indicate that in terms of sentiment and semantic analysis, there is low level of similarity in selected verses of Google Translate when compared to expert translations. In the qualitative evaluation, we find that Google translate is unsuitable for translation of certain Sanskrit words and phrases due to its poetic nature, contextual significance, metaphor and imagery. The mistranslations are not surprising since the Bhagavad Gita is known as a difficult text not only to translate, but also to interpret since it relies on contextual, philosophical and historical information. Our framework lays the foundation for automatic evaluation of other languages by Google Translate


A review of Implementation and Challenges of Unmanned Aerial Vehicles for Spraying Applications and Crop Monitoring in Indonesia

Fikri, Muhamad Rausyan, Candra, Taufiq, Saptaji, Kushendarsyah, Noviarini, Ajeng Nindi, Wardani, Dilla Ayu

arXiv.org Artificial Intelligence

Abstract: The rapid development of technology has brought unmanned aerial vehicles (UAVs) to become widely known in the current era. The market of UAVs is also predicted to continue growing with related technologies in the future. UAVs have been used in various sectors, including livestock, forestry, and agriculture. In agricultural applications, UAVs are highly capable of increasing the productivity of the farm and reducing farmers' workload. This study examines the urgency of UAV implementation in the agriculture sector. A short history of UAVs is provided in this paper to portray the development of UAVs from time to time. The classification of UAVs is also discussed to differentiate various types of UAVs. The application of UAVs in spraying and crop monitoring is based on the previous studies that have been done by many scientific groups and researchers who are working closely to propose solutions for agriculture-related issues. Furthermore, the limitations of UAV applications are also identified. The challenges in implementing agricultural UAVs in Indonesia are also presented. Keywords: Unmanned aerial vehicle, agricultural UAV, spraying, crop monitoring. 1. Introduction According to the United Nations (UN), the world population is projected to reach 9.7 billion people in 2050 (UN, 2015). This vast population would potentially double the food demand in the future (Hunter et al., 2017). Consequently, the ever-growing population that would emerge could cause food shortages in the future. This issue has become a severe problem since the Food and Agriculture Organization (FAO) announced similar speculation in which the current agricultural production must be increased by 70 percent by 2050 to meet the increasing demand for highquality food (Mundial, 2021). Many people suffering from hunger become a signal of how severe the food shortage is, and it was reported that more than 820 million people in 2018 were considered undernutrition (WHO, 2019). Surprisingly, the earlier data mentioned shows the increasing tendency towards people suffering from hunger since only around 690 million people were considered suffering from hunger in 2015.


Can companies make decisions with AI?

#artificialintelligence

AI can play many roles in the technology stack of a modern enterprise. Its performance as a neutral, data-based, analytical advisor could allow businesses to use algorithms to predict whether a decision is the right one. AI-based decisions are part of an arsenal of tools leveraged by technology high performers. Businesses led by digitally savvy leaders, those who champion emerging technologies such as AI, outperform other like-sized businesses by 48% on valuation and revenue growth, according to one MIT research study. "The integration of traditional decisioning into AI is really just starting to hit its stride right now," said Rowan Curran, analyst at Forrester.