AITopics | Venugopalan, Subhashini

Collaborating Authors

Venugopalan, Subhashini

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards AI-assisted Academic Writing

Liebling, Daniel J., Kane, Malcolm, Grunde-Mclaughlin, Madeleine, Lang, Ian J., Venugopalan, Subhashini, Brenner, Michael P.

arXiv.org Artificial IntelligenceMar-17-2025

We present components of an AI-assisted academic writing system including citation recommendation and introduction writing. The system recommends citations by considering the user's current document context to provide relevant suggestions. It generates introductions in a structured fashion, situating the contributions of the research relative to prior work. We demonstrate the effectiveness of the components through quantitative evaluations. Finally, the paper presents qualitative research exploring how researchers incorporate citations into their writing workflows. Our findings indicate that there is demand for precise AI-assisted writing systems and simple, effective methods for meeting those needs.

information retrieval, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2503.13771

Country:

Europe (0.68)
North America > United States > Washington > King County > Seattle (0.14)
North America > Mexico > Mexico City (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.50)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.46)

Add feedback

CURIE: Evaluating LLMs On Multitask Scientific Long Context Understanding and Reasoning

Cui, Hao, Shamsi, Zahra, Cheon, Gowoon, Ma, Xuejian, Li, Shutong, Tikhanovskaya, Maria, Norgaard, Peter, Mudur, Nayantara, Plomecka, Martyna, Raccuglia, Paul, Bahri, Yasaman, Albert, Victor V., Srinivasan, Pranesh, Pan, Haining, Faist, Philippe, Rohr, Brian, Statt, Michael J., Morris, Dan, Purves, Drew, Kleeman, Elise, Alcantara, Ruth, Abraham, Matthew, Mohammad, Muqthar, VanLee, Ean Phing, Jiang, Chenfei, Dorfman, Elizabeth, Kim, Eun-Ah, Brenner, Michael P, Jain, Viren, Ponda, Sameera, Venugopalan, Subhashini

arXiv.org Artificial IntelligenceMar-14-2025

Scientific problem-solving involves synthesizing information while applying expert knowledge. We introduce CURIE, a scientific long-Context Understanding,Reasoning and Information Extraction benchmark to measure the potential of Large Language Models (LLMs) in scientific problem-solving and assisting scientists in realistic workflows. This benchmark introduces ten challenging tasks with a total of 580 problems and solution pairs curated by experts in six disciplines - materials science, condensed matter physics, quantum computing, geospatial analysis, biodiversity, and proteins - covering both experimental and theoretical work-flows in science. We evaluate a range of closed and open LLMs on tasks in CURIE which requires domain expertise, comprehension of long in-context information,and multi-step reasoning. While Gemini Flash 2.0 and Claude-3 show consistent high comprehension across domains, the popular GPT-4o and command-R+ fail dramatically on protein sequencing tasks. With the best performance at 32% there is much room for improvement for all models. We hope that insights gained from CURIE can guide the future development of LLMs in sciences. Evaluation code and data are in https://github.com/google/curie

information, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2503.13517

Country:

Europe (0.67)
North America > United States (0.67)
Africa > Cameroon > Gulf of Guinea (0.28)

Genre:

Workflow (1.00)
Research Report (1.00)

Industry:

Education (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Speech Recognition With LLMs Adapted to Disordered Speech Using Reinforcement Learning

Nagpal, Chirag, Venugopalan, Subhashini, Tobin, Jimmy, Ladewig, Marilyn, Heller, Katherine, Tomanek, Katrin

arXiv.org Artificial IntelligenceDec-24-2024

We introduce a large language model (LLM) capable of processing speech inputs and show that tuning it further with reinforcement learning on human preference (RLHF) enables it to adapt better to disordered speech than traditional fine-tuning. Our method replaces low-frequency text tokens in an LLM's vocabulary with audio tokens and enables the model to recognize speech by fine-tuning it on speech with transcripts. We then use RL with rewards based on syntactic and semantic accuracy measures generalizing the LLM further to recognize disordered speech. While the resulting LLM does not outperform existing systems for speech recognition, we find that tuning with reinforcement learning using custom rewards leads to substantially better performance than supervised fine-tuning of the language model, specifically when adapting to speech in a different setting. This presents a compelling alternative tuning strategy for speech recognition using large language models.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2501.00039

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.37)

Add feedback

SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers

Pramanick, Shraman, Chellappa, Rama, Venugopalan, Subhashini

arXiv.org Artificial IntelligenceJul-12-2024

Seeking answers to questions within long scientific research articles is a crucial area of study that aids readers in quickly addressing their inquiries. However, existing question-answering (QA) datasets based on scientific papers are limited in scale and focus solely on textual content. To address this limitation, we introduce SPIQA (Scientific Paper Image Question Answering), the first large-scale QA dataset specifically designed to interpret complex figures and tables within the context of scientific research articles across various domains of computer science. Leveraging the breadth of expertise and ability of multimodal large language models (MLLMs) to understand figures, we employ automatic and manual curation to create the dataset. We craft an information-seeking task involving multiple images that cover a wide variety of plots, charts, tables, schematic diagrams, and result visualizations. SPIQA comprises 270K questions divided into training, validation, and three different evaluation splits. Through extensive experiments with 12 prominent foundational models, we evaluate the ability of current multimodal systems to comprehend the nuanced aspects of research articles. Additionally, we propose a Chain-of-Thought (CoT) evaluation strategy with in-context retrieval that allows fine-grained, step-by-step assessment and improves model performance. We further explore the upper bounds of performance enhancement with additional textual information, highlighting its promising potential for future research and the dataset's impact on revolutionizing how we interact with scientific literature.

large language model, machine learning, question answering, (23 more...)

arXiv.org Artificial Intelligence

2407.09413

Country:

Europe > Portugal (0.14)
Asia (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

A Design Space for Intelligent and Interactive Writing Assistants

Lee, Mina, Gero, Katy Ilonka, Chung, John Joon Young, Shum, Simon Buckingham, Raheja, Vipul, Shen, Hua, Venugopalan, Subhashini, Wambsganss, Thiemo, Zhou, David, Alghamdi, Emad A., August, Tal, Bhat, Avinash, Choksi, Madiha Zahrah, Dutta, Senjuti, Guo, Jin L. C., Hoque, Md Naimul, Kim, Yewon, Knight, Simon, Neshaei, Seyed Parsa, Sergeyuk, Agnia, Shibani, Antonette, Shrivastava, Disha, Shroff, Lila, Stark, Jessi, Sterman, Sarah, Wang, Sitong, Bosselut, Antoine, Buschek, Daniel, Chang, Joseph Chee, Chen, Sherol, Kreminski, Max, Park, Joonsuk, Pea, Roy, Rho, Eugenia H., Shen, Shannon Zejiang, Siangliulue, Pao

arXiv.org Artificial IntelligenceMar-26-2024

In our era of rapid technological advancement, the research landscape for writing assistants has become increasingly fragmented across various research communities. We seek to address this challenge by proposing a design space as a structured way to examine and explore the multidimensional space of intelligent and interactive writing assistants. Through a large community collaboration, we explore five aspects of writing assistants: task, user, technology, interaction, and ecosystem. Within each aspect, we define dimensions (i.e., fundamental components of an aspect) and codes (i.e., potential options for each dimension) by systematically reviewing 115 papers. Our design space aims to offer researchers and designers a practical tool to navigate, comprehend, and compare the various possibilities of writing assistants, and aid in the envisioning and design of new writing assistants.

computational linguistic, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3613904.3642697

2403.14117

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > Canada > Quebec > Montreal (0.14)
Europe > United Kingdom > Scotland > City of Glasgow > Glasgow (0.14)
(5 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Media (1.00)
Law (1.00)
Government (1.00)
(4 more...)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Communications > Social Media (1.00)
(9 more...)

Add feedback

Quantum Many-Body Physics Calculations with Large Language Models

Pan, Haining, Mudur, Nayantara, Taranto, Will, Tikhanovskaya, Maria, Venugopalan, Subhashini, Bahri, Yasaman, Brenner, Michael P., Kim, Eun-Ah

arXiv.org Artificial IntelligenceMar-5-2024

Large language models (LLMs) have demonstrated an unprecedented ability to perform complex tasks in multiple domains, including mathematical and scientific reasoning. We demonstrate that with carefully designed prompts, LLMs can accurately carry out key calculations in research papers in theoretical physics. We focus on a broadly used approximation method in quantum physics: the Hartree-Fock method, requiring an analytic multi-step calculation deriving approximate Hamiltonian and corresponding self-consistency equations. To carry out the calculations using LLMs, we design multi-step prompt templates that break down the analytic calculation into standardized steps with placeholders for problem-specific information. We evaluate GPT-4's performance in executing the calculation for 15 research papers from the past decade, demonstrating that, with correction of intermediate steps, it can correctly derive the final Hartree-Fock Hamiltonian in 13 cases and makes minor errors in 2 cases. Aggregating across all research papers, we find an average score of 87.5 (out of 100) on the execution of individual calculation steps. Overall, the requisite skill for doing these calculations is at the graduate level in quantum condensed matter theory. We further use LLMs to mitigate the two primary bottlenecks in this evaluation process: (i) extracting information from papers to fill in templates and (ii) automatic scoring of the calculation steps, demonstrating good results in both cases. The strong performance is the first step for developing algorithms that automatically explore theoretical hypotheses at an unprecedented scale.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2403.03154

Country: North America > United States (0.68)

Genre: Workflow (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Parameter Efficient Tuning Allows Scalable Personalization of LLMs for Text Entry: A Case Study on Abbreviation Expansion

Tomanek, Katrin, Cai, Shanqing, Venugopalan, Subhashini

arXiv.org Artificial IntelligenceDec-21-2023

Abbreviation expansion is a strategy used to speed up communication by limiting the amount of typing and using a language model to suggest expansions. Here we look at personalizing a Large Language Model's (LLM) suggestions based on prior conversations to enhance the relevance of predictions, particularly when the user data is small (~1000 samples). Specifically, we compare fine-tuning, prompt-tuning, and retrieval augmented generation of expanded text suggestions for abbreviated inputs. Our case study with a deployed 8B parameter LLM on a real user living with ALS, and experiments on movie character personalization indicates that (1) customization may be necessary in some scenarios and prompt-tuning generalizes well to those, (2) fine-tuning on in-domain data (with as few as 600 samples) still shows some gains, however (3) retrieval augmented few-shot selection also outperforms fine-tuning. (4) Parameter efficient tuning allows for efficient and scalable personalization. For prompt-tuning, we also find that initializing the learned "soft-prompts" to user relevant concept tokens leads to higher accuracy than random initialization.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2312.14327

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Using Large Language Models to Accelerate Communication for Users with Severe Motor Impairments

Cai, Shanqing, Venugopalan, Subhashini, Seaver, Katie, Xiao, Xiang, Tomanek, Katrin, Jalasutram, Sri, Morris, Meredith Ringel, Kane, Shaun, Narayanan, Ajit, MacDonald, Robert L., Kornman, Emily, Vance, Daniel, Casey, Blair, Gleason, Steve M., Nelson, Philip Q., Brenner, Michael P.

arXiv.org Artificial IntelligenceDec-3-2023

Finding ways to accelerate text input for individuals with profound motor impairments has been a long-standing area of research. Closing the speed gap for augmentative and alternative communication (AAC) devices such as eye-tracking keyboards is important for improving the quality of life for such individuals. Recent advances in neural networks of natural language pose new opportunities for re-thinking strategies and user interfaces for enhanced text-entry for AAC users. In this paper, we present SpeakFaster, consisting of large language models (LLMs) and a co-designed user interface for text entry in a highly-abbreviated form, allowing saving 57% more motor actions than traditional predictive keyboards in offline simulation. A pilot study with 19 non-AAC participants typing on a mobile device by hand demonstrated gains in motor savings in line with the offline simulation, while introducing relatively small effects on overall typing speed. Lab and field testing on two eye-gaze typing users with amyotrophic lateral sclerosis (ALS) demonstrated text-entry rates 29-60% faster than traditional baselines, due to significant saving of expensive keystrokes achieved through phrase and word predictions from context-aware LLMs. These findings provide a strong foundation for further exploration of substantially-accelerated text communication for motor-impaired users and demonstrate a direction for applying LLMs to text-based user interfaces.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2312.01532

Country: North America > United States (1.00)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.66)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Clinical BERTScore: An Improved Measure of Automatic Speech Recognition Performance in Clinical Settings

Shor, Joel, Bi, Ruyue Agnes, Venugopalan, Subhashini, Ibara, Steven, Goldenberg, Roman, Rivlin, Ehud

arXiv.org Artificial IntelligenceApr-28-2023

Automatic Speech Recognition (ASR) in medical contexts has the potential to save time, cut costs, increase report accuracy, and reduce physician burnout. However, the healthcare industry has been slower to adopt this technology, in part due to the importance of avoiding medically-relevant transcription mistakes. In this work, we present the Clinical BERTScore (CBERTScore), an ASR metric that penalizes clinically-relevant mistakes more than others. We demonstrate that this metric more closely aligns with clinician preferences on medical sentences as compared to other metrics (WER, BLUE, METEOR, etc), sometimes by wide margins. We collect a benchmark of 18 clinician preferences on 149 realistic medical sentences called the Clinician Transcript Preference benchmark (CTP) and make it publicly available for the community to further develop clinically-aware ASR metrics. To our knowledge, this is the first public dataset of its kind. We demonstrate that CBERTScore more closely matches what clinicians prefer.

artificial intelligence, bertscore, speech recognition, (17 more...)

arXiv.org Artificial Intelligence

2303.05737

Country:

Europe (0.46)
Asia > Middle East > Republic of Türkiye (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)

Add feedback

TRILLsson: Distilled Universal Paralinguistic Speech Representations

Shor, Joel, Venugopalan, Subhashini

arXiv.org Artificial IntelligenceMar-20-2022

Recent advances in self-supervision have dramatically improved the quality of speech representations. However, deployment of state-of-the-art embedding models on devices has been restricted due to their limited public availability and large resource footprint. Our work addresses these issues by publicly releasing a collection of paralinguistic speech models that are small and near state-of-the-art performance. Our approach is based on knowledge distillation, and our models are distilled on public data only. We explore different architectures and thoroughly evaluate our models on the Non-Semantic Speech (NOSS) benchmark. Our largest distilled model is less than 15% the size of the original model (314MB vs 2.2GB), achieves over 96% the accuracy on 6 of 7 tasks, and is trained on 6.5% the data. The smallest model is 1% in size (22MB) and achieves over 90% the accuracy on 6 of 7 tasks. Our models outperform the open source Wav2Vec 2.0 model on 6 of 7 tasks, and our smallest model outperforms the open source Wav2Vec 2.0 on both emotion recognition tasks despite being 7% the size.

artificial intelligence, dataset, machine learning, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.21437/Interspeech.2022-118

2203.00236

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (0.68)
Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback