AITopics | Jain, Nihal

Collaborating Authors

Jain, Nihal

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LibEvolutionEval: A Benchmark and Study for Version-Specific Code Generation

Kuhar, Sachit, Ahmad, Wasi Uddin, Wang, Zijian, Jain, Nihal, Qian, Haifeng, Ray, Baishakhi, Ramanathan, Murali Krishna, Ma, Xiaofei, Deoras, Anoop

arXiv.org Artificial IntelligenceNov-19-2024

Recent advancements in code completion models have primarily focused on local file contexts. However, these studies do not fully capture the complexity of real-world software development, which often requires the use of rapidly-evolving public libraries. To fill the gap, we introduce LibEvolutionEval, a detailed study requiring an understanding of library evolution to perform in-line code completion accurately. LibEvolutionEval provides a version-specific code-completion task comprised of eight libraries (torch, torchvision, scipy, pil, tqdm, pyyaml, matplotlib, and pandas) as they evolve over the year along with a detailed analysis of the evolution of two popular and well-maintained public libraries: PyTorch and Matplotlib. We evaluate popular public models and find that public library evolution significantly influences model performance. We explored mitigation methods by studying how retrieved version-specific library documentation and prompting can improve the model's capability in handling these fast-evolving packages, paving a promising future path in better handling fast-evolving libraries.

completion, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2412.04478

Country: North America > Canada (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.52)

Add feedback

Approximately Aligned Decoding

Melcer, Daniel, Gonugondla, Sujan, Perera, Pramuditha, Qian, Haifeng, Chiang, Wen-Hao, Wang, Yanjun, Jain, Nihal, Garg, Pranav, Ma, Xiaofei, Deoras, Anoop

arXiv.org Artificial IntelligenceOct-1-2024

It is common to reject undesired outputs of Large Language Models (LLMs); however, current methods to do so require an excessive amount of computation, or severely distort the distribution of outputs. We present a method to balance the distortion of the output distribution with computational efficiency, allowing for the generation of long sequences of text with difficult-to-satisfy constraints, with less amplification of low probability outputs compared to existing methods. We show through a series of experiments that the task-specific performance of our method is comparable to methods that do not distort the output distribution, while being much more computationally efficient. Language models sometimes generate undesirable outputs, such as syntactically-incorrect code, hallucinated PII, or profanity. These conditions, which we collectively refer to as errors for the remainder of the paper, can be detected with incremental parsers, regular expression matching, or even simple substring searches. However, once detection occurs, there are several competing methods for mitigating errors in the output. One set of methods, constrained generation (Beurer-Kellner et al., 2024; Geng et al., 2024; Melcer et al., 2024), avoids errors by disabling the generation of any token that immediately leads to such an error. While this method is effective, it can lead to the amplification of low-probability outputs. Another class of methods avoids errors without any amplification of low-probability outputs, at the cost of additional computation. Rejection sampling is the simplest such method; i.e. if the output contains an error, simply generate another sample until the output is acceptable. Adaptive Sampling with Approximate Expected Futures (ASAp) (Park et al., 2024) provides a performance improvement over rejection sampling while maintaining the output distribution by effectively sampling without replacement, but there are still many situations in which it may converge too slowly. In our experiments, we show that our method obtains task-specific performance on par with ASAp, while converging significantly faster when the constraints are difficult to satisfy. We first describe autoregressive language models and their properties.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2410.01103

Country: North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Add feedback

On Mitigating Code LLM Hallucinations with API Documentation

Jain, Nihal, Kwiatkowski, Robert, Ray, Baishakhi, Ramanathan, Murali Krishna, Kumar, Varun

arXiv.org Artificial IntelligenceJul-12-2024

In this study, we address the issue of API hallucinations in various software engineering contexts. We introduce CloudAPIBench, a new benchmark designed to measure API hallucination occurrences. CloudAPIBench also provides annotations for frequencies of API occurrences in the public domain, allowing us to study API hallucinations at various frequency levels. Our findings reveal that Code LLMs struggle with low frequency APIs: for e.g., GPT-4o achieves only 38.58% valid low frequency API invocations. We demonstrate that Documentation Augmented Generation (DAG) significantly improves performance for low frequency APIs (increase to 47.94% with DAG) but negatively impacts high frequency APIs when using sub-optimal retrievers (a 39.02% absolute drop). To mitigate this, we propose to intelligently trigger DAG where we check against an API index or leverage Code LLMs' confidence scores to retrieve only when needed. We demonstrate that our proposed methods enhance the balance between low and high frequency API performance, resulting in more reliable API invocations (8.20% absolute improvement on CloudAPIBench for GPT-4o).

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2407.09726

Country: North America > Canada (0.14)

Genre: Research Report > New Finding (0.86)

Industry: Information Technology > Services (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion

Ding, Yangruibo, Wang, Zijian, Ahmad, Wasi Uddin, Ding, Hantian, Tan, Ming, Jain, Nihal, Ramanathan, Murali Krishna, Nallapati, Ramesh, Bhatia, Parminder, Roth, Dan, Xiang, Bing

arXiv.org Artificial IntelligenceNov-16-2023

Code completion models have made significant progress in recent years, yet current popular evaluation datasets, such as HumanEval and MBPP, predominantly focus on code completion tasks within a single file. This over-simplified setting falls short of representing the real-world software development scenario where repositories span multiple files with numerous cross-file dependencies, and accessing and understanding cross-file context is often required to complete the code correctly. To fill in this gap, we propose CrossCodeEval, a diverse and multilingual code completion benchmark that necessitates an in-depth cross-file contextual understanding to complete the code accurately. CrossCodeEval is built on a diverse set of real-world, open-sourced, permissively-licensed repositories in four popular programming languages: Python, Java, TypeScript, and C#. To create examples that strictly require cross-file context for accurate completion, we propose a straightforward yet efficient static-analysis-based approach to pinpoint the use of cross-file context within the current file. Extensive experiments on state-of-the-art code language models like CodeGen and StarCoder demonstrate that CrossCodeEval is extremely challenging when the relevant cross-file context is absent, and we see clear improvements when adding these context into the prompt. However, despite such improvements, the pinnacle of performance remains notably unattained even with the highest-performing model, indicating that CrossCodeEval is also capable of assessing model's capability in leveraging extensive context to make better code completion. Finally, we benchmarked various methods in retrieving cross-file context, and show that CrossCodeEval can also be used to measure the capability of code retrievers.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2310.11248

Country:

North America > United States (0.14)
North America > Canada (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

Add feedback

ContraCLM: Contrastive Learning For Causal Language Model

Jain, Nihal, Zhang, Dejiao, Ahmad, Wasi Uddin, Wang, Zijian, Nan, Feng, Li, Xiaopeng, Tan, Ming, Nallapati, Ramesh, Ray, Baishakhi, Bhatia, Parminder, Ma, Xiaofei, Xiang, Bing

arXiv.org Artificial IntelligenceMay-2-2023

Despite exciting progress in causal language models, the expressiveness of the representations is largely limited due to poor discrimination ability. To remedy this issue, we present ContraCLM, a novel contrastive learning framework at both token-level and sequence-level. We assess ContraCLM on a variety of downstream tasks. We show that ContraCLM enhances discrimination of the representations and bridges the gap with the encoder-only models, which makes causal language models better suited for tasks beyond language generation. Specifically, we attain $44\%$ relative improvement on the Semantic Textual Similarity tasks and $34\%$ on Code-to-Code Search tasks. Furthermore, by improving the expressiveness of the representations, ContraCLM also boosts the source code generation capability with $9\%$ relative improvement on execution accuracy on the HumanEval benchmark.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2210.01185

Country: North America > United States > Washington > King County > Seattle (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

MultiViz: Towards Visualizing and Understanding Multimodal Models

Liang, Paul Pu, Lyu, Yiwei, Chhablani, Gunjan, Jain, Nihal, Deng, Zihao, Wang, Xingbo, Morency, Louis-Philippe, Salakhutdinov, Ruslan

arXiv.org Artificial IntelligenceMar-6-2023

The promise of multimodal models for real-world applications has inspired research in visualizing and understanding their internal mechanics with the end goal of empowering stakeholders to visualize model behavior, perform model debugging, and promote trust in machine learning models. However, modern multimodal models are typically black-box neural networks, which makes it challenging to understand their internal mechanics. How can we visualize the internal modeling of multimodal interactions in these models? Our paper aims to fill this gap by proposing MultiViz, a method for analyzing the behavior of multimodal models by scaffolding the problem of interpretability into 4 stages: (1) unimodal importance: how each modality contributes towards downstream modeling and prediction, (2) cross-modal interactions: how different modalities relate with each other, (3) multimodal representations: how unimodal and cross-modal interactions are represented in decision-level features, and (4) multimodal prediction: how decision-level features are composed to make a prediction. MultiViz is designed to operate on diverse modalities, models, tasks, and research areas. Through experiments on 8 trained models across 6 real-world tasks, we show that the complementary stages in MultiViz together enable users to (1) simulate model predictions, (2) assign interpretable concepts to features, (3) perform error analysis on model misclassifications, and (4) use insights from error analysis to debug models. MultiViz is publicly available, will be regularly updated with new interpretation tools and metrics, and welcomes inputs from the community.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2207.00056

Country: North America > United States (0.92)

Genre: Research Report > Experimental Study (0.45)

Industry:

Health & Medicine (1.00)
Leisure & Entertainment (0.92)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Self-supervised Multi-view Disentanglement for Expansion of Visual Collections

Jain, Nihal, Vaddamanu, Praneetha, Maheshwari, Paridhi, Vinay, Vishwa, Kulkarni, Kuldeep

arXiv.org Artificial IntelligenceFeb-4-2023

Image search engines enable the retrieval of images relevant to a query image. In this work, we consider the setting where a query for similar images is derived from a collection of images. For visual search, the similarity measurements may be made along multiple axes, or views, such as style and color. We assume access to a set of feature extractors, each of which computes representations for a specific view. Our objective is to design a retrieval algorithm that effectively combines similarities computed over representations from multiple views. To this end, we propose a self-supervised learning method for extracting disentangled view-specific representations for images such that the inter-view overlap is minimized. We show how this allows us to compute the intent of a collection as a distribution over views. We show how effective retrieval can be performed by prioritizing candidate expansion images that match the intent of a query collection. Finally, we present a new querying mechanism for image search enabled by composing multiple collections and perform retrieval under this setting using the techniques presented in this paper.

information retrieval, machine learning, pattern recognition, (17 more...)

arXiv.org Artificial Intelligence

2302.02249

Country:

Asia (0.68)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > California > Santa Clara County (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.54)
(2 more...)

Add feedback