AITopics

2407.09283

Country:

North America > United States > Florida > Alachua County > Gainesville (0.14)
North America > United States > Florida > Escambia County > Pensacola (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Palacio, David N., Rodriguez-Cardenas, Daniel, Velasco, Alejandro, Khati, Dipin, Moran, Kevin, Poshyvanyk, Denys

Towards More Trustworthy and Interpretable LLMs for Code through Syntax-Grounded Explanations

arXiv.org Artificial IntelligenceJul-12-2024

Trustworthiness and interpretability are inextricably linked concepts for LLMs. The more interpretable an LLM is, the more trustworthy it becomes. However, current techniques for interpreting LLMs when applied to code-related tasks largely focus on accuracy measurements, measures of how models react to change, or individual task performance instead of the fine-grained explanations needed at prediction time for greater interpretability, and hence trust. To improve upon this status quo, this paper introduces ASTrust, an interpretability method for LLMs of code that generates explanations grounded in the relationship between model confidence and syntactic structures of programming languages. ASTrust explains generated code in the context of syntax categories based on Abstract Syntax Trees and aids practitioners in understanding model predictions at both local (individual code snippets) and global (larger datasets of code) levels. By distributing and assigning model confidence scores to well-known syntactic structures that exist within ASTs, our approach moves beyond prior techniques that perform token-level confidence mapping by offering a view of model confidence that directly aligns with programming language concepts with which developers are familiar. To put ASTrust into practice, we developed an automated visualization that illustrates the aggregated model confidence scores superimposed on sequence, heat-map, and graph-based visuals of syntactic structures from ASTs. We examine both the practical benefit that ASTrust can provide through a data science study on 12 popular LLMs on a curated set of GitHub repos and the usefulness of ASTrust through a human study.

category, explanation, llm, (16 more...)

2407.08983

Country:

North America > United States > Florida > Orange County > Orlando (0.14)
North America > United States > Virginia > Williamsburg (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre:

Research Report > New Finding (0.93)
Overview (0.93)
Research Report > Experimental Study (0.67)

Industry:

Health & Medicine (1.00)
Information Technology (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJul-11-2024

Automata-based constraints for language model decoding

Koo, Terry, Liu, Frederick, He, Luheng

Language models (LMs) are often expected to generate strings in some formal language; for example, structured data, API calls, or code snippets. Although LMs can be tuned to improve their adherence to formal syntax, this does not guarantee conformance, especially with smaller LMs suitable for large-scale deployment. In addition, tuning requires significant resources, making it impractical for uncommon or task-specific formats. To prevent downstream parsing errors we would ideally constrain the LM to only produce valid output, but this is severely complicated by tokenization, which is typically both ambiguous and misaligned with the formal grammar. We solve these issues through the application of automata theory, deriving an efficient closed-form solution for the regular languages, a broad class of formal languages with many practical applications, including API calls or schema-guided JSON and YAML. We also discuss pragmatic extensions for coping with the issue of high branching factor. Finally, we extend our techniques to deterministic context-free languages, which similarly admit an efficient closed-form solution. In spite of its flexibility and representative power, our approach only requires access to per-token decoding logits and lowers into simple calculations that are independent of LM size, making it both efficient and easy to apply to almost any LM architecture.

constraint, fsa, regular expression, (16 more...)

2407.08103

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Colorado > Boulder County > Boulder (0.04)
North America > Dominican Republic (0.04)
(3 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.75)

Majdinasab, Vahid, Nikanjam, Amin, Khomh, Foutse

DeepCodeProbe: Towards Understanding What Models Trained on Code Learn

arXiv.org Artificial IntelligenceJul-11-2024

Machine learning models trained on code and related artifacts offer valuable support for software maintenance but suffer from interpretability issues due to their complex internal variables. These concerns are particularly significant in safety-critical applications where the models' decision-making processes must be reliable. The specific features and representations learned by these models remain unclear, adding to the hesitancy in adopting them widely. To address these challenges, we introduce DeepCodeProbe, a probing approach that examines the syntax and representation learning abilities of ML models designed for software maintenance tasks. Our study applies DeepCodeProbe to state-of-the-art models for code clone detection, code summarization, and comment generation. Findings reveal that while small models capture abstract syntactic representations, their ability to fully grasp programming language syntax is limited. Increasing model capacity improves syntax learning but introduces trade-offs such as increased training time and overfitting. DeepCodeProbe also identifies specific code patterns the models learn from their training data. Additionally, we provide best practices for training models on code to enhance performance and interpretability, supported by an open-source replication package for broader application of DeepCodeProbe in interpreting other code-related models.

programming language, representation, syntax, (14 more...)

2407.0889

Country:

North America > Canada > Quebec > Montreal (0.05)
Europe > Portugal > Lisbon > Lisbon (0.04)
Europe > Czechia > South Moravian Region > Brno (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.68)

Doumen, Jonas, Schmalz, Veronica Juliana, Beuls, Katrien, Van Eecke, Paul

The Computational Learning of Construction Grammars: State of the Art and Prospective Roadmap

arXiv.org Artificial IntelligenceJul-10-2024

This paper documents and reviews the state of the art concerning computational models of construction grammar learning. It brings together prior work on the computational learning of form-meaning pairings, which has so far been studied in several distinct areas of research. The goal of this paper is threefold. First of all, it aims to synthesise the variety of methodologies that have been proposed to date and the results that have been obtained. Second, it aims to identify those parts of the challenge that have been successfully tackled and reveal those that require further research. Finally, it aims to provide a roadmap which can help to boost and streamline future research efforts on the computational learning of large-scale, usage-based construction grammars.

construction, construction grammar, grammar, (13 more...)

2407.07606

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > District of Columbia > Washington (0.04)
(13 more...)

Genre: Overview (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(3 more...)

arXiv.org Artificial IntelligenceJul-10-2024

Label-anticipated Event Disentanglement for Audio-Visual Video Parsing

Zhou, Jinxing, Guo, Dan, Mao, Yuxin, Zhong, Yiran, Chang, Xiaojun, Wang, Meng

Audio-Visual Video Parsing (AVVP) task aims to detect and temporally locate events within audio and visual modalities. Multiple events can overlap in the timeline, making identification challenging. While traditional methods usually focus on improving the early audio-visual encoders to embed more effective features, the decoding phase -- crucial for final event classification, often receives less attention. We aim to advance the decoding phase and improve its interpretability. Specifically, we introduce a new decoding paradigm, \underline{l}abel s\underline{e}m\underline{a}ntic-based \underline{p}rojection (LEAP), that employs labels texts of event categories, each bearing distinct and explicit semantics, for parsing potentially overlapping events.LEAP works by iteratively projecting encoded latent features of audio/visual segments onto semantically independent label embeddings. This process, enriched by modeling cross-modal (audio/visual-label) interactions, gradually disentangles event semantics within video segments to refine relevant label embeddings, guaranteeing a more discriminative and interpretable decoding process. To facilitate the LEAP paradigm, we propose a semantic-aware optimization strategy, which includes a novel audio-visual semantic similarity loss function. This function leverages the Intersection over Union of audio and visual events (EIoU) as a novel metric to calibrate audio-visual similarities at the feature level, accommodating the varied event densities across modalities. Extensive experiments demonstrate the superiority of our method, achieving new state-of-the-art performance for AVVP and also enhancing the relevant audio-visual event localization task.

audio-visual encoder, audio-visual video, label-anticipated event disentanglement, (13 more...)

2407.08126

Country:

Asia > China > Anhui Province > Hefei (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.88)

Bhatt, Krishnakant, J, Karthika N, Ramakrishnan, Ganesh, Jyothi, Preethi

CharSS: Character-Level Transformer Model for Sanskrit Word Segmentation

Subword tokens in Indian languages inherently carry meaning, and isolating them can enhance NLP tasks, making sub-word segmentation a crucial process. Segmenting Sanskrit and other Indian languages into subtokens is not straightforward, as it may include sandhi, which may lead to changes in the word boundaries. We propose a new approach of utilizing a Character-level Transformer model for Sanskrit Word Segmentation (CharSS). We perform experiments on three benchmark datasets to compare the performance of our method against existing methods. On the UoH+SandhiKosh dataset, our method outperforms the current state-of-the-art system by an absolute gain of 6.72 points in split prediction accuracy. On the hackathon dataset, our method achieves a gain of 2.27 points over the current SOTA system in terms of perfect match metric. We also propose a use-case of Sanskrit-based segments for a linguistically informed translation of technical terms to lexically similar low-resource Indian languages. In two separate experimental settings for this task, we achieve an average improvement of 8.46 and 6.79 chrF++ scores, respectively.

artificial intelligence, machine learning, natural language, (18 more...)

2407.06331

Country:

Oceania > Australia > Australian Capital Territory > Canberra (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

FuncEvalGMN: Evaluating Functional Correctness of SQL via Graph Matching Network

Zhan, Yi, Sun, Yang, Weng, Han, Cui, Longjie, Wang, Guifeng, Xie, Jiajun, Tian, Yu, Yin, Xiaoming, Liu, Boyi, Huang, Dongchi

In this paper, we propose a novel graph-based methodology to evaluate the functional correctness of SQL generation. Conventional metrics for assessing SQL code generation, such as matching-based and execution-based methods (e.g., exact set match and execution accuracy), are subject to two primary limitations. Firstly, the former fails to effectively assess functional correctness, as different SQL queries may possess identical functionalities. Secondly, the latter is susceptible to producing false positive samples in evaluations. Our proposed evaluation method, \texttt{FuncEvalGMN}, does not depend on the sufficient preparation of the test data, and it enables precise testing of the functional correctness of the code. Firstly, we parse SQL using a relational operator tree (ROT) called \textit{Relnode}, which contains rich semantic information from the perspective of logical execution.Then, we introduce a GNN-based approach for predicting the functional correctness of generated SQL. This approach incorporates global positional embeddings to address the limitations with the loss of topological information in conventional graph matching frameworks. As an auxiliary contribution, we propose a rule-based matching algorithm, Relnode Partial Matching (\texttt{RelPM}) as a baseline. Finally, we contribute a dataset, \texttt{Pair-Aug-Spider} with a training set and two testing sets, each comprising pairs of SQL codes to simulate various SQL code evaluation scenarios. The training set and one testing dataset focus on code generation using large language models (LLMs), while the other emphasizes SQL equivalence rewriting.

dataset, graph, node, (16 more...)

2407.1453

Country: North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Nakos, Constantine, Forbus, Kenneth D.

Interactively Diagnosing Errors in a Semantic Parser

Hand-curated natural language systems provide an inspectable, correctable alternative to language systems based on machine learning, but maintaining them requires considerable effort and expertise. Interactive Natural Language Debugging (INLD) aims to lessen this burden by casting debugging as a reasoning problem, asking the user a series of questions to diagnose and correct errors in the system's knowledge. In this paper, we present work in progress on an interactive error diagnosis system for the CNLU semantic parser. We show how the first two stages of the INLD pipeline (symptom identification and error localization) can be cast as a model-based diagnosis problem, demonstrate our system's ability to diagnose semantic errors on synthetic examples, and discuss design challenges and frontiers for future work.

assumption, interpretation, knowledge, (16 more...)

2407.064

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Illinois > Cook County > Evanston (0.04)
(5 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Srivastava, Nikit, Ma, Mengshi, Vollmers, Daniel, Zahera, Hamada, Moussallem, Diego, Ngomo, Axel-Cyrille Ngonga

MST5 -- Multilingual Question Answering over Knowledge Graphs

Knowledge Graph Question Answering (KGQA) simplifies querying vast amounts of knowledge stored in a graph-based model using natural language. However, the research has largely concentrated on English, putting non-English speakers at a disadvantage. Meanwhile, existing multilingual KGQA systems face challenges in achieving performance comparable to English systems, highlighting the difficulty of generating SPARQL queries from diverse languages. In this research, we propose a simplified approach to enhance multilingual KGQA systems by incorporating linguistic context and entity information directly into the processing pipeline of a language model. Unlike existing methods that rely on separate encoders for integrating auxiliary information, our strategy leverages a single, pretrained multilingual transformer-based language model to manage both the primary input and the auxiliary data. Our methodology significantly improves the language model's ability to accurately convert a natural language query into a relevant SPARQL query. It demonstrates promising results on the most recent QALD datasets, namely QALD-9-Plus and QALD-10. Furthermore, we introduce and evaluate our approach on Chinese and Japanese, thereby expanding the language diversity of the existing datasets.

dataset, qald-9-plus, sparql query, (12 more...)

2407.06041

Country:

Asia > Singapore (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(2 more...)