Goto

Collaborating Authors

 Problem Solving


A Survey on Machine Learning Techniques for Source Code Analysis

arXiv.org Artificial Intelligence

The advancements in machine learning techniques have encouraged researchers to apply these techniques to a myriad of software engineering tasks that use source code analysis, such as testing and vulnerability detection. Such a large number of studies hinders the community from understanding the current research landscape. This paper aims to summarize the current knowledge in applied machine learning for source code analysis. We review studies belonging to twelve categories of software engineering tasks and corresponding machine learning techniques, tools, and datasets that have been applied to solve them. To do so, we conducted an extensive literature search and identified 479 primary studies published between 2011 and 2021. We summarize our observations and findings with the help of the identified studies. Our findings suggest that the use of machine learning techniques for source code analysis tasks is consistently increasing. We synthesize commonly used steps and the overall workflow for each task and summarize machine learning techniques employed. We identify a comprehensive list of available datasets and tools useable in this context. Finally, the paper discusses perceived challenges in this area, including the availability of standard datasets, reproducibility and replicability, and hardware resources.


Bridging between LegalRuleML and TPTP for Automated Normative Reasoning (extended version)

arXiv.org Artificial Intelligence

LegalRuleML is a comprehensive XML-based representation framework for modeling and exchanging normative rules. The TPTP input and output formats, on the other hand, are general-purpose standards for the interaction with automated reasoning systems. In this paper we provide a bridge between the two communities by (i) defining a logic-pluralistic normative reasoning language based on the TPTP format, (ii) providing a translation scheme between relevant fragments of LegalRuleML and this language, and (iii) proposing a flexible architecture for automated normative reasoning based on this translation. We exemplarily instantiate and demonstrate the approach with three different normative logics.


Analysing the Predictivity of Features to Characterise the Search Space

arXiv.org Artificial Intelligence

Exploring search spaces is one of the most unpredictable challenges that has attracted the interest of researchers for decades. One way to handle unpredictability is to characterise the search spaces and take actions accordingly. A well-characterised search space can assist in mapping the problem states to a set of operators for generating new problem states. In this paper, a landscape analysis-based set of features has been analysed using the most renown machine learning approaches to determine the optimal feature set. However, in order to deal with problem complexity and induce commonality for transferring experience across domains, the selection of the most representative features remains crucial. The proposed approach analyses the predictivity of a set of features in order to determine the best categorization.


Survey on Applications of Neurosymbolic Artificial Intelligence

arXiv.org Artificial Intelligence

In recent years, the Neurosymbolic framework has attracted a lot of attention in various applications, from recommender systems and information retrieval to healthcare and finance. This success is due to its stellar performance combined with attractive properties, such as learning and reasoning. The new emerging Neurosymbolic field is currently experiencing a renaissance, as novel frameworks and algorithms motivated by various practical applications are being introduced, building on top of the classical neural and reasoning problem setting. This article aims to provide a comprehensive review of significant recent developments in real-world applications of Neurosymbolic Artificial Intelligence. Specifically, we introduce a taxonomy of common Neurosymbolic applications and summarize the state-of-the-art for each of those domains. Furthermore, we identify important current trends and provide new perspectives pertaining to the future of this burgeoning field.


A Survey of Neural Trees

arXiv.org Artificial Intelligence

Neural networks (NNs) and decision trees (DTs) are both popular models of machine learning, yet coming with mutually exclusive advantages and limitations. To bring the best of the two worlds, a variety of approaches are proposed to integrate NNs and DTs explicitly or implicitly. In this survey, these approaches are organized in a school which we term as neural trees (NTs). This survey aims to present a comprehensive review of NTs and attempts to identify how they enhance the model interpretability. We first propose a thorough taxonomy of NTs that expresses the gradual integration and co-evolution of NNs and DTs. Afterward, we analyze NTs in terms of their interpretability and performance, and suggest possible solutions to the remaining challenges. Finally, this survey concludes with a discussion about other considerations like conditional computation and promising directions towards this field. A list of papers reviewed in this survey, along with their corresponding codes, is available at: https://github.com/zju-vipa/awesome-neural-trees


How to use Binary Search Trees part1(Data Structures and Algorithms)

#artificialintelligence

Abstract: This paper presents a parallel solution based on the coarse-grained multicomputer (CGM) model using the four-splitting technique to solve the optimal binary search tree problem. The well-known sequential algorithm of Knuth solves this problem in O(n2) time and space, where n is the number of keys used to build the optimal binary search tree. To parallelize this algorithm on the CGM model, the irregular partitioning technique, consisting in subdividing the dependency graph into subgraphs (or blocks) of variable size, has been proposed to tackle the trade-off of minimizing the number of communication rounds and balancing the load of processors. This technique however induces a high latency time of processors (which accounts for most of the global communication time) because varying the blocks' sizes does not enable them to start evaluating some blocks as soon as the data they need are available. The four-splitting technique proposed in this paper solves this shortcoming by evaluating a block as a sequence of computation and communication steps of four subblocks.


Semi-supervised Training for Knowledge Base Graph Self-attention Networks on Link Prediction

arXiv.org Artificial Intelligence

The task of link prediction aims to solve the problem of incomplete knowledge caused by the difficulty of collecting facts from the real world. GCNs-based models are widely applied to solve link prediction problems due to their sophistication, but GCNs-based models are suffering from two problems in the structure and training process. 1) The transformation methods of GCN layers become increasingly complex in GCN-based knowledge representation models; 2) Due to the incompleteness of the knowledge graph collection process, there are many uncollected true facts in the labeled negative samples. Therefore, this paper investigates the characteristic of the information aggregation coefficient (self-attention) of adjacent nodes and redesigns the self-attention mechanism of the GAT structure. Meanwhile, inspired by human thinking habits, we designed a semi-supervised self-training method over pre-trained models. Experimental results on the benchmark datasets FB15k-237 and WN18RR show that our proposed self-attention mechanism and semi-supervised self-training method can effectively improve the performance of the link prediction task. If you look at FB15k-237, for example, the proposed method improves Hits@1 by about 30%.


Faithful Reasoning Using Large Language Models

arXiv.org Artificial Intelligence

Although contemporary large language models (LMs) demonstrate impressive question-answering capabilities, their answers are typically the product of a single call to the model. This entails an unwelcome degree of opacity and compromises performance, especially on problems that are inherently multi-step. To address these limitations, we show how LMs can be made to perform faithful multi-step reasoning via a process whose causal structure mirrors the underlying logical structure of the problem. Our approach works by chaining together reasoning steps, where each step results from calls to two fine-tuned LMs, one for selection and one for inference, to produce a valid reasoning trace. Our method carries out a beam search through the space of reasoning traces to improve reasoning quality. We demonstrate the effectiveness of our model on multi-step logical deduction and scientific question-answering, showing that it outperforms baselines on final answer accuracy, and generates humanly interpretable reasoning traces whose validity can be checked by the user.


Nayland Blake, the Art-Problem Solver, Will See You Now

The New Yorker

The multidisciplinary artist Nayland Blake was once a child gazing in wonder at Alexander Calder's "circus" in the lobby of the Whitney Museum. "Who's the circus now?" Blake said the other day, some fifty years later, gesturing around a conference room across from the museum's education center. Blake, bearish, Merlin-bearded, soft-spoken in the manner of a blacksmith teaching kindergartners, was preparing for a session in their performance series, "Got an Art Problem?," part of this year's Whitney Biennial. In June, Blake threw a "Gender Discard Party" in the museum's lobby, to which guests were invited to "bring your own baggage" and dance away the woes of classification, in view of the artist's "Rear Entry," a reproduction of the door to the Mineshaft, the former gay club in the meatpacking district. Blake thought the curators would never go for "Rear Entry."


4 AI research trends everyone is (or will be) talking about

#artificialintelligence

Were you unable to attend Transform 2022? Check out all of the summit sessions in our on-demand library now! Using AI in the real world remains challenging in many ways. Organizations are struggling to attract and retain talent, build and deploy AI models, define and apply responsible AI practices, and understand and prepare for regulatory framework compliance. At the same time, the DeepMinds, Googles and Metas of the world are pushing ahead with their AI research.