AITopics | Su, Zhong

Collaborating Authors

Su, Zhong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

D2A: A Dataset Built for AI-Based Vulnerability Detection Methods Using Differential Analysis

Zheng, Yunhui, Pujar, Saurabh, Lewis, Burn, Buratti, Luca, Epstein, Edward, Yang, Bo, Laredo, Jim, Morari, Alessandro, Su, Zhong

arXiv.org Artificial IntelligenceFeb-16-2021

Static analysis tools are widely used for vulnerability detection as they understand programs with complex behavior and millions of lines of code. Despite their popularity, static analysis tools are known to generate an excess of false positives. The recent ability of Machine Learning models to understand programming languages opens new possibilities when applied to static analysis. However, existing datasets to train models for vulnerability identification suffer from multiple limitations such as limited bug context, limited size, and synthetic and unrealistic source code. We propose D2A, a differential analysis based approach to label issues reported by static analysis tools. The D2A dataset is built by analyzing version pairs from multiple open source projects. From each project, we select bug fixing commits and we run static analysis on the versions before and after such commits. If some issues detected in a before-commit version disappear in the corresponding after-commit version, they are very likely to be real bugs that got fixed by the commit. We use D2A to generate a large labeled dataset to train models for vulnerability identification. We show that the dataset can be used to build a classifier to identify possible false alarms among the issues reported by static analysis, hence helping developers prioritize and investigate potential true positives first.

dataset, health & medicine, neural network, (20 more...)

arXiv.org Artificial Intelligence

2102.07995

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Analyzing and Predicting Not-Answered Questions in Community-based Question Answering Services

AAAI ConferencesAug-4-2011

This paper focuses on analyzing and predicting not-answered questions in Community based Question Answering (CQA) services, such as Yahoo! Answers. In CQA services, users express their information needs by submitting natural language questions and await answers from other human users. Comparing to receiving results from web search engines using keyword queries, CQA users are likely to get more specific answers, because human answerers may catch the main point of the question. However, one of the key problems of this pattern is that sometimes no one helps to give answers, while web search engines hardly fail to response. In this paper, we analyze the not-answered questions and give a first try of predicting whether questions will receive answers. More specifically, we first analyze the questions of Yahoo Answers based on the features selected from different perspectives. Then, we formalize the prediction problem as supervised learning – binary classification problem and leverage the proposed features to make predictions. Extensive experiments are made on 76,251 questions collected from Yahoo! Answers. We analyze the specific characteristics of not-answered questions and try to suggest possible reasons for why a question is not likely to be answered. As for prediction, the experimental results show that classification based on the proposed features outperforms the simple word-based approach significantly.

category, inductive learning, information management, (21 more...)

AAAI Conferences

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.62)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.54)

Add feedback