Collaborating Authors

Releasing a new benchmark and data set for evaluating neural code search models


A new benchmark to evaluate code search techniques. The benchmark includes the largest evaluation data set currently available for Java, consisting of a natural language query and code snippet pairs. This data set comprises 287 Stack Overflow question-and-answer pairs from the Stack Exchange Data Dump. Also included is a search corpus that contains more than 24,000 of the most popular Android repositories on GitHub (ranked by the number of stars) and is indexed using the more than 4.7 million method bodies parsed from these repositories. A score sheet on the evaluation data set, using two models from our recent work, is also included.

Neural Code Search: ML-based code search using natural language queries


Engineers work best when they can easily find code examples to guide them on particular coding tasks. For some questions -- for example, "How to programmatically close or hide the Android soft keyboard?" But questions specific to proprietary code or APIs (or code written in less common programming languages) need a different solution, since they are not typically discussed in those forums. To address this need, we've developed a code search tool that applies natural language processing (NLP) and information retrieval (IR) techniques directly to source code text. This tool, called Neural Code Search (NCS), accepts natural language queries and returns relevant code fragments retrieved directly from the code corpus.

SQL Server 2019 & Java: Parameters


We see in Code Snippet 1 how we want to call into the adder method in the JavaTest1 class, passing in two parameters: @x, and @y. When we execute the code the Java C language extension gets the parameter values, and looks in the code for two class-level variables named x, and y, and assigns the values to those variables. The adder method then uses x, and y. I mentioned above how a couple of things changed after Microsoft introduced the Java Language SDK. One of them was that you no longer define a method in SPEES's @script parameter. The parameter instead defines a class you want to call into.

Improve Web Search Using Image Snippets

AAAI Conferences

The Web has become the largest information repository over the world. Therefore, effectively and efficiently searching the Web becomes a key challenge. Previous research on Web search mainly attempts to exploit the text in the Web pages and the link information between the pages. This paper shows that the Web search performance can be enhanced if image information is considered. In detail, a new Web search framework is proposed, where image snippets are extracted for the Web pages, which are then provided along with text snippets to the user such that it is much easier and more accurate for the user to identify the Web pages he or she expects and to reformulate the initial query. Experimental evaluations demonstrate the promise of the proposed framework.

Google's fake news Snippets - BBC News


Over the weekend, I put a question to the Google Home speaker I'd brought back from the United States. "Is Obama planning a coup?" I'd asked this after reading an article that suggested a relatively new feature that gives answers - or Snippets as the search company call them - to queries, rather than just links, had been producing some troubling results. The piece said a search asking which US presidents were in the Ku Klux Klan had listed several as members of the KKK, despite there being no evidence for that. It also featured a search for "Proposition 63", a gun control measure, that had produced a Snippet describing it as "a deceptive ballot initiative that will criminalise millions of law abiding Californians". And then there was "Is Obama planning a coup?" which had resulted in a Snippets box describing "Western Center for Journalism's exclusive video".