Goto

Collaborating Authors

 Software


Coda: An End-to-End Neural Program Decompiler

Neural Information Processing Systems

Reverse engineering of binary executables is a critical problem in the computer security domain. On the one hand, malicious parties may recover interpretable source codes from the software products to gain commercial advantages. On the other hand, binary decompilation can be leveraged for code vulnerability analysis and malware detection. However, efficient binary decompilation is challenging. Conventional decompilers have the following major limitations: (i) they are only applicable to specific source-target language pair, hence incurs undesired development cost for new language tasks; (ii) their output high-level code cannot effectively preserve the correct functionality of the input binary; (iii) their output program does not capture the semantics of the input and the reversed program is hard to interpret.


elaborate on the algorithm description accordingly

Neural Information Processing Systems

We thank all reviewers for their valuable feedback and comments. Please find our responses below. Reviewer 1 - Explanation in the introduction: we strive for clarity and we appreciate this comment. We thank the reviewer for pointing this out. This can be done in many ways as discussed in Appendix C. The theoretical value used for the bounds is rather conservative however.


5 projects Perplexity's new Labs AI tool can whip up for you now - in minutes

ZDNet

Designing a detailed web app, dashboard, or even spreadsheet might take you hours to complete. What if someone or something could do the same work in just a few minutes? In a blog post published Thursday, Perplexity explained how Labs can create anything from reports to spreadsheets to dashboards to simple web apps. The new feature is accessible only to Pro subscribers, who pay 20 per month (though there are a couple of ways to score the plan for free). This new capability is available on Perplexity's website and in its iOS and Android apps. The company has also promised its imminent arrival in its Windows and Mac apps.


How AI coding agents could destroy open source software

ZDNet

Imagine a single rogue line of code slipping past your tired eyes - and suddenly your entire app is compromised. AI coding agents could be the silent saboteurs of the next big cybersecurity crisis.


Generalized Fast Exact Conformalization

Neural Information Processing Systems

Conformal prediction converts nearly any point estimator into a prediction interval under standard assumptions while ensuring valid coverage. However, the extensive computational demands of full conformal prediction are daunting in practice, as it necessitates a comprehensive number of trainings across the entire latent label space. Unfortunately, existing efforts to expedite conformalization often carry strong assumptions and are developed specifically for certain models, or they only offer approximate solution sets. To address this gap, we develop a method for fast exact conformalization of generalized statistical estimation. Our analysis reveals that the structure of the solution path is inherently piecewise smooth, and indicates that utilizing second-order information of difference equations suffices to approximate the entire solution spectrum arbitrarily. We provide a unified view that not only encompasses existing work but also attempts to offer geometric insights.


A Benchmark for Evaluating Language Model Fit

Neural Information Processing Systems

Evaluations of language models (LMs) commonly report perplexity on monolithic data held out from training. Implicitly or explicitly, this data is composed of domains--varying distributions of language.




Automatic Binary Dataset Construction for Machine Learning

Neural Information Processing Systems

Binary code is pervasive, and binary analysis is a key task in reverse engineering, malware classification, and vulnerability discovery. Unfortunately, while there exist large corpora of malicious binaries, obtaining high-quality corpora of benign binaries for modern systems has proven challenging (e.g., due to licensing issues). Consequently, machine learning based pipelines for binary analysis utilize either costly commercial corpora (e.g., VirusTotal) or open-source binaries (e.g., coreutils) available in limited quantities.


Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Neural Information Processing Systems

Autonomous agents that accomplish complex computer tasks with minimal human interventions can significantly enhance accessibility and productivity of humancomputer interactions. Existing benchmarks either lack interactive environments or are limited to specific applications/domains, failing to reflect the diversity and complexity of real-world computer use and limiting agent scalability.