Machine Learning Can Identify the Authors of Anonymous Code
Researchers who study stylometry--the statistical analysis of linguistic style--have long known that writing is a unique, individualistic process. The vocabulary you select, your syntax, and your grammatical decisions leave behind a signature. Automated tools can now accurately identify the author of a forum post for example, as long as they have adequate training data to work with. But newer research shows that stylometry can also apply to artificial language samples, like code. Software developers, it turns out, leave behind a fingerprint as well.
Aug-10-2018, 19:48:12 GMT