Evaluating Methods for Distinguishing Between Human-Readable Text and Garbled Text

Henderson, Jette L. (The University of Texas at Austin) | Frazee, Daniel J. (The University of Texas at Austin) | Siegel, Nick P. (The University of Texas at Austin) | Martin, Cheryl E. (The University of Texas at Austin) | Liu, Alexander Y. (The University of Texas at Austin)

May-8-2016–AAAI Conferences

In some cybersecurity applications, it is useful to differenti- ate between human-readable text and garbled text (e.g., en- coded or encrypted text). Automated methods are necessary for performing this task on large volumes of data. Which method is best is an open question that depends on the spe- cific problem context. In this paper, we explore this open question via empirical tests of many automated categoriza- tion methods for differentiating human-readable versus gar- bled text under a variety of conditions (e.g., different class priors, different problem contexts, concept drift, etc.). The results indicate that the best approaches tend to be either variants of naïve Bayes or classifiers that use low- dimensional, structural features. The results also indicate that concept drift is one of the most problematic issues when classifying garbled text.

artificial intelligence, human-readable text and garbled text, machine learning, (1 more...)

AAAI Conferences

May-8-2016

Conferences PDF

Add feedback

Industry:
- Information Technology > Security & Privacy (0.53)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found