Collaborating Authors


Using Randomness to Improve Robustness of Machine-Learning Models Against Evasion Attacks Machine Learning

Machine learning models have been widely used in security applications such as intrusion detection, spam filtering, and virus or malware detection. However, it is well-known that adversaries are always trying to adapt their attacks to evade detection. For example, an email spammer may guess what features spam detection models use and modify or remove those features to avoid detection. There has been some work on making machine learning models more robust to such attacks. However, one simple but promising approach called {\em randomization} is underexplored. This paper proposes a novel randomization-based approach to improve robustness of machine learning models against evasion attacks. The proposed approach incorporates randomization into both model training time and model application time (meaning when the model is used to detect attacks). We also apply this approach to random forest, an existing ML method which already has some degree of randomness. Experiments on intrusion detection and spam filtering data show that our approach further improves robustness of random-forest method. We also discuss how this approach can be applied to other ML models.

Training Machine Learning Models On 311, 511, and 911 City Data -


We have been working hard to understand the core stack of data services that make our cities work, or not work, depending on where you live. This is the current data sets available via existing services, which may or may not exist in a machine readable format, via an API, depending on the city you live in. There is a huge amount of data already available at the municipal level, but here is where we have started as of January. Real Time Streaming 311 Incidents In Chicago 511 - Traffic, Travel & Transit Adding 511 Data To Our Existing Transit Data Research Getting Your 511 Traffic Incidents in the San Francisco Bay Area as a Real Time Streaming API 911 - Emergency Events Making 911 Data Real Time Streaming 911 Emergency Data For Baltimore, MD We've targeted these three areas because they make a difference in our lives at the local level, and have huge potential when it comes to making available via web APIs, and in real time using Server-Sent Events (SSE). Now that we have these three critical aspects of municipal operations profiled, we are going to work to profile as many cities as we can.