AITopics | Yegneswaran, Vinod

Collaborating Authors

Yegneswaran, Vinod

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Augmenting Rule-based DNS Censorship Detection at Scale with Machine Learning

Brown, Jacob, Jiang, Xi, Tran, Van, Bhagoji, Arjun Nitin, Hoang, Nguyen Phong, Feamster, Nick, Mittal, Prateek, Yegneswaran, Vinod

arXiv.org Artificial IntelligenceJun-15-2023

The proliferation of global censorship has led to the development of a plethora of measurement platforms to monitor and expose it. Censorship of the domain name system (DNS) is a key mechanism used across different countries. It is currently detected by applying heuristics to samples of DNS queries and responses (probes) for specific destinations. These heuristics, however, are both platform-specific and have been found to be brittle when censors change their blocking behavior, necessitating a more reliable automated process for detecting censorship. In this paper, we explore how machine learning (ML) models can (1) help streamline the detection process, (2) improve the potential of using large-scale datasets for censorship detection, and (3) discover new censorship instances and blocking signatures missed by existing heuristic methods. Our study shows that supervised models, trained using expert-derived labels on instances of known anomalies and possible censorship, can learn the detection heuristics employed by different measurement platforms. More crucially, we find that unsupervised models, trained solely on uncensored instances, can identify new instances and variations of censorship missed by existing heuristics. Moreover, both methods demonstrate the capability to uncover a substantial number of new DNS blocking signatures, i.e., injected fake IP addresses overlooked by existing heuristics. These results are underpinned by an important methodological finding: comparing the outputs of models trained using the same probes but with labels arising from independent processes allows us to more reliably detect cases of censorship in the absence of ground-truth labels of censorship.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2302.02031

Country:

North America > United States (1.00)
Asia (1.00)

Genre: Research Report > New Finding (0.46)

Industry: Law > Civil Rights & Constitutional Law (1.00)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.68)
(2 more...)

Add feedback

Data Masking with Privacy Guarantees

Pham, Anh T., Ghosh, Shalini, Yegneswaran, Vinod

arXiv.org Machine LearningJan-8-2019

We study the problem of data release with privacy, where data is made available with privacy guarantees while keeping the usability of the data as high as possible --- this is important in health-care and other domains with sensitive data. In particular, we propose a method of masking the private data with privacy guarantee while ensuring that a classifier trained on the masked data is similar to the classifier trained on the original data, to maintain usability. We analyze the theoretical risks of the proposed method and the traditional input perturbation method. Results show that the proposed method achieves lower risk compared to the input perturbation, especially when the number of training samples gets large. We illustrate the effectiveness of the proposed method of data masking for privacy-sensitive learning on $12$ benchmark datasets.

classifier, diabetes, survey article, (19 more...)

arXiv.org Machine Learning

1901.02185

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.37)

Industry:

Information Technology > Security & Privacy (0.66)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.47)

Technology:

Information Technology > Security & Privacy (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.31)

Add feedback

Trusted Neural Networks for Safety-Constrained Autonomous Control

Ghosh, Shalini, Mercier, Amaury, Pichapati, Dheeraj, Jha, Susmit, Yegneswaran, Vinod, Lincoln, Patrick

arXiv.org Artificial IntelligenceMay-18-2018

We propose Trusted Neural Network (TNN) models, which are deep neural network models that satisfy safety constraints critical to the application domain. We investigate different mechanisms for incorporating rule-based knowledge in the form of first-order logic constraints into a TNN model, where rules that encode safety are accompanied by weights indicating their relative importance. This framework allows the TNN model to learn from knowledge available in form of data as well as logical rules. We propose multiple approaches for solving this problem: (a) a multi-headed model structure that allows trade-off between satisfying logical constraints and fitting training data in a unified training framework, and (b) creating a constrained optimization problem and solving it in dual formulation by posing a new constrained loss function and using a proximal gradient descent algorithm. We demonstrate the efficacy of our TNN framework through experiments using the open-source TORCS~\cite{BernhardCAA15} 3D simulator for self-driving cars. Experiments using our first approach of a multi-headed TNN model, on a dataset generated by a customized version of TORCS, show that (1) adding safety constraints to a neural network model results in increased performance and safety, and (2) the improvement increases with increasing importance of the safety constraints. Experiments were also performed using the second approach of proximal algorithm for constrained optimization --- they demonstrate how the proposed method ensures that (1) the overall TNN model satisfies the constraints even when the training data violates some of the constraints, and (2) the proximal gradient descent algorithm on the constrained objective converges faster than the unconstrained version.

constraint, deep learning, neural network, (19 more...)

arXiv.org Artificial Intelligence

1805.07075

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Industry:

Leisure & Entertainment (1.00)
Transportation > Passenger (0.35)
Transportation > Ground > Road (0.35)
Information Technology > Robotics & Automation (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

ATOL: A Framework for Automated Analysis and Categorization of the Darkweb Ecosystem

Ghosh, Shalini (SRI International) | Porras, Phillip (SRI International) | Yegneswaran, Vinod (SRI International) | Nitz, Ken (SRI International) | Das, Ariyam (University of California, Los Angeles)

AAAI ConferencesFeb-4-2017

We present a framework for automated analysis and categorization of .onion websites in the darkweb to facilitate analyst situational awareness of new content that emerges from this dynamic landscape. Over the last two years, our team has developed a large-scale darkweb crawling infrastructure called OnionCrawler that acquires new onion domains on a daily basis, and crawls and indexes millions of pages from these new and previously known .onion sites. It stores this data into a research repository designed to help better understand Tor’s hidden service ecosystem. The analysis component of our framework is called Automated Tool for Onion Labeling (ATOL), which introduces a two-stage thematic labeling strategy: (1) it learns descriptive and discriminative keywords for different categories, and (2) uses these terms to map onion site content to a set of thematic labels. We also present empirical results of ATOL and our ongoing experimentation with it, as we have gained experience applying it to the entirety of our darkweb repository, now over 70 million indexed pages. We find that ATOL can perform site-level thematic label assignment more accurately than keywordbased schemes developed by domain experts — we expand the analyst-provided keywords using an automatic keyword discovery algorithm, and get 12% gain in accuracy by using a machine learning classification model. We also show how ATOL can discover categories on previously unlabeled onions and discuss applications of ATOL in supporting various analyses and investigations of the darkweb.

category, law enforcement, public safety, (26 more...)

AAAI Conferences

Workshops at the Thirty-First AAAI Conference on Artificial Intelligence

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.47)

Industry:

Information Technology > Security & Privacy (1.00)
Law Enforcement & Public Safety (0.93)
Government (0.88)
(2 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (1.00)
Information Technology > Communications (1.00)
(4 more...)

Add feedback