AITopics | Tavabi, Nazgol

Collaborating Authors

Tavabi, Nazgol

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Do not Mask Randomly: Effective Domain-adaptive Pre-training by Masking In-domain Keywords

Golchin, Shahriar, Surdeanu, Mihai, Tavabi, Nazgol, Kiapour, Ata

arXiv.org Artificial IntelligenceJul-14-2023

We propose a novel task-agnostic in-domain pre-training method that sits between generic pre-training and fine-tuning. Our approach selectively masks in-domain keywords, i.e., words that provide a compact representation of the target domain. We identify such keywords using KeyBERT (Grootendorst, 2020). We evaluate our approach using six different settings: three datasets combined with two distinct pre-trained language models (PLMs). Our results reveal that the fine-tuned PLMs adapted using our in-domain pre-training strategy outperform PLMs that used in-domain pre-training with random masking as well as those that followed the common pre-train-then-fine-tune paradigm. Further, the overhead of identifying in-domain keywords is reasonable, e.g., 7-15% of the pre-training time (for two epochs) for BERT Large (Devlin et al., 2019).

information retrieval, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2307.0716

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Arizona > Pima County > Tucson (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.69)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.71)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)

Add feedback

Discovering Signals from Web Sources to Predict Cyber Attacks

Goyal, Palash, Hossain, KSM Tozammel, Deb, Ashok, Tavabi, Nazgol, Bartley, Nathan, Abeliuk, Andr'es, Ferrara, Emilio, Lerman, Kristina

arXiv.org Machine LearningJun-8-2018

Cyber attacks are growing in frequency and severity. Over the past year alone we have witnessed massive data breaches that stole personal information of millions of people and wide-scale ransomware attacks that paralyzed critical infrastructure of several countries. Combating the rising cyber threat calls for a multi-pronged strategy, which includes predicting when these attacks will occur. The intuition driving our approach is this: during the planning and preparation stages, hackers leave digital traces of their activities on both the surface web and dark web in the form of discussions on platforms like hacker forums, social media, blogs and the like. These data provide predictive signals that allow anticipating cyber attacks. In this paper, we describe machine learning techniques based on deep neural networks and autoregressive time series models that leverage external signals from publicly available Web sources to forecast cyber attacks. Performance of our framework across ground truth data over real-world forecasting tasks shows that our methods yield a significant lift or increase of F1 for the top signals on predicted cyber attacks. Our results suggest that, when deployed, our system will be able to provide an effective line of defense against various types of targeted cyber attacks.

cyberwarfare, deep learning, prediction, (21 more...)

arXiv.org Machine Learning

1806.03342

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.86)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DarkEmbed: Exploit Prediction With Neural Language Models

Tavabi, Nazgol (USC Information Sciences Institute) | Goyal, Palash (USC Information Sciences Institute) | Almukaynizi, Mohammed (Arizona State University) | Shakarian, Paulo (Arizona State University) | Lerman, Kristina (USC Information Sciences Institute)

AAAI ConferencesFeb-8-2018

Software vulnerabilities can expose computer systems to attacks by malicious actors. With the number of vulnerabilities discovered in the recent years surging, creating timely patches for every vulnerability is not always feasible. At the same time, not every vulnerability will be exploited by attackers; hence, prioritizing vulnerabilities by assessing the likelihood they will be exploited has become an important research problem. Recent works used machine learning techniques to predict exploited vulnerabilities by analyzing discussions about vulnerabilities on social media. These methods relied on traditional text processing techniques, which represent statistical features of words, but fail to capture their context. To address this challenge, we propose DarkEmbed, a neural language modeling approach that learns low dimensional distributed representations, i.e., embeddings, of darkweb/deepweb discussions to predict whether vulnerabilities will be exploited. By capturing linguistic regularities of human language, such as syntactic, semantic similarity and logic analogy, the learned embeddings are better able to classify discussions about exploited vulnerabilities than traditional text analysis methods. Evaluations demonstrate the efficacy of learned embeddings on both structured text (such as security blog posts) and unstructured text (darkweb/deepweb posts). DarkEmbed outperforms state-of-the-art approaches on the exploit prediction task with an F1-score of 0.74.

cyberwarfare, law enforcement, vulnerability, (22 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country: North America > United States (0.94)

Genre:

Research Report (0.48)
Overview > Innovation (0.34)

Industry:

Information Technology > Security & Privacy (1.00)
Law Enforcement & Public Safety (0.68)
Government > Military > Cyberwarfare (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Structured Features in Naive Bayes Classification

Choi, Arthur (University of California, Los Angeles) | Tavabi, Nazgol (Sharif University of Technology) | Darwiche, Adnan (University of California, Los Angeles)

AAAI ConferencesApr-19-2016

We propose the structured naive Bayes (SNB) classifier, which augments the ubiquitous naive Bayes classifier with structured features. SNB classifiers facilitate the use of complex features, such as combinatorial objects (e.g., graphs, paths and orders) in a general but systematic way. Underlying the SNB classifier is the recently proposed Probabilistic Sentential Decision Diagram (PSDD), which is a tractable representation of probability distributions over structured spaces. We illustrate the utility and generality of the SNB classifier via case studies. First, we show how we can distinguish players of simple games in terms of play style and skill level based purely on observing the games they play. Second, we show how we can detect anomalous paths taken on graphs based purely on observing the paths themselves.

artificial intelligence, classifier, machine learning, (15 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country: North America > United States > California (0.14)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback