Goto

Collaborating Authors

 Markov Models


Optimizing Personalized Email Filtering Thresholds to Mitigate Sequential Spear Phishing Attacks

AAAI Conferences

Highly targeted spear phishing attacks are increasingly common, and have been implicated in many major security breeches. Email filtering systems are the first line of defense against such attacks. These filters are typically configured with uniform thresholds for deciding whether or not to allow a message to be delivered to a user. However, users have very significant differences in both their susceptibility to phishing attacks as well as their access to critical information and credentials that can cause damage. Recent work has considered setting personalized thresholds for individual users based on a Stackelberg game model. We consider two important extensions of the previous model. First, in our model user values can be substitutable, modeling cases where multiple users provide access to the same information or credential. Second, we consider attackers who make sequential attack plans based on the outcome of previous attacks. Our analysis starts from scenarios where there is only one credential and then extends to more general scenarios with multiple credentials. For single-credential scenarios, we demonstrate that the optimal defense strategy can be found by solving a binary combinatorial optimization problem called PEDS. For multiple-credential scenarios, we formulate it as a bilevel optimization problem for finding the optimal defense strategy and then reduce it to a single level optimization problem called PEMS using complementary slackness conditions. Experimental results show that both PEDS and PEMS lead to significant higher defender utilities than two existing benchmarks in different parameter settings. Also, both PEDS and PEMS are more robust than the existing benchmarks considering uncertainties.


Predicting the Next Location: A Recurrent Model with Spatial and Temporal Contexts

AAAI Conferences

Spatial and temporal contextual information plays a key role for analyzing user behaviors, and is helpful for predicting where he or she will go next. With the growing ability of collecting information, more and more temporal and spatial contextual information is collected in systems, and the location prediction problem becomes crucial and feasible. Some works have been proposed to address this problem, but they all have their limitations. Factorizing Personalized Markov Chain (FPMC) is constructed based on a strong independence assumption among different factors, which limits its performance. Tensor Factorization (TF) faces the cold start problem in predicting future actions. Recurrent Neural Networks (RNN) model shows promising performance comparing with PFMC and TF, but all these methods have problem in modeling continuous time interval and geographical distance. In this paper, we extend RNN and propose a novel method called Spatial Temporal Recurrent Neural Networks (ST-RNN). ST-RNN can model local temporal and spatial contexts in each layer with time-specific transition matrices for different time intervals and distance-specific transition matrices for different geographical distances. Experimental results show that the proposed ST-RNN model yields significant improvements over the competitive compared methods on two typical datasets, i.e., Global Terrorism Database (GTD) and Gowalla dataset.


A Scalable Framework to Choose Sellers in E-Marketplaces Using POMDPs

AAAI Conferences

In multiagent e-marketplaces, buying agents need to select good sellers by querying other buyers (called advisors). Partially Observable Markov Decision Processes (POMDPs) have shown to be an effective framework for optimally selecting sellers by selectively querying advisors. However, current solution methods do not scale to hundreds or even tens of agents operating in the e-market. In this paper, we propose the Mixture of POMDP Experts (MOPE) technique, which exploits the inherent structure of trust-based domains, such as the seller selection problem in e-markets, by aggregating the solutions of smaller sub-POMDPs. We propose a number of variants of the MOPE approach that we analyze theoretically and empirically. Experiments show that MOPE can scale up to a hundred agents thereby leveraging the presence of more advisors to significantly improve buyer satisfaction.


Smoothed Hierarchical Dirichlet Process: A Non-Parametric Approach to Constraint Measures

arXiv.org Machine Learning

Time-varying mixture densities occur in many scenarios, for example, the distributions of keywords that appear in publications may evolve from year to year, video frame features associated with multiple targets may evolve in a sequence. Any models that realistically cater to this phenomenon must exhibit two important properties: the underlying mixture densities must have an unknown number of mixtures, and there must be some "smoothness" constraints in place for the adjacent mixture densities. The traditional Hierarchical Dirichlet Process (HDP) may be suited to the first property, but certainly not the second. This is due to how each random measure in the lower hierarchies is sampled independent of each other and hence does not facilitate any temporal correlations. To overcome such shortcomings, we proposed a new Smoothed Hierarchical Dirichlet Process (sHDP). The key novelty of this model is that we place a temporal constraint amongst the nearby discrete measures $\{G_j\}$ in the form of symmetric Kullback-Leibler (KL) Divergence with a fixed bound $B$. Although the constraint we place only involves a single scalar value, it nonetheless allows for flexibility in the corresponding successive measures. Remarkably, it also led us to infer the model within the stick-breaking process where the traditional Beta distribution used in stick-breaking is now replaced by a new constraint calculated from $B$. We present the inference algorithm and elaborate on its solutions. Our experiment using NIPS keywords has shown the desirable effect of the model.


Consistently Estimating Markov Chains with Noisy Aggregate Data

arXiv.org Machine Learning

We address the problem of estimating the parameters of a time-homogeneous Markov chain given only noisy, aggregate data. This arises when a population of individuals behave independently according to a Markov chain, but individual sample paths cannot be observed due to limitations of the observation process or the need to protect privacy. Instead, only population-level counts of the number of individuals in each state at each time step are available. When these counts are exact, a conditional least squares (CLS) estimator is known to be consistent and asymptotically normal. We initiate the study of method of moments estimators for this problem to handle the more realistic case when observations are additionally corrupted by noise. We show that CLS can be interpreted as a simple "plug-in" method of moments estimator. However, when observations are noisy, it is not consistent because it fails to account for additional variance introduced by the noise. We develop a new, simpler method of moments estimator that bypasses this problem and is consistent under noisy observations.


Inverse Reinforcement Learning with Simultaneous Estimation of Rewards and Dynamics

arXiv.org Machine Learning

Inverse Reinforcement Learning (IRL) describes the problem of learning an unknown reward function of a Markov Decision Process (MDP) from observed behavior of an agent. Since the agent's behavior originates in its policy and MDP policies depend on both the stochastic system dynamics as well as the reward function, the solution of the inverse problem is significantly influenced by both. Current IRL approaches assume that if the transition model is unknown, additional samples from the system's dynamics are accessible, or the observed behavior provides enough samples of the system's dynamics to solve the inverse problem accurately. These assumptions are often not satisfied. To overcome this, we present a gradient-based IRL approach that simultaneously estimates the system's dynamics. By solving the combined optimization problem, our approach takes into account the bias of the demonstrations, which stems from the generating policy. The evaluation on a synthetic MDP and a transfer learning task shows improvements regarding the sample efficiency as well as the accuracy of the estimated reward functions and transition models.


Reinforcement Learning as a Framework for Ethical Decision Making

AAAI Conferences

Emerging AI systems will be making more and more decisions that impact the lives of humans in a significant way. It is essential, then, that these AI systems make decisions that take into account the desires, goals, and preferences of other people, while simultaneously learning about what those preferences are. In this work, we argue that the reinforcement-learning framework achieves the appropriate generality required to theorize about an idealized ethical artificial agent, and offers the proper foundations for grounding specific questions about ethical learning and decision making that can promote further scientific investigation. We define an idealized formalism for an ethical learner, and conduct experiments on two toy ethical dilemmas, demonstrating the soundness and flexibility of our approach. Lastly, we identify several critical challenges for future advancement in the area that can leverage our proposed framework.


Toward Caching Symmetrical Subtheories for Weighted Model Counting

AAAI Conferences

Model counting and weighted model counting are key problems in artificial intelligence. Marginal inference can be reduced to model counting in many statistical-relational systems, such as Markov Logic. One common approach used by model counters is splitting a theory into disjoint subtheories, performing model counting on the subtheories, and then caching the result. If an identical subtheory is encountered again in the search, the cached result is used, greatly reducing runtime. In this work we introduce a way to cache symmetric subtheories compactly, which could potentially decrease required cache size, increase cache hits, and decrease runtime of solving.


Planning under Uncertainty for Aggregated Electric Vehicle Charging Using Markov Decision Processes

AAAI Conferences

The increasing penetration of renewable energy sources and electric vehicles raises important challenges related to the operation of electricity grids. For instance, the amount of power generated by wind turbines is time-varying and dependent on the weather, which makes it hard to match flexible electric vehicle demand and uncertain wind power supply. In this paper we propose a vehicle aggregation framework which uses Markov Decision Processes to control charging of multiple electric vehicles and deals with uncertainty in renewable supply. We present a grouping technique to address the scalability aspects of our framework. In experiments we show that the aggregation framework maximizes the profit of the aggregator while reducing usage of conventionally-generated power and cost of customers.


Automatic Label Correction and Appliance Prioritization in Single Household Electricity Disaggregation

AAAI Conferences

Electricity disaggregation focuses on classification ofindividual appliances by monitoring aggregate electricalsignals. In this paper we present a novel algorithmto automatically correct labels, discard contaminatedtraining samples, and boost signal to noise ratio throughhigh frequency noise reduction. We also propose amethod for prioritized classification which classifies applianceswith the most intense signals first. When testedon four houses in Kaggles Belkin dataset, these methodsautomatically relabel over 77% of all training samplesand decrease error rate by an average of 45% in bothreal power and high frequency noise classification.