Xu, Xiaoxi
Strategy Mining
Xu, Xiaoxi (University of Massachusetts, Amherst) | Jensen, David (University of Massachusetts, Amherst) | Rissland, Edwina L. (University of Massachusetts, Amherst)
Strategy mining is a new area of research about discovering strategies for decision-making. It is motivated by how similarity is assessed in retrospect in law. In the legal domain, when both case facts and court decisions are present, it is often useful to assess similarity by accounting for both case facts and case outcomes. In this paper, we formulate the strategy mining problem as a clustering problem with the goal of finding clusters that represent disparate conditional dependency of decision labels on other features. Existing clustering algorithms are inappropriate to cluster dependency because they either assume feature independence, such as K-means, or only consider the co-occurrence of features without explicitly modeling the special dependency of the decision label on other features, such as Latent Dirichlet Allocation (LDA). We propose an Expectation Maximization (EM) style unsupervised learning algorithm for dependency clustering. Like EM, our algorithm is grounded in statistical learning theory. It minimizes the empirical risk of decision tree learning. Unlike other clustering algorithms, our algorithm is irrelevant-feature resistant, and its learned clusters modeled by decision trees are strongly interpretable and predictive. We systematically evaluate both the convergence property and solution quality of our algorithm using a common law dataset comprised of actual cases. Experimental results show that our algorithm significantly outperforms K-means and LDA on clustering dependency
Identifying Social Deliberative Behavior from Online Communication โ A Cross-Domain Study
Xu, Xiaoxi (University of Massachusetts Amherst) | Murray, Tom (University of Massachusetts Amherst) | Woolf, Beverly Park (University of Massachusetts Amherst) | Smith, David A. (Northeastern University)
In this paper we describe automatic systems for identifying whether participants demonstrate social deliberative behavior within their online conversations. We test 3 corpora containing 2617 annotated segments. With machine learning models using linguistic features, we identify social deliberative behavior with up to 68.09% in-domain accuracy (com- pared to 50% baseline), 62.17% in-domain precision, and 84% in-domain recall. In cross-domain identification tasks, we achieve up to 55.56% cross-domain accuracy, 59.84% cross-domain precision, and 86.58% cross-domain recall. We also discover linguistic characteristics of social deliberative behavior. In the context of identifying social deliberative be- havior, we offer insights into why certain machine learning models generalize well across domains and why certain domains pose great challenges to machine learning models.
Computational Predictors in Online Social Deliberations
Woolf, Beverly Park (University of Massachusetts-Amherst) | Murray, Thomas (University of Massachusetts-Amherst) | Xu, Xiaoxi (University of Massachusetts-Amherst) | Osterweil, Leon (University of Massachusetts-Amherst) | Clarke, Lori (University of Massachusetts-Amherst) | Wing, Leah (University of Massachusetts-Amherst) | Katsh, Ethan (University of Massachusetts-Amherst)
This research seeks to identify online participants' disposi tion and skills. A prototype dashboard and annotation scheme were developed to support facilitators and several computational predictors were identified that show statisti cally significant correlations with dialogue skills as ob served by human annotators.
Discovering Latent Strategies
Xu, Xiaoxi (University of Massachusetts Amherst)
Strategy mining is a new area of research about discovering strategies in decision-making. In this paper, we formulate the strategy-mining problem as a clustering problem, called the latent-strategy problem. In a latent-strategy problem, a corpus of data instances is given, each of which is represented by a set of features and a decision label. The inherent dependency of the decision label on the features is governed by a latent strategy. The objective is to find clusters, each of which contains data instances governed by the same strategy. Existing clustering algorithms are inappropriate to cluster dependency because they either assume feature independency (e.g., K-means) or only consider the co-occurrence of features without explicitly modeling the special dependency of the decision label on other features (e.g., Latent Dirichlet Allocation (LDA)). In this paper, we present a baseline unsupervised learning algorithm for dependency clustering. Our model-based clustering algorithm iterates between an assignment step and a minimization step to learn a mixture of decision tree models that represent latent strategies. Similar to the Expectation Maximization algorithm, our algorithm is grounded in the statistical learning theory. Different from other clustering algorithms, our algorithm is irrelevant-feature resistant and its learned clusters (modeled by decision trees) are strongly interpretable and predictive. We systematically evaluate our algorithm using a common law dataset comprised of actual cases. Experimental results show that our algorithm significantly outperforms K-means and LDA on clustering dependency.