Law
The Right to Obscure: A Mechanism and Initial Evaluation
Huang, Eric Hsin-Chun (Stanford University) | Lanier, Jaron (Microsoft Research) | Shoham, Yoav (Stanford University)
The recent landmark "right to be forgotten" ruling by the EU Court gives EU citizens the right to remove certain links that are "inaccurate, inadequate, irrelevant or excessive" from search results under their names. While we agree with the spirit of the ruling โ to empower individuals to manage their personal data while keeping a balance between such right and the freedom of expression, we believe that the ruling is impractical as it provides neither precise criteria for evaluating removal requests nor concrete guidelines for implementation. Consequently, Google's current implementation has several problems concerning scalability, objectivity, and responsiveness. Instead of the right to be forgotten, we propose the right to obscure certain facts about oneself on search engines, and a simple mechanism which respects the spirit of the ruling by giving people more power to influence search results for queries on their names. Specifically, under our proposed mechanism, data subjects will be able to register minus terms, and search results for their name queries that contain such terms would be filtered out. We implement a proof-of-concept search engine following the proposed mechanism, and conduct experiments to explore the influences it might have on users' impressions on different data subjects.
Context-Independent Claim Detection for Argument Mining
Lippi, Marco (University of Bologna) | Torroni, Paolo (University of Bologna)
Argumentation mining aims to automatically identify structured argument data from unstructured natural language text. This challenging, multi-faceted task is recently gaining a growing attention, especially due to its many potential applications. One particularly important aspect of argumentation mining is claim identification. Most of the current approaches are engineered to address specific domains. However, argumentative sentences are often characterized by common rhetorical structures, independently of the domain. We thus propose a method that exploits structured parsing information to detect claims without resorting to contextual information, and yet achieve a performance comparable to that of state-of-the-art methods that heavily rely on the context.
Sparse Approximate Inference for Spatio-Temporal Point Process Models
Cseke, Botond, Mangion, Andrew Zammit, Heskes, Tom, Sanguinetti, Guido
Spatio-temporal point process models play a central role in the analysis of spatially distributed systems in several disciplines. Yet, scalable inference remains computa- tionally challenging both due to the high resolution modelling generally required and the analytically intractable likelihood function. Here, we exploit the sparsity structure typical of (spatially) discretised log-Gaussian Cox process models by using approximate message-passing algorithms. The proposed algorithms scale well with the state dimension and the length of the temporal horizon with moderate loss in distributional accuracy. They hence provide a flexible and faster alternative to both non-linear filtering-smoothing type algorithms and to approaches that implement the Laplace method or expectation propagation on (block) sparse latent Gaussian models. We infer the parameters of the latent Gaussian model using a structured variational Bayes approach. We demonstrate the proposed framework on simulation studies with both Gaussian and point-process observations and use it to reconstruct the conflict intensity and dynamics in Afghanistan from the WikiLeaks Afghan War Diary.
Bayesian Poisson Tensor Factorization for Inferring Multilateral Relations from Sparse Dyadic Event Counts
Schein, Aaron, Paisley, John, Blei, David M., Wallach, Hanna
We present a Bayesian tensor factorization model for inferring latent group structures from dynamic pairwise interaction patterns. For decades, political scientists have collected and analyzed records of the form "country $i$ took action $a$ toward country $j$ at time $t$"---known as dyadic events---in order to form and test theories of international relations. We represent these event data as a tensor of counts and develop Bayesian Poisson tensor factorization to infer a low-dimensional, interpretable representation of their salient patterns. We demonstrate that our model's predictive performance is better than that of standard non-negative tensor factorization methods. We also provide a comparison of our variational updates to their maximum likelihood counterparts. In doing so, we identify a better way to form point estimates of the latent factors than that typically used in Bayesian Poisson matrix factorization. Finally, we showcase our model as an exploratory analysis tool for political scientists. We show that the inferred latent factor matrices capture interpretable multilateral relations that both conform to and inform our knowledge of international affairs.
Apple refuses to answer questions over 'homophobic' Russian Siri | Technology
Apple is refusing to explain how and why the Russian language version of its voice-controlled virtual assistant, Siri, provided homophobic answers to queries relating to gay or lesbian topics. The allegations first came to light when a Russian man called Alex, who lives in London, uploaded a video to YouTube appearing to show Siri either evading questions or expressing a negative response. Alex asked questions such as: "Are there any gay bars around me?", "tell me about gay marriage?" "I would have turned red if I could", "you are so rude!" and "I will pretend I haven't heard". These responses seem to suggest that the word gay () in Russian has been programmed as profanity. Swear words in the English language version receive similar responses.
Hierarchical sparse Bayesian learning: theory and application for inferring structural damage from incomplete modal data
Structural damage due to excessive loading or environmental degradation typically occurs in localized areas in the absence of collapse. This prior information about the spatial sparseness of structural damage is exploited here by a hierarchical sparse Bayesian learning framework with the goal of reducing the source of ill-conditioning in the stiffness loss inversion problem for damage detection. Sparse Bayesian learning methodologies automatically prune away irrelevant or inactive features from a set of potential candidates, and so they are effective probabilistic tools for producing sparse explanatory subsets. We have previously proposed such an approach to establish the probability of localized stiffness reductions that serve as a proxy for damage by using noisy incomplete modal data from before and after possible damage. The core idea centers on a specific hierarchical Bayesian model that promotes spatial sparseness in the inferred stiffness reductions in a way that is consistent with the Bayesian Ockham razor. In this paper, we improve the theory of our previously proposed sparse Bayesian learning approach by eliminating an approximation and, more importantly, incorporating a constraint on stiffness increases. Our approach has many appealing features that are summarized at the end of the paper. We validate the approach by applying it to the Phase II simulated and experimental benchmark studies sponsored by the IASC-ASCE Task Group on Structural Health Monitoring. The results show that it can reliably detect, locate and assess damage by inferring substructure stiffness losses from the identified modal parameters. The occurrence of missed and false damage alerts is effectively suppressed.
The Utility of Text: The Case of Amicus Briefs and the Supreme Court
Sim, Yanchuan (Language Technologies Institute) | Routledge, Bryan R (Carnegie Mellon University) | Smith, Noah A (Carnegie Mellon University)
We explore the idea that authoring a piece of text is an act of maximizing one's expected utility.To make this idea concrete, we consider the societally important decisions of the Supreme Court of the United States.Extensive past work in quantitative political science provides a framework for empirically modeling the decisions of justices and how they relate to text.We incorporate into such a model texts authored by amici curiae (``friends of the court'' separate from the litigants) who seek to weigh in on the decision, then explicitly model their goals in a random utility model.We demonstrate the benefits of this approach in improved vote prediction and the ability to perform counterfactual analysis.
An Unsupervised Framework of Exploring Events on Twitter: Filtering, Extraction and Categorization
Zhou, Deyu (Southeast University) | Chen, Liangyu (Southeast University) | He, Yulan (Aston University)
Twitter, as a popular microblogging service, has become a new information channel for users to receive and exchange the mostup-to-date information on current events. However, since there is no control on how users can publish messages on Twitter, finding newsworthy events from Twitter becomes a difficult task like "finding a needle in a haystack". In this paper we propose a general unsupervised framework to explore events from tweets, which consists of a pipeline process of filtering, extraction and categorization. To filter out noisy tweets, the filtering step exploits a lexicon-based approach to separate tweets that are event-related from those that are not. Then, based on these event-related tweets, the structured representations of events are extracted and categorized automatically using an unsupervised Bayesian model without the use of any labelled data. Moreover, the categorized events are assigned with the event type labels without human intervention. The proposed framework has been evaluated on over 60 millions tweets which were collected for one month in December 2010. A precision of 70.49% is achieved in event extraction, outperforming a competitive baseline by nearly 6%. Events are also clustered into coherence groups with the automatically assigned event type label.
Automatic Land Use and Land Cover Classification Using RapidEye Imagery in Mexico
Sierra-Alcocer, Raul (National Commission for Knowledge and Use of Biodiversity) | Zenteno-Jimenez, Enrique-Daniel (National Commission for Knowledge and Use of Biodiversity) | Barrios, Juan M. (National Commission for Knowledge and Use of Biodiversity)
The problem with this type of method is that it does not really take advantage of Land use and land cover classification (LUCC) maps from high resolution images. We believe that pixel based spectral remote sensor data are of great interest since they allow to information is not enough to characterize land use and track issues like deforestation/reforestation, water sources land cover classes. For this reason, our goal is to design a reduction, urban growth, or to calculate indicators like a methodology that models classes as areas of correlated pixels.
A Solution Alternative to Achieve Parcel Connectivity in the Dynamic Reserve Design Problem
Jafari, Nahid (University of Georgia) | Moore, Clinton T. (University of Georgia) | Hepinstall-Cymerman, Jeffrey (University of Georgia)
The DNR is able to purchase lands and engage in conservation easements, but there is considerable uncertainty (for the Conservation reserve design is the problem of selecting reasons enumerated above) about which lands to target, and parcels of land such that the assembled set maximizes when. Furthermore, for any parcel that is protected through some criterion pertaining to the conservation of species or purchase or easement, DNR encumbers a responsibility to natural communities (Williams, ReVelle, and Levin 2005).