Law
Privacy-Preserving Classification with Secret Vector Machines
Hartmann, Valentin, Modi, Konark, Pujol, Josep M., West, Robert
Today, large amounts of valuable data are distributed among millions of user-held devices, such as personal computers, phones, or Internet-of-things devices. Many companies collect such data with the goal of using it for training machine learning models allowing them to improve their services. However, user-held data is often sensitive, and collecting it is problematic in terms of privacy. We address this issue by proposing a novel way of training a supervised classifier in a distributed setting akin to the recently proposed federated learning paradigm (McMahan et al. 2017), but under the stricter privacy requirement that the server that trains the model is assumed to be untrusted and potentially malicious; we thus preserve user privacy by design, rather than by trust. In particular, our framework, called secret vector machine (SecVM), provides an algorithm for training linear support vector machines (SVM) in a setting in which data-holding clients communicate with an untrusted server by exchanging messages designed to not reveal any personally identifiable information. We evaluate our model in two ways. First, in an offline evaluation, we train SecVM to predict user gender from tweets, showing that we can preserve user privacy without sacrificing classification performance. Second, we implement SecVM's distributed framework for the Cliqz web browser and deploy it for predicting user gender in a large-scale online evaluation with thousands of clients, outperforming baselines by a large margin and thus showcasing that SecVM is practicable in production environments. Overall, this work demonstrates the feasibility of machine learning on data from thousands of users without collecting any personal data. We believe this is an innovative approach that will help reconcile machine learning with data privacy.
Solving Partial Assignment Problems using Random Clique Complexes
Sharma, Charu, Nathani, Deepak, Kaul, Manohar
We present an alternate formulation of the partial assignment problem as matching random clique complexes, that are higher-order analogues of random graphs, designed to provide a set of invariants that better detect higher-order structure. The proposed method creates random clique adjacency matrices for each k-skeleton of the random clique complexes and matches them, taking into account each point as the affine combination of its geometric neighbourhood. We justify our solution theoretically, by analyzing the runtime and storage complexity of our algorithm along with the asymptotic behaviour of the quadratic assignment problem (QAP) that is associated with the underlying random clique adjacency matrices. Experiments on both synthetic and real-world datasets, containing severe occlusions and distortions, provide insight into the accuracy, efficiency, and robustness of our approach. We outperform diverse matching algorithms by a significant margin.
Towards Interpretable Deep Extreme Multi-label Learning
Kang, Yihuang, Cheng, I-Ling, Mao, Wenjui, Kuo, Bowen, Lee, Pei-Ju
Many Machine Learning algorithms, such as deep neural networks, have long been criticized for being "black-boxes"-a kind of models unable to provide how it arrive at a decision without further efforts to interpret. This problem has raised concerns on model applications' trust, safety, nondiscrimination, and other ethical issues. In this paper, we discuss the machine learning interpretability of a real-world application, eXtreme Multi-label Learning (XML), which involves learning models from annotated data with many pre-defined labels. We propose a two-step XML approach that combines deep non-negative autoencoder with other multi-label classifiers to tackle different data applications with a large number of labels. Our experimental result shows that the proposed approach is able to cope with many-label problems as well as to provide interpretable label hierarchies and dependencies that helps us understand how the model recognizes the existences of objects in an image.
Week in Review: IoT, Security, Auto
Products/Services Visa agreed to acquire the token and electronic ticketing business of Rambus for $75 million in cash. The business involved is part of the Smart Card Software subsidiary of Rambus. It includes the former Bell ID mobile-payment businesses and the Ecebs smart-ticketing systems for transit providers. Meanwhile, Rambus expanded its CryptoManager Root of Trust product line. "Security is a mission-critical imperative for SoC designs serving virtually every application space," Neeraj Paliwal, vice president of products, cryptography at Rambus, said in a statement.
Council of Europe and Artificial Intelligence
Organised around the three main pillars that constitute the Council of Europe core values, human rights, democracy, and the rule of law, panel discussions addressed the challenges and opportunities of AI development for individuals, for societies, and for the viability of our legal and institutional frameworks, and explored options for ensuring that effective mechanisms of democratic oversight are in place.
Fake videos prompt need for law - Letters The Star Online
TECHNOLOGY has advanced so much that one can now produce or alter audio or video content to show or present something that actually didn't happen. With deepfake technology (which combines "deep learning" with "fake"), one can, for example, superimpose someone's face over another person's to create a video to support his or her own agenda. The video is then circulated online, with disastrous consequences on the victim if the purpose is vile in nature, such as the sex video that is currently doing its rounds on social media in Malaysia. Deepfake is artificial intelligence (AI) at work, and there is little you can do to prevent it from happening to you, as highly-paid Hollywood actress Scarlett Johansson lamented. The subject of a fake porn video, she told the Washington Post (Dec 31, 2018): "The truth is, there is no difference between someone hacking my account or someone hacking the person standing behind me on line at the grocery store's account. It just depends on whether or not someone has the desire to target you. "Obviously, if a person has more resources, they may employ various forces to build a bigger wall around their digital identity.
Artificial intelligence boosts Abu Dhabi courts' speed, accuracy
With the new artificial intelligence (AI) system of the Abu Dhabi Judicial Department (ADJD), cases are identified with a high level of accuracy and requests are processed in an efficient and timely manner, said a tech expert. Alaa Youssef, managing director of SAS Middle East, the firm that offered the AI solutions to Abu Dhabi courts, said judiciary systems worldwide are transforming their operations and functions to keep pace with the digital era. They provide judiciary systems with the capabilities to understand and model their tasks and operations with greater flexibility and accuracy, besides facilitating efficiency and consistency in the overall judicial practice," said Youssef. He pointed out that the goal to introduce AI system in ADJD was to reduce their time in decision-making. The tech expert explained that the judicial department's engagement with SAS was initiated in three main phases: phase one is based on creating visualisations, which involved viewing operational performance of the organisation, gaining performance insights, and ad-hoc analysis. Phase two was about more complex data governance. The third phase was the application of AI and machine learning models on real-world business challenges within the judicial system. "We have been able to tap into huge reserves of data about individuals that is collected by the judicial department.
Artificial Intelligence and Law: An Overview by Harry Surden :: SSRN
Much has been written recently about artificial intelligence (AI) and law. But what is AI, and what is its relation to the practice and administration of law? The discussion aims to be nuanced but also understandable to those without a technical background. To that end, I first discuss AI generally. I then turn to AI and how it is being used by lawyers in the practice of law, people and companies who are governed by the law, and government officials who administer the law.
Unsupervised predictive coding models may explain visual brain representation
Deep predictive coding networks are neuroscience-inspired unsupervised learning models that learn to predict future sensory states. We build upon the PredNet implementation by Lotter, Kreiman, and Cox (2016) to investigate if predictive coding representations are useful to predict brain activity in the visual cortex. We use representational similarity analysis (RSA) to compare PredNet representations to functional magnetic resonance imaging (fMRI) and magnetoencephalography (MEG) data from the Algonauts Project (Cichy et al., 2019). In contrast to previous findings in the literature (Khaligh-Razavi & Kriegeskorte, 2014), we report empirical data suggesting that unsupervised models trained to predict frames of videos may outperform supervised image classification baselines in terms of correlation to spatial (fMRI) data. Our best submission achieves an average noise normalized correlation score of 16.67% and 27.67% on the fMRI and MEG tracks of the Algonauts Challenge.
Patent Claim Generation by Fine-Tuning OpenAI GPT-2
In this work, we focus on fine-tuning an OpenAI GPT-2 pre-trained model for generating patent claims. GPT-2 has demonstrated impressive efficacy of pre-trained language models on various tasks, particularly coherent text generation. Patent claim language itself has rarely been explored in the past and poses a unique challenge. We are motivated to generate coherent patent claims automatically so that augmented inventing might be viable someday. In our implementation, we identified a unique language structure in patent claims and leveraged its implicit human annotations. We investigated the fine-tuning process by probing the first 100 steps and observing the generated text at each step. Based on both conditional and unconditional random sampling, we analyze the overall quality of generated patent claims. Our contributions include: (1) being the first to generate patent claims by machines and being the first to apply GPT-2 to patent claim generation, (2) providing various experiment results for qualitative analysis and future research, (3) proposing a new sampling approach for text generation, and (4) building an e-mail bot for future researchers to explore the fine-tuned GPT-2 model further.