Goto

Collaborating Authors

Results


Raising standards for global data-sharing

Science

In their Policy Forum “How to fix the GDPR's frustration of global biomedical research” (2 October 2020, p. [40][1]), J. Bovenberg et al. argue that the biomedical research community has struggled to share data outside the European Union as a result of the EU's General Data Protection Regulation (GDPR), which strictly limits the international transfer of personal data. However, they do not acknowledge the law's flexibility, and their solutions fail to recognize the importance of multilateral efforts to raise standards for global data-sharing. Bovenberg et al. express concern about the thwarting of “critical data flows” in biomedical research. However, the limited number of critical commentaries ([ 1 ][2], [ 2 ][3]) and registered complaints ([ 3 ][4]) indicate that hindered data exchange may not be a substantial global problem. Moreover, the authors concede that during the COVID-19 pandemic, data transfers remain ongoing because transfers “necessary for important reasons of public interest” are already provided in the law [([ 4 ][5]), Article 49(1)(d)]. The European Data Protection Board (EDPB) has cautioned that transfers according to this derogation shall not become the rule in practice ([ 5 ][6]), but this conditional support for international COVID-19 data sharing shows that the law already provides suitable flexibility. This flexibility also shows the EDPB's recognition of the pressing social need that biomedical research represents for the global research community during the COVID-19 pandemic, while also seeking to ensure that this remains the exception and not the beginning of a normalized practice. Bovenberg et al. contend that pseudonymized data should not be considered personal data in the hands of an entity that does not possess the key needed for re-identification. This proposal runs against well-established guidance in EU member states such as Ireland ([ 6 ][7]) and Germany ([ 7 ][8]), and it does not take into account the cases in which identifiers remain attached to transferred biomedical data or in which data could be identified without a key. Bovenberg et al. also neglect to state that the GDPR has special principles and safeguards for particularly sensitive re-identifiable data, not just for the protection of privacy but also for the security and integrity of health research data—aims that align with all high-quality scientific research. Respecting these standards (both technical and organizational) is fundamental to ensuring better data security and accuracy in the transferring of huge datasets of sensitive health data that are essential to global collaboration [([ 4 ][5]), Articles 5 and 9, Recitals 53 and 54, and ([ 8 ][9])]. Thus, these rules should not be subject to exemptions, which would result from not classifying pseudonymized data as personal data. The purpose of the GDPR's strict rules is to ensure that when personal data are transferred to non-EU countries, the level of protection ensured in the European Union is not undermined. The EU's Court of Justice decisions ([ 9 ][10], [ 10 ][11]) make it clear that ensuring an adequate level of protection in non-EU countries, especially independent oversight and judicial remedies—which the Court found lacking in the United States—is a matter of fundamental rights. This discrepancy is an opportunity for non-EU countries, including the United States, to raise their data protection standards to the level of the European Union's, not for the European Union to decrease its own standards in a regulatory race to the bottom. We encourage research organizations and country delegations to work with the European Commission, national data protection authorities, and the EDPB to craft interoperable rules on data sharing applicable for biomedical research in ways that do not undermine fundamental rights owed to data subjects. 1. [↵][12]1. R. Eiss , Nature 584, 498 (2020). [OpenUrl][13] 2. [↵][14]1. R. Becker et al ., J. Med. Internet Res. 22, e19799 (2020). [OpenUrl][15] 3. [↵][16]1. A. Jelinek , EDPB response letter to Mark W. Libby, Chargé d'Affaires, United States Mission to the European Union (2020); [https://edpb.europa.eu/sites/edpb/files/files/file1/edpb\_letter\_out2020-0029\_usmission\_covid19.pdf][17]. 4. [↵][18]GDPR (2016); . 5. [↵][19]EDPB, “Guidelines 03/2020 on the processing of data concerning health for the purpose of scientific research in the context of the COVID-19 outbreak” (2020). 6. [↵][20]Data Protection Commission, “Guidance on Anonymisation and Pseudonymisation” (2019); [www.dataprotection.ie/sites/default/files/uploads/2019-06/190614%20Anonymisation%20and%20Pseudonymisation.pdf][21]. 7. [↵][22]German Federal Ministry of the Interior, Building and Community, “Draft for a Code of Conduct on the use of GDPR compliant pseudonymisation” (2019); [www.gdd.de/downloads/aktuelles/whitepaper/Data\_Protection\_Focus\_Group-Draft\_CoC\_Pseudonymisation\_V1.0.pdf][23]. 8. [↵][24]1. D. Anderson et al ., Int. Data Privacy L. 10, 180 (2020). [OpenUrl][25] 9. [↵][26]Case C-362/14 Maximilian Schrems v. Data Protection Commissioner (Court of Justice of the EU, 2015). 10. [↵][27]Case C-311/18 Data Protection Commissioner v. Facebook Ireland Limited and Maximillian Schrems (Court of Justice of the EU, 2020). [1]: http://www.sciencemag.org/content/370/6512/40 [2]: #ref-1 [3]: #ref-2 [4]: #ref-3 [5]: #ref-4 [6]: #ref-5 [7]: #ref-6 [8]: #ref-7 [9]: #ref-8 [10]: #ref-9 [11]: #ref-10 [12]: #xref-ref-1-1 "View reference 1 in text" [13]: {openurl}?query=rft.jtitle%253DNature%26rft.volume%253D584%26rft.spage%253D498%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [14]: #xref-ref-2-1 "View reference 2 in text" [15]: {openurl}?query=rft.jtitle%253DJ.%2BMed.%2BInternet%2BRes.%26rft.volume%253D22%26rft.spage%253De19799%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [16]: #xref-ref-3-1 "View reference 3 in text" [17]: https://edpb.europa.eu/sites/edpb/files/files/file1/edpb_letter_out2020-0029_usmission_covid19.pdf [18]: #xref-ref-4-1 "View reference 4 in text" [19]: #xref-ref-5-1 "View reference 5 in text" [20]: #xref-ref-6-1 "View reference 6 in text" [21]: http://www.dataprotection.ie/sites/default/files/uploads/2019-06/190614%20Anonymisation%20and%20Pseudonymisation.pdf [22]: #xref-ref-7-1 "View reference 7 in text" [23]: http://www.gdd.de/downloads/aktuelles/whitepaper/Data_Protection_Focus_Group-Draft_CoC_Pseudonymisation_V1.0.pdf [24]: #xref-ref-8-1 "View reference 8 in text" [25]: {openurl}?query=rft.jtitle%253DInt.%2BData%2BPrivacy%2BL.%26rft.volume%253D10%26rft.spage%253D180%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [26]: #xref-ref-9-1 "View reference 9 in text" [27]: #xref-ref-10-1 "View reference 10 in text"


Query-free Black-box Adversarial Attacks on Graphs

arXiv.org Artificial Intelligence

Many graph-based machine learning models are known to be vulnerable to adversarial attacks, where even limited perturbations on input data can result in dramatic performance deterioration. Most existing works focus on moderate settings in which the attacker is either aware of the model structure and parameters (white-box), or able to send queries to fetch model information. In this paper, we propose a query-free black-box adversarial attack on graphs, in which the attacker has no knowledge of the target model and no query access to the model. With the mere observation of the graph topology, the proposed attack strategy flips a limited number of links to mislead the graph models. We prove that the impact of the flipped links on the target model can be quantified by spectral changes, and thus be approximated using the eigenvalue perturbation theory. Accordingly, we model the proposed attack strategy as an optimization problem, and adopt a greedy algorithm to select the links to be flipped. Due to its simplicity and scalability, the proposed model is not only generic in various graph-based models, but can be easily extended when different knowledge levels are accessible as well. Extensive experiments demonstrate the effectiveness and efficiency of the proposed model on various downstream tasks, as well as several different graph-based learning models.


Achieving Security and Privacy in Federated Learning Systems: Survey, Research Challenges and Future Directions

arXiv.org Artificial Intelligence

Federated learning (FL) allows a server to learn a machine learning (ML) model across multiple decentralized clients that privately store their own training data. In contrast with centralized ML approaches, FL saves computation to the server and does not require the clients to outsource their private data to the server. However, FL is not free of issues. On the one hand, the model updates sent by the clients at each training epoch might leak information on the clients' private data. On the other hand, the model learnt by the server may be subjected to attacks by malicious clients; these security attacks might poison the model or prevent it from converging. In this paper, we first examine security and privacy attacks to FL and critically survey solutions proposed in the literature to mitigate each attack. Afterwards, we discuss the difficulty of simultaneously achieving security and privacy protection. Finally, we sketch ways to tackle this open problem and attain both security and privacy.


I-GCN: Robust Graph Convolutional Network via Influence Mechanism

arXiv.org Machine Learning

Deep learning models for graphs, especially Graph Convolutional Networks (GCNs), have achieved remarkable performance in the task of semi-supervised node classification. However, recent studies show that GCNs suffer from adversarial perturbations. Such vulnerability to adversarial attacks significantly decreases the stability of GCNs when being applied to security-critical applications. Defense methods such as preprocessing, attention mechanism and adversarial training have been discussed by various studies. While being able to achieve desirable performance when the perturbation rates are low, such methods are still vulnerable to high perturbation rates. Meanwhile, some defending algorithms perform poorly when the node features are not visible. Therefore, in this paper, we propose a novel mechanism called influence mechanism, which is able to enhance the robustness of the GCNs significantly. The influence mechanism divides the effect of each node into two parts: introverted influence which tries to maintain its own features and extroverted influence which exerts influences on other nodes. Utilizing the influence mechanism, we propose the Influence GCN (I-GCN) model. Extensive experiments show that our proposed model is able to achieve higher accuracy rates than state-of-the-art methods when defending against non-targeted attacks.


Risk Management Framework for Machine Learning Security

arXiv.org Artificial Intelligence

Adversarial attacks for machine learning models have become a highly studied topic both in academia and industry. These attacks, along with traditional security threats, can compromise confidentiality, integrity, and availability of organization's assets that are dependent on the usage of machine learning models. While it is not easy to predict the types of new attacks that might be developed over time, it is possible to evaluate the risks connected to using machine learning models and design measures that help in minimizing these risks. In this paper, we outline a novel framework to guide the risk management process for organizations reliant on machine learning models. First, we define sets of evaluation factors (EFs) in the data domain, model domain, and security controls domain. We develop a method that takes the asset and task importance, sets the weights of EFs' contribution to confidentiality, integrity, and availability, and based on implementation scores of EFs, it determines the overall security state in the organization. Based on this information, it is possible to identify weak links in the implemented security measures and find out which measures might be missing completely. We believe our framework can help in addressing the security issues related to usage of machine learning models in organizations and guide them in focusing on the adequate security measures to protect their assets.


Divers just found this World War II Enigma machine dumped on the seabed

ZDNet

Underwater archeologists sponsored by the World Wide Fund for Nature (WWF) have found an Enigma machine at the bottom of the Baltic Sea, likely from a submarine that Germany scuttled at the end of World War II. The divers made the discovery while searching the sea bed using a sonar device for abandoned fishing nets that can be harmful for sea life. Enigma machines, created in Nazi Germany in the 1930s and 1940s, were used to encode military messages; these codes were finally broken by the experts assembled by the British at Bletchley Park, work which fueled the creation of modern computers. Earlier this year, a four-rotor M4 Enigma cipher machine sold at an auction for £347,250 ($437,955). It, however, was in pristine condition while the rusty, barnacle-covered one found in the Baltic Sea has been deformed by decades spent in salt water.


WWII: Enigma machine used by the Nazis to send secret messages found in the Baltic Sea

Daily Mail - Science & tech

Divers recovered the device at the bottom of Gelting Bay, on Germany's northern coast, while working to remove abandoned fishing nets that threaten marine life. Designed shortly after WWI by the engineer Arthur Scherbius for commercial usage, the cipher engine was adopted by many national governments and militaries. The portable device is best-known for its use by the Axis powers to encode military commands, for safe transmission by radio, as part of their rapid'blitzkrieg' strategy. Enigma featured a number of wheels, which together formed an electric circuit that repeatedly scrambled an entered character -- and reconfigured after each letter. German military models -- made more complex through the addition of a plugboard, for added scrambling -- and their codebooks were highly sought by the allies.


A Survey on Data Pricing: from Economics to Data Science

arXiv.org Artificial Intelligence

How can we assess the value of data objectively, systematically and quantitatively? Pricing data, or information goods in general, has been studied and practiced in dispersed areas and principles, such as economics, marketing, electronic commerce, data management, data mining and machine learning. In this article, we present a unified, interdisciplinary and comprehensive overview of this important direction. We examine various motivations behind data pricing, understand the economics of data pricing and review the development and evolution of pricing models according to a series of fundamental principles. We discuss both digital products and data products. We also consider a series of challenges and directions for future work.


How artificial intelligence is changing cyber security

#artificialintelligence

Having contact with someone who has a cold increases the chances that you might pick up the bug yourself. In much the same way, businesses adding more connectivity into their system increases the opportunities for cyber criminals to introduce viruses into the system. Here, Sophie Hand, UK Country Manager at EU Automation, explains the role Artificial Intelligence (AI) can play in combatting cyber crime. Even with the most effective preventative measures in place, cyber criminals try to get around them. It is unlikely that we will ever completely eradicate cyber threats because hackers are intelligent and tenacious, always searching for new ways to breach a company's defences.


Darktrace answers The Vatican's prayers

#artificialintelligence

The Vatican Library, which holds one of the oldest and most significant collections of historical texts in the world, is deploying AI technology from Cambridge's Darktrace to protect it against cyber-attacks. Founded in 1451 by Nicholas V, the Vatican Library holds invaluable documents from across history including letters from ancient figures, drawings and writings from Michelangelo and Galileo and the oldest surviving copy of the Bible. Eight years ago, the Library embarked on a project to digitise over 80,000 of these documents in order to help preserve the content for posterity, and broaden access to new audiences and academics. As a result, however, the collections have become vulnerable to cyber-attacks and securing their digital versions is paramount. The Vatican Library chose Darktrace Immune System technology to protect against a range of potential attackers that pose a risk to these priceless works and writings.