Law
Graph Analytics to Reinforce Anti-fraud Programs
Organizations across industries are adopting graph analytics to reinforce their anti-fraud programs. In this post, we examine three types of fraud graph analytics can help investigators combat: insurance fraud, credit card fraud, VAT fraud. In many areas, fraud investigators have at their disposal large datasets in which clues are hidden. These clues are left behind by criminals who, on their side, try to hide their activity behind layers of more or less intricate schemes. To unveil illegal activities, investigators have to connect the pieces of the puzzle to discover evidence of wrongdoing.
A large-scale crowdsourced analysis of abuse against women journalists and politicians on Twitter
Delisle, Laure, Kalaitzis, Alfredo, Majewski, Krzysztof, de Berker, Archy, Marin, Milena, Cornebise, Julien
We report the first, to the best of our knowledge, hand-in-hand collaboration between human rights activists and machine learners, leveraging crowd-sourcing to study online abuse against women on Twitter. On a technical front, we carefully curate an unbiased yet low-variance dataset of labeled tweets, analyze it to account for the variability of abuse perception, and establish baselines, preparing it for release to community research efforts. On a social impact front, this study provides the technical backbone for a media campaign aimed at raising public and deciders' awareness and elevating the standards expected from social media companies.
How Huawei planned international robot espionage via email
Huawei began building its own phone-testing system, xDeviceRobot, in early 2012. The Chinese company hoped to improve the quality of its mobile hardware, which tended to fail far more often than competitors' devices in third-party trials. In May 2012, Huawei China asked T-Mobile if it could license or flat-out buy the company's phone-testing robot, Tappy, which served as a standard for much of the industry. So, Huawei decided to steal Tappy. After installing a handful of employees at T-Mobile's headquarters in Bellevue, Washington, federal prosecutors claim Huawei USA and China employees attempted to illegally collect information on Tappy in a year-long espionage campaign that culminated in actual theft.
Weak-lensing shear measurement with machine learning: teaching artificial neural networks about feature noise
Tewes, Malte, Kuntzer, Thibault, Nakajima, Reiko, Courbin, Frédéric, Hildebrandt, Hendrik, Schrabback, Tim
Cosmic shear is a primary cosmological probe for several present and upcoming surveys investigating dark matter and dark energy, such as Euclid or WFIRST. The probe requires an extremely accurate measurement of the shapes of millions of galaxies based on imaging data. Crucially, the shear measurement must address and compensate for a range of interwoven nuisance effects related to the instrument optics and detector, noise, unknown galaxy morphologies, colors, blending of sources, and selection effects. This paper explores the use of supervised machine learning (ML) as a tool to solve this inverse problem. We present a simple architecture that learns to regress shear point estimates and weights via shallow artificial neural networks. The networks are trained on simulations of the forward observing process, and take combinations of moments of the galaxy images as inputs. A challenging peculiarity of this ML application is the combination of the noisiness of the input features and the requirements on the accuracy of the inverse regression. To address this issue, the proposed training algorithm minimizes bias over multiple realizations of individual source galaxies, reducing the sensitivity to properties of the overall sample of source galaxies. Importantly, an observational selection function of these source galaxies can be straightforwardly taken into account via the weights. We first introduce key aspects of our approach using toy-model simulations, and then demonstrate its potential on images mimicking Euclid data. Finally, we analyze images from the GREAT3 challenge, obtaining competitively low shear biases despite the use of a simple training set. We conclude that the further development of ML approaches is of high interest to meet the stringent requirements on the shear measurement in current and future surveys. A demonstration implementation of our technique is publicly available.
Noise-tolerant fair classification
Lamy, Alexandre Louis, Zhong, Ziyuan, Menon, Aditya Krishna, Verma, Nakul
Fair machine learning concerns the analysis and design of learning algorithms that do not exhibit systematic bias with respect to some sensitive feature (e.g., race, gender). This subject has received sustained interest in the past few years, with considerable progress on both devising sensible measures of fairness, and means of achieving them. Typically, the latter involves correcting one's learning procedure so that there is no bias on the training sample. However, all such work has operated under the assumption that the sensitive feature available in one's training sample is perfectly reliable. This assumption may be violated in many real-world cases: for example, respondents to a survey may choose to conceal or obfuscate their group identity out of privacy concerns. This poses the question of whether one can still learn fair classifiers in the presence of such noisy sensitive features. In this paper, we answer the question in the affirmative for a widely-used measure of fairness and model of noise. We show that if one measures fairness using the mean-difference score, and sensitive features are subject to noise from the mutually contaminated learning model, then owing to a simple identity we only need to change the desired fairness-tolerance. The requisite tolerance can be estimated by leveraging existing noise-rate estimators. We finally show that our procedure is empirically effective on two case-studies involving sensitive feature censoring.
Making face recognition less biased doesn't make it less scary
In the past few years, there's been a dramatic rise in the adoption of face recognition, detection, and analysis technology. You're probably most familiar with recognition systems, like Facebook's photo-tagging recommender and Apple's FaceID, which can identify specific individuals. Detection systems, on the other hand, determine whether a face is present at all; and analysis systems try to identify aspects like gender and race. All of these systems are now being used for a variety of purposes, from hiring and retail to security and surveillance. Many people believe that such systems are both highly accurate and impartial.
A Robot Named 'Tappy': Huawei Conspired To Steal T-Mobile's Trade Secrets, Says DOJ
A Justice Department indictment unsealed on Monday details an alleged conspiracy by the Chinese device maker Huawei to steal the details of a T-Mobile robot. Here, a woman uses her smartphone outside a Huawei store in Beijing on Tuesday. A Justice Department indictment unsealed on Monday details an alleged conspiracy by the Chinese device maker Huawei to steal the details of a T-Mobile robot. Here, a woman uses her smartphone outside a Huawei store in Beijing on Tuesday. But only one of them reads like the script of a slapstick caper movie.
Huawei: inside the twin indictments unveiled by US authorities
The twin criminal indictments against Huawei unveiled by US authorities on Monday are packed with emails and financial transactions allegedly showing how the Chinese technology giant carried out criminal conspiracies. But the finer points of the 23 charges are less important than the overall shot they deliver across China's bows. The US considers Huawei to be an arm of the Chinese state – and their devices to be potential spying equipment for Beijing. Charges that Huawei illegally violated US sanctions on Iran hold the most symbolic significance. They allowed Kirstjen Nielsen, the homeland security secretary, to stress the company's activities had been "detrimental to the security of the United States".
SONYC
Over an 11-month period--May 2016 to April 2017--51% of all noise complaints in the focus area were related to after-hours construction activity (6 P.M.–7 A.M.), three times the amount in the next category. Note combining all construction-related complaints adds up to 70% of this sample, highlighting how disruptive to the lives of ordinary citizens this particular category of noise can be. Figure 4c includes SPL values (blue line) at a five-minute resolution for the after-hours period during or immediately preceding a subset of the complaints. Dotted green lines correspond to background levels, computed as the moving average of SPL measurements within a two-hour window. Dotted black lines correspond to SPL values 10dB above the background, the threshold defined by the city's noise code to indicate potential violations.