Accuracy
Detroit police chief cops to 96-percent facial recognition error rate
Detroit's police chief admitted on Monday that facial recognition technology used by the department misidentifies suspects about 96 percent of the time. It's an eye-opening admission given that the Detroit Police Department is facing criticism for arresting a man based on a bogus match from facial recognition software. Last week, the ACLU filed a complaint with the Detroit Police Department on behalf of Robert Williams, a Black man who was wrongfully arrested for stealing five watches worth $3,800 from a luxury retail store. Investigators first identified Williams by doing a facial recognition search with software from a company called DataWorks Plus. Under police questioning, Williams pointed out that the grainy surveillance footage obtained by police didn't actually look like him.
Hawaii Is Finally Making It Easier for Tourists to Visit. Is That Smart?
Hawaii is ready for its midpandemic tourism boom. Starting on Aug. 1, tourists looking to visit Hawaii will be able to bypass the state's two-week quarantine requirement for arrivals by getting a negative COVID-19 test within 72 hours before landing in the state. Visitors can also have their quarantines cut short if they receive negative test results during those two weeks. The same rules will also apply to residents returning to the islands. Hawaii won't pay for the tests; travelers will have to handle that themselves before departure, though screeners will still administer temperature checks at airports.
Examining Redundancy in the Context of Safe Machine Learning
Doran, Hans Dermot, Reif, Monika
This paper describes a set of experiments with neural network classifiers on the MNIST database of digits. The purpose is to investigate na\"ive implementations of redundant architectures as a first step towards safe and dependable machine learning. We report on a set of measurements using the MNIST database which ultimately serve to underline the expected difficulties in using NN classifiers in safe and dependable systems.
High-recall causal discovery for autocorrelated time series with latent confounders
Gerhardus, Andreas, Runge, Jakob
We present a new method for linear and nonlinear, lagged and contemporaneous constraint-based causal discovery from observational time series in the presence of latent confounders. We show that existing causal discovery methods such as FCI and variants suffer from low recall in the autocorrelated time series case and identify low effect size of conditional independence tests as the main reason. Information-theoretical arguments show that effect size can often be increased if causal parents are included in the conditioning sets. To identify parents early on, we suggest an iterative procedure that utilizes novel orientation rules to determine ancestral relationships already during the edge removal phase. We prove that the method is order-independent, and sound and complete in the oracle case. Extensive simulation studies for different numbers of variables, time lags, sample sizes, and further cases demonstrate that our method indeed achieves much higher recall than existing methods while keeping false positives at the desired level. This performance gain grows with stronger autocorrelation. Our method also covers causal discovery for non-time series data as a special case. We provide Python code for all methods involved in the simulation studies.
A Mean-Field Theory for Learning the Sch\"{o}nberg Measure of Radial Basis Functions
Khuzani, Masoud Badiei, Ye, Yinyu, Napel, Sandy, Xing, Lei
We develop and analyze a projected particle Langevin optimization method to learn the distribution in the Sch\"{o}nberg integral representation of the radial basis functions from training samples. More specifically, we characterize a distributionally robust optimization method with respect to the Wasserstein distance to optimize the distribution in the Sch\"{o}nberg integral representation. To provide theoretical performance guarantees, we analyze the scaling limits of a projected particle online (stochastic) optimization method in the mean-field regime. In particular, we prove that in the scaling limits, the empirical measure of the Langevin particles converges to the law of a reflected It\^{o} diffusion-drift process. Moreover, the drift is also a function of the law of the underlying process. Using It\^{o} lemma for semi-martingales and Grisanov's change of measure for the Wiener processes, we then derive a Mckean-Vlasov type partial differential equation (PDE) with Robin boundary conditions that describes the evolution of the empirical measure of the projected Langevin particles in the mean-field regime. In addition, we establish the existence and uniqueness of the steady-state solutions of the derived PDE in the weak sense. We apply our learning approach to train radial kernels in the kernel locally sensitive hash (LSH) functions, where the training data-set is generated via a $k$-mean clustering method on a small subset of data-base. We subsequently apply our kernel LSH with a trained kernel for image retrieval task on MNIST data-set, and demonstrate the efficacy of our kernel learning approach. We also apply our kernel learning approach in conjunction with the kernel support vector machines (SVMs) for classification of benchmark data-sets.
Feasibility of blood testing combined with PET-CT to screen for cancer and guide intervention
Cancers diagnosed early are often more responsive to treatment. Blood tests that detect molecular markers of cancer have successfully identified individuals already known to have the disease. Lennon et al. conducted an exploratory study that more closely reflects the way in which such blood tests would be used in the future. They evaluated the feasibility and safety of incorporating a multicancer blood test into the routine clinical care of 10,000 women with no history of cancer. Over a 12-month period, the blood test detected 26 cancers of different types. A combination of the blood test and positron emission tomography–computed tomography (PET-CT) imaging led to surgical removal of nine of these cancers. Use of the blood test did not result in a large number of futile follow-up procedures. Science , this issue p. [eabb9601][1] ### INTRODUCTION The goal of earlier cancer detection is to identify the disease at a stage when it can be effectively treated, thereby offering the patient a better chance of long-term survival. Adherence to screening modalities known to decrease cancer mortality such as colonoscopy, mammography, low-dose computed tomography, and Pap smears varies widely. Moreover, the majority of cancer types are diagnosed only when symptoms occur. Multicancer blood tests offer the exciting possibility of detecting many cancer types at a relatively early stage and in a minimally invasive manner. ### RATIONALE Evaluation of the feasibility and safety of multicancer blood testing requires prospective interventional studies. We designed such a study to answer four critical questions: (i) Can a multicancer blood test detect cancers not previously detected by other means? (ii) Can a positive test result lead to surgical intervention with curative intent? (iii) Can testing be incorporated into routine clinical care and not discourage patients from undergoing recommended screening tests such as mammography? (iv) Can testing be performed safely, without incurring a large number of unnecessary, invasive follow-up tests? ### RESULTS We evaluated a blood test that detects DNA mutations and protein biomarkers of cancer in a prospective, interventional study of 10,006 women who were 65 to 75 years old and who had no prior history of cancer. Positive blood tests were followed by diagnostic positron emission tomography–computed tomography (PET-CT), which served to independently confirm and precisely localize the site and extent of disease if present. The study design incorporated several features to maximize the safety of testing to the participants. Of the 10,006 enrollees, 9911 (99.1%) could be assessed with respect to the four questions posed above. (i) Detection: Of 96 cancers incident during the study period, 26 were first detected by blood testing and 24 additional cancers by conventional screening. Fifteen of the 26 patients in whom cancer was first detected by blood testing underwent PET-CT imaging, and 11 patients developed signs or symptoms of cancer after the blood test that led to imaging procedures other than PET-CT. The specificity and positive predictive value (PPV) of blood testing alone were 98.9% and 19.4%, respectively, and combined with PET-CT, the specificity and PPV increased to 99.6% and 28.3%. The blood test first detected 14 of 45 cancers (31%) in seven organs for which no standard-of-care screening test is available. (ii) Intervention: Of the 26 cancers first detected by blood testing, 17 (65%) had localized or regional disease. Of the 15 participants with positive blood tests as well as positive PET-CT scans, 9 (60%) underwent surgery with curative intent. (iii) Incorporation into clinical care: Blood testing could be combined with conventional screening, leading to detection of more than half of the total incident cancers observed during the study period. Blood testing did not deter participants from undergoing mammography, and surveys revealed that 99% of participants would join a similar, subsequent study if offered. (iv) Safety: 99% of participants did not require any follow-up of blood testing results, and only 0.22% underwent an unnecessary invasive diagnostic procedure as a result of a false-positive blood test. ### CONCLUSION A minimally invasive blood test in combination with PET-CT can safely detect and precisely localize several types of cancers in individuals not previously known to have cancer, in some cases enabling treatment with intent to cure. Further studies will be required to assess the clinical utility, risk-benefit ratio, and cost-effectiveness of such testing. ![Figure][2] Overview of cancers detected by blood testing. Twenty-six cancers (blue dots) in 10 organs were first detected by blood testing. The blue dots with the red halo represent 12 of the 26 cancers that were surgically treated with intent to cure. Nine of these 12 were detected by the combination of the blood test and PET-CT, with the remaining three identified by the blood test combined with another imaging modality. Cancer treatments are often more successful when the disease is detected early. We evaluated the feasibility and safety of multicancer blood testing coupled with positron emission tomography–computed tomography (PET-CT) imaging to detect cancer in a prospective, interventional study of 10,006 women not previously known to have cancer. Positive blood tests were independently confirmed by a diagnostic PET-CT, which also localized the cancer. Twenty-six cancers were detected by blood testing. Of these, 15 underwent PET-CT imaging and nine (60%) were surgically excised. Twenty-four additional cancers were detected by standard-of-care screening and 46 by neither approach. One percent of participants underwent PET-CT imaging based on false-positive blood tests, and 0.22% underwent a futile invasive diagnostic procedure. These data demonstrate that multicancer blood testing combined with PET-CT can be safely incorporated into routine clinical care, in some cases leading to surgery with intent to cure. [1]: /lookup/doi/10.1126/science.abb9601 [2]: pending:yes
Measuring the performance of a Classification problem
It is often convenient to combine precision and recall into a single metric called the F1 score, in particular, if you need a simple way to compare two classifiers. The F1 score is the harmonic mean of precision and recall. The F1 score favors classifiers that have similar precision and recall. This is not always what you want: in some contexts, you mostly care about precision, and in other contexts, you really care about the recall. For example, if you trained a classifier to detect videos that are safe for kids, you would probably prefer a classifier that rejects many good videos (low recall) but keeps only safe ones (high precision), rather than a classifier that has a much higher recall but lets a few really bad videos show up in your product (in such cases, you may even want to add a human pipeline to check the classifier's video selection). On the other hand, suppose you train a classifier to detect shoplifters on surveillance images: it is probably fine if your classifier has only 30% precision as long as it has 99% recall (sure, the security guards will get a few false alerts, but almost all shoplifters will get caught).
Predictive Analytics for Water Asset Management: Machine Learning and Survival Analysis
Rahbaralam, Maryam, Modesto, David, Cardús, Jaume, Abdollahi, Amir, Cucchietti, Fernando M
Understanding performance and prioritizing resources for the maintenance of the drinking-water pipe network throughout its life-cycle is a key part of water asset management. Renovation of this vital network is generally hindered by the difficulty or impossibility to gain physical access to the pipes. We study a statistical and machine learning framework for the prediction of water pipe failures. We employ classical and modern classifiers for a short-term prediction and survival analysis to provide a broader perspective and long-term forecast, usually needed for the economic analysis of the renovation. To enrich these models, we introduce new predictors based on water distribution domain knowledge and employ a modern oversampling technique to remedy the high imbalance coming from the few failures observed each year. For our case study, we use a dataset containing the failure records of all pipes within the water distribution network in Barcelona, Spain. The results shed light on the effect of important risk factors, such as pipe geometry, age, material, and soil cover, among others, and can help utility managers conduct more informed predictive maintenance tasks.
Improved Preterm Prediction Based on Optimized Synthetic Sampling of EHG Signal
Xu, Jinshan, Chen, Zhenqin, Lu, Yanpei, Yang, Xi, Pumir, Alain
Preterm labor is the leading cause of neonatal morbidity and mortality and has attracted research efforts from many scientific areas. The inter-relationship between uterine contraction and the underlying electrical activities makes uterine electrohysterogram (EHG) a promising direction for preterm detection and prediction. Due the scarcity of EHG signals, especially those of preterm patients, synthetic algorithms are applied to create artificial samples of preterm type in order to remove prediction bias towards term, at the expense of a reduction of the feature effectiveness in machine-learning based automatic preterm detecting. To address such problem, we quantify the effect of synthetic samples (balance coefficient) on features' effectiveness, and form a general performance metric by utilizing multiple feature scores with relevant weights that describe their contributions to class separation. Combined with the activation/inactivation functions that characterizes the effect of the abundance of training samples in term and preterm prediction precision, we obtain an optimal sample balance coefficient that compromise the effect of synthetic samples in removing bias towards the majority and the side-effect of reducing features' importance. Substantial improvement in prediction precision has been achieved through a set of numerical tests on public available TPEHG database, and it verifies the effectiveness of the proposed method.
A machine learning approach to automatic detection of irregularity in skin lesion border using dermoscopic images
Skin lesion border irregularity is considered an important clinical feature for the early diagnosis of melanoma, representing the B feature in the ABCD rule. In this article we propose an automated approach for skin lesion border irregularity detection. The approach involves extracting the skin lesion from the image, detecting the skin lesion border, measuring the border irregularity, training a Convolutional Neural Network and Gaussian naive Bayes ensemble, to the automatic detection of border irregularity, which results in an objective decision on whether the skin lesion border is considered regular or irregular. The approach achieves outstanding results, obtaining an accuracy, sensitivity, specificity, and F-score of 93.6%, 100%, 92.5% and 96.1%, respectively.