Accuracy
Robust Scene Inference under Noise-Blur Dual Corruptions
Goyal, Bhavya, Lalonde, Jean-François, Li, Yin, Gupta, Mohit
Abstract--Scene inference under low-light is a challenging problem due to severe noise in the captured images. One way to reduce noise is to use longer exposure during the capture. However, in the presence of motion (scene or camera motion), longer exposures lead to motion blur, resulting in loss of image information. This creates a trade-off between these two kinds of image degradations: motion blur (due to long exposure) vs. noise (due to short exposure), also referred as a dual image corruption pair in this paper. With the rise of cameras capable of capturing multiple exposures of the same scene simultaneously, it is possible to overcome this trade-off. Our key observation is that although the amount and nature of degradation varies for these different image captures, the semantic content remains the same across all images. To this end, we propose a method to leverage these multi exposure captures for robust inference under low-light and motion. Our method builds on a feature consistency loss to encourage similar results from these individual captures, and uses the ensemble of their final predictions for robust visual recognition. We demonstrate the effectiveness of our approach on simulated images as well as real captures with multiple exposures, and across the tasks of object detection and image classification.
What's in the laundromat? Mapping and characterising offshore owned domestic property in London
Bourne, Jonathan, Ingianni, Andrea, McKenzie, Rex
The UK, particularly London, is a global hub for money laundering, a significant portion of which uses domestic property. However, understanding the distribution and characteristics of offshore domestic property in the UK is challenging due to data availability. This paper attempts to remedy that situation by enhancing a publicly available dataset of UK property owned by offshore companies. We create a data processing pipeline which draws on several datasets and machine learning techniques to create a parsed set of addresses classified into six use classes. The enhanced dataset contains 138,000 properties 44,000 more than the original dataset. The majority are domestic (95k), with a disproportionate amount of those in London (42k). The average offshore domestic property in London is worth 1.33 million GBP collectively this amounts to approximately 56 Billion GBP. We perform an in-depth analysis of the offshore domestic property in London, comparing the price, distribution and entropy/concentration with Airbnb property, low-use/empty property and conventional domestic property. We estimate that the total amount of offshore, low-use and airbnb property in London is between 144,000 and 164,000 and that they are collectively worth between 145-174 billion GBP. Furthermore, offshore domestic property is more expensive and has higher entropy/concentration than all other property types. In addition, we identify two different types of offshore property, nested and individual, which have different price and distribution characteristics. Finally, we release the enhanced offshore property dataset, the complete low-use London dataset and the pipeline for creating the enhanced dataset to reduce the barriers to studying this topic.
Towards Fairness-Aware Multi-Objective Optimization
Yu, Guo, Ma, Lianbo, Du, Wei, Du, Wenli, Jin, Yaochu
Recent years have seen the rapid development of fairness-aware machine learning in mitigating unfairness or discrimination in decision-making in a wide range of applications. However, much less attention has been paid to the fairness-aware multi-objective optimization, which is indeed commonly seen in real life, such as fair resource allocation problems and data driven multi-objective optimization problems. This paper aims to illuminate and broaden our understanding of multi-objective optimization from the perspective of fairness. To this end, we start with a discussion of user preferences in multi-objective optimization and then explore its relationship to fairness in machine learning and multi-objective optimization. Following the above discussions, representative cases of fairness-aware multiobjective optimization are presented, further elaborating the importance of fairness in traditional multi-objective optimization, data-driven optimization and federated optimization. Finally, challenges and opportunities in fairness-aware multi-objective optimization are addressed. We hope that this article makes a small step forward towards understanding fairness in the context of optimization and promote research interest in fairness-aware multi-objective optimization.
An Impartial Take to the CNN vs Transformer Robustness Contest
Pinto, Francesco, Torr, Philip H. S., Dokania, Puneet K.
Following the surge of popularity of Transformers in Computer Vision, several studies have attempted to determine whether they could be more robust to distribution shifts and provide better uncertainty estimates than Convolutional Neural Networks (CNNs). The almost unanimous conclusion is that they are, and it is often conjectured more or less explicitly that the reason of this supposed superiority is to be attributed to the self-attention mechanism. In this paper we perform extensive empirical analyses showing that recent state-of-the-art CNNs (particularly, ConvNeXt [20]) can be as robust and reliable or even sometimes more than the current state-of-the-art Transformers. However, there is no clear winner. Therefore, although it is tempting to state the definitive superiority of one family of architectures over another, they seem to enjoy similar extraordinary performances on a variety of tasks while also suffering from similar vulnerabilities such as texture, background, and simplicity biases.
FairGRAPE: Fairness-aware GRAdient Pruning mEthod for Face Attribute Classification
Lin, Xiaofeng, Kim, Seungbae, Joo, Jungseock
Existing pruning techniques preserve deep neural networks' overall ability to make correct predictions but could also amplify hidden biases during the compression process. We propose a novel pruning method, Fairness-aware GRAdient Pruning mEthod (FairGRAPE), that minimizes the disproportionate impacts of pruning on different sub-groups. Our method calculates the per-group importance of each model weight and selects a subset of weights that maintain the relative between-group total importance in pruning. The proposed method then prunes network edges with small importance values and repeats the procedure by updating importance values. We demonstrate the effectiveness of our method on four different datasets, FairFace, UTK-Face, CelebA, and ImageNet, for the tasks of face attribute classification where our method reduces the disparity in performance degradation by up to 90% compared to the state-of-the-art pruning algorithms. Our method is substantially more effective in a setting with a high pruning rate (99%).
Open video data sharing in developmental and behavioural science
Marschik, Peter B, Kulvicius, Tomas, Flügge, Sarah, Widmann, Claudius, Nielsen-Saines, Karin, Schulte-Rüther, Martin, Hüning, Britta, Bölte, Sven, Poustka, Luise, Sigafoos, Jeff, Wörgötter, Florentin, Einspieler, Christa, Zhang, Dajie
Video recording is a widely used method for documenting infant and child behaviours in research and clinical practice. Video data has rarely been shared due to ethical concerns of confidentiality, although the need of shared large-scaled datasets remains increasing. This demand is even more imperative when data-driven computer-based approaches are involved, such as screening tools to complement clinical assessments. To share data while abiding by privacy protection rules, a critical question arises whether efforts at data de-identification reduce data utility? We addressed this question by showcasing the Prechtl's general movements assessment (GMA), an established and globally practised video-based diagnostic tool in early infancy for detecting neurological deficits, such as cerebral palsy. To date, no shared expert-annotated large data repositories for infant movement analyses exist. Such datasets would massively benefit training and recalibration of human assessors and the development of computer-based approaches. In the current study, sequences from a prospective longitudinal infant cohort with a total of 19451 available general movements video snippets were randomly selected for human clinical reasoning and computer-based analysis. We demonstrated for the first time that pseudonymisation by face-blurring video recordings is a viable approach. The video redaction did not affect classification accuracy for either human assessors or computer vision methods, suggesting an adequate and easy-to-apply solution for sharing movement video data. We call for further explorations into efficient and privacy rule-conforming approaches for deidentifying video data in scientific and clinical fields beyond movement assessments. These approaches shall enable sharing and merging stand-alone video datasets into large data pools to advance science and public health.
Classification via score-based generative modelling
In this work, we investigated the application of score-based gradient learning in discriminative and generative classification settings. Score function can be used to characterize data distribution as an alternative to density. It can be efficiently learned via score matching, and used to flexibly generate credible samples to enhance discriminative classification quality, to recover density and to build generative classifiers. We analysed the decision theories involving score-based representations, and performed experiments on simulated and real-world datasets, demonstrating its effectiveness in achieving and improving binary classification performance, and robustness to perturbations, particularly in high dimensions and imbalanced situations.
Algorithmic Fairness in Business Analytics: Directions for Research and Practice
De-Arteaga, Maria, Feuerriegel, Stefan, Saar-Tsechansky, Maytal
The extensive adoption of business analytics (BA) has brought financial gains and increased efficiencies. However, these advances have simultaneously drawn attention to rising legal and ethical challenges when BA inform decisions with fairness implications. As a response to these concerns, the emerging study of algorithmic fairness deals with algorithmic outputs that may result in disparate outcomes or other forms of injustices for subgroups of the population, especially those who have been historically marginalized. Fairness is relevant on the basis of legal compliance, social responsibility, and utility; if not adequately and systematically addressed, unfair BA systems may lead to societal harms and may also threaten an organization's own survival, its competitiveness, and overall performance. This paper offers a forward-looking, BA-focused review of algorithmic fairness. We first review the state-of-the-art research on sources and measures of bias, as well as bias mitigation algorithms. We then provide a detailed discussion of the utility-fairness relationship, emphasizing that the frequent assumption of a trade-off between these two constructs is often mistaken or short-sighted. Finally, we chart a path forward by identifying opportunities for business scholars to address impactful, open challenges that are key to the effective and responsible deployment of BA.
Array
Confusion Matrix is mainly used machine language and deep learning field to evaluate the perfomance of model,by showing all its predicted value in a table. It is mainly used to evaluate classification model(classifier) by calculating precision,recall and f1 score using confusion matrix. As name suggest it create confusion in terminalogy which is used in table.when In the dog vs cat example we have designed and trained model to predict two values which is either dog or cat . To create confusion matrix we need to take one item at a time .In the our case we are taking first dog then we will create four values that is TP,TN,FP,FN.
Uncertainty quantification for predictions of atomistic neural networks
Vazquez-Salazar, Luis Itza, Boittier, Eric D., Meuwly, M.
The value of uncertainty quantification on predictions for trained neural networks (NNs) on quantum chemical reference data is quantitatively explored. For this, the architecture of the PhysNet NN was suitably modified and the resulting model was evaluated with different metrics to quantify calibration, quality of predictions, and whether prediction error and the predicted uncertainty can be correlated. The results from training on the QM9 database and evaluating data from the test set within and outside the distribution indicate that error and uncertainty are not linearly related. The results clarify that noise and redundancy complicate property prediction for molecules even in cases for which changes - e.g. double bond migration in two otherwise identical molecules - are small. The model was then applied to a real database of tautomerization reactions. Analysis of the distance between members in feature space combined with other parameters shows that redundant information in the training dataset can lead to large variances and small errors whereas the presence of similar but unspecific information returns large errors but small variances. This was, e.g., observed for nitro-containing aliphatic chains for which predictions were difficult although the training set contained several examples for nitro groups bound to aromatic molecules. This underlines the importance of the composition of the training data and provides chemical insight into how this affects the prediction capabilities of a ML model. Finally, the approach put forward can be used for information-based improvement of chemical databases for target applications through active learning optimization.